Welcome Guest! To enable all features please Login or Register.

Notification

Icon
Error

Options
Go to last post Go to first unread
Offline elasto  
#1 Posted : Sunday, October 9, 2011 2:13:02 PM(UTC)
elasto

Joined: 8/23/2011(UTC)
Posts: 245

Thanks: 6 times
Was thanked: 12 time(s) in 11 post(s)
I just want to check my reading of the situation is correct.

Firstly, some background:

I have a large object (8k vertices, 44k indices) that I am splitting up into separate sub-meshes in order to do early culling - only drawing those sections of mesh that are on screen. (Maybe this is too many indices to display on an mobile but put that issue to one side for the purpose of this question!)

Currently, in my XNA code, I have a single Vertex Buffer shared across all the sub-meshes, and each sub-mesh has its own Index Buffer. To draw each mesh, therefore, I keep the Vertex Buffer the same and simply reference a new Index Buffer for each GraphicsDevice.DrawIndexedPrimitives call.

It seems to me (at first glance - only had this code a couple of hours!) that DE has Meshes which have Geometry Data, and Geometry Data has Vertices and Indices which presumably turn into objects cached on the GPU. In other words, the collections of Vertices and Indices seem to be tied together within the same object (Geometry Data) rather than being addressable as end objects in their own right. So I can't have a single Vertex Buffer Object utilised by multiple Index Buffers in quite the same way.

Presumably I could use a single Geometry Data object and keep altering the Indices on it - but presumably that would keep streaming index data from the CPU to the GPU every frame and be more inefficient (or am I misunderstanding the situation and this is what happens behind the scenes with indices anyway?)

Am I understanding all this right?

If I am, it's not a terribly big deal, it just presumably means each sub-mesh needs its own Vertex Buffer, which means an inefficiency of doubling up those vertices shared between the meshes.

(Edit: Although is splitting up the Vertex Buffer the better thing to do anyway due to the single Vertex Buffer being too big to keep in the fastest level of GPU cache all at once?!)

Edited by user Sunday, October 9, 2011 3:33:39 PM(UTC)  | Reason: Not specified

Wanna join the discussion?! Login to your forum accountregister a new account. Or Connect via Facebook Twitter Google

Offline Benjamin  
#2 Posted : Sunday, October 9, 2011 6:58:26 PM(UTC)
Benjamin

Medals: Admin

Joined: 8/20/2011(UTC)
Posts: 1,421
Location: Hannover

Thanks: 18 times
Was thanked: 97 time(s) in 92 post(s)
Hey PG.

First of all 8k vertices and even having 44k indices (which sounds like a lot for 8k) is no problem on any platfom (neither on the limits side or the performance side).

We do have some mesh atlasing and sharing in the lower levels, but I am not sure if this code is activated right now, some hints could be in OpenTKGeometry.cs. The idea here is that you just assign lots of data to GeometryData or even better use Mesh.Draw, which calls Material.Draw with the geometry and then stuff will be optimized in the best way possible on each platform.

In your case you could just split up the 8k vertices and 44k indices into many meshes (you can share the same data if you like) and just throw them all out, the MaterialManager will sort and put them back together in the fastest render order (since they probably share material, shader and mesh data, they will be rendered in one batch later anyway).

You could also try to do some of your own VertexBuffer and IndexBuffer stuff (all the low level code is open), but we obviously can't guarantee it will run fine on other platforms. But you could try it out and we can help test and optimize it ThumpUp
Offline elasto  
#3 Posted : Monday, October 10, 2011 2:29:02 AM(UTC)
elasto

Joined: 8/23/2011(UTC)
Posts: 245

Thanks: 6 times
Was thanked: 12 time(s) in 11 post(s)
Originally Posted by: Benjamin Nitschke (DeltaEngine) Go to Quoted Post
Hey PG.

First of all 8k vertices and even having 44k indices (which sounds like a lot for 8k) is no problem on any platfom (neither on the limits side or the performance side).

Cool!

Quote:
We do have some mesh atlasing and sharing in the lower levels, but I am not sure if this code is activated right now, some hints could be in OpenTKGeometry.cs. The idea here is that you just assign lots of data to GeometryData or even better use Mesh.Draw, which calls Material.Draw with the geometry and then stuff will be optimized in the best way possible on each platform.

By 'mesh atlasing', you mean it automagically splits up the mesh into submeshes if it deems that a more optimal way to render it? I'll take a look at OpenTKGeometry and see what I can discern, anyway.

Quote:
In your case you could just split up the 8k vertices and 44k indices into many meshes (you can share the same data if you like) and just throw them all out, the MaterialManager will sort and put them back together in the fastest render order (since they probably share material, shader and mesh data, they will be rendered in one batch later anyway).

I notice this summary comment in the MaterialManager:

//Show all materials collected this frame with this MaterialManager. Renders out all materials in an optimized way (sorted by render layer, then by blendMode, then by shader (currently disabled, we only have one shader working now: DiffuseMaterial), and finally by material data (image and color).

An additional reason I am splitting up the mesh into submeshes is it is allowing me to draw the meshes in z-buffer order.

I do this in a somewhat crude way: I maintain a bounding box round the submesh as I add vertices to it, and I assign a single z-buffer value to the whole sub-mesh - which I could obtain, for example, by taking the centre of the bounding box, multiplying by the WorldViewProjection matrix and storing the z value resulting. (I don't do it quite that way, I do something a bit more sophisticated but I don't want to needlessly overcomplicate this post!)

Ok, so at the end of this, each sub-mesh has a single z-buffer value which I then sort by to determine rendering order. I find it makes about a 15% improvement in framerate having it sorted to reverse-sorted - and, just as importantly, the framerate stays consistent (not better from one viewpoint and worse from another).

So, going back to the comment in the MaterialManager, it doesn't appear that it offers any z-buffer ordering possibility (or does it?). This would clearly be an ordering that occurred after all the others. My suggestion would be to have a new Property in GeometryData that could be left as zero, or could be populated by the coder in whatever way the coder saw fit, and, if present, would serve as the last ordering the MaterialManager sorts by.

(I am right in thinking that the MaterialManager class is not open source, right?)

Quote:
You could also try to do some of your own VertexBuffer and IndexBuffer stuff (all the low level code is open), but we obviously can't guarantee it will run fine on other platforms. But you could try it out and we can help test and optimize it ThumpUp

It's all quite a lot to take in right now. That may be for a later time :)


Edit: (Edited out edit cos I think I answered my own question!)

Edited by user Monday, October 10, 2011 2:59:20 AM(UTC)  | Reason: Not specified

Offline Benjamin  
#4 Posted : Monday, October 10, 2011 3:13:28 AM(UTC)
Benjamin

Medals: Admin

Joined: 8/20/2011(UTC)
Posts: 1,421
Location: Hannover

Thanks: 18 times
Was thanked: 97 time(s) in 92 post(s)
Originally Posted by: PG Go to Quoted Post
By 'mesh atlasing', you mean it automagically splits up the mesh into submeshes if it deems that a more optimal way to render it? I'll take a look at OpenTKGeometry and see what I can discern, anyway.


Yes, its not even that magic because it is required on some platform when bone limits are reached and shaders would get too complex and slow rendering everything at once. It also works nicely for dynamic geometry (created and changed each frame) because creating and disposing new vertex buffers, index buffers and vbo or vao is too time consuming each frame, so reusing is good.

Our Soulcraft team has done a lot of optimizations to get very complex scenes with many hundred thousand vertices and polygons running nicely with merging meshes based on this and also doing good culling.

Quote:
I notice this summary comment in the MaterialManager:

//Show all materials collected this frame with this MaterialManager. Renders out all materials in an optimized way (sorted by render layer, then by blendMode, then by shader (currently disabled, we only have one shader working now: DiffuseMaterial), and finally by material data (image and color).

An additional reason I am splitting up the mesh into submeshes is it is allowing me to draw the meshes in z-buffer order.

I do this in a somewhat crude way: I maintain a bounding box round the submesh as I add vertices to it, and I assign a single z-buffer value to the whole sub-mesh - which I could obtain, for example, by taking the centre of the bounding box, multiplying by the WorldViewProjection matrix and storing the z value resulting. (I don't do it quite that way, I do something a bit more sophisticated but I don't want to needlessly overcomplicate this post!)

Ok, so at the end of this, each sub-mesh has a single z-buffer value which I then sort by to determine rendering order. I find it makes about a 15% improvement in framerate having it sorted to reverse-sorted - and, just as importantly, the framerate stays consistent (not better from one viewpoint and worse from another).


Sounds good. We tried to avoid sorting as much as possible because it really hurts CPU performance on slow mobile platforms, but from time to time it is definitely needed (e.g. for particle effects).

Quote:
So, going back to the comment in the MaterialManager, it doesn't appear that it offers any z-buffer ordering possibility (or does it?). This would clearly be an ordering that occurred after all the others. My suggestion would be to have a new Property in GeometryData that could be left as zero, or could be populated by the coder in whatever way the coder saw fit, and, if present, would serve as the last ordering the MaterialManager sorts by.

(I am right in thinking that the MaterialManager class is not open source, right?)


It does, just not the way you expect it yet. We will provide more knobs in the future to customize rendering more. Currently mostly what is being done is rendering via the draw layers (pre sorted by how you put stuff in), sorting per shader, material, etc. and rendering out in batches to be really quick and also merging geometry as much as possible to reduce calls to the graphics layer.

MaterialManager is currently what we call protected (you might have noticed that Delta.Rendering.Effects is also protected). The reason here is that the code is not final yet and releasing it is too early plus it will produce lots of problems when others do changes to the code (because we also going to change it heavily). More and more code will be opened up during the beta process in v0.9.x (some code might stay protected, but 90%+ will be open source and visible at the end of the beta. I personally even plan to bring everything to Open Source and even make the engine free on all platforms once we have our Library Store up, which is IMO a much better way to provide more services, but that is in the far future (took long enough to convince everyone to make all this open source this far)).

Good luck with your rendering .. PS: We did not do much testing with 3D as content generation is currently very limited, we plan to work on 3D in v0.9.2 and bring more high level features back into the engine once we feel good about covering all the basics on all platforms.
Offline elasto  
#5 Posted : Monday, October 10, 2011 3:47:40 AM(UTC)
elasto

Joined: 8/23/2011(UTC)
Posts: 245

Thanks: 6 times
Was thanked: 12 time(s) in 11 post(s)
Originally Posted by: Benjamin Nitschke (DeltaEngine) Go to Quoted Post
Yes, its not even that magic because it is required on some platform when bone limits are reached and shaders would get too complex and slow rendering everything at once. It also works nicely for dynamic geometry (created and changed each frame) because creating and disposing new vertex buffers, index buffers and vbo or vao is too time consuming each frame, so reusing is good.

Our Soulcraft team has done a lot of optimizations to get very complex scenes with many hundred thousand vertices and polygons running nicely with merging meshes based on this and also doing good culling.

<snip>

It does, just not the way you expect it yet. We will provide more knobs in the future to customize rendering more. Currently mostly what is being done is rendering via the draw layers (pre sorted by how you put stuff in), sorting per shader, material, etc. and rendering out in batches to be really quick and also merging geometry as much as possible to reduce calls to the graphics layer.

MaterialManager is currently what we call protected (you might have noticed that Delta.Rendering.Effects is also protected). The reason here is that the code is not final yet and releasing it is too early plus it will produce lots of problems when others do changes to the code (because we also going to change it heavily). More and more code will be opened up during the beta process in v0.9.x (some code might stay protected, but 90%+ will be open source and visible at the end of the beta. I personally even plan to bring everything to Open Source and even make the engine free on all platforms once we have our Library Store up, which is IMO a much better way to provide more services, but that is in the far future (took long enough to convince everyone to make all this open source this far)).

Good luck with your rendering .. PS: We did not do much testing with 3D as content generation is currently very limited, we plan to work on 3D in v0.9.2 and bring more high level features back into the engine once we feel good about covering all the basics on all platforms.


Ok. I think what I'll do for now is just keep my approach as simple as possible, and just keep an eye on what options open up to me as later versions come out.

When it's relatively close to the time to release our game, I might perhaps try a couple of more sophisticated methods there and then when I am better able to profile the performance differences on the target platform.

Quote:
Sounds good. We tried to avoid sorting as much as possible because it really hurts CPU performance on slow mobile platforms, but from time to time it is definitely needed (e.g. for particle effects).
Yes. In my game, calculating z-buffer values and doing a sort every frame renders the framerate almost as bad as simply rendering it unsorted.

Fortunately I only need to resort every time the WorldViewProjection matrix changes - and even then I could limit it to every tenth change or such, cos the camera doesn't move that quickly.

Thanks for taking the time to reply when clearly this is a very busy time!
Rss Feed  Atom Feed
Users browsing this topic
OceanSpiders 2.0
Forum Jump  
You cannot post new topics in this forum.
You cannot reply to topics in this forum.
You cannot delete your posts in this forum.
You cannot edit your posts in this forum.
You cannot create polls in this forum.
You cannot vote in polls in this forum.

Powered by YAF.NET | YAF.NET © 2003-2023, Yet Another Forum.NET
This page was generated in 0.128 seconds.