How to optimise updating a large octree?

Hey Everyone,

I’ve got a scene with a lot of meshes. Performance wasn’t too good and I only actually need a few meshes on screen at one time. So I created an octree for the scene using scene.createOrUpdateSelectionOctree() which worked wonderfully!

The issue I have now is that whenever I add or remove a mesh from the scene (a static one, not a dynamic one) I have to call scene.createOrUpdateSelectionOctree() again, which takes about 100ms for the call to complete and the scene skips.

I need to keep these meshes as static to keep the performance gains of the octree over time.

Is there a better way to do this?

As a side question: Is there also a better way to handle dynamic content in the scene octree?. I am adding and removing meshes from scene.selectionOctree.dynamicContent all the time. I add a mesh and it doesn’t show and I then I remember I have to add it to the dynamic content array too, or rebuild the octree.

So it really goes to the topic question. Is there a way to optimise the octree for updates? Both in performance and API?

This is a tough question as the octree is mostly a static structure.

But I’m sure I can optimize it if you only add new and does not remove existing ones

Can you share a repro on the phone?

1 Like

Hmm… it will be removing and adding. Is there a better way to approach this?

Just imagine a scene with 50000 “objects”. Then put them into an octree. Then start adding and removing some around the place. And just keep increasing that number of objects :slight_smile:

I can put a little demo together in the playground. But I thought it was quite a simple use case?

Are you gonna have 50000 objects that are unique? How many draw calls are you keeping it down to? With correct instancing and with the normal culling methods what kind of performance are you getting?

It is a simple use case but not a simple algorithm :slight_smile:
Remember that JavaScript has only ONE thread so that could easily be done with threads is tough to do in JS

https://playground.babylonjs.com/#AYMY34

Here is an easy test environment for what you are doing and messing around with Octrees and performance.

Thanks so much guys.

I appreciate the effort it takes.

I’ll pop together a demo with the instanced spheres with an octree, with a timer adding and removing spheres.

I am using instancing to get the draw calls down.

For me it is actually a bigger issue removing objects from the octree. I can add items as dynamic and it won’t be a big performance hit for a while, then I can rebuild the octree just once. But I can’t remove objects. I guess i could just hide them and then do the same thing, batch update the octree. But it still isn’t optimal, there will be a bit of a “skip” as it updates.

With your demo I’ll be able to improve that situation:)

I’ve put together a small demo.

https://playground.babylonjs.com/#AYMY34#1

Can you create an issue on the repo for it? I’ll try to work on it asap

Thanks so much!!! :slight_smile:

1 Like

Here we are:
https://playground.babylonjs.com/#AYMY34#2

I also added octree.removeMesh()

Hope you like it!

1 Like

Thanks! That’s awesome. Should keep me going for the next 6 months :smiley:

1 Like

So. I’ve found out it isn’t the best idea.

If you start adding too many meshes, even very simple ones that are instanced, you quickly use up the system memory. Also, adding and removing meshes starts causing issues with the garbage collector.

I ended up adding the d3 octree library and using that for the octree. It only handles points, but it seems to perform well. So it gives a pretty good approximation for the culling.

To help with memory I add and remove meshes that aren’t needed. So the culling actually removes and adds them from the scene rather than skip rendering them in that pass. That works fine for my purposes, they don’t need to actually exist for anything other than the visuals.

But I then ran into performance issues. If say you spin in a circle and it removes and adds a large number of meshes, then it starts to chug. So I chunk them and just load them say 20 per frame.

Your comment about lack of threading for the octree making it tough, I totally understand that now! It seems to take a fair amount of effort to scale to large scenes with javascript!

Is there a way to get a lighter version of an instanced mesh that uses less memory and cpu processing?. Something along the lines of what a particle system does? I think that would speed up my use case. Happy to lose a lot of features, it’s just really the visuals I want.

Did you check the sps?
https://doc.babylonjs.com/how_to/solid_particles

Yeah I sure did. Just not sure how to add and remove shapes from it without create a new SPS. Perhaps if I make lots of smaller SPSs, then I can rebuild them without too much cost and still keep the draw calls reasonable. They can be culled that way too.

Thanks!

Pinging @jerome to get more insights:)

In short, the SPS performance is directly related to the number of objects to animate multiplied by their number of vertices…
The SPS itself is drawn in a single draw call.

This means that :

  • if you have a lot of static (not moving, not changing scale, texture or color) objects in a SPS, you can set them as “dead” (alive = false) and they’ll be frozen, not computed any longer until you wake them back. This can speed up the SPS performance.
  • you can even decide to update only parts of the SPS, say from particle i to particle j, meaning that all the rest isn’t computed at all on the next call to setParticles(). This improves a lot the SPS performance on very large pools.
  • the SPS creation is an heavy operation, so it’s not recommended to create/destroy several SPS within the render loop. It’s better to create them before it starts anyway.
  • Don’t forget that you can create more particles than initially needed and set them as “invisible” (isVisible = false), so they won’t be computed either. You can also apply the “update-only-some-range” like described before on the invisible ones.

Note : the dead particles aren’t computed and are frozen in their current state whatever visible or invisible.
The invisible particles aren’t computed and aren’t rendered.
Note 2 : invisible particles are temporarily reduced to a single point outside the camera frustum (they aren’t culled as a part of the SPS global mesh though) but they are quite nothing to draw for the GPU (no triangle, no rasterization, just a point outside the screen space). The VBO is then size fixed.

The difference between invisible or dead particles and the range update is this one :

  • invisible/dead : the global loop iterates over all the particles but don’t compute their rotation/scaling/position/color/texture. It’s needed in order to be able to change their current state on demand : setting them from visible/alive to invisible/dead or vice versa.
  • range : the global loop iterates only on the particle requested range. The rest keeps untouched.

So the range method, if you can use it, is the fastest one.

I have in my todo list to add the expandable SPS feature some day.

1 Like

Thanks so much.

Your reply gave me some fresh thoughts!

I think I could get it to work with a fixed number of objects and keeping some spare ones invisible, then making them visible / invisible again as needed. I could also recycle objects by making them as alive, then moving them, then making them dead again.

I discovered the garbage collector doesn’t play nicely with creating and destroying too many objects anyway. Like you say, better to create everything before the render loop starts.

The SPS really sounds very flexible!

It’ll be interesting to see the performance comparison between a large number of instanced meshes vs SPS.

You got it : recycling objects in a SPS is the best way to go.

Please have a look at this example : Test Babylon SP Terrain
There’s a SPS managing “only” 6000 objects on the terrain. Actually the map holds almost 70K objects. As they can’t be seen all at once, the SPS recycles visible objects in the current visible part of the terrain.
The recycled objects only share the same geometry between them (say, box, sphere or tetrahedron here) and are individually colored, scaled, positioned and textured according to the data of the related objects of the map.