How to optimise updating a large octree?

ryan · December 9, 2018, 1:55pm

Hey Everyone,

I’ve got a scene with a lot of meshes. Performance wasn’t too good and I only actually need a few meshes on screen at one time. So I created an octree for the scene using scene.createOrUpdateSelectionOctree() which worked wonderfully!

The issue I have now is that whenever I add or remove a mesh from the scene (a static one, not a dynamic one) I have to call scene.createOrUpdateSelectionOctree() again, which takes about 100ms for the call to complete and the scene skips.

I need to keep these meshes as static to keep the performance gains of the octree over time.

Is there a better way to do this?

As a side question: Is there also a better way to handle dynamic content in the scene octree?. I am adding and removing meshes from scene.selectionOctree.dynamicContent all the time. I add a mesh and it doesn’t show and I then I remember I have to add it to the dynamic content array too, or rebuild the octree.

So it really goes to the topic question. Is there a way to optimise the octree for updates? Both in performance and API?

Deltakosh · December 9, 2018, 5:30pm

This is a tough question as the octree is mostly a static structure.

But I’m sure I can optimize it if you only add new and does not remove existing ones

Can you share a repro on the phone?

ryan · December 10, 2018, 8:37am

Hmm… it will be removing and adding. Is there a better way to approach this?

Just imagine a scene with 50000 “objects”. Then put them into an octree. Then start adding and removing some around the place. And just keep increasing that number of objects

I can put a little demo together in the playground. But I thought it was quite a simple use case?

Pryme8 · December 10, 2018, 4:14pm

Are you gonna have 50000 objects that are unique? How many draw calls are you keeping it down to? With correct instancing and with the normal culling methods what kind of performance are you getting?

Deltakosh · December 10, 2018, 4:15pm

It is a simple use case but not a simple algorithm
Remember that JavaScript has only ONE thread so that could easily be done with threads is tough to do in JS

Pryme8 · December 10, 2018, 4:25pm

https://playground.babylonjs.com/#AYMY34

Here is an easy test environment for what you are doing and messing around with Octrees and performance.

ryan · December 11, 2018, 2:36pm

Thanks so much guys.

I appreciate the effort it takes.

I’ll pop together a demo with the instanced spheres with an octree, with a timer adding and removing spheres.

I am using instancing to get the draw calls down.

For me it is actually a bigger issue removing objects from the octree. I can add items as dynamic and it won’t be a big performance hit for a while, then I can rebuild the octree just once. But I can’t remove objects. I guess i could just hide them and then do the same thing, batch update the octree. But it still isn’t optimal, there will be a bit of a “skip” as it updates.

Deltakosh · December 11, 2018, 3:20pm

With your demo I’ll be able to improve that situation:)

ryan · December 13, 2018, 2:35pm

I’ve put together a small demo.

https://playground.babylonjs.com/#AYMY34#1

Deltakosh · December 13, 2018, 2:38pm

Can you create an issue on the repo for it? I’ll try to work on it asap

ryan · December 13, 2018, 3:07pm

Thanks so much!!!

Deltakosh · December 17, 2018, 5:50pm

Here we are:
https://playground.babylonjs.com/#AYMY34#2

I also added octree.removeMesh()

Hope you like it!

ryan · December 18, 2018, 6:38am

Thanks! That’s awesome. Should keep me going for the next 6 months

ryan · December 31, 2018, 5:50am

So. I’ve found out it isn’t the best idea.

If you start adding too many meshes, even very simple ones that are instanced, you quickly use up the system memory. Also, adding and removing meshes starts causing issues with the garbage collector.

I ended up adding the d3 octree library and using that for the octree. It only handles points, but it seems to perform well. So it gives a pretty good approximation for the culling.

To help with memory I add and remove meshes that aren’t needed. So the culling actually removes and adds them from the scene rather than skip rendering them in that pass. That works fine for my purposes, they don’t need to actually exist for anything other than the visuals.

But I then ran into performance issues. If say you spin in a circle and it removes and adds a large number of meshes, then it starts to chug. So I chunk them and just load them say 20 per frame.

Your comment about lack of threading for the octree making it tough, I totally understand that now! It seems to take a fair amount of effort to scale to large scenes with javascript!

Is there a way to get a lighter version of an instanced mesh that uses less memory and cpu processing?. Something along the lines of what a particle system does? I think that would speed up my use case. Happy to lose a lot of features, it’s just really the visuals I want.

Deltakosh · December 31, 2018, 8:49am

Did you check the sps?
https://doc.babylonjs.com/how_to/solid_particles

ryan · January 1, 2019, 2:32am

Yeah I sure did. Just not sure how to add and remove shapes from it without create a new SPS. Perhaps if I make lots of smaller SPSs, then I can rebuild them without too much cost and still keep the draw calls reasonable. They can be culled that way too.

Thanks!

Deltakosh · January 1, 2019, 7:52am

Pinging @jerome to get more insights:)

jerome · January 1, 2019, 10:02am

In short, the SPS performance is directly related to the number of objects to animate multiplied by their number of vertices…
The SPS itself is drawn in a single draw call.

This means that :

if you have a lot of static (not moving, not changing scale, texture or color) objects in a SPS, you can set them as “dead” (alive = false) and they’ll be frozen, not computed any longer until you wake them back. This can speed up the SPS performance.
you can even decide to update only parts of the SPS, say from particle i to particle j, meaning that all the rest isn’t computed at all on the next call to setParticles(). This improves a lot the SPS performance on very large pools.
the SPS creation is an heavy operation, so it’s not recommended to create/destroy several SPS within the render loop. It’s better to create them before it starts anyway.
Don’t forget that you can create more particles than initially needed and set them as “invisible” (isVisible = false), so they won’t be computed either. You can also apply the “update-only-some-range” like described before on the invisible ones.

Note : the dead particles aren’t computed and are frozen in their current state whatever visible or invisible.
The invisible particles aren’t computed and aren’t rendered.
Note 2 : invisible particles are temporarily reduced to a single point outside the camera frustum (they aren’t culled as a part of the SPS global mesh though) but they are quite nothing to draw for the GPU (no triangle, no rasterization, just a point outside the screen space). The VBO is then size fixed.

The difference between invisible or dead particles and the range update is this one :

invisible/dead : the global loop iterates over all the particles but don’t compute their rotation/scaling/position/color/texture. It’s needed in order to be able to change their current state on demand : setting them from visible/alive to invisible/dead or vice versa.
range : the global loop iterates only on the particle requested range. The rest keeps untouched.

So the range method, if you can use it, is the fastest one.

I have in my todo list to add the expandable SPS feature some day.

ryan · January 1, 2019, 11:43am

Thanks so much.

Your reply gave me some fresh thoughts!

I think I could get it to work with a fixed number of objects and keeping some spare ones invisible, then making them visible / invisible again as needed. I could also recycle objects by making them as alive, then moving them, then making them dead again.

I discovered the garbage collector doesn’t play nicely with creating and destroying too many objects anyway. Like you say, better to create everything before the render loop starts.

The SPS really sounds very flexible!

It’ll be interesting to see the performance comparison between a large number of instanced meshes vs SPS.

jerome · January 1, 2019, 1:01pm

You got it : recycling objects in a SPS is the best way to go.

Please have a look at this example : Test Babylon SP Terrain
There’s a SPS managing “only” 6000 objects on the terrain. Actually the map holds almost 70K objects. As they can’t be seen all at once, the SPS recycles visible objects in the current visible part of the terrain.
The recycled objects only share the same geometry between them (say, box, sphere or tetrahedron here) and are individually colored, scaled, positioned and textured according to the data of the related objects of the map.

Topic		Replies	Views
How to optimize a scene with 10k mesh instances? Questions	9	3842	February 20, 2020
Open world enviornment - Trees & Rocks & Grass at scale Questions	25	1540	July 18, 2022
Ideas for rendering and optimizing complex scenes Questions	37	991	January 30, 2024
Question (or a lot of questions) about performance and optimisations Questions	50	1603	September 30, 2019
Add single mesh to octree Bugs	8	997	January 18, 2023

How to optimise updating a large octree?

Related topics