Optimizing textures on disabled meshes

Alex_B · October 1, 2022, 1:06am

Hi all! We have an application that is loading in a collection of GLTF files and each GLTF contains about 50-100 parts. From there, we are swapping out those root-level GLTF models by setting the enabled flag to false. The swaps are extremely fast (good job!) but we’re seeing an issue related to GPU consumption.

Every time we add a new GLTF to the collection, the GPU rises when that model has textures applied. This is spiraling out of control. To combat this, we are running a sanity check to remove any orphan textures or materials, and can verify this by using the Inspector to watch the number of meshes, vertices, materials, and textures. Although that number remains static and we are correctly managing any unused textures, it is still driving GPU consumption up to more than 2GB

So I have some general questions:

When a mesh (and its children) are disabled, those textures are not unbound from the _GL context, correct? Does that mean that disabling a mesh should have no effect on overall texture sizes?
Is there a similar approach to loading a texture and disabling it so it is unbound or otherwise removed temporarily from the GPU while a model is not visible?
Failing that, are there other suggestions for keeping GPU saturation low? To be clear, we can demonstrate that this issue is 100% due to textures
We also saw some issues when cloning textures to be used on different meshes. Does Babylon do some sort of caching when a single texture is assigned to multiple materials? Cloning made the problem much worse.

Thanks for any answers and if this isn’t clear, I’d be happy to try and whip up a playground to demonstrate what we’re seeing.

EDIT:

Oh, and this is seen mostly on Chromium headless on Linux. I’m sure that has a lot to do with the difficulty in seeing this on other devices.

labris · October 1, 2022, 11:17am

A Playground would be helpful to pinpoint your issue.
Seems you may have memory leakage somewhere.
What is the heap memory size and does it grow with the same dynamics as GPU consumption?

Alex_B · October 1, 2022, 4:14pm

It’s kind of slow to rise in this contrived version (1 mesh, tiny texture?), but it does happen for me in Chrome. It started at around 820 MB for the GPU process but is now over 1 GB.

The basic flow here is that we update a texture, take a screenshot, then update again. I added a cache buster so it’s a fresh fetch of each texture file.

Alex_B · October 1, 2022, 4:17pm

The JS Heap does not appear to be growing…?

sebavan · October 3, 2022, 12:41pm

Texture caching in babylon is done by url so we only keep 1 representation of each texture by url on the GPU.

And each texture have its own space on the GPU memory. There is no way to prevent it

Alex_B · October 3, 2022, 2:43pm

So really, our only strategy is to dispose textures after each update or else, mathematically, it can only increase over time?

sebavan · October 3, 2022, 2:57pm

The thing is how the engine could know the texture would not be used again ? the data has to stay somewhere unfortunately.

Or you would need a special loader keeping track of where to find the data to lazy load on demand.

Alex_B · October 3, 2022, 3:07pm

Our system is basically a lazy loader in that we don’t know ahead of time the textures we need to load, so the same meshes can have an almost infinite combination. This works great until we add more models to the system, which is where we’re seeing the GPU spiral out of control. If we dispose the textures, it’s slower to update (since many parts don’t actually change textures and have a “default” texture) but even just retaining the default textures still results in context lost events because the GPU is running out of memory. We can do nvidia-smi and watch it climb up until it crashes the browser.

sebavan · October 3, 2022, 3:18pm

Maybe you could keep the models and textures fully separated so you would never load textures you do not need ?

Alex_B · October 4, 2022, 12:16am

Ok, we think we’ve solved the issue. As far as we can tell, there is some small effect that cloning materials has that was causing the memory bloat (not really a leak) to happen faster. We can’t prove that so it may just be a fluke.

However, the primary solution for us was to dispose textures that are not shown and re-apply them as needed. This makes our updates slightly slower but the GPU memory is now completely stable.

Alex_B · October 4, 2022, 12:17am

The data has to stay somewhere

In the end, this is the real answer. Adding lots of textures bloats the GPU so there’s no shortcut around that.

Question: would WebGPU help this issue at all?

sebavan · October 4, 2022, 4:46pm

I won t help either I am afraid for the same reason

Topic		Replies	Views
Help with memory, cpu and gpu usage Questions	7	2446	December 4, 2020
Loading multiple textures at run-time? Initial load time problem Questions	12	1377	October 27, 2020
Asset Container Internals? Questions	2	305	November 30, 2022
RAM, texture size and tab throttling Questions	7	947	February 11, 2021
Preventing duplication of materials and textures on a library of GLTF models Questions material , gltf , texture , gltffileloader , loaders	19	226	August 23, 2024

Optimizing textures on disabled meshes

Related topics