Hi all! We have an application that is loading in a collection of GLTF files and each GLTF contains about 50-100 parts. From there, we are swapping out those root-level GLTF models by setting the enabled flag to false. The swaps are extremely fast (good job!) but we’re seeing an issue related to GPU consumption.
Every time we add a new GLTF to the collection, the GPU rises when that model has textures applied. This is spiraling out of control. To combat this, we are running a sanity check to remove any orphan textures or materials, and can verify this by using the Inspector to watch the number of meshes, vertices, materials, and textures. Although that number remains static and we are correctly managing any unused textures, it is still driving GPU consumption up to more than 2GB
So I have some general questions:
When a mesh (and its children) are disabled, those textures are not unbound from the _GL context, correct? Does that mean that disabling a mesh should have no effect on overall texture sizes?
Is there a similar approach to loading a texture and disabling it so it is unbound or otherwise removed temporarily from the GPU while a model is not visible?
Failing that, are there other suggestions for keeping GPU saturation low? To be clear, we can demonstrate that this issue is 100% due to textures
We also saw some issues when cloning textures to be used on different meshes. Does Babylon do some sort of caching when a single texture is assigned to multiple materials? Cloning made the problem much worse.
Thanks for any answers and if this isn’t clear, I’d be happy to try and whip up a playground to demonstrate what we’re seeing.
EDIT:
Oh, and this is seen mostly on Chromium headless on Linux. I’m sure that has a lot to do with the difficulty in seeing this on other devices.
A Playground would be helpful to pinpoint your issue.
Seems you may have memory leakage somewhere.
What is the heap memory size and does it grow with the same dynamics as GPU consumption?
It’s kind of slow to rise in this contrived version (1 mesh, tiny texture?), but it does happen for me in Chrome. It started at around 820 MB for the GPU process but is now over 1 GB.
The basic flow here is that we update a texture, take a screenshot, then update again. I added a cache buster so it’s a fresh fetch of each texture file.
Our system is basically a lazy loader in that we don’t know ahead of time the textures we need to load, so the same meshes can have an almost infinite combination. This works great until we add more models to the system, which is where we’re seeing the GPU spiral out of control. If we dispose the textures, it’s slower to update (since many parts don’t actually change textures and have a “default” texture) but even just retaining the default textures still results in context lost events because the GPU is running out of memory. We can do nvidia-smi and watch it climb up until it crashes the browser.
Ok, we think we’ve solved the issue. As far as we can tell, there is some small effect that cloning materials has that was causing the memory bloat (not really a leak) to happen faster. We can’t prove that so it may just be a fluke.
However, the primary solution for us was to dispose textures that are not shown and re-apply them as needed. This makes our updates slightly slower but the GPU memory is now completely stable.