How to release ressources for KTX2 textures

Hi there,

I have an application which has to handle lots of different KTX2 compressed textures.
When providing all textures in one GLB file (about 20MB) many devices crash while loading the GLB.
The error messages differ between various browsers but it always seems to related to memory size.
This is what Chrome reports:
image

So I tried to split the textures up into various GLB files and always just load the ones, that are required for the current state of the app.
Before switching to a different GLB I want to clear the scene, so that the previously loaded textures get disposed and there is enough memory for the new ones.
However this doesn’t seem to work that way, because after some loading cycles the devices crash again with the same error messages.
It seems like the memory is still occupied by previously loaded textures.

I made a repro which just adds a GLB to the scene and immediatly disposes the scene afterwards.
When running that code in a loop I would expect that this could go on forever, since all the ressources of the old scene are disposed, before adding new textures.
https://playground.babylonjs.com/#KNCTDK
The loop runs about 5 times on my machine in Chrome.
IOS devices crash already at the 2nd / 3rd time.

Am I missing something, or is there a better way to release the scene ressources?

Do you have the same behaviour with usual textures?

@labris It’s definitely also an issue for “standard” textures.

Typically GLBs with standard textures are much larger than GLBs which contain the same textures KTX2 compressed.
In the beginning of the project we used these standard textures and had real big issues with loading times and even crashes.
KTX2 solved a lot of these issues. But as the project progresses more and more textures get added and now also KTX2 seems to be on its limit.

You are out of memory - https://playground.babylonjs.com/#KNCTDK#2
image

Here is better disposal (still unstable in Chrome - no problem with Edge) - https://playground.babylonjs.com/#KNCTDK#4
image

Hi TheHOF,

I haven’t been able to repro this crash locally; modern Edge, Chrome Canary, and Firefox all seem to be able to sustain loop iterations up into the thirties on my Windows 11 laptop. (Firefox memory gets huge, but never crashes, presumably just turning up GC when memory pressure gets high.) Can you provide some more information about the repros you’ve experienced? In particular, what Babylon versions, browsers, and platforms have you seen this repro on? Thanks!

Running your PG I see no leak:
image

Last snapshot was made 5 minutes after the first one

On my machine with 64GB of RAM, it repros, but only after 134 times. :open_mouth:

EDIT1: Well, it doesn’t repro the crash. The wasm module fails with an exception.
EDIT2: This is with Edge.

EDIT3: The browser task manager definitely indicates some kind of memory leaking.


8GB and rising :open_mouth:

EDIT4: Second try, similar result. After 137 tries, it dies with same error as before.

With this PG - https://playground.babylonjs.com/#KNCTDK#4 - no problem with Chrome on Mac (32 Gb RAM) and Edge on Windows (8 Gb RAM), Chrome on Windows crashes after the 5th render .

@labris Your PG dies for me exactly the same way.

We’re investigating still, but it seems like there is a leak in WASM memory somewhere. If we reduce the number of workers to 1, then it dies at 3.5GB memory instead of 10GB, which seems to imply that each worker is running out of memory.

After further investigation, there definitely appears to be a leak, but its over in the Web workers that do the WASM work, so it looks like it doesn’t show up in the performance.memory traces from the main thread. Tracking that down and fixing it will be one workstream, but we’re also considering another feature that would allow you to tell the Engine type to let go of its loader resources altogether, which might be helpful for memory consumption in the long term and can certainly be used as a workaround for the leak in the immediate term (by periodically releasing and recreating the KTX2 loader workers, which gets rid of the leaked memory).

ThinEngine.MinimizeMemoryFootprint by syntheticmagus · Pull Request #11442 · BabylonJS/Babylon.js (github.com)

Using this as a workaround, I was able to confirm that I could run for 25 minutes and over 300 iterations without the overall memory footprint continuing to grow. A little off the wall, but just wanted to keep you updated since this is the thread that got us looking at this. :slight_smile:

3 Likes

Thank you all for your amazing support, I’m very glad if I could help finding memory leaks in here.

@syntheticmagus Is there already a test version of your workaround which I could test in my project?

1 Like

I changed the model to the GLTF with “standard” textures - https://playground.babylonjs.com/#KNCTDK#11 - and it goes endlessly at my poor old laptop :slight_smile:
What I would suggest as possible (or temporary) solutions:

  1. Your model has a lot of 4K textures. Are they really needed for visuals or you may downscale them to 2048 or even 1024?
  2. You may try to load with GLB just only geometry and materials and then load needed textures async.

To have the ktx2 decoder do its work in the main js thread, set BABYLON.KhronosTextureContainer2.DefaultNumWorkers = 0;: that will disable the worker thread pool.

Doing so, we can indeed see that the snapshots are growing:
image

The culprit is the msc basis transcoder module:

It seems the heap used by the component is growing indefinitely…

We already had some problems in the past with this module (see "Cannot perform Construct on a detached ArrayBuffer" error when using msc_basis_transcoder - Giters).

Are you able to encode your textures in UASTC instead of ETC1S? We use some special wasm modules to handle UASTC->BC7/ASTC/RGBA so it could be a workaround (also, they are way faster to transcode than msc_basis_transcoder).

We may not end up adopting it, or the form may change; but if you really want to try my current workaround you can pull the branch from my PR and use the Engine's new MinimizeMemoryFootprint() method periodically to cause the Engine to discard static resources, including the KTX2 worker pool. However, there’s a lot of other great advice that’s been added to this thread, too, so I’d probably only try that as a last resort. :slight_smile:

Thx again for all your great tips.
At first we will try to adjust the compression mode (UASTC), since we already have another issue that could probably be related to the ETC compression:

I will keep you informed.

2 Likes