Hi, this is a weird one, but here is the scenario we’re encountering. In BabylonJS 5x (after the alpha release) we are seeing a GPU memory leak when running CreateScreenshotUsingRenderTargetAsync continually. The issue appears most prominently when running on Linux, for some reason. We do not see this issue when running any alpha version. I think there is an issue with this commit here, which aims to force a render using CreateScreenshotUsingRenderTarget when the following is true:
The scene’s active camera is not the camera used for the render.
The original call is to CreateScreenshot
This is our scenario, only we’re calling CreateScreenshotUsingRenderTargetAsync with a free camera that is not the active Scene camera.
Just looking at this code, it looks like it’s possible that calling CreateScreenshotUsingRenderTargetAsync with a new camera without swapping the scene’s active camera may direct the renderer to call itself twice, and then some object is not being disposed by the GPU?
I’m happy to help set up a Playground next week but I wanted to see if any of this sounds plausible.
Actually, I think I lied there. It looks like that change only affects normal screenshots. We’re using this method here, which does not appear to force the render target texture method. We’re still trying to figure out why there’s a leak when we switch to 5.41.0 from 5.0.0-alpha.60
No, but since it’s a GPU leak we tried a few things to isolate the issue. First, we just looked at the Scene graph in the Inspector and confirmed that we aren’t growing textures or materials. Those counts are stable and nothing is being added there. To really confirm that theory, we skipped any Scene updates altogether and just do the screenshot operation. If we swap that around, and do all the updates but return a single pixel from the screenshot operation, there is no leak, so it’s seems to be somewhere in that screenshot code.
We then reverted to the Alpha version and saw the issue disappear. We tried this several times and can reproduce it every time. Switch to latest, GPU leak, switch back, no GPU leak.
This is using the latest Chromium/Puppeteer in a Linux environment, if that helps at all. I’m not sure we’d be able to see it just using a PG, but I’d guess the way to set it up is to run an async render operation in a loop that just does them one after the other and see if your GPU climbs. I don’t see it happening on a Mac, either, which just makes it even more of a mystery!
One thing that would probably help narrow down the search would be to test different 5.x versions and locate the one that introduces the problem: there are too many changes between 5.0.0-alpha.60 and 5.41.0 to do a code comparison and try to pinpoint the exact change that leads to the behavior you are experiencing.
We tried to ascertain this but ran into an issue in which 5.0.0 - 5.12.0 would not load our models. So jumping to 5.21, we were able to see the leak. Strangely, the leak is there but not as pronounced in 5.21 as it is in 5.41. It leaks, but does it slower.
Thanks, I’m going to have a look, but why are you calling engine.endFrame by hand? The user is not supposed to do that when the render loop is handled by the engine, I’m not sure if there can be some side effects because of that…
Can you test by removing the 2x scene.render() and 2x engine.endFrame() calls to be sure the problem is not related to that?
Correct. Are you testing on Windows or Linux? For us, the problem does not exist on Mac OS but in Linux (Amazon Optimized Linux, to be exact), the GPU goes from 600mb to 7,8,900 and up until it uses up the full 24GB. Nothing else is running on this machine.
In the new Tools.DumpData function, we are creating a new texture each time the function is called but we are also disposing it, so I don’t understand why it would leak… Indeed, it seems it leaks only on your test machine but not on others.
Can you test this PG and see if it is leaking:
It’s basically what we are doing in the DumpData function…
Note that if that helps, you can keep the overriden method of BABYLON.DumpTools.DumpData I provided in the PG above if the screenshots are ok with you with this method. Our new implementation takes care of the pre-multiply setting of canvas that made data with alpha channel not having the proper look, but screenshots don’t have an alpha channel so that should be fine.