Proposal Solution for Stuck Screenshot for Render Target Textures

We’ve been fighting a similar issue to this one:

In our case, we KNOW that the scene is ready to render and we can do our own checks that confirm this. Could we add a custom ready function as a param?

Basically, it’s “at your own risk” but just add an optional customReadyFunc param to the signature and change this line to:

customReadyFunc? : customReadyFunc: () => texture.isReadyForRendering() && camera.isReady(true)

Any thoughts on this?

I’m not sure we need a new parameter for this, you can use the customizeTexture callback to override the isReadyForRendering function:

await BABYLON.Tools.CreateScreenshotUsingRenderTargetAsync(
    ...,
    (texture: RenderTargetTexture) => {
        texture.isReadyForRendering = () => true;
    }
)

However, I’m not sure why this function would return false if everything is ready…

1 Like

Oh, I didn’t know we could do that!

To answer your question, it seems to happen intermittently when we clone a Node Material and assign different textures to the blocks. Somehow, even when you clone with “shareEffect” as false, it still shares the same effect because it appears to use the vertex, fragment, and #defines as the lookup key. When that happens, the “ready” state can be corrupted and never fire. We are just logging the scene state when this happens and it’s always the same cloned Node Material that is affected.

However, there is one case in which no materials are being logged as not ready and yet the ready check still hangs. That one is still a mystery.

Thanks for the context. I guess it would be hard for you to setup a repro somewhere?

The problem is that we have only been able to repro in production and we can only identify the context through logs.

1 Like

This seems to be a case where mesh.isReady(true) returns false despite all individual readiness checks passing: mesh.isReady(false) returns true (geometry ready), material.isReady(mesh) returns true (material reports ready for this specific mesh), and all submesh.effect.isReady() checks return true. This occurs with both PBRMaterial and NodeMaterial meshes. The submeshes are correctly using the same material as the mesh (submesh.getMaterial() === mesh.material), all textures report ready, and there are no geometry issues. Yet mesh.isReady(true) still returns false, creating a false negative that blocks scene readiness detection.

Do you use a shadow generator and/or LOD meshes? There are some checks associated to these features in Mesh.isReady:

Also, if delayLoadState == Constants.DELAYLOADSTATE_LOADING, Mesh.isReady will return false.

Yes, we do use a shadow generator! We are not using delayed loading.

I copied over some of this code so it’s easier to step through and debug. This bit here seems like a clue:

         for (const subMesh of mesh.subMeshes) {const subMeshMaterial = subMesh.getMaterial();const needsAlphaBlending = subMeshMaterial? subMeshMaterial.needAlphaBlendingForMesh(mesh): false;const generatorIsReady = generator.isReady(subMesh,hardwareInstancedRendering,needsAlphaBlending,); 
if (!generatorIsReady) {
            engine.currentRenderPassId = currentRenderPassId;
            debugger;
            return false;
          }
        }

There’s a mesh with 1 submesh and this submesh never returns ready.

It gives me:

generatorIsReady: false
needsAlphaBlending: false
subMeshMaterial: PBRMaterial

1 Like

We’re making progress!

Can you step inside generator.isReady() to see what exactly isn’t ready? You can put some dummy code before this call and add a breakpoint on the console.log line:

for (const subMesh of this.subMeshes) {
    if (this.name === "name_of_bad_mesh") {
        console.log(this);
    }
    if (!generator.isReady(subMesh, hardwareInstancedRendering, subMesh.getMaterial()?.needAlphaBlendingForMesh(this) ?? false)) {
        engine.currentRenderPassId = currentRenderPassId;
        return false;
    }
}

It’s hanging on this part:

  if (needAlphaTesting || material.needAlphaBlendingForMesh(mesh)) {
                if (this.useOpacityTextureForTransparentShadow) {
                    this._opacityTexture = (material as any).opacityTexture;
                } else {
                    this._opacityTexture = material.getAlphaTestTexture();
                }
                if (this._opacityTexture) {
                     // HANGS HERE!!!!!
                    if (!this._opacityTexture.isReady()) {
                        return false;
                    }

                    const alphaCutOff = (material as any).alphaCutOff ?? ShadowGenerator.DEFAULT_ALPHA_CUTOFF;

                    defines.push("#define ALPHATEXTURE");
                    if (needAlphaTesting) {
                        defines.push(`#define ALPHATESTVALUE ${alphaCutOff}${alphaCutOff % 1 === 0 ? "." : ""}`);
                    }
                    if (mesh.isVerticesDataPresent(VertexBuffer.UVKind)) {
                        attribs.push(VertexBuffer.UVKind);
                        defines.push("#define UV1");
                        uv1 = true;
                    }
                    if (mesh.isVerticesDataPresent(VertexBuffer.UV2Kind)) {
                        if (this._opacityTexture.coordinatesIndex === 1) {
                            attribs.push(VertexBuffer.UV2Kind);
                            defines.push("#define UV2");
                            uv2 = true;
                        }
                    }
                }

The opacity texture is the albedoTexture on a PBRMaterial.

Looking at that texture, it appears as though it was disposed but the internal _texture object is null.

Yes, disposing the texture set this._texture to null.

So, the problem is that a material is using a texture that has been disposed previously => you should try to find why the texture has been disposed, which is probably a bug if it is used in a material still used by a mesh.

But it’s really mysterious. When we dispose of any texture we’re always setting the texture and internal ._texture to null. I’m scouring the code to see anywhere this might be missing, but I’m not seeing it. I wonder if something is hanging on to a cached reference somehow?

It looks like you don’t set mat.albedoTexture = null for the material in error when the albedo texture has been disposed.

Look at this PG:

After 500ms, it logs the return value of texture.isReady(), dispose the texture and logs the result again: you will see that the output in the console is true and false.

The root cause seems to be that the shadow generator is hanging on to its _opacityTexture reference event after I dispose the texture on a material. I have a cleanup function that I use to find textures that are not used by any material and I expanded that to include the shadow generator and now it’s no longer hanging. I wonder if that’s a bug, though?

Just some thoughts, here.

  1. Should the ready function skip over textures that have been disposed even if they are attached to a material?
  2. Should the shadow generator null out its opacity reference when a texture is disposed?

This is clearly a bug on our end but it feels like bit of a smell to do this much orchestration around a texture that we no longer need.

The use of a deleted texture is not supported. It would be too complicated and too costly in terms of performance to add code that checks whether a texture has been deleted every time we use it: it is up to the user not to use a deleted texture (our policy is not to check entries in order to optimize performance).

One thing that would be (really) nice to have, however, would be a debug version of Babylon, in which we could add code to make debugging easier (a bit like a debug version in C++). I think we should discuss this with the team and see if it’s possible to do.

2 Likes

I was thinking the same thing! It’s been really, really difficult to figure out which thing was “not ready” and why. Even in this case, I still had no idea that the shadow generator might be a place to look!

But, I will take slightly disagree here about the ready checking. If you are already doing the check on the shadow generator to see if its _opacityTexture is ready, then all we need is a flag to show state on the texture:

[LOADING, READY, DISPOSED]

This way, we could quickly skip checks on disposed textures.

That’s the way my brain processes the problem :wink:

I guess we agree to disagree here :slight_smile: .

Continuing to use a texture that has been deleted is, in my opinion, an application bug (similar to using a pointer to memory that has been deleted in C/C++). The best thing to do would be to have a “debug version” that would warn you when this happens, to make it easier to fix the application.

1 Like

The way I look at it is that an application using Babylon is basically a “distributed system” and the screenshot methods, custom shader methods, etc are a service endpoint. If you make a request to a service endpoint and your input is incorrect, you would never expect it to just never return. Instead, it ought to return a specific error that can be used to diagnose and fix the problem. In Babylon, this is already true if you attempt to load a broken GLTF, missing textures, or a bad shader. It is only for the screenshot code that it’s possible to get NO response except an infinite delay.

To go with your perspective, if I have an application bug and I delete some key part of a request, the service will usually do some version of this flow:

  1. Request validation
    1. Rejection of invalid request
    2. Accept valid request
  2. Request fulfillment

I think the screenshot code is breaking this contract by not running the diagnostics and rejecting invalid requests early. I can see how the async nature of “is ready” makes this difficult, but that’s why I was imagining some sort of state flag that helps to understand if a texture or shader is initialized, ready, or has been disposed.

Anyway, time to go enjoy this week off :slight_smile: