ScreenshotTools.CreateScreenshot produces images with scene elements missing/incomplete only on Windows

Hello folks!

I’ve been having a real struggle trying to understand why rendering the scene to an image produces inconsistent results. Specifically when I call ScreenshotTools.CreateScreenshotAsync() sometimes (e.g. ~50%) the generated images don’t contain the whole scene due to what looks like a race condition, or drawing the image before the scene has fully rendered. This means that the generated image may be either blank, or have the meshes with their materials rendered but without their edges rendered (which should always be on) or be missing all UIObjects expected on the scene (length measurement labels).

The issue has never been reproduced on any linux or MacOS machine we’ve tried (some hundreds of times). I’ve only seen it on Windows machines even if they have latest updates and dedicated GPU (but not only). It seems particularly prominent on a client’s Surface Book for whatever reason.

The fact that there are no console errors produced and that this happens sometimes makes it really frustrating to understand and debug. I’ve found that doing this messy rendering after every scene change decreases the likelihood of the bug’s appearance (but doesn’t kill it)…

Below I’m pasting the screenshot.ts file

import { Engine } from '@babylonjs/core/Engines/engine'
import { ScreenshotTools } from '@babylonjs/core/Misc/screenshotTools'

import { CameraController } from './camera'
import { ALL_ELEMENT_TYPES } from './constants'
import { SceneProvider, createScene } from './scene'
import { Blueprint, ElementType, TactElement, isPristineTactElement } from './types'
import { UIProvider } from './ui'
import { UnreachableCaseError } from './utils/UnreachableCaseError'

const SCREENSHOT_IMG_WIDTH_IN_PX = 3000
const SCREENSHOT_IMG_HEIGHT_IN_PX = SCREENSHOT_IMG_WIDTH_IN_PX

/**
 * Generates a png screenshot of the blueprint provided with the tactElements provided
 * visible at full opacity, including lenght measurements for those elements. The
 * first element's type is used as the active type and all other types are visible.
 *
 * @param blueprint blueprint to render
 * @param tactElements elements to be shown as planned
 * @param format preferred format of the returned image. Formats correspond to:
 * - [`<canvas>.toBlob()`](https://developer.mozilla.org/en-US/docs/Web/API/HTMLCanvasElement/toBlob)
 * - [`<canvas>.toDataURL()`](https://developer.mozilla.org/en-US/docs/Web/API/HTMLCanvasElement/toDataURL)
 *
 * @returns screenshot depending on the provided `format`:
 * - a string of base64-encoded characters which can be assigned to the src
 * attribute of an <img> element to display it
 * - a binary blob
 */
export async function createScreenshot<F extends 'dataUrl' | 'blob'>({
  blueprint,
  tactElements,
  format,
}: {
  blueprint: Blueprint
  tactElements: TactElement[]
  format: F
}): Promise<F extends 'dataUrl' ? string : F extends 'blob' ? Blob : never>
export async function createScreenshot({
  blueprint,
  tactElements,
  format,
}: {
  blueprint: Blueprint
  tactElements: TactElement[]
  format: 'dataUrl' | 'blob'
}): Promise<string | Blob> {
  const canvas = document.createElement('canvas')
  canvas.width = SCREENSHOT_IMG_HEIGHT_IN_PX
  canvas.height = SCREENSHOT_IMG_HEIGHT_IN_PX
  canvas.style.visibility = 'hidden'
  document.body.appendChild(canvas)

  const engine = new Engine(canvas, false, {
    useHighPrecisionFloats: false,
    preserveDrawingBuffer: true,
    stencil: true,
    alpha: true,
    desynchronized: true,
    powerPreference: 'high-performance',
  })
  const scene = createScene(engine)
  const cameraController = new CameraController(scene, false)
  const sceneProvider = new SceneProvider(
    scene,
    new UIProvider(cameraController.orthoCamera),
    cameraController,
  )

  sceneProvider.addBlueprint({ blueprint, renderEdges: 'sync' })
  scene.render()
  cameraController.resetZoom(sceneProvider.rootMesh, false, true)
  scene.render()
  sceneProvider.activeElementType = getActiveElementType(tactElements, sceneProvider)
  sceneProvider.shownElementTypes = ALL_ELEMENT_TYPES
  tactElements && sceneProvider.setTactElements(tactElements)
  sceneProvider.setSelectedElements(
    tactElements
      .flatMap(tactElement => {
        return isPristineTactElement(tactElement)
          ? tactElement.id
          : tactElement.parts.map(part => part.color && part.guid)
      })
      .filter(Boolean) as string[], // we do `as sting[]` because TS fails to infer it
  )
  scene.render()

  const screenshotAsDataUrl = await ScreenshotTools.CreateScreenshotAsync(
    engine,
    cameraController.orthoCamera,
    {
      width: SCREENSHOT_IMG_WIDTH_IN_PX,
      height: SCREENSHOT_IMG_HEIGHT_IN_PX,
    },
    'image/png',
  )

  canvas.remove()
  engine.dispose()

  switch (format) {
    case 'dataUrl':
      return screenshotAsDataUrl
    case 'blob': {
      const screenshotWithoutDataUrlPrefix = screenshotAsDataUrl.replace(
        'data:image/png;base64,',
        '',
      )
      const screenshotAsBufferedBinaryInBase64 = Buffer.from(
        screenshotWithoutDataUrlPrefix,
        'base64',
      )
      const screenshotAsBlob = new Blob([screenshotAsBufferedBinaryInBase64])

      return screenshotAsBlob
    }
    default:
      throw new UnreachableCaseError(format)
  }
}

function getActiveElementType(
  tactElements: TactElement[],
  sceneProvider: SceneProvider,
): ElementType {
  const firstElement = tactElements[0]

  if (firstElement) {
    const sceneElement = sceneProvider.findElementWithGUID(
      firstElement.id || (firstElement.parentId as string),
    )
    if (sceneElement) {
      return sceneElement.type
    }
  }

  return ElementType.Wall
}

Here’s a link with some generated images showcasing the different ways the generation fails and the successful expected result:

And an attached image of the expected result (as a new user I can only attach max one image here):

Please ask me to share additional files imported here which I obviously skipped to not spam since this is non-trivial industrial application using Babylon.js.

Thank you for this amazing library, your insightful blog posts, and for the (hopefully!) help in killing this nasty bug I’ve been fighting on and off for months!

The thing is that the copy of the canvas is done at the time you call the screenshot function, so all objects may have not been rendered yet at that time.

You can try to call:

engine.flushFramebuffer();
ScreenshotTools.CreateScreenshotAsync(...);

In scene.onAfterRenderObservable or in Tools.SetImmediate() and see if it helps.

The problem is that we don’t have a clear point where we can be sure that all drawing commands for the current frame have been sent to the GPU and are displayed BUT before the next frame has begun, so that we can copy the canvas buffer at the right time.

What should be more reliable is CreateScreenshotUsingRenderTarget() because it is using a texture to render the scene to, and reading pixels from this texture to make the screenshot ensures that all drawing are executed before returning back.

1 Like

Hey Evgeni, thank you so much for your help (and please excuse my late response, while yours was instant)!

I did my best to incorporate and evaluate your hints. So here’s my follow-up report:

  1. the exact code in my original post never reproduces the bug on a Thinkpad T470 running windows (the one I was recently debugging against)

  2. the exact code in my original post never reproduces the bug on a Thinkpad T480s running linux mint

  3. the exact code in my original post never reproduces the bug on a MacBook Pro running MacOS

  4. the exact code in my original post sometimes reproduces the bug on a Thinkpad E480 (sometimes means some of the generated screenshots have the artifacts posted in the imgur gallery above, but some screenshots are generated as expected)

  5. the exact code in my original post sometimes reproduces the bug on a Surface Book

  6. deleting the “extraneous” render calls on the code in my original post

       sceneProvider.addBlueprint({ blueprint, renderEdges: 'sync' })
    -  scene.render()
       cameraController.resetZoom(sceneProvider.rootMesh, false, true)
    -  scene.render()
       sceneProvider.activeElementType = getActiveElementType(tactElements, sceneProvider)
       sceneProvider.shownElementTypes = ALL_ELEMENT_TYPES
       tactElements && sceneProvider.setTactElements(tactElements)
    

    results in the bug being reproduced every time (even on my linux machine that otherwise doesn’t repro it) with the following console warnings:

    WebGL: INVALID_VALUE: getProgramParamter: attempt to use a deleted object


    0 setting antialiasing and preserveDrawingBuffer to true for the Engine ends up making the bug reproducable sometimes, even on linux. It may have to do with this explanation here. The interesting thing is that on the official BabylonJS guide for screenshots it is suggested to set preserveDrawingBuffer: true.

  7. Creating around a lot of screenshots in succession ~20 ends up throwing the following warning and killing the main canvas.:
    main.js?c88f:1 WARNING: Too many active WebGL contexts. Oldest context will be lost.
    image

    The disposable canvas used for a single screenshot each time we take a screenshot keeps working even after the main canvas stops rendering. Am I doing something wrong in cleaning-up after each screenshot?

  8. Using onAfterRenderObservable seems to fix it! I cannot reproduce the bug on the T470 nor the E480! I don’t have access to the Surface Book, but I’ll forward this speculative fix and come back with an update.

    
           })
           .filter(Boolean) as string[], // we do `as string[]` because TS fails to infer it
       )
    -  scene.render()
     
    -  const screenshotAsDataUrl = await ScreenshotTools.CreateScreenshotAsync(
    -    engine,
    -    cameraController.orthoCamera,
    -    {
    -      width: SCREENSHOT_IMG_WIDTH_IN_PX,
    -      height: SCREENSHOT_IMG_HEIGHT_IN_PX,
    -    },
    -    'image/png',
    -  )
    +  const screenshotAsDataUrlPromised = new Promise((resolve: (value: string) => void) => {
    +    scene.onAfterRenderObservable.addOnce(async () => {
    +      engine.flushFramebuffer()
    +
    +      const screenshot = await ScreenshotTools.CreateScreenshotAsync(
    +        engine,
    +        cameraController.orthoCamera,
    +        {
    +          width: SCREENSHOT_IMG_WIDTH_IN_PX,
    +          height: SCREENSHOT_IMG_HEIGHT_IN_PX,
    +        },
    +        'image/png',
    +      )
    +
    +      canvas.remove()
    +      engine.dispose()
    +
    +      resolve(screenshot)
    +    })
    +
    +    scene.render()
    +  })
     
    -  canvas.remove()
    -  engine.dispose()
    +  const screenshotAsDataUrl = await screenshotAsDataUrlPromised
     
       switch (format) {
         case 'dataUrl':
    
  9. worth noting that using setImmediate had no positive effect. That is using
    TimingTools.SetImmediate(async () => { instead of
    scene.onAfterRenderObservable.addOnce(async () => { seen in code above

  10. using CreateScreenshotUsingRenderTarget (the callback version) blows up due to reaching the call-stack limit

       )
     
       const screenshotAsDataUrlPromised = new Promise((resolve: (value: string) => void) => {
    -    scene.onAfterRenderObservable.addOnce(async () => {
    +    scene.onAfterRenderObservable.addOnce(() => {
           engine.flushFramebuffer()
     
    -      const screenshot = await ScreenshotTools.CreateScreenshotAsync(
    +      ScreenshotTools.CreateScreenshotUsingRenderTarget(
             engine,
             cameraController.orthoCamera,
             {
               width: SCREENSHOT_IMG_WIDTH_IN_PX,
               height: SCREENSHOT_IMG_HEIGHT_IN_PX,
             },
    -        'image/png',
    +        screenshot => {
    +          console.log('resolving...') // TODO: maninak delete
    +          resolve(screenshot)
    +          canvas.remove()
    +          engine.dispose()
    +        },
           )
    -
    -      canvas.remove()
    -      engine.dispose()
    -
    -      resolve(screenshot)
         })
     
         scene.render()
    

    Am I doing something wrong here, or is this a bug in babylonjs?

  11. using CreateScreenshotUsingRenderTargetAsync (the promisified version) never resolves

         scene.onAfterRenderObservable.addOnce(async () => {
           engine.flushFramebuffer()
     
    -      const screenshot = await ScreenshotTools.CreateScreenshotAsync(
    +      const screenshot = await ScreenshotTools.CreateScreenshotUsingRenderTargetAsync(
             engine,
             cameraController.orthoCamera,
             {
               width: SCREENSHOT_IMG_WIDTH_IN_PX,
               height: SCREENSHOT_IMG_HEIGHT_IN_PX,
             },
    -        'image/png',
           )
     
    -      canvas.remove()
    -      engine.dispose()
    -
    +      console.log('resolving...') // TODO: maninak delete
           resolve(screenshot)
         })
    

    Am I doing something wrong here, or is this a bug in babylonjs?

Conclusions:

  1. As seen in (7) there is some promising lead that I still need to verify thoroughly
  2. I’d appreciate any of your insightful input regarding (5), (6), (9), (10)
  3. What you described regarding CreateScreenshotUsingRenderTargetAsync is nowhere to be found in the docs nor in the guide. Perhaps it should be added?
  4. Anything else you would like to comment (on what I posted, on the topic in general, further hints/leads you just thought of since your last post, …)

I deeply appreciate your time @Evgeni_Popov! You seem to be a gift for this community. :clap:

Have you tried to not create a new engine/canvas each time? It may help 5/ and 6/, I fear that there are still some operations pending when the canvas/engine are disposed. That would also explain why doing the things in onAfterRenderObservable helps in that matter (but flushFramebuffer may also help here).

For 9/ and 10/, I would need a repro to see what’s going on as I don’t see why we would end up in an infinite loop…

Regarding other thoughts or comments, once the WebGPU code is merged with the main branch it should help with screenshot operations because the method is already wrapped inside the onAfterRenderObservable observer (it is even in a onEndFrame observer, but this observer does not exist yet in the current main branch)! Also, CreateScreenshotUsingRenderTargetAsync is not the panacea because it does not take into account some specific effects (post processes for eg). So, if your scene is using them, the screenshot taken with CreateScreenshotUsingRenderTargetAsync will not match a call to CreateScreenshotAsync.

Thanks!

For future readers, what fixed it for me, as explained in my post above (section 7), was creating a screenshot within an onAfterRenderObservable callback.

We haven’t received any reports/complaints regarding screenshots for the last two weeks so it’s plausible that its an actual fix.

@Evgeni_Popov thank you for you help! Please let me know which version/branch/PR to watch until “the WebGPU code is merged with the main branch” as you said, if you could share that info.

Once we have merged we will announce the (early) WebGPU support in this forum so you won’t be able to miss it :wink: