Why does adding a post-process improve the render framerates on Mac retina screen?

Hello all,

I have noticed an interesting phenomenon during my rendering experiments. When attempting to render a large number of quads with instances, I observed that adding a PostProcess significantly increase (not decrease :joy:) the framerates on a Mac retina screen, but not on a standard 1080p screen.

You can replicate this by running the following two playgrounds on a Mac retina screen:

If your device is powerful enough to handle both at 60 FPS, you can increase the numInstances variable at the beginning to slow down rendering, making the performance gap more evident.

const numInstances = 10000; /// increase to slow down rendering
const numQuadsPerInstance = 128;

The only distinction between these two playgrounds is that the latter utilizes a post-process:

const postProcess = new BABYLON.PassPostProcess("Scene copy", 1.0, camera);

I want to understand why this is happening, I thought a PostProcess could only slow down the rendering, but it looks exactly the opposite. And how I can achieve the same framerate without using a PostProcess.

Thank you all!


P.S. This strange thing was noticed when I attempted to increase the framerates of Babylon’s Gaussian Splatting to match those of PlayCanvas.

1 Like

Pure guess here (you should test with https://chromewebstore.google.com/detail/spectorjs/denbgaamihkadbghdceggmchnflmhpmk?hl=en&pli=1
I don’t repro on my setup): mac with retina display have a dpi scale. to make it quick, with a post process, scene is rendered in a texture that is smaller than the backbuffer.

As the scene is heavy on overdraw and bandwidth, having a smaller target will decrease pressure on bandwidth.

cc @sebavan

EDIT: I see it’s the core of the discussion in the other thread.

1 Like

Thanks! I use Spector.js to captured the PG and I see a viewport (1788x1886) under drawElementInstanced() , which is exactly the retina resolution of a Mac.

Draw State
DITHER: true
VIEWPORT: 0, 0, 1788, 1886
FRONT_FACE: CCW
FRAGMENT_SHADER_DERIVATIVE_HINT_OES: Extension OES_standard_derivatives is unavailable.
RASTERIZER_DISCARD: false
FRAGMENT_SHADER_DERIVATIVE_HINT: 4352

Can this confirm that the rendering size is full? Also, if the actual rendering size is somehow half, I would expect the rendering results after adding the PostProcess to be different, but they do not become low.

Also, it could be because MSAA is disabled by default when you create a post process.

Try setting postProcess.samples = 4 to activate it and see the impact on performance.

Setting samples = 4; in the second PG makes the fps drop from 60 to 20.

But in the first PG that does not use PostProcess, I also set the antialias to false, so it should not be doing any multisampling, right?

Yes, it shouldn’t, but maybe there’s a bug somewhere. You should compare the first PG when antialias is false/true and verify there’s a difference. You should also compare the output with your second PG.

1 Like

Thank you for the reminder! Setting antialiasing in the first Playground does not affect framerates or output quality. As described here, antialiasing only becomes effective when an additional pass is introduced and the number of samples is set to a value greater than 1.

Regarding quality, both Playgrounds are identical. As a comparison, when adaptToDeviceRatio is set to false during Engine creation, the quality decreases while the framerates increase.

This is because ultimately the postprocess is scaled up to get to the canvas which compensates the lose of precision.

@xiasun let s keep only one thread open please as I can see the question being discussed there:

and a bit there Possible performance bottleneck on high resolution devices (MacOS) - #17 by Deltakosh

Can you try to turn depth test on in your first PG ?

I tried on a mac with retina screen and can not repro either. What browser/version are you using ? it is really weird we can not repro on similar devices.

I am on an old MacBook Pro 2019 (Intel CPU + AMD GPU) with Chrome 127.0.6533.100. Do you mean both PG are running at 60fps on your machine? If that is the case, you may increase the numInstances at the beginning to make one of them running under full framerate, otherwise the performance difference may not be visible.

I thought about this too, is there some sort of flag or config so that I could verify & compare? Since the two PGs share the same rendering quality and it does not look like some sort of supersampling.

Thank you @sebavan I tried setting depth to true , but the frame rate did not change.

var createEngine = function() {
    var engine = new BABYLON.Engine(canvas, false, {
        antialias: false,
        depth: true, ///// here
        stencil: false,
        xrCompatible: false,
        preserveDrawingBuffer: true,
        powerPreference: 'high-performance'
    }, true);
    return engine;
}

In this case, the quads are actually transparent and will be blended during rendering. This is to simulate splats of Gaussian Splatting.

I have also added a PostProcess locally to test a real GS. The frame rate did increase, and the quality did not change, as in this PG. If there is something like a depth test, I guess the rendering result of GS would be completely messed up.


Sorry, I started a new thread because I thought this PostProcess issue seemed like an independent one. I will not update the other two threads then.

1 Like

@Cedric do you have an intel mac to test on ?

I am wondering if it could be this as I am running out of ideas.

I can reproduce this on a 2017 MacBook Pro - Intel i7 - Retina - Chrome Version 127.0 :x: (~3X FPS for 2nd PG)

I cannot reproduce this on a 2021 MacBook Pro - M1 Max - Liquid Retina XDR - Chrome Version 127.0 :white_check_mark: (same FPS for both PG’s)

Are there any ways I can help with Spector JS? I can help look into this later today

1 Like

Nope this would highlight an internal driver issue there :frowning: not much we could do/see on our end. You could try reporting it to the chromium team.

2 Likes

@xiasun If you can share your observations with PostProcess in a Chromium issue, I can also add my 2 data points there to help

1 Like

Thank you, I will try to launch an issue next week and post it here. (I will try to compare it on other browsers or different ANGLE backends before submitting.)

By the way, I noticed you mentioned that on your M1 Max machine the framerates are the same. I’m curious whether they are both running at a full framerate (perhaps 60 FPS). If they are, would you consider increasing the numInstances at the beginning of the Playground to make it run at a lower framerate? If both Playgrounds are running at a full framerate, the performance difference may not be visible.

1 Like

Sharing some FPS results below:


2021 MacBook Pro - M1 Max - Liquid Retina XDR - Chrome Version 127.0

  • ANGLE Backend: OpenGL (default numInstances)
    • PG 1: 83 FPS
    • PG 2: 81 FPS
  • ANGLE Backend: OpenGL (numInstances = 50000)
    • PG 1: 16 FPS
    • PG 2: 15 FPS
  • ANGLE Backend: Metal (numInstances = 50000)
    • PG 1: 27 FPS
    • PG 2: 27 FPS

With default numInstances, Metal reaches the FPS cap of 120


2017 MacBook Pro - Intel i7 - Retina - Chrome Version 127.0

  • ANGLE Backend: OpenGL (default numInstances)
    • PG 1: 11 FPS
    • PG 2: 38 FPS
  • ANGLE Backend: Metal (default numInstances)
    • PG 1: 10 FPS
    • PG 2: 40 FPS
1 Like

@sebavan @regna I have submitted an issue to Chromium Google Issue Tracker.

Thank you all.

3 Likes