Better FPS with mesh cloning than mesh instancing - a bug in Babylon?

Hi everyone,

I am looking to optimize a scene with several occurences of a heavy mesh.
I looked at cloning an imported mesh versus creating instances of it and I would expect that creating instances would yield better performance, but it’s actually the opposite. And I am not sure why.

Please see the following playground:
https://www.babylonjs-playground.com/#N3EK4B#64

When I use line 16 (cloning) and comment out line 17 (instancing) I get about 9-10 fps on my machine when rotating the scene. Inspector shows 10 draw calls, which makes sense.
When I use line 17 and comment out line 16 - my framerate drops to 4-5 fps on scene rotation. Here inspector shows 1 draw call, which also makes sense. What doesn’t make sense is that why a single draw call results in lower FPS than 10 draw calls of the same mesh.

Does the mesh have too many verrtices/faces and this is what’s causing a single draw call for 10 instances to take long?

Any insight is appreciated!
Thanks.

1 Like

That’s strange, there’s a problem in Chrome but not in Firefox, where I am at 60 fps in both cases (1 or 10 draw calls should not change anything regarding fps).

In Chrome, I’m only at 30 fps and there’s a strange long first frame each time I record a performance run. It always lasts around 1.3s and all the time is passed in “System (children)”… Then the other frames have some duration like 16ms, so I should see a fps of 60 fps which is not the case.

[…] Ok, it’s the isPickable property the culprit. Set it to true and you should be fine:

https://www.babylonjs-playground.com/#N3EK4B#65

Note I have to wait a number of seconds before the fps raises from 30 to 60 fps. Sometimes, I have to set the focus to the display area (by clicking on it), and during my testing I have crashed the Chrome tab several times… Also, when switching to another tab and coming back, the fps comes back to 30 fps and it takes some time to raise to 60.

So I do think there’s still something going bad with Chrome and the PG… Maybe the big asset is the problem?

Everything is smooth in Firefox, so I don’t think isPickable has anything to do with the problem, after all…

1 Like

Hmm, this is interesting. I tried the playground in the latest version of Firefox on my system. And I get the same low framerates I do on Chrome. What version of Firefox / system are you running? Mine is Firefox 68.9.0esr (32-bit) / Windows 10 Pro 64 bit. My Chrome version is 83.0.4103.97 (64-bit).

If I look at chrome://gpu/ in my browser I have hardware acceleration enabled, so it’s not that. I wonder what else might cause your Firefox to be able to get 60 fps…

I have made some updates to the playground: started from a clean new playground, made sure it’s using the latest draco decoder, played with passing the attributes parameter to the decodeMeshAsync function, made sure that constantlyUpdateMeshUnderPointer is false for the scene:

https://playground.babylonjs.com/#LR3TB7#2

Unfortunately, none of that made a difference in performance. @Evgeni_Popov, can you try the above playground on your Firefox please?
(Draco decoder is very spotty on my Firefox, I get various errors from it both on the old and this last playground depending on which urls I use for the decoder (preview.babylon, cdn.babylon or cdn.jsdelivr)

UPDATE: Ok looks like my version of Firefox was fairly outdated, just installed the latest one 77.0.1 (64-bit). The good news is that don’t get any of the Draco decoder errors anymore. The bad news is that it still gives the same low fps as Chrome does. So there must be something about your Firefox…

Firefox version is 77.0.1 (64-bit). Your PG is still ok in Firefox.

In Chrome, I have to wait some seconds (sometimes 15 / 20s) before it goes from 30 fps to 60 fps. Maybe Chrome is doing some background work, like releasing memory used for loading. As the asset is big, it may take some time. Try to wait up to 30 / 40s without doing anything and see if the fps is raising.

I waited for about a minute on both Chrome and Firefox and, sadly, the framerate still hangs around 9-10 fps when using cloning (still better than 4-5 fps when using instancing).

Are your browsers using a dedicated high-performance GPU such as NVIDIA? My laptop has NVIDIA Quadro P3200 and also an integrated Intel GPU. Through NVIDIA Control Panel I see that my Preferred graphics processor is set to “Auto-select” and I don’t have access to set it to “High-performance NVIDIA processor” (likely because of my company’s security settings).

In my chrome if I go to chrome://gpu and search for “GL_RENDERER” it shows

ANGLE (Intel(R) UHD Graphics P630 Direct3D11 vs_5_0 ps_5_0)

Pretty sure your system is using high-performance GPU for Chrome and Firefox, that’s why you are seeing way higher fps counts on both. Am I correct?

Indeed, I’m on a desktop and I have a GTX 1080:

ANGLE (NVIDIA GeForce GTX 1080 Direct3D11 vs_5_0 ps_5_0)

Unfortunately, I don’t have any idea to explain your poorer fps with instancing compared to cloning…

Maybe the driver for your graphics card is emulating instances for some reasons ???

Thank you, guys. I will speak to our IT so that I can choose to use high performance GPU first, and then come back here with the results of new tests of cloning versus instancing. :crossed_fingers:

1 Like

With things this large, it might be a good idea to do profiling. There may be something that might be trivial for normal sized stuff, maybe bound box calculation, but in this context blows up huge in this context.

Was able to test with 2 laptops both on Chrome:

  • ThinkPad T470p: ANGLE (Intel(R) HD Graphics 630 Direct3D11 vs_5_0 ps_5_0), similar to @Anton. The playground remains at 10 FPS after loading.
  • MacBook Pro (15-inch, 2018): AMD Radeon Pro 560X OpenGL Engine. The playground remains at 25 FPS after loading.

A bit unexpected, since the ThinkPad achieves higher FPS (usually 1.5 - 2X higher than MacBook Pro) for all my other projects.

Hey @gbz, so this is interesting too. Your ThinkPad looks to be using integrated Intel GPU and yet still achieves a higher FPS than MacBook Pro with a Radeon. Probably the overall system on ThinkPad is more powerful and makes up for the GPU?

@Anton, I’m not too knowledgable on machine specs :stuck_out_tongue: though it is strange that the ThinkPad is not performing as well as the MacBook Pro when it comes to Instances.

For my other projects on Chrome, the ThinkPad easily exceeds 60 FPS (>100 FPS on frame-rate-limit-disabled browsers), but the MacBook Pro doesn’t surpass 40 FPS.

Ok, I am finally able to use my high-performance GPU in browsers and I see way higher FPS on the old background, so I decided to bump up the number of meshes a bit :slight_smile:

https://playground.babylonjs.com/#LR3TB7#4

On this playground I am getting the same 20 FPS for both cloning and instancing. Same result in Firefox using high-performance GPU.
So even though the performance is way better with NVIDIA, the results are now the same for instancing versus cloning, instead of the former being faster.

2 Likes

Hi @Anton, @Evgeni_Popov, @sebavan

I got exactly the same situation. I’m creating, cloning, or making instances.

So which is faster in principle? Like 1st, 2nd, and 3rd place in speed in Theory?

I’m using bow BabylonJS v 5.4 BTW.

Regards
Peter

instancing should be in theory way faster as it only requires one draw call and even better would be relying on thin instances :slight_smile:

Hi @sebavan

And the other ones? How do they look at the speed podium?

I guess the podium would be: assuming the materials are the same.

ThinInstance>Instance>Cloning>Create

1 Like