Best Practices for Optimizing Babylon.js Scenes (not just) on Lower-End Devices

Hey!

  1. When creating the engine (use WebGPU is available) ask for a high-performance device. Drains more battery power but who cares if it’s a game?! :smiley:
new WebGPUEngine(canvas, {
    powerPreference: "high-performance"
})

or

new Engine(canvas, true, {
    powerPreference: "high-performance",
});

WebGPU:
GPU/requestAdapter

WebGL:
WebGLRenderingContext/getContextAttributes

If this fails, fall back to powerPreference: "default" or just omit it.

  1. Use instances of even better go with thin instances wherever possible
  2. WebGPU - Use snapshot rendering
  3. Use mesh.freezeWorldMatrix() on your static meshes
  4. Use mesh.doNotSyncBoundingInfo = true on your static meshes
  5. Freeze your static materials with material.freeze()
  6. Remove faces from your meshes which will never become visible (for example back side of the buildings on a street)
  7. Join the meshes in your 3D software or merge with Mesh.mergeMeshes. Be sure to dispose the old meshes and allow 32 bit indices if needed otherwise the meshes might fail to merge. Check the console for warnings. See post #9 by @Joe_Kerr below.
  8. If you use shadows and your IShadowLight is not moving render the shadows only once:
const shadowGenerator = new ShadowGenerator(4096, shadowLight);
const shadowMap = shadowGenerator.getShadowMap();
if (shadowMap) {
   shadowMap.refreshRate = RenderTargetTexture.REFRESHRATE_RENDER_ONCE;
}
  1. In case of dynamic shadows use a smaller RT texture and set the shadows quality to low: shadowGenerator.filteringQuality = ShadowGenerator.QUALITY_LOW;. Try to reduce the number of shadow casters and receivers as possible.
  2. Don’t use the SSAO2 pipeline, bake your AOs
  3. Try to avoid rendering pipelines unless necessary
  4. Agressively memoize everything you can to free up CPU cycles - What the heck is memoization?
  5. Use scene.autoClear = false when your meshes always covers the whole viewport
  6. Use scene.autoClearDepthAndStencil = false if you don’t use the depth buffer
  7. Use scene.skipPointerMovePicking = true if you don’t need hover interactivity
  8. Stick to the left handed system
  9. Set scene.blockMaterialDirtyMechanism = true if your materials doesn’t need to be synced
  10. Do not use alpha blended materials
  11. Use LODs. You can precreate your owns or use the LOD features of the framework to create your LOD meshes. For additional info see post #22 by @Joe_Kerr
    You can try out the auto-LOD feature as well.
  12. Use low poly meshes ( quite obvious isn’t it? :slight_smile: ) or you can try mesh.simplify
  13. …smaller textures - 4k → 2k → 1k → smaller - make them smaller until they starts to look shi*tty
  14. Don’t overuse reflections
  15. Complex GUIs could be faster in plain HTML overlayed over the render canvas. Stick to one HTML layer. If you use React, you can end up with tens or even more layers, which need to be individually composited by the browser’s rendering engine, hurting performance especially on mobile devices
  16. Limit camera.maxZ
  17. Use compressed textures (KTX2 / Basis / ASTC) - you can use this online tool called GLB Batch Optimizer by @labris
  18. Load you assets asynchronously for better startup times (LoadAssetContainerAsync or your proprietary solution) See Asset Manager
  19. Remove all console.logs in production
  20. Use a bundler with treeshaking to reduce bundle sizes and thus loading times (don’t include the inspector in your prod build)
  21. Try to play with engine.setHardwareScalingLevel() - explanation here
  22. Bake your animations
  23. Use SolidParticleSystem aka SPS - see post #14 by @Pryme8
  24. If you use particles prefer GPU particles
  25. Use Imposters - see post #22 by @Joe_Kerr

This a list of optimization approaches I thought of so far. I’ll update it as new ideas come to mind.

Try them out one by one and see what works, not every optimization fits every situation.

Happy coding!

12 Likes

Additional link: Optimizing Your Scene | Babylon.js Documentation

1 Like

Great list @roland! Can you expand on “Stick to the left handed system”?

I was considering using the right handed system as my last project had to use it, and in my current project I’d prefer not to have imported GLB meshes parented to a __root__ transform. But if there’s a significant performance impact, I might need to reconsider …

1 Like

There is no perf difference but more a learning curve as all of our doc / examples are LH.

For simplicity sake we recommend LH but from a perf standpoint: no difference

2 Likes

Thanks for clarifying @Deltakosh

2 Likes

Right-handed mode introduces extra matrix adjustments under the hood in some cases, which can add a bit of overhead. If you’re curious, you can check the Babylon.js source code around useRightHandedSystem and see for yourself.

On modern, fast devices the performance impact is really minimal, but if every CPU cycle counts (especially on mobile), it’s safer to stick with the default LHS.

Also, as @Deltakosh pointed out, left-handed is the native coordinate system in Babylon.js, so it’s generally more intuitive and consistent to work with.

1 Like

Happy to see an updated comprehensive list of optimizations to consider. I have a few followup questions that may or may not help with this list.

Where does SPS come into the equation here? I prefer them to thin instances when I can use them.

Is the BabylonjsGUI not an exception to this rule? The docs mention: "Possibly some performance considerations" This may be the place for elaboration. I’ve been getting great performance on low end mobile with my GUI and it’s fairly dense/animated.

I wish I could find the thread, but I remember in a discussion it was stated when meshes are not merged in their original 3D software they will still take multiple draw calls if merged in BabylonJS. Any elaboration here?

Addendum:
Any notes about armatures? (minimizing bone count, pre baked animations)
Any notes about 2D particles? (pooling maybe?)

SPS is an amazing and really powerful tool in the hands of a Babylon.js developer. I hadn’t used it in a while, so I completely forgot about it. I was pretty annoyed :zany_face: when I discovered SPS after I had already spent a noticeable amount of time writing my own simpler version for this demo: BabylonJS Dude Particle Shooter Demo

I’ll definitely add it to the list. Thanks for reminding me!

As for choosing between SPS and thin instances: they’re quite different.
SPS creates one big mesh by combining the “particles.” Thin instances, on the other hand, use a single prefab geometry and multiple transform matrices.

SPS offers a lot of helper functions to manage particles. With thin instances, you have to manage everything manually. It’s more low-level, but with no overhead and honestly, I kinda like that.

To sumarize from my point of view: SPS is more convenient. Thin instances require more effort to manage but are faster on the CPU side. For static meshes thin instances are the best option in my opinion. I’m still using my Thinnizator for converting meshes to thin instances in my projects. I’m currently about to release a new Thinnizator Playground and I will also open source the newest Thinnizator class which takes care of “thinnizing” the whole scene.

I’m not sure which is faster on the GPU side though.

If anyone has insights on that, feel free to share your two cents!

I used to use the Babylon.js 2D GUI for rendering a lot of highly decorated badge elements linked to meshes. But I ended up switching to plain <div>s because they performed better, especially when the engine adapts to the device pixel ratio (DPR). That might not always be the case, depending on the use case, but if you’re experiencing low FPS due to lots of GUI controls, trying the <div> approach is definitely worth it.

I’m surprised that Mesh.mergeMeshes still produces one draw call per mesh — Playground. If I recall correctly, I used this function in the past and it worked properly, just one draw call at the end.

EDIT: This is not the case. See post #16.

As a workaround, you can use the mergeMeshes function from the docs to achieve single draw call rendering — Playground. It seems a bit buggy. I’m not the author, but I might look into it later, unless someone else with more free time wants to take care of it. :grinning_face_with_smiling_eyes:

EDIT: working custom merge function. See post #16.

Buggy custom mergeMeshes function result:

TBH I don’t have much experience with optimizing armatures but the docs should be usefull on this topic.
2D particles - don’t have a clue but I think GPU particles performs pretty fast already

@cyborg_ean Thank you!

2 Likes

Thanks for this write-up, I was literally just going through my project to optimise performance so it came at the right time :star_struck:

1 Like

Maybe they were refering to multi-materials? If you merge a mesh with more than 1 material, it will still be mesh * numMaterial = draw calls. Merging in the 3D software would then also entail baking materials into 1 texture per channel (color, normal, metal, etc.) so that you can reduce the material count to 1.

I am on a low end laptop. I am using the default Mixamo skeleton with like ~70 bones. FPS drop fast, the more characters on screen!

If your game is on a high enough zoom level, I would definitely use the smaller Mixamo skeleton (I think there is one with 30 bones). You know, what is the point of hands having 30 bones, if they occupy 1 pixel of screen space.

On the other hand: Say, you opt for the no-finger-bones skeleton. You accumulate like 100 animations, some of them edited. Hours of work. And then you get the idea: I want a first person aim preview. So suddenly you are all zoomed in on your characters and you see that the axe guy is holding his weapon with an open flat hand… (is it glued or what :squinting_face_with_tongue:)

Oh cool. Can we have that in Babylon something like BABYLON.Mesh.MergeMeshes → BABYLON.Mesh.ThinizateMeshes?


Come to think of it. About merged meshes again. Are “lose” meshes a performance problem? I mean two separate cubes cost 2 draw calls. Now I merge them and suddenly there is only 1 draw call. But on the geometry level it is still two separate or “lose” meshes, right? So what does the GPU actually see/do here? Are we maybe just saving the overhead costs of the draw calls? Or can the GPU do some things better or worse with many individual lose meshes versus one merged lose-meshes-mesh?

To put this to another extreme, I have seen models where each individual triangle was a lose mesh. I usually merge vertices by distance, then, to get the poly count down to bearable levels. But what if this is for performance?

1 Like

Unfortunately, this also happens when you’re not using multimaterials.

Cool idea! @deltakosh ?

You can join your game’s scene static meshes for example. It saves CPU time on evaluating all the loose meshes.You can alsoo freeze the mesh, the materials… etc. IMHO it’s more practical to join the meshes in the modelling SW. However there are some cases when you build the meshes dynamically so mergeMeshes could be quite useful.

2 Likes

From a GPU perspective, meshes that are merged and using the same material become one set of attributes, uniforms, and matrices. Merged meshes that share vertices, if optimized, could share the vertex and instead repeat index (saving 12 or 16 bytes per shared vertex – I don’t recall if vertex buffer is required to be 3 floats or 4 floats per vertex, and it might be different on WebGL2 vs. WebGPU).

It would be interesting to classify Babylon visual objects in terms of their use of resources.

Here’s my take on the relevant resources. Careful use of ArrayBuffers can reduce CPU memory footprint (and CPU time) by avoiding a (kind of hidden) copy. Careful updates can avoid copying entire buffers when only a small portion changes per frame. Processing and transfers add up quickly when they need to occur per frame.

  • CPU memory
  • CPU processing
  • GPU (Buffer) memory
  • GPU processing
  • CPU-to-GPU transfer (initial / update)
  • GPU-to-CPU transfer (initial / update)
3 Likes

Sorry I can’t because back compat :smiley:

The SPS is one of the most powerful things we have, You can do sooo much with it. Smart particles, custom shaders per mesh instance, metadata per particle and much more.

It was the saving grace for several projects and made things like Saint Jude’s sequin forest a possibility.

This is really old so some of the functionality is no longer there it seems as the backend might not be live anymore.

4 Likes

Very nice!

@cyborg_ean
This is not the case. Mesh.mergeMeshes works correctly. You just have to dispose the old meshes :smiley: and be sure to allow32BitsIndices if the merged indices count > 65535.

Working simplified and optimized mergeMeshes function using preallocated typed arrays:

1 Like

We can’t add a new function? :zany_face:

I had to try at least :smiley:

1 Like

I’m really confused now :smiley:

That was my polite way to say NO WAY :smiley:

1 Like