How does the rendering process of Babylon work under the hood?

Hello Forum :slight_smile:

Is there some piece of documentation about the inner workings of the Babylon Rendering Pipeline? I don’t necessarily mean the “Default rendering Pipeline” (that has some documentation) but rather the default behavior of Babylon when rendering a Scene. As an example, there is a nice book where real time render Pipelines are explained like this:

I could not find a similar description of the functionality of Babylon. For research purposes, I’d love to know more about the single steps performed in the Pixel Processing Stage of Babylon (if it exists in such a way :smiley:).

  • How is the Base Color calculated?
  • How is the effect of Shadows calculated?
  • How is the effect of AO calculated?

Does a visual (or textual :confused:) representation of those questions exist?

Thank you once again for all the Help!

Hi @Sarrus4x4 - that’s a good and in depth question :slightly_smiling_face::sunglasses:

There’s a number of different ways and places this has and can be documented. Since BJS is a wrapper over WebGL/GPU (shhh! Native is another discussion!), the WebGL specification would naturally have insights into the pixel processing stage, as that mostly happens on the GPU in the form of pixel shaders. You could also go straight to the source (as it were) and browse the BJS source repos to see for yourself how the rendering pipeline works internally. The code is fairly well documented with comments, but that could be too “low-level” for what you want.

In my book, I’ve got a couple of chapters devoted to exploring topics like how lighting calculations work, what goes on inside materials, and how rendering pipelines work with different types of shaders.

Since you’re asking for very specific info on the BJS rendering pipeline, I’m unsure if my book would serve to answer your questions. It could be a start though!

Ed: updated BJS repos link to point at engine src


Thank you for the fast answer!

So I guess the WebGL documentation on pixel processing and pixel shaders will answer my question :slight_smile:

There is a lot of visually pleasing graphics on how this all works :heart_eyes:

As I take it from the documentation, the answer to all of my questions…

…is the Fragment shader, aka. the Pixel Shader.

chatGPT your buddy in all questions:

How does the rendering process of Babylon work under the hood?

Babylon.js is a powerful 3D game engine and WebGL library that enables developers to create immersive, interactive 3D experiences on the web. The rendering process in Babylon.js involves several steps, including:

  1. Scene setup: The first step is to create a 3D scene and add objects to it. This involves defining the position, rotation, and scale of each object, as well as its material and texture.
  2. Camera setup: Next, a camera is set up to view the scene. The camera’s position and orientation determine what parts of the scene will be visible on the screen.
  3. Rendering pipeline: The Babylon.js rendering pipeline involves several steps that are executed for each frame:a. Culling: Objects that are outside the camera’s view frustum are culled, or removed from the rendering pipeline, to save processing time.b. Depth sorting: Objects that are closer to the camera are rendered first, so that objects farther away do not appear in front of them.c. Shader programs: Babylon.js uses WebGL to render 3D graphics on the web. WebGL relies on shaders, which are small programs that run on the GPU to calculate the appearance of each object. Babylon.js comes with a set of built-in shaders, but developers can also write their own custom shaders.d. Lighting: Babylon.js supports several types of lighting, including point lights, directional lights, and spotlights. Lighting is calculated for each object in the scene based on its position, orientation, and material properties.e. Texturing: Textures are applied to objects to give them a more realistic appearance. Textures can be loaded from image files or generated dynamically using procedural techniques.f. Post-processing: After the scene is rendered, post-processing effects can be applied to the final image. These effects include things like bloom, motion blur, and depth of field.
  4. Display: Finally, the rendered image is displayed on the screen. This involves transferring the image data from the GPU to the display buffer, which is then presented on the screen. The process is repeated for each frame, typically at a rate of 60 frames per second or higher, to create the illusion of motion.

How is the Base Color calculated?

The Base Color of a material in 3D graphics represents the diffuse color of the object, or the color that is perceived when light reflects off the surface of the material. In Babylon.js, the Base Color is calculated based on several factors, including the material’s texture and its physical properties.

If the material has a texture, the Base Color is typically determined by sampling the texture at the pixel coordinates corresponding to the surface of the object. This means that the color of each pixel on the object’s surface is determined by the corresponding pixel in the texture. This allows for a wide range of visual effects, such as adding patterns, images, or even animations to the material.

If the material does not have a texture, the Base Color is typically defined by the material’s physical properties, such as its reflectance, absorbance, and transmittance. These properties describe how the material interacts with light, and can be used to determine the color of the material based on the lighting conditions and the viewer’s perspective.

In some cases, the Base Color may be calculated using a combination of both texture and physical properties. For example, a material may have a texture that defines a pattern or image, but the physical properties of the material may affect the way that light interacts with the texture, resulting in a different perceived color.

Overall, the Base Color calculation in Babylon.js depends on the specific material properties and textures used, as well as the lighting and viewing conditions in the 3D scene.

How is the effect of Shadows calculated?

Shadows in Babylon.js are calculated using a technique called shadow mapping, which involves rendering the scene from the perspective of a light source to create a depth map, and then using that depth map to determine which parts of the scene should be in shadow.

The shadow mapping process involves several steps:

  1. Render the depth map: First, the scene is rendered from the perspective of the light source, creating a depth map that stores the distance between each pixel and the light source.
  2. Project the depth map: The depth map is then projected onto the scene from the viewer’s perspective, using the depth values to determine which parts of the scene are in shadow.
  3. Apply the shadow effect: The shadow effect is applied to the objects that are in shadow, using a technique called shadow masking. This involves comparing the depth of each pixel on the object to the depth values in the depth map, and if the pixel is farther away from the light source than the corresponding depth value, it is considered to be in shadow.

The shadow mapping technique in Babylon.js is highly configurable, allowing developers to control various parameters such as the resolution of the depth map, the type of shadow filtering used, and the range of the shadow effect. Shadow mapping can be used with various types of lights, including point lights, directional lights, and spotlights.

In addition to shadow mapping, Babylon.js also supports other shadowing techniques, such as cascaded shadow maps, which are used to improve the quality and performance of shadow rendering for large scenes.

How is the effect of AO calculated?

Ambient Occlusion (AO) is a rendering technique that approximates the way that light is occluded or blocked by nearby objects, resulting in darker areas in the scene where objects are close together. In Babylon.js, the effect of AO is calculated using a technique called screen-space ambient occlusion (SSAO).

The SSAO algorithm in Babylon.js involves several steps:

  1. Render a depth map: First, a depth map is rendered from the viewer’s perspective, which stores the distance from the camera to each pixel in the scene.
  2. Generate a random sample pattern: A set of random samples are generated in a hemisphere around each pixel, representing the directions in which ambient light is sampled.
  3. Sample the depth map: For each sample, the algorithm samples the depth map at a distance proportional to the sample’s distance from the pixel, and compares it to the depth of the current pixel. If the sample is occluded, the algorithm records this information.
  4. Calculate the occlusion factor: The occlusion information is used to calculate an occlusion factor for the pixel, which represents the amount of ambient light that is occluded or blocked by nearby objects.
  5. Apply the AO effect: The occlusion factor is used to darken the pixel’s color, resulting in the AO effect.

The SSAO algorithm can be optimized by using various techniques such as reducing the number of samples, or using lower resolution depth maps. Additionally, different SSAO algorithms can be used depending on the desired visual effect and performance requirements, such as horizon-based ambient occlusion (HBAO) or multi-scale voxel-based ambient occlusion (MSSVO).

Overall, the SSAO technique in Babylon.js provides a fast and efficient way to simulate the effect of ambient occlusion in 3D scenes, resulting in more realistic and visually appealing images.


Not bad!

This part is wrong, though:

b. Depth sorting: Objects that are closer to the camera are rendered first, so that objects farther away do not appear in front of them

Opaque meshes are sorted by their material id (and rendered in that order), while transparent meshes are sorted by alphaIndex + their distance from the camera (the farthest meshes are rendered first).


Wow! :smiley:
Thank you for pointing that possibility out to me :slight_smile:

1 Like

Worth pointing out too that depth sorting is one of the most expensive operations in the pipeline, being that it is CPU-bound and blocking

1 Like

Do you know why is this done cpu side?

Good question. There are a few reasons, but I’ll try to summarize as best I can.

Sending data from the cpu to the GPU is cheaper than the converse, but it’s not free. Culling meshes in the cpu-bound Scene is better than allowing those meshes to be flagged/sent for rendering. But in order to cull, you need to have a depth order, so you need to sort them into the desired rendering order. As well, the GPU doesn’t retain state in shaders, so depth information needs to be calculated or provided in some fashion in at at least one point in the process.

All of these are ways to prevent over rendering, when a given pixel is set to values that will never actually reach the screen, in addition to other ancillary but still important functions


1 Like

Thank you for the context! However, I still don’t understand why you can’t store all of the scenes geometry in uniforms that the shaders would use to compute the z-order sorting per frame

Sorting on the GPU and retrieving the result is probably slower than sorting on the CPU (not to mention that reading from the GPU is asynchronous). To improve performance, we should sort on the GPU and make the draw calls directly from the GPU, without going through the CPU. This requires compute shaders and some structural changes in the engine. This might be possible in WebGPU, but I think it is not easy.


[why not use uniforms…]

How do you think the mesh info gets to the GPU in the first place?

Joking aside, collision detection generally is done on the CPU, which is another reason to have meshes on the CPU. Having sorted meshes in depth order speeds up collision detection by a lot!

I realize this may not be the place for discussion on the babylon.js engine architecture, but I’ve been thinking about this for a bit:

Modern gpu merge sorts are insanely fast.

Evgeni_Popov : To improve performance, we should sort on the GPU and make the draw calls directly from the GPU, without going through the CPU

This would give some great performance boosts but also enable other types of performance optimizations that don’t work well right now such as using densely placed transparent meshes such as those used in a forest of billboarded trees:
Spruce Forest / Fichtenwald - Buy Royalty Free 3D model by VIS-All-3D (@VIS-All) [d0779a5] (

Or in this case that I’m currently dealing with; a tree model that uses transparent image textures for the foliage:

Whenever I’m rendering a scene that uses even 10 of these trees on my macbook, I see a significant fps drop.

Joking aside, collision detection generally is done on the CPU, which is another reason to have meshes on the CPU. Having sorted meshes in depth order speeds up collision detection by a lot!

I personally think babylon.js should set it’s primary focus on the rendering side of things. Separate processes or libraries can be used for collision detection. For me right now, raycasting and just collision detection in general is better handled by Ammo.js and executes many times faster. I see no reason why the functionality provided by Ammo.js couldn’t be generalized to some smaller wasm modules that just handle collisions.


As i see, you are still talking about the computation :smiley:

May I ask a followup question to my initial one? When Babylon has a rendering pipeline like WebGL, does this mean, that the ‘defaultRenderingPipeline’ feature is just a fancy grouping of post-processes and therefore also just a post-process? This would mean that all effects of the defaultRenderingPipeline (e.g. Bloob) are computed AFTER the initial rendering of the image and added to it afterward.

Did I get that right? :confused:

Thank you for all your help, you are awesome :heart:

Please excuse my spamming

I think I found my answer (the answer is yes, all those effects are post-processes and get computed and added for every Frame AFTER it was initially created by the actual rendering Pipeline). BUT those effects are added via shaders (so shaders are used in the actual Pipeline during the Pixel calculation AND during the postprocessing?)

But now I saw that AO in Babylon is also a part of the chainedPostProcessing which means, that AO is also calculated and added for every Frame after it was initially produced by the actual render Pipeline.

Did I get that right?

Yes, if you use SSAO / SSAO2. It is generated dynamically at each frame by one (several) post-processes.

It should not be confused with the AO texture that you can add to your PBR material and which is static in essence.


I don’t disagree that the GPU is a better place for many types of operations like the ones we’ve been discussing, but there’s almost always going to be the need to perform mesh intersection/collision checks and other similar tasks in-engine on the CPU-side. Sometimes you don’t want or need a full physics engine and your core game or business logic just needs to know when two meshes intersect. Small wasm modules aren’t a bad way to approach it, but by introducing and requiring an asynchronous interface for mesh collisions it seems to me like it goes against the idea of simplicity.


the ‘defaultRenderingPipeline’ feature is just a fancy grouping of post-processes and therefore also just a post-process? This would mean that all effects of the defaultRenderingPipeline (e.g. Bloob) are computed AFTER the initial rendering of the image and added to it afterward.

I know you answered this in following posts, but maybe I can add a bit to help a bit more –

A shader is just a small program that runs on the GPU and that performs a very specific task, whether that’s figuring out what triangles should be rendered and how or computing the color of a single pixel. It doesn’t know where its’ input data comes from nor does it know how its’ output value will be used.

This allows chains of post-processes that take as their input the output from the previous process(es), in effect building up the final render layer-by-layer. Some processes require multiple passes (e.g., @Evgeni_Popov mentioned SSAO as one of them), which means the process feeds itself its’ own output one or more times before producing the final result. Techniques like ray-marching will use “ping-pong buffers” that are built up over many frames of rendering, and which function fundamentally the same as the aforementioned SSAO or default post-processing pipeline does.


Fwiw, i think this question can be more generalized to how drawing works in general , and what limitations babylon / webgl / webgpu have. Postprocess is done output image , which may actually be a composite of several images . See Deferred shading - Wikipedia for a good visual. Idk exactly, but i think pbr rendering has way more channels and complications for transparency. There is a good native example on the nvpro github if youre actually interested.

When considering how to draw, Probably the simplest approach is to submit job on cpu, render on gpu, read back on cpu and resubmit until some condition is met. But, resubmitting from the cpu is slow (this is why draw calls are the limiting factor on cpu bound engines)

However, there has been “indirect” drawing for over a decade in opengl and ofc in vulkan. The native graphics apis with Indirect in the name just mean that the input data comes from the gpu. Webgl2 has something like this called transformFeedback, but limited caps
.WebGLTransformFeedback - Web APIs | MDN

Webgpu has the foundations of draw indirect and multidraw indirect, but i think its currently limited to 1 iteration, which is hot garbage. We do have render bundles which kind of help, by saving the output of previous computes whose input hasnt changed, but not really the same. If indirect drawing is ever available, draw calls wont necessarily be the limiting factor any more, rather the vertex count will be. This is what enables unreal to do its thing.

So… maybe asking how babylon does it isnt the best form of question, because babylon supports multiple backends. That is the purpose of the “engine”, to be a factory function for adapting to the hardware. There must be several permutations of the engines workflow graph.

1 Like

Hi @Evgeni_Popov

Does this sorting also include thin instance meshes? If I have tens of thousands of billboard trees like this, will I see increased CPU bound computation time?

No, the thin instances are not sorted. The goal is to be as fast as possible, so it is up to the user to rearrange the order of the matrices if it doesn’t fit: the engine simply sends them to the GPU in the order they are stored in the matrix buffer.