Path-tracing in BabylonJS

Hello all,

Here are the step-by-step instructions for locating the Sun in your 3D scene when the background and its Sun disk came from an HDR image that you wanted to use. The first set of instructions is what I think most people will find useful in general 3D graphics programming. It just requires that you be able to open up your HDR image inside PhotoShop, GIMP, or similar image manipulation program.

The second set of instructions, which will be a separate post after this one, will show how I load in any arbitrary HDR image, examine the pixels manually in JavaScript, and algorithmically calculate where the Sun is located when the 2D image is wrapped around the final 3D scene, all while not having to open up PhotoShop/GIMP at all. This 2nd method requires more work and algos up front, but might be useful nonetheless, especially if you don’t know ahead of time which image will be used as the background. This second method flows directly from the first easier method, so let’s start with the first method:

Just so we’re on the same page with all this HDR stuff - these images are typically in RGBE format, so the usual Red, Green, Blue channels - however the last one where the A/Alpha channel would be, is unique to HDR in that it specifies an exponent (E) which affects the final output of the RGB channels when the unpacking/calculations are usually done automatically. When you open your image in PhotoShop/GIMP, etc., use the eyedropper tool, and hover over various pixels, by the time it is unpacked and displayed on the monitor, the RGB might look the same as any other texture image, and the A Alpha channel might just read 1.0 or 255 for all pixels. This is not very helpful for our 2nd method in the upcoming post, which is why we need to read some of the pixel data manually. But more on that later.

So once you’ve opened up your HDR image, get to the ‘info’ tab/window in your image manipulation program, which will display the value of each pixel (shows x,y coordinates of mouse pointer and pixel, its RGBA values, etc.). Then simply hover over the center of the Sun disk and note the x,y coordinates. If you are using this for path tracing and direct light sampling from the Sun, simply eyeballing this center might not be enough - put the ‘info’ mode into pixel (linear) so that the numbers are floating point, and then slowly move the mouse around the Sun - you’ll see the floating point RGB numbers go anywhere from 0.1 to 2.0 (which is where 99.9 % of the ‘normal’ pixels in the rest of the image will be under). However, if you get the pointer just right (center of ‘hot’ pixel area of Sun disk) these RGB raw linear floating numbers will suddenly jump all the way past 50,000.0 for each color channel, and maybe even past 100,000.0! Congrats, you have located the Sun, lol! Seriously, that’s where the E exponent channel was the highest in the raw image pixel data, which scales up its associated RGB channels accordingly. By the time this all gets to your monitor in the end though, everything gets scaled (tone-mapped) back to the 0-255, or 0.0-1.0 range for all RGB channels, for normal display purposes.

So here are the steps in detailed form (later I’ll list the steps in short algo form):

  1. Find the hottest group of pixels in your image and write down the mouse’s x,y pixel coordinates at that point. For example, my symmetrical_garden_2k.hdr image that I was using has dimensions of 2048x1024 and the Sun’s disk hot spot was right at coordinates (396,174). At this location the RGB channels go past 100,000.0 in pixel info mode.

  2. Normalize these pixel integer coordinates into floating point texture UV representation (range:0.0-1.0). This is accomplished by simply dividing the pixel’s integer X coordinate by the width of the HDR and similarly dividing the pixel’s integer Y coordinate by the height of the HDR. So, again using my example - (396 / 2048, 174 / 1024) = (u,v) = (0.193359375, 0.169921875).

  3. Since this HDR will be wrapped around the scene in spherical fashion, we need these magic sun disk coordinates to be in terms of Polar Coordinates, or in our 3D case - Spherical Coordinates. Spherical Coordinates require 2 angles and a radius of the sphere. We simplify things by making the sphere unit size, or a radius of 1. That leaves the 2 angles to calculate. These are traditionally called phi and theta. Theta goes from 0 to 2PI (so a complete revolution, 360 degrees, etc.). Theta controls the angle that you are rotating/looking right and left in the 3D scene. Phi is half of that range, going from 0 to PI, and controls the angle that you look up and down in the 3D scene. We don’t need the full 2PI range for this Phi angle because it’s not mathematically helpful to rotate the view up past the zenith, which would flip the camera upside down. Now we just map the uv into (phi, theta) with the following simple formulas:

phi = HDRI_bright_v * PI note: V is used for phi
theta = HDRI_bright_u * 2PI note: U is used for theta

So, again using my earlier example, I get:
phi = 0.169921875 * PI = 0.533825314 (remember to use the V float number)
theta = 0.193359375 * 2 * PI = 1.21491278 (remember to use the U float number)

I am a visual person - I have to be able to ‘see’ it to understand, so putting aside all these meaningless numbers for the moment, this all sort of makes intuitive sense when you realize that most HDR images are double wide as they are tall. If you sort of imagine pointing your camera right down the center of the image (mouse pointer in the middle), theta has to be 2 PI ( a complete circle) because we can move the mouse left and right and wind up back where we started in the image. The image is only half as tall because if we move the mouse up and down, we either reach the zenith(top) or the floor/ground beneath us within 1 PI (half a circle).
It wouldn’t make sense to have a perfect square HDR because then you would need to be able to flip the camera upside down when you got past the top of the image, which confuses things mathematically when we get to 3D. So that’s why we must map the angles to 2 PI range for left / right movement and just 1 PI for up and down movement inside the texture image.

  1. Final step, almost there! Now that we have converted the original hot pixel x,y coordinates into Phi and Theta angles to be used in Spherical coordinates, we finally must do one more conversion. This time we go from Spherical coordinates to 3D Cartesian coordinates, which will magically map our polar angles into a normalized direction vector in 3D. When I say ‘magically’, I mean that I used the handy three.js math library routine (I’m sure Babylon has something similar) to do this conversion task for me. Here’s an example in three.js:

let lightTargetVector = new THREE.Vector3();
lightTargetVector.setFromSphericalCoords(1, phi, theta); // 1 means radius, so a unit sphere with a radius of 1

For those of you more mathematically inclined, this is how the math library function does its sorcery:

// the 'this' below is a blank Vector3 (x,y,z)
setFromSphericalCoords( radius, phi, theta ) {
		const sinPhiRadius = Math.sin( phi ) * radius;
		this.x = sinPhiRadius * Math.sin( theta );
		this.y = Math.cos( phi ) * radius;
		this.z = sinPhiRadius * Math.cos( theta );
		return this;
}

As with a lot of my math code in the PathTracingCommon file, I don’t really understand how the above math function does its conversion or how it is ‘derived’, but what I’ve become good at these more recent years of programming is knowing what pieces of the puzzle I will need, and then putting these various (sometimes disparate) pieces together to solve a particular problem. As mentioned, I am a very visual learner, so the above function doesn’t really register in my brain, but when we are doing the conversions and mappings and talking about cameras and rotating the view and stuff like that, I can eventually see the ‘big’ picture in my mind and then go out and locate the necessary components/algos needed for the job.

Now if I console.log the resulting lightTargetVector (x must be negated, due to three.js’ R-handed coordinate system, I think), I get:
x: -0.47694634304269057, y: 0.8608669386377673, z: 0.17728592673600096

And we are done! We now have a 3D normalized x,y,z vector (lightTargetVector) pointing exactly at the Sun when the HDR is wrapped around the scene. As an interesting side note, remember that before I knew how to do all this, I had naively set up an infinite red metal cylinder in the scene as a pointer helper to find the Sun in 3D. Well, here is my old naive result:
(x: -0.4776591005752502, y: 0.8606470280635138, z: 0.17643264075302031)
and with the new mathematically sound version:
(x: -0.4769463430426905, y: 0.8608669386377673, z: 0.17728592673600096)

Not bad for my old eyeballs, LOL! But that naïve sun finder took me almost half an hour to find the exact sun disk center (ha), so we wouldn’t want to use that tedious method every time we opened up a new HDR file! At least this provides some reassurance that the above algo/steps are doing the job correctly.

Here are the steps in abbreviated form without all the commentary:

  1. Get (x,y) pixel location of hot spot in the HDR image with a IM program of your choice.
  2. Divide this (x,y) by (hdrWidth, hdrHeight) to get (u,v) . Range: 0.0-1.0 floating point
  3. Convert v to spherical angle phi → v * PI and u to spherical angle theta → u * 2PI
  4. Convert spherical coordinates to 3D cartesian coordinates using math helper library
  5. Depending on your coordinate system (L-Hand or R-Hand), you might need to negate either the x or the z component. x *= -1, or possibly z *= -1

I still don’t fully understand this last tweak on step 5., but for example, my newly calculated direction was pointing right when my HDR image clearly had the Sun on the left and should therefore be pointing left in the 3D scene. This could possibly only be an issue for us path tracing folk, because we are sending out viewing rays into the scene. I remember having to do some negations when path tracing in three.js’ R-Hand system, such as z *= -1 for camera rays to get everything to look right. But fortunately, this is a simple trial and error 1-liner fix for most HDR projects.

Later I’ll post the second, more complex, method of loading in any HDR file, not having to open it up at all to see where the Sun (x,y) is, and then looping through the RGBE pixel data (mainly just the ‘E’ parts) in Javascript, and then it spits out the (x,y) coordinates that you would have had to get with your mouse and eyes in PhotoShop/GIMP.

Hopefully all this made some sort of sense! I’ll follow up soon with the more robust second method.

Cheers,
-Erich

9 Likes