Path-tracing in BabylonJS

Thank you David! Having fun working on this project! :smiley:

1 Like

Hello all,

Just an update of what I’ve been doing this last week: if you check out my three.js renderer repo, the latest commits show the improvements to the path traced Stanford Dragon foreground with HDRI background environment demo.

What I’ve recently learned and applied to that renderer, will allow me to bring the changes and improvements to our Babylon.js renderer when we want to start loading HDRI (RGBE) images to optionally use in the background as realistic natural lighting for our scenes.

A quick technical note: this all started when a three.js renderer user opened up a github issue saying that when he tried to load his own HDRI image, he was getting tiny white sun spots (noise) all over the place. When I initially created my HDRI environment demo, I used primarily 1 outdoor garden scene that I downloaded for free on HDR Haven.

I am embarrassed to say that up until yesterday I didn’t know how to locate the sun in the picture when it got wrapped around the path traced scene in spherical fashion, ha! Being the wacky person I am, I had initially devised an outside-the-box solution (but not robust at all) of placing an infinitely tall thin red metal cylinder that acted like a pointing pole to the sky in the final scene. I then tweaked and rotated this long debug pole until it poked the sun(ha) in the scene when the 2d HDRI image was wrapped around the scene in 3d. When I eventually found the winning lottery combo of 3d vector x,y,z components, I wrote them down, and hard coded them into the scene shader. I’m not proud of that naive approach, but it worked temporarily, ha ha.

When the issue original poster said that he wanted to be able to load in any image, I knew I had some learnin’ to do, and thus started down this rabbit hole. You’d think that finding the darn sun in a wrapped image and pointing your surfaces at it for lighting would be trivial! But not so as I soon found out. I won’t belabor this post with the algo/math details, but I’m happy to report that I took a good step forward in being able to load any outdoor scene by first algorithmically finding the sun’s pixels in 2d pixel coordinates of the original RGBE image, then converting that 2d representation into 2d spherical coordinates, then spoiler alert: converting that into 3d Cartesian coordinates (an x,y,z vector) that points right at the brightest part of the image when it’s ultimately wrapped around the scene as an infinitely distant sphere, whew!

It never ceases to amaze me just how much basic math (less than calculus) and computer algorithm skills you need to have under your belt to be able to do seemingly simple stuff like find the Sun that was in a 2d image! Also stuff like representing a flat 1d array as a 2d texture with width and height, then going backwards, you start with the texture and figure out how to get back to the flat array - stuff like that. When I finally ‘get’ it, I’m saying to myself, “Why didn’t I do it this way from the beginning, duh!”

At any rate, the next TODO with this part of the renderer is being able to load an indoor HDRI with artificial human-made light sources and possibly many different arbitrary light types in the image’s scene. That will take more thoughtful algos and finesse, but at least we now have the capability to find the Sun! ;-D

-Erich

8 Likes

I think this guy would be interested in the computation to find the sun direction in a HDRI file:

4 Likes

Yes I think you will find a lot of topics in this forum concerning that issue. I remember to have post one too :grinning_face_with_smiling_eyes:
This is not only raytracing related but still an impressive job so very curious to know what is the computation also.

1 Like

Love the Blender click method! Lol Yeah a year or so ago I opened up my initial HDRI outdoor garden image in PhotoShop/GIMP and I took my mouse and held the pointer over the center of the sun, noted the pixel coordinates, and was saying to myself, “Come on Erich, here is the Sun right here, this can’t be that hard”! Lol. Well, turns out that although it is just a handful of simple multiplies, divides, and a sin() and cos() thrown in here and there (which the CPU doesn’t even blink an eye at, ha), the algorithm and steps taken are not obvious at all.

Glad to find I’m not the only one who was stumped trying to find the Sun, ha ha!

Since there are others out there wanting to do this type of calculation (even if their project isn’t related to path tracing,
but more general 3d graphics programming),
I’ll be back real soon with a step-by-step algo and accompanying source code snippets in Javascript. Just have to put my notes together…

See you soon!

5 Likes

Hello all,

Here are the step-by-step instructions for locating the Sun in your 3D scene when the background and its Sun disk came from an HDR image that you wanted to use. The first set of instructions is what I think most people will find useful in general 3D graphics programming. It just requires that you be able to open up your HDR image inside PhotoShop, GIMP, or similar image manipulation program.

The second set of instructions, which will be a separate post after this one, will show how I load in any arbitrary HDR image, examine the pixels manually in JavaScript, and algorithmically calculate where the Sun is located when the 2D image is wrapped around the final 3D scene, all while not having to open up PhotoShop/GIMP at all. This 2nd method requires more work and algos up front, but might be useful nonetheless, especially if you don’t know ahead of time which image will be used as the background. This second method flows directly from the first easier method, so let’s start with the first method:

Just so we’re on the same page with all this HDR stuff - these images are typically in RGBE format, so the usual Red, Green, Blue channels - however the last one where the A/Alpha channel would be, is unique to HDR in that it specifies an exponent (E) which affects the final output of the RGB channels when the unpacking/calculations are usually done automatically. When you open your image in PhotoShop/GIMP, etc., use the eyedropper tool, and hover over various pixels, by the time it is unpacked and displayed on the monitor, the RGB might look the same as any other texture image, and the A Alpha channel might just read 1.0 or 255 for all pixels. This is not very helpful for our 2nd method in the upcoming post, which is why we need to read some of the pixel data manually. But more on that later.

So once you’ve opened up your HDR image, get to the ‘info’ tab/window in your image manipulation program, which will display the value of each pixel (shows x,y coordinates of mouse pointer and pixel, its RGBA values, etc.). Then simply hover over the center of the Sun disk and note the x,y coordinates. If you are using this for path tracing and direct light sampling from the Sun, simply eyeballing this center might not be enough - put the ‘info’ mode into pixel (linear) so that the numbers are floating point, and then slowly move the mouse around the Sun - you’ll see the floating point RGB numbers go anywhere from 0.1 to 2.0 (which is where 99.9 % of the ‘normal’ pixels in the rest of the image will be under). However, if you get the pointer just right (center of ‘hot’ pixel area of Sun disk) these RGB raw linear floating numbers will suddenly jump all the way past 50,000.0 for each color channel, and maybe even past 100,000.0! Congrats, you have located the Sun, lol! Seriously, that’s where the E exponent channel was the highest in the raw image pixel data, which scales up its associated RGB channels accordingly. By the time this all gets to your monitor in the end though, everything gets scaled (tone-mapped) back to the 0-255, or 0.0-1.0 range for all RGB channels, for normal display purposes.

So here are the steps in detailed form (later I’ll list the steps in short algo form):

  1. Find the hottest group of pixels in your image and write down the mouse’s x,y pixel coordinates at that point. For example, my symmetrical_garden_2k.hdr image that I was using has dimensions of 2048x1024 and the Sun’s disk hot spot was right at coordinates (396,174). At this location the RGB channels go past 100,000.0 in pixel info mode.

  2. Normalize these pixel integer coordinates into floating point texture UV representation (range:0.0-1.0). This is accomplished by simply dividing the pixel’s integer X coordinate by the width of the HDR and similarly dividing the pixel’s integer Y coordinate by the height of the HDR. So, again using my example - (396 / 2048, 174 / 1024) = (u,v) = (0.193359375, 0.169921875).

  3. Since this HDR will be wrapped around the scene in spherical fashion, we need these magic sun disk coordinates to be in terms of Polar Coordinates, or in our 3D case - Spherical Coordinates. Spherical Coordinates require 2 angles and a radius of the sphere. We simplify things by making the sphere unit size, or a radius of 1. That leaves the 2 angles to calculate. These are traditionally called phi and theta. Theta goes from 0 to 2PI (so a complete revolution, 360 degrees, etc.). Theta controls the angle that you are rotating/looking right and left in the 3D scene. Phi is half of that range, going from 0 to PI, and controls the angle that you look up and down in the 3D scene. We don’t need the full 2PI range for this Phi angle because it’s not mathematically helpful to rotate the view up past the zenith, which would flip the camera upside down. Now we just map the uv into (phi, theta) with the following simple formulas:

phi = HDRI_bright_v * PI note: V is used for phi
theta = HDRI_bright_u * 2PI note: U is used for theta

So, again using my earlier example, I get:
phi = 0.169921875 * PI = 0.533825314 (remember to use the V float number)
theta = 0.193359375 * 2 * PI = 1.21491278 (remember to use the U float number)

I am a visual person - I have to be able to ‘see’ it to understand, so putting aside all these meaningless numbers for the moment, this all sort of makes intuitive sense when you realize that most HDR images are double wide as they are tall. If you sort of imagine pointing your camera right down the center of the image (mouse pointer in the middle), theta has to be 2 PI ( a complete circle) because we can move the mouse left and right and wind up back where we started in the image. The image is only half as tall because if we move the mouse up and down, we either reach the zenith(top) or the floor/ground beneath us within 1 PI (half a circle).
It wouldn’t make sense to have a perfect square HDR because then you would need to be able to flip the camera upside down when you got past the top of the image, which confuses things mathematically when we get to 3D. So that’s why we must map the angles to 2 PI range for left / right movement and just 1 PI for up and down movement inside the texture image.

  1. Final step, almost there! Now that we have converted the original hot pixel x,y coordinates into Phi and Theta angles to be used in Spherical coordinates, we finally must do one more conversion. This time we go from Spherical coordinates to 3D Cartesian coordinates, which will magically map our polar angles into a normalized direction vector in 3D. When I say ‘magically’, I mean that I used the handy three.js math library routine (I’m sure Babylon has something similar) to do this conversion task for me. Here’s an example in three.js:

let lightTargetVector = new THREE.Vector3();
lightTargetVector.setFromSphericalCoords(1, phi, theta); // 1 means radius, so a unit sphere with a radius of 1

For those of you more mathematically inclined, this is how the math library function does its sorcery:

// the 'this' below is a blank Vector3 (x,y,z)
setFromSphericalCoords( radius, phi, theta ) {
		const sinPhiRadius = Math.sin( phi ) * radius;
		this.x = sinPhiRadius * Math.sin( theta );
		this.y = Math.cos( phi ) * radius;
		this.z = sinPhiRadius * Math.cos( theta );
		return this;
}

As with a lot of my math code in the PathTracingCommon file, I don’t really understand how the above math function does its conversion or how it is ‘derived’, but what I’ve become good at these more recent years of programming is knowing what pieces of the puzzle I will need, and then putting these various (sometimes disparate) pieces together to solve a particular problem. As mentioned, I am a very visual learner, so the above function doesn’t really register in my brain, but when we are doing the conversions and mappings and talking about cameras and rotating the view and stuff like that, I can eventually see the ‘big’ picture in my mind and then go out and locate the necessary components/algos needed for the job.

Now if I console.log the resulting lightTargetVector (x must be negated, due to three.js’ R-handed coordinate system, I think), I get:
x: -0.47694634304269057, y: 0.8608669386377673, z: 0.17728592673600096

And we are done! We now have a 3D normalized x,y,z vector (lightTargetVector) pointing exactly at the Sun when the HDR is wrapped around the scene. As an interesting side note, remember that before I knew how to do all this, I had naively set up an infinite red metal cylinder in the scene as a pointer helper to find the Sun in 3D. Well, here is my old naive result:
(x: -0.4776591005752502, y: 0.8606470280635138, z: 0.17643264075302031)
and with the new mathematically sound version:
(x: -0.4769463430426905, y: 0.8608669386377673, z: 0.17728592673600096)

Not bad for my old eyeballs, LOL! But that naïve sun finder took me almost half an hour to find the exact sun disk center (ha), so we wouldn’t want to use that tedious method every time we opened up a new HDR file! At least this provides some reassurance that the above algo/steps are doing the job correctly.

Here are the steps in abbreviated form without all the commentary:

  1. Get (x,y) pixel location of hot spot in the HDR image with a IM program of your choice.
  2. Divide this (x,y) by (hdrWidth, hdrHeight) to get (u,v) . Range: 0.0-1.0 floating point
  3. Convert v to spherical angle phi → v * PI and u to spherical angle theta → u * 2PI
  4. Convert spherical coordinates to 3D cartesian coordinates using math helper library
  5. Depending on your coordinate system (L-Hand or R-Hand), you might need to negate either the x or the z component. x *= -1, or possibly z *= -1

I still don’t fully understand this last tweak on step 5., but for example, my newly calculated direction was pointing right when my HDR image clearly had the Sun on the left and should therefore be pointing left in the 3D scene. This could possibly only be an issue for us path tracing folk, because we are sending out viewing rays into the scene. I remember having to do some negations when path tracing in three.js’ R-Hand system, such as z *= -1 for camera rays to get everything to look right. But fortunately, this is a simple trial and error 1-liner fix for most HDR projects.

Later I’ll post the second, more complex, method of loading in any HDR file, not having to open it up at all to see where the Sun (x,y) is, and then looping through the RGBE pixel data (mainly just the ‘E’ parts) in Javascript, and then it spits out the (x,y) coordinates that you would have had to get with your mouse and eyes in PhotoShop/GIMP.

Hopefully all this made some sort of sense! I’ll follow up soon with the more robust second method.

Cheers,
-Erich

8 Likes

Hello again!

Here’s part 2 of the posts about finding a 3d sun direction vector in the scene when the background and sun came from a 2d HDR image. Wouldn’t it be nice if we didn’t have to open up every single HDR image in PhotoShop/GIMP to hunt around for the Sun disk’s brightest pixels with our mouse? Well, the following second method (which actually just replaces step 1 in the previous post) is more robust in that you should be able to load any outdoor HDR image with the Sun visible (or even partly visible behind light cloud cover) and it will automatically return the brightest pixel coordinates that correspond with the Sun’s disk center. This ability comes at the expense of being slightly more complicated, but hopefully I can walk everyone through the process so that the steps will be more clear.

By the way, since this entire post takes the place of step 1 in the preceding post (the step where you have to go inside PhotoShop/GIMP, etc), means that when you’re done with this part, you can just continue from step 2 through 5 in the above method and you’ll have everything automated. :slight_smile:

So from a global overview, this is what we’d like to do: First, load in an arbitrary outdoor HDR with arbitrary resolution, then find the Sun natural light source in the image (which could have been placed anywhere, from the original photographer’s camera setup), then return this brightest (or one of the brightest if they’re in a group) pixel’s x and y coordinates. Then as mentioned, we continue on from step 2 in the proceeding post and we’re good to go.

In WebGL when we’re reading or storing image pixel data, it most often has to be to and from a JavaScript flat array, such as:
const data_rgba = new Uint8Array( 4 * imgWidth * imgHeight );
We replace the rgba above with rgbe and that’s pretty much what we’re dealing with. Notice that it’s the (image dimensions * 4) because we have to take each pixel and spread its 4 channels among 4 unique array elements. So what we like to think of as pixel0.rgbe, pixel1.rgbe, pixel2.rgbe (which would be initially pixel[0,1,2] in the image), becomes a spread out flat list of numbers:
p0.r, p0.g, p0.b, p0.e, p1.r, p1.g, p1.b, p1.e, p2.r, p2.g, p2.b, p2.e, and so forth, which is represented in typed array form as pixel_data[0,1,2,3, 4,5,6,7, 8,9,10,11, etc.]. Note that in the original array we had 3 elements for 3 pixels in the original HDR image, and in the 2nd Javascript flat array, we have 12 elements which is 3 pixels * 4 channels. This will come back around soon.

As mentioned in the previous post, the RGB data doesn’t really help us find the Sun because there may be snow or white clouds everywhere in the scene and so most pixels would be 255,255,255 including the Sun’s pixels. The key is every 4th array element: the E channel, or exponent. If we can somehow iterate over every pixel’s E channel and keep a running record for the highest exponent we have found so far, by the end we will have successfully located the brightest pixel in the arbitrary image. And since the Sun is the brightest thing by far in an outdoor HDR image, the highest exponent will max out near or at the very center of the Sun disk.

So if the data is laid out like p0[0],p0[1],p0[2],p0[3],etc. for each pixel’s p0[r],p0[g],p0[b],p0[e], channels etc., then we’re only concerned with index [3], which is the pixel’s E channel. Then we just add 4 to this running index on each loop iteration as we scan the whole HDR image. Here’s some code from my three.js version:

hdrTexture = hdrLoader.load( hdrPath, function ( texture, textureData ) 
{
        texture.encoding = THREE.LinearEncoding;
        texture.minFilter = THREE.LinearFilter;
        texture.magFilter = THREE.LinearFilter;
        texture.generateMipmaps = false;
        texture.flipY = false;

        hdrImgWidth = texture.image.width;
        hdrImgHeight = texture.image.height;
        hdrImgData = texture.image.data;

        // initialize record variables
        highestExponent = -Infinity;
        highestIndex = 0;

        for (let i = 3; i < hdrImgData.length; i += 4) // every 4th element, starting at [3]
        {
                if (hdrImgData[i] >= highestExponent)
                {
                        // record the new winner
                        highestExponent = hdrImgData[i];
                        highestIndex = i;
                }
                        
        }
        console.log("highestIndex: " + highestIndex); // for debug
}

Above we start at pixel data index [3] which is the beginning pixel’s E channel, and then just add 4 to this index on every loop, searching each and every pixel for the highest exponent as we go. Once the last pixel is reached, by the end we should have a flat array index that corresponds to the winning brightest pixel. Some of you may have noticed a shortcoming in my algo; this will indeed find the brightest pixel, but the winning brightest pixel stops being recorded at the edge of the disk when the pixels return back to normal sky with less brightness as we scan from left to right. To find the true center (and maybe I will add this, or someone else can take a shot at it for me!) you have to remember when you reached the first winning bright pixel, the last winning bright pixel, then take the average of those 2 values to get the exact center of the winning group of brightest pixels. This is straightforward in 1D as we scan from left to right, but what complicates matters is that we have to do this going up and down and find that average as well. Therefore I simply left my above loop in place and it seems to do ok with the HDRs that I have tested so far. Where it would fail is if we somehow had an alien sky HDR from another planet view, and the HDR had 2 or more Suns. This algo would only find and sample the brightest light source when path tracing.

So from here on, it’s a matter of getting this desired flat array brightest index number into something more meaningful and useful, namely an HDR image x,y coordinate corresponding to the exact Sun disk pixel. Once more, here’s the high overview plan:

  1. loop through every 4th element in the large flat array of hdr rgbe pixel data, so [3] and onwards, adding 4 to each loop index - [3],[7],[11],[15],etc. Find and record winning highest exponent, E channel.
  2. take this winning exponent flat array index number and subtract 3 from it to get back to the r channel, or [0]'th place of each pixel - that way we are back on the boundary of every 4 numbers of each pixel. Our result will now be guaranteed to be divisible by 4 (because of how it was spread and laid out for WebGL in the first place, initially multiplying by 4)
  3. Divide this index number by 4 - now we’ll be back in range of the original 2048x1024(or whatever) resolution. This will still be large number because it is still 1d flat, but at least it is a little smaller than it was before! Its range will be anywhere from: 0 to (width*height)
  4. Now to make this correct-range flat number into something useful, we have to map this large number back into the 2D space of the HDR image (width location, height location), or (x,y) which needs to be in the range: (0 to width-1, 0 to height-1) . Note: the ‘-1’ reflects the fact that our data is always 0-based. Through trial and error and some experimenting (ha) I came up with the following formulas for getting the final x and y coordinates:
    X = highestIndex modulo with imgWidth
    Y = highestIndex divided by imgWidth, then Floored to get the integer portion

At the end of all this, we will have the same thing we had in step 1 of the previous post: an exact X,Y pixel location corresponding to the brightest part of the HDR image.

Here’s some code that follows the above outlined steps

1. /* As printed above: the 'for (let i = 3...)' loop which results in 'highestIndex' */
2. highestIndex -= 3;  // get back to the 0 boundary of every possible pixel
3. highestIndex /= 4; // guaranteed to be evenly divisible by 4, by design
4. brightestPixelX = highestIndex % hdrImgWidth; // range: 0 to imgWidth-1
   brightestPixelY = Math.floor(highestIndex / hdrImgWidth); // range: 0 to imgHeight-1, essentially this is 
//how many times that the large number has exceeded/wrapped the imgWidth x dimension.

We have now automatically completed step 1 in the first post, avoiding opening up an image manipulation program and hunting around the image with a mouse to try and find the brightest pixels. Now we can just continue on as normal with steps 2-5 from my earlier post and we should have a quick and robust way to find the Sun in 3D from a 2D HDR image!

If you look at all the math, it’s not that involved, especially for the CPU that won’t even hardly blink during this whole affair. But just to get the algo steps right in my mind, man it wasn’t pretty! Lol. At one point I was scribbling numbers from 0-100 in a square 2d matrix on a little scrap of paper like a madman, to see how a 1d list of numbers gets mapped to a 2d square array with a width boundary and height dimensions (ha). But I eventually understood why step 4 has to do what it does. :slight_smile:

Now a caveat with all this is that this method will only work for a single, dominant light source, like the Sun outdoors. To scan an indoor HDR image and return multiple bright areas coming from human-made artificial lights will require a more sophisticated approach. So that’s still a big TODO. But hopefully this method will at least let us efficiently render and sample all outdoor scenes with Sun visible, and even if you’re not doing path tracing, I hope that this technique will assist you in locating the light source in your general 3D Babylon scene if you are using an HDR image.

If there’s any confusion about these HDR posts (there were a lot of words back there, ha ha!), please feel free to post here on this thread. I will do my best to clarify anything that was not clear, either algo-wise, code-wise, or both.

Best of luck and happy HDR rendering! :smiley:
-Erich

5 Likes

Hello @Necips

Yes I’m sure there are better, more robust ways to do light source detection in images. It’s interesting when you mentioned shadow detection, as that could be used to trace the light back towards the source, even if the source is off camera in the image. And your earlier post about finding the real-world Sun angle based on geographic coordinates and time of day is intriguing. This could be used in a number of different applications, path tracing outdoor scenes in real time being just one of them!

My previous 2 posts about doing this light detection only work if the Sun is visible (or partially covered by a thin cloud layer), and only if the image was taken outdoors because the Sun dominates all other human-made light sources, and therefore it is easier to separate and identify algorithmically.

Although my simple approach does the trick for now, we will need a more sophisticated approach for detecting light sources in indoor HDR images, where there could be multiple arbitrarily shaped lights, or in some images I’ve encountered, no visible light source at all, just ambient room light coming from a window off camera!

To help me get started figuring out some of the math in my posted algos, I followed the pbr book light sampling link that was suggested by a three.js renderer user and forum participant. In the pbr book (the 3rd edition which is now free online and is pretty much the bible for CG graphics), they explain some of the x,y coordinates to Spherical angles, and then the Spherical coordinates to Cartesian conversion that I used in my first post about HDR light detection. I wouldn’t have figured that math out on my own! Ha. But the reason I mentioned this book is that later in the same chapter, it gives a technique to loop through all the pixels in any HDR image (indoor or outdoor, lights visible or lights off camera), and it actually builds a lighting probability density distribution as it goes from pixel to pixel, and then when you get to the end of the loop, you have sort of an importance ‘light’ map that you can directly importance-sample from when path tracing. Because they use this more sophisticated approach (with math and probabality algos that are beyond my understanding still), the end result is still considered unbiased rendering, which is really cool. In other words, if you actually placed a real world scene in that HDR spherical environment, we can expect the rendered outcome to match reality to the best of our human ability.

I would like to incorporate this ability/technique as well as your earlier ideas, but I will have to continue to study these approaches until I can visually ‘see’ the overall picture and be able to say in non-math speak what is going on under the hood (like I did hopefully in my previous 2 posts :smiley: ).

Thanks for sharing and the inspiring info!

4 Likes

Hi All,

Just checking in with a quick update. I recently figured out how to create and extract Babylon’s notion of an empty 3D transform. The equivalent in three.js was THREE.Object3D() - it didn’t have any geometry or materials associated with it, it was just an empty transform, kind of like a gizmo object in a 3D editor. Browsing the Babylon source code on GitHub, from what I can gather, I think the equivalent in Babylon is BABYLON.TransformNode(). I was able to assign an individual TransformNode to the spheres in our test scene and then on the Javascript setup side, that allowed me to perform simple operations on the transform using familiar Babylon commands, i.e LeftSphereTransformNode.position.set(x,y,z), LeftSphereTransformNode.scale.set(x,y,z), and LeftSphereTransformNode.rotation.set(x,y,z).

This lets the user-side code be much more flexible, rather than having to hardcode those object parameters in each path tracing shader. The flipside is that in the ever-growing path tracing library shader includes file, I had to create a special sphere-ray intersection routine called UnitSphereIntersect(ray), in which it doesn’t take any scaling (sphere radius), rotation, or translation (sphere position) into account, but rather intersects a ray with an untransformed unit sphere (radius of 1) centered at the origin (0,0,0). One of ray tracing’s greatest abilities is that you can have such a simplified ray-sphere function, but then instead transform the ray by the inverse of the desired sphere object’s transformation. It resembles what we do for 3D scene objects and the camera inverse to correctly display the transformed objects out in the scene.

That’s why I have been working on getting this TransformNode business up and running for for the last few days. Hopefully very soon I’ll have a 2nd (similar room) demo to show the new easy transforming abilities on the end-user’s JS side. I’ll also add the most-encountered general quadrics - unit sphere, cylinder, cone, paraboloid, box, disk, and rectangle. I might or might not include the hyperboloid (hour-glass) and the hyperbolic paraboloid (saddle), as we don’t really come across those very often and they tend to be a little more finnicky mathematically when it comes to analytically intersecting with rays. The torus (doughnut or ring) requires a different approach all together when it comes to finding ray intersections because it is quartic (4 solutions max) as opposed to the easier quadric (2 solutions max) shapes listed above that can be found either geometrically or by the famous old quadratic equation. But eventually I’ll add that too because ring shapes come up more often than hyperbolic paraboloids, ha. :smiley:

Will return soon with some new capabilities for our rendering system!

6 Likes

New demo at the Babylon PathTracing Renderer repo on GitHub!

Transformed Quadric Geometry Demo

Additional keyboard controls for this demo:
R,T,Y keys: place the transform operation into different modes,
R: Rotation mode
T: Translation mode
Y: Scaling mode
(default is R-Rotation mode, when the demo begins)
F/G keys: decrease/increase X value of chosen operation
H/J keys: decrease/increase Y value of chosen operation
K/L keys: decrease/increase Z value of chosen operation

This new demo showcases ray/path tracing’s really useful ability of intersecting unit (radius of 1) quadric shapes, which simplifies the library intersection routines considerably. The shapes are then transformed geometrically by instead transforming the intersecting rays by the inverse matrix of the classical transform operations (rotation, translation, scaling) being done to the shapes.

I have so far implemented the most common shapes - sphere(ellipsoid), cylinder, cone, and paraboloid. Soon I will add disk, rectangle, box, and pyramid/frustum. The cone and pyramid will soon have a ‘k’ parameter that defines the radius of the top opening of the shape. This way we can have truncated cones and truncated pyramids (which are basically frustums, like the ones typically used for perspective cameras).

I have 2 questions for the more seasoned Babylon developers/users out there:
One transformation that I did not find in the Babylon source code is Skew(a.k.a. Shear) operation. This is where you can Shear the Y and Z coordinates of the shape by the X coordinate value, for instance. If Babylon does not have this, would it be beneficial for other developers to add this ability? So in the end, we would have Rotation, Translation, Scaling, and Shearing (Skew) operations available to us.

The second question is relating to UI management for the demos. As I keep adding features/capabilities to our renderer, I want to let the end user be able to control the new parameters in real time. So far, I have added a bunch of Keyboard key checks, which although fast and helpful, are getting kind of clunky and hard to remember. In three.js, the preferred UI library was dat.gui, which adds a clickable/touchable menu interface to the right top portion of the webpage, and has sliders and expanding/collapsing folders and stuff like that. Is there a similar third-party interface library that Babylon.js prefers, or would simply dropping dat.gui into the mix be the way to go?

I’ll be back hopefully soon with more quadric shapes and features in the new demo.
Enjoy!

5 Likes

Excellent work you are doing.

Suggest you ask this again in the Features Request section so that the question is not restricted to just those following this topic.

There is a preference for not a third-party library but one belonging to Babylon.js itself

This has the advantage of also being available in VR environments

2 Likes

@JohnK

Will do on both counts! Thank you for the suggestions. I’ll head over to the feature request area and see if anyone wants to add Shear operation capabilities to Babylon. I understand on the matrix level kind of what’s supposed to happen with the various matrix elements multiplications, but I’m not that comfortable yet with the core Babylon.js typescript math library/matrix source code to add it myself as a PR. But maybe someone who understands the core a little better can tackle it.

I will read up on the Babylon.js in-house GUI that you referred to, thanks! At first glance, this looks like the cleanest and best way forward for showing user-controlled parameters as they are added.

Thanks again!
-Erich

2 Likes

Note that using dat.gui is perfectly doable and may be the easiest thing to do for you as you already have your GUI using dat.gui. Also, note that drop down lists are not supported in Babylon GUI yet, so if you are using them you should probably stick with dat.gui anyway.

2 Likes

@Evgeni_Popov

Thank you for the additional info and advice. I didn’t have any particular allegiance to dat.gui, but it was used all throughout the three.js demos, so I just ended up following their code examples.

I might need drop down menus at some point - I guess I’ll have to take a close look at both solutions and see what would be the easiest option to get something up and running. I’m sort of leaning towards dat.gui, because like you mentioned, I already have experience and code in place with that system.

Thanks to all for the input! :slight_smile:

1 Like

Hello!

Just a quick update: I added support for the Hyperboloid and the Pyramid/Frustum. Also I added a shapeK parameter to control the opening/width of the Cone, Hyperboloid, and Pyramid/Frustum. This new K parameter is controlled by the Z,X keys (Z to decrease and X to increase) and the final range is 0.0 - 1.0.

Added shapes and K parameter

At this point we have support for the following quadric/analytic shapes: Sphere, Cylinder, Cone, Paraboloid, Hyperboloid, Box, Pyramid/Frustum, Disk, and Rectangle. Lastly will be the Torus/Ring which requires a different intersection approach altogether. But I think that the ones we have now are the most common and/or useful, especially when they can be easily transformed with the familiar Babylon operations: Rotate, Translate, and Scale. I made a feature request in the appropriate forum for the Shear(skew) operation to be added to the Babylon core. If that gets added, we will have support for that transform as well, giving maximum flexibility in transforming all of the shapes.

Still working on getting dat.gui integrated into the demos for more streamlined and visual control over scene/shape parameters. Hopefully I’ll have something up and running soon, as well as the final math shape, Torus/Ring, implemented.

Enjoy! :slight_smile:

2 Likes

Hi again, I just added torus/ring and capsule to the list of mathematical 3D shapes that can be intersected by our renderer!

new torus, capsule, flattened ring shape support

(Note: remember to try the Z and X keys to decrease/increase the K parameter that some shapes require)

Sorry this took a while but the torus/ring is the most notoriously difficult and finnicky shape to ray trace. A week ago in our Babylon renderer, I first tried Inigo Quilez’s (iq of ShaderToy) recently updated (2019) analytic torus intersection method, which is similar to the simpler quadric shapes like sphere, cylinder, cone, etc. in that it solves for the ray’s t by root finding. But since the torus can have up to 4 possible points of intersection with a ray, it is quartic in nature, requiring solving a quartic equation (degree 4) instead of the much simpler quadratic equation (degree 2) for other classic shapes. This root solving of a 4th degree quartic equation historically has numerical instability and precision issues, especially when the shape is transformed or if the desired thickness of the torus is increased greatly. Because of these instabilities and precision problems, numerous black bands/stripes as well as thin-striped holes/gaps appeared in the torus in my test scene. Since this is unacceptable, I scrapped the whole thing and moved to the ray marching approach, which I’m grateful for ‘koiava’ (also on ShaderToy) for suggesting as an alternative. Now when you view and manipulate the torus and its transform and thickness, you should not see those bad artifacts. With the torus, no approach is perfect, and although the ray marching approach gives better results up front than analytic ray tracing by root finding, it can still give artifacts of its own ray-marching kind (slight gaps on the silhouette edges) and have some slight slow-down of framerate, especially if you were to create a giant thick torus that engulfs most of the scene. Also, if you create a large glass torus and try to fly the camera through it, the ray marching falls apart and large gaps in the shape appear. Even with these inconsistencies, I can live with our current torus intersection system, and hope it will at least be of use to those needing that shape in their ray/path-tracing scenes.

At this point, we now have a fairly-exhaustive and complete shape intersection library. All of these mathematical shapes (with the exception of triangles, which will be covered along with efficient AABBs when we eventually start doing BVHs for glTF triangular models) are defined in terms of their unit-radius size object space. Some have an additional K parameter controlling the opening or thickness of the shape. Then, through the use of Babylon’s familiar transform operations such as (translation, rotation, scaling), they are positioned, rotated and sized in terms of the scene’s world coordinates.

Through trial and error, I have found this process of defining/intersecting shapes in unit radius space followed with transform by the object’s inverse matrix, more efficient (especially when mathematically tracing unit size shapes in the intersection library) and easier to use for the end user - using familiar Babylon commands to manipulate their desired scene objects (as they normally would do for traditional Babylon WebGL scenes). The next step (cherry on top) in terms of the user experience would be allowing the user just to call ‘new Sphere(transform);’ from Javascript and then the fragment shader would magically ‘include’ and compile everything needed to place the desired sphere into the scene. In other words, the end user wouldn’t have to touch the fragment shader code at all, only the Javascript setup file normally required for traditional Babylon scenes. This will require more thought and planning, but I believe it is possible.

Still planning on adding dat.gui to some demos so that users can see examples of how to add visual gui controls to their own demos.

p.s. I just thought of one last shape that I could try adding - a flattened ring (flat on top and bottom). This is essentially a solid cylinder with a smaller cylinder hole cut out of its middle. The closest analogy I can think of is a stack of old CD disks with user-specified variable-sized holes cut out of their middles. I might try adding that as the final math shape for fun (my three.js version doesn’t even have that :slight_smile: ). Will be back with more soon!

p.p.s edit 6/26/21 Successfully added above mentioned ‘flattened ring’ shape. It’s the light blue shape on the right side of the Cornell Box. Z and X keys control the radius of the inner hole that runs through the vertical cylinder. Enjoy!

9 Likes

I love clicking on your example links. I’m every time amazed how even better your tool becomes !

4 Likes

@erichlof This was such an amazing thread to read through, I don’t have much to add but I am really impressed with the work you are doing and can’t wait to see what you come up with next :slight_smile:

2 Likes

@erichlof agreed amazing work !! :slight_smile: Forgive my ignorance but how far off is this from working on a triangle mesh with pbr material ?

1 Like

Thank you! - having fun working on this. :slight_smile:

As to when we will have triangle models path traced with PBR materials, that is definitely on my radar and something I would like to get up and running soon. It’s difficult to predict the time-line because although I understand what needs to be in place in terms of BVH acceleration structures on the GPU and the finicky nature of WebGL with its quirks and limitations, I have only ever done that complete system with three.js as the host API. In other words, I’m sort of learning Babylon.js as I go through these recent steps here for our renderer.

By far the most challenging Javascript code that I ever had to put together was the acceleration structure builder that takes as input an unordered triangle soup from your desired model and has to construct a heirarchy of axis aligned bounding boxes over all the various triangles, then condense that heirarchy onto a compact GPU data texture to be traversed in the path traced scene. I know it can be done, but I have to first get comfortable with how Babylon.js stores raw model data under the hood after it is loaded with the glTF loader, so I can hook into that process and use the triangle data for our ray tracing purposes.

I plan on adding dat.gui to the quadric geometry demo so that users can see how to integrate a basic menu and controls into the path tracing web page. Once a simple demonstration of that is working (which shouldn’t take long), then I’ll start working on the BVH code, so that we can start loading in triangle models.

Please check back here every so often for progress updates. Hopefully we’ll have a more robust system relatively soon!

3 Likes