Path-tracing in BabylonJS

Hello everyone,
I added some fun, new controls to the red and blue Cornell Box scene with spheres. Now when you press number keys 1-6, the quad area light instantly teleports to a different side of the room! Also by pressing [ and ] (open and close bracket keys), you can decrease and increase the size of the quad light and the path tracer will adjust its sampling strategy automatically in real time.

Some new quad light controls

I’ve been having fun experimenting with different combinations of light placement and size! :slight_smile:

4 Likes

Hello again!

Major update to the shader side of this project: my 1st attempt at an edge detector and a denoiser to go along with it. Actually, it’s more of a ‘noise-smoother’, but it still makes a very noticeable difference!

Inspired by NVIDIA’S recent efforts to bring path tracing into real time, like Path Traced Quake, Quake II RTX, and Minecraft RTX (which all feature edge detection and their proprietary deep learning A.I. denoising technology), I recently set out to create my own edge detection and simple denoiser (or noise smoother if you like) that could be run real time in the browser, even on a smartphone!

Using a car analogy, when I started my path tracing journey in 2015, we were riding around in a station wagon. When I learned the tricks of the trade like cosWeighted importance sampling, blueNoise sampling, and wavefront pathtracing, we stepped up to a Camaro. Now when we use edge detection and simple denoising strategies, we have built and strapped ourselves into a Formula One car and are burning up the race track! :slight_smile: This is about as fast as I can get WebGL to path trace a scene so far.

Demo with edge detection and denoising

BTW I changed the sphere on the right from glass to metal mirror so that you can get the full effect. If you drag the camera around, the scene follows with you, almost noise-free and much less distracting, and if you do let the camera sit still, it instantly converges to a photo-realistic result!

Although my 1st attempt at a noise-smoother and its edge detector shows promise, there is always room for improvement and convergence boost. My small attempt is nowhere near the sophistication of NVIDIA’S (nor will it ever be probably), but who knows - we may someday find out how to build a rocket and leave the formula one race track far below! :slight_smile:

14 Likes

Man you rock so much! This is impressive!

3 Likes

Each commit is looking better and better! I’m sorry I haven’t pushed anything to the offscreen rendering this week; I’ve been working on a very similar problem for another project and that’s gotten me bogged down a bit. If it’s useful here’s the input abstraction and input processing scaffolding.

The onCommandObservable would be hooked up to the cross-process message pump on to pass input (e.g., ‘MOVE_LEFT’) to the offscreen rendering process, which in turn is responsible to deciding what to do with that input, and so forth. @PichouPichou 's example upthread is very useful and adapting the specific offscreen semantics from there into the above PG would be the next step before integrating it into the current (RT)^2 repos.

4 Likes

To all,
Sorry for the brief pause in the github repo updates. I frequently and randomly get new ideas in my head to try, and so I’ve been experimenting with all of my established test scenes from the three.js repo on my own machine in order to see if the new ideas will come to fruition.

The latest idea is increasing the radius of the box filter for diffuse surfaces, which gives even better and faster, almost immediate convergence, but must be handled with care not to mess up the areas that need sharper detail, such as corners and object silhouette edges.

Everything’s going smoothly so far, and then hopefully in the next few days I will bring those most recent updates to the Babylon.js PathTracing Renderer here. I just didn’t want to pollute our repo with my little experiments that might or might not last the test of time and quality control.

Hopefully will have something soon!

7 Likes

Don’t worry there is no pressure @erichlof.

It’s already super fun and promising to see the progress made. We know it is always very important to start with a good base so take the time you need :wink:

1 Like

Hello everyone!

Back with more updates! A couple of weeks ago, an idea popped into my head: why not try increasing the radius of the blur kernel in the screenOutput shader?

The old one was 3×3, so 9 texture taps in the shader to effectively blend each pixel with its immediate neighbor pixels. The new denoising kernel has an increased dimension of 5x5. Although this essentially means 25 texture taps in the screenOutput shader, the GPU doesn’t seem to mind because all pixel threads are doing this same task, so there should be no GPU divergence. And the results are definitely worth the extra work - diffuse and clearCoat diffuse surfaces now converge nearly instantly!

Also, for added fun I hooked up the Right sphere’s material choice to the 7,8,9,and 0 keys. It starts out as Metal, but if you try pressing those keys, it immediately switches to the new material without the path tracer missing a beat.

Demo with improved denoiser and quick material switcher

It’s fun to just play with all the quad area light configurations as well as the different material choices.

If you want to see the full range of scenes and how the new denoiser adapts to different environments and objects, check out my three.js version github repo.

In the meantime I’ll keep adding to the Babylon.js path tracing library file so we can utilize more of these features. Enjoy!

-Erich

9 Likes

Man you are really rocking it!!! I’m amazed

1 Like

Thank you David! Having fun working on this project! :smiley:

1 Like

Hello all,

Just an update of what I’ve been doing this last week: if you check out my three.js renderer repo, the latest commits show the improvements to the path traced Stanford Dragon foreground with HDRI background environment demo.

What I’ve recently learned and applied to that renderer, will allow me to bring the changes and improvements to our Babylon.js renderer when we want to start loading HDRI (RGBE) images to optionally use in the background as realistic natural lighting for our scenes.

A quick technical note: this all started when a three.js renderer user opened up a github issue saying that when he tried to load his own HDRI image, he was getting tiny white sun spots (noise) all over the place. When I initially created my HDRI environment demo, I used primarily 1 outdoor garden scene that I downloaded for free on HDR Haven.

I am embarrassed to say that up until yesterday I didn’t know how to locate the sun in the picture when it got wrapped around the path traced scene in spherical fashion, ha! Being the wacky person I am, I had initially devised an outside-the-box solution (but not robust at all) of placing an infinitely tall thin red metal cylinder that acted like a pointing pole to the sky in the final scene. I then tweaked and rotated this long debug pole until it poked the sun(ha) in the scene when the 2d HDRI image was wrapped around the scene in 3d. When I eventually found the winning lottery combo of 3d vector x,y,z components, I wrote them down, and hard coded them into the scene shader. I’m not proud of that naive approach, but it worked temporarily, ha ha.

When the issue original poster said that he wanted to be able to load in any image, I knew I had some learnin’ to do, and thus started down this rabbit hole. You’d think that finding the darn sun in a wrapped image and pointing your surfaces at it for lighting would be trivial! But not so as I soon found out. I won’t belabor this post with the algo/math details, but I’m happy to report that I took a good step forward in being able to load any outdoor scene by first algorithmically finding the sun’s pixels in 2d pixel coordinates of the original RGBE image, then converting that 2d representation into 2d spherical coordinates, then spoiler alert: converting that into 3d Cartesian coordinates (an x,y,z vector) that points right at the brightest part of the image when it’s ultimately wrapped around the scene as an infinitely distant sphere, whew!

It never ceases to amaze me just how much basic math (less than calculus) and computer algorithm skills you need to have under your belt to be able to do seemingly simple stuff like find the Sun that was in a 2d image! Also stuff like representing a flat 1d array as a 2d texture with width and height, then going backwards, you start with the texture and figure out how to get back to the flat array - stuff like that. When I finally ‘get’ it, I’m saying to myself, “Why didn’t I do it this way from the beginning, duh!”

At any rate, the next TODO with this part of the renderer is being able to load an indoor HDRI with artificial human-made light sources and possibly many different arbitrary light types in the image’s scene. That will take more thoughtful algos and finesse, but at least we now have the capability to find the Sun! ;-D

-Erich

8 Likes

I think this guy would be interested in the computation to find the sun direction in a HDRI file:

4 Likes

Yes I think you will find a lot of topics in this forum concerning that issue. I remember to have post one too :grinning_face_with_smiling_eyes:
This is not only raytracing related but still an impressive job so very curious to know what is the computation also.

1 Like

Love the Blender click method! Lol Yeah a year or so ago I opened up my initial HDRI outdoor garden image in PhotoShop/GIMP and I took my mouse and held the pointer over the center of the sun, noted the pixel coordinates, and was saying to myself, “Come on Erich, here is the Sun right here, this can’t be that hard”! Lol. Well, turns out that although it is just a handful of simple multiplies, divides, and a sin() and cos() thrown in here and there (which the CPU doesn’t even blink an eye at, ha), the algorithm and steps taken are not obvious at all.

Glad to find I’m not the only one who was stumped trying to find the Sun, ha ha!

Since there are others out there wanting to do this type of calculation (even if their project isn’t related to path tracing,
but more general 3d graphics programming),
I’ll be back real soon with a step-by-step algo and accompanying source code snippets in Javascript. Just have to put my notes together…

See you soon!

5 Likes

Hello all,

Here are the step-by-step instructions for locating the Sun in your 3D scene when the background and its Sun disk came from an HDR image that you wanted to use. The first set of instructions is what I think most people will find useful in general 3D graphics programming. It just requires that you be able to open up your HDR image inside PhotoShop, GIMP, or similar image manipulation program.

The second set of instructions, which will be a separate post after this one, will show how I load in any arbitrary HDR image, examine the pixels manually in JavaScript, and algorithmically calculate where the Sun is located when the 2D image is wrapped around the final 3D scene, all while not having to open up PhotoShop/GIMP at all. This 2nd method requires more work and algos up front, but might be useful nonetheless, especially if you don’t know ahead of time which image will be used as the background. This second method flows directly from the first easier method, so let’s start with the first method:

Just so we’re on the same page with all this HDR stuff - these images are typically in RGBE format, so the usual Red, Green, Blue channels - however the last one where the A/Alpha channel would be, is unique to HDR in that it specifies an exponent (E) which affects the final output of the RGB channels when the unpacking/calculations are usually done automatically. When you open your image in PhotoShop/GIMP, etc., use the eyedropper tool, and hover over various pixels, by the time it is unpacked and displayed on the monitor, the RGB might look the same as any other texture image, and the A Alpha channel might just read 1.0 or 255 for all pixels. This is not very helpful for our 2nd method in the upcoming post, which is why we need to read some of the pixel data manually. But more on that later.

So once you’ve opened up your HDR image, get to the ‘info’ tab/window in your image manipulation program, which will display the value of each pixel (shows x,y coordinates of mouse pointer and pixel, its RGBA values, etc.). Then simply hover over the center of the Sun disk and note the x,y coordinates. If you are using this for path tracing and direct light sampling from the Sun, simply eyeballing this center might not be enough - put the ‘info’ mode into pixel (linear) so that the numbers are floating point, and then slowly move the mouse around the Sun - you’ll see the floating point RGB numbers go anywhere from 0.1 to 2.0 (which is where 99.9 % of the ‘normal’ pixels in the rest of the image will be under). However, if you get the pointer just right (center of ‘hot’ pixel area of Sun disk) these RGB raw linear floating numbers will suddenly jump all the way past 50,000.0 for each color channel, and maybe even past 100,000.0! Congrats, you have located the Sun, lol! Seriously, that’s where the E exponent channel was the highest in the raw image pixel data, which scales up its associated RGB channels accordingly. By the time this all gets to your monitor in the end though, everything gets scaled (tone-mapped) back to the 0-255, or 0.0-1.0 range for all RGB channels, for normal display purposes.

So here are the steps in detailed form (later I’ll list the steps in short algo form):

  1. Find the hottest group of pixels in your image and write down the mouse’s x,y pixel coordinates at that point. For example, my symmetrical_garden_2k.hdr image that I was using has dimensions of 2048x1024 and the Sun’s disk hot spot was right at coordinates (396,174). At this location the RGB channels go past 100,000.0 in pixel info mode.

  2. Normalize these pixel integer coordinates into floating point texture UV representation (range:0.0-1.0). This is accomplished by simply dividing the pixel’s integer X coordinate by the width of the HDR and similarly dividing the pixel’s integer Y coordinate by the height of the HDR. So, again using my example - (396 / 2048, 174 / 1024) = (u,v) = (0.193359375, 0.169921875).

  3. Since this HDR will be wrapped around the scene in spherical fashion, we need these magic sun disk coordinates to be in terms of Polar Coordinates, or in our 3D case - Spherical Coordinates. Spherical Coordinates require 2 angles and a radius of the sphere. We simplify things by making the sphere unit size, or a radius of 1. That leaves the 2 angles to calculate. These are traditionally called phi and theta. Theta goes from 0 to 2PI (so a complete revolution, 360 degrees, etc.). Theta controls the angle that you are rotating/looking right and left in the 3D scene. Phi is half of that range, going from 0 to PI, and controls the angle that you look up and down in the 3D scene. We don’t need the full 2PI range for this Phi angle because it’s not mathematically helpful to rotate the view up past the zenith, which would flip the camera upside down. Now we just map the uv into (phi, theta) with the following simple formulas:

phi = HDRI_bright_v * PI note: V is used for phi
theta = HDRI_bright_u * 2PI note: U is used for theta

So, again using my earlier example, I get:
phi = 0.169921875 * PI = 0.533825314 (remember to use the V float number)
theta = 0.193359375 * 2 * PI = 1.21491278 (remember to use the U float number)

I am a visual person - I have to be able to ‘see’ it to understand, so putting aside all these meaningless numbers for the moment, this all sort of makes intuitive sense when you realize that most HDR images are double wide as they are tall. If you sort of imagine pointing your camera right down the center of the image (mouse pointer in the middle), theta has to be 2 PI ( a complete circle) because we can move the mouse left and right and wind up back where we started in the image. The image is only half as tall because if we move the mouse up and down, we either reach the zenith(top) or the floor/ground beneath us within 1 PI (half a circle).
It wouldn’t make sense to have a perfect square HDR because then you would need to be able to flip the camera upside down when you got past the top of the image, which confuses things mathematically when we get to 3D. So that’s why we must map the angles to 2 PI range for left / right movement and just 1 PI for up and down movement inside the texture image.

  1. Final step, almost there! Now that we have converted the original hot pixel x,y coordinates into Phi and Theta angles to be used in Spherical coordinates, we finally must do one more conversion. This time we go from Spherical coordinates to 3D Cartesian coordinates, which will magically map our polar angles into a normalized direction vector in 3D. When I say ‘magically’, I mean that I used the handy three.js math library routine (I’m sure Babylon has something similar) to do this conversion task for me. Here’s an example in three.js:

let lightTargetVector = new THREE.Vector3();
lightTargetVector.setFromSphericalCoords(1, phi, theta); // 1 means radius, so a unit sphere with a radius of 1

For those of you more mathematically inclined, this is how the math library function does its sorcery:

// the 'this' below is a blank Vector3 (x,y,z)
setFromSphericalCoords( radius, phi, theta ) {
		const sinPhiRadius = Math.sin( phi ) * radius;
		this.x = sinPhiRadius * Math.sin( theta );
		this.y = Math.cos( phi ) * radius;
		this.z = sinPhiRadius * Math.cos( theta );
		return this;
}

As with a lot of my math code in the PathTracingCommon file, I don’t really understand how the above math function does its conversion or how it is ‘derived’, but what I’ve become good at these more recent years of programming is knowing what pieces of the puzzle I will need, and then putting these various (sometimes disparate) pieces together to solve a particular problem. As mentioned, I am a very visual learner, so the above function doesn’t really register in my brain, but when we are doing the conversions and mappings and talking about cameras and rotating the view and stuff like that, I can eventually see the ‘big’ picture in my mind and then go out and locate the necessary components/algos needed for the job.

Now if I console.log the resulting lightTargetVector (x must be negated, due to three.js’ R-handed coordinate system, I think), I get:
x: -0.47694634304269057, y: 0.8608669386377673, z: 0.17728592673600096

And we are done! We now have a 3D normalized x,y,z vector (lightTargetVector) pointing exactly at the Sun when the HDR is wrapped around the scene. As an interesting side note, remember that before I knew how to do all this, I had naively set up an infinite red metal cylinder in the scene as a pointer helper to find the Sun in 3D. Well, here is my old naive result:
(x: -0.4776591005752502, y: 0.8606470280635138, z: 0.17643264075302031)
and with the new mathematically sound version:
(x: -0.4769463430426905, y: 0.8608669386377673, z: 0.17728592673600096)

Not bad for my old eyeballs, LOL! But that naïve sun finder took me almost half an hour to find the exact sun disk center (ha), so we wouldn’t want to use that tedious method every time we opened up a new HDR file! At least this provides some reassurance that the above algo/steps are doing the job correctly.

Here are the steps in abbreviated form without all the commentary:

  1. Get (x,y) pixel location of hot spot in the HDR image with a IM program of your choice.
  2. Divide this (x,y) by (hdrWidth, hdrHeight) to get (u,v) . Range: 0.0-1.0 floating point
  3. Convert v to spherical angle phi → v * PI and u to spherical angle theta → u * 2PI
  4. Convert spherical coordinates to 3D cartesian coordinates using math helper library
  5. Depending on your coordinate system (L-Hand or R-Hand), you might need to negate either the x or the z component. x *= -1, or possibly z *= -1

I still don’t fully understand this last tweak on step 5., but for example, my newly calculated direction was pointing right when my HDR image clearly had the Sun on the left and should therefore be pointing left in the 3D scene. This could possibly only be an issue for us path tracing folk, because we are sending out viewing rays into the scene. I remember having to do some negations when path tracing in three.js’ R-Hand system, such as z *= -1 for camera rays to get everything to look right. But fortunately, this is a simple trial and error 1-liner fix for most HDR projects.

Later I’ll post the second, more complex, method of loading in any HDR file, not having to open it up at all to see where the Sun (x,y) is, and then looping through the RGBE pixel data (mainly just the ‘E’ parts) in Javascript, and then it spits out the (x,y) coordinates that you would have had to get with your mouse and eyes in PhotoShop/GIMP.

Hopefully all this made some sort of sense! I’ll follow up soon with the more robust second method.

Cheers,
-Erich

9 Likes

Hello again!

Here’s part 2 of the posts about finding a 3d sun direction vector in the scene when the background and sun came from a 2d HDR image. Wouldn’t it be nice if we didn’t have to open up every single HDR image in PhotoShop/GIMP to hunt around for the Sun disk’s brightest pixels with our mouse? Well, the following second method (which actually just replaces step 1 in the previous post) is more robust in that you should be able to load any outdoor HDR image with the Sun visible (or even partly visible behind light cloud cover) and it will automatically return the brightest pixel coordinates that correspond with the Sun’s disk center. This ability comes at the expense of being slightly more complicated, but hopefully I can walk everyone through the process so that the steps will be more clear.

By the way, since this entire post takes the place of step 1 in the preceding post (the step where you have to go inside PhotoShop/GIMP, etc), means that when you’re done with this part, you can just continue from step 2 through 5 in the above method and you’ll have everything automated. :slight_smile:

So from a global overview, this is what we’d like to do: First, load in an arbitrary outdoor HDR with arbitrary resolution, then find the Sun natural light source in the image (which could have been placed anywhere, from the original photographer’s camera setup), then return this brightest (or one of the brightest if they’re in a group) pixel’s x and y coordinates. Then as mentioned, we continue on from step 2 in the proceeding post and we’re good to go.

In WebGL when we’re reading or storing image pixel data, it most often has to be to and from a JavaScript flat array, such as:
const data_rgba = new Uint8Array( 4 * imgWidth * imgHeight );
We replace the rgba above with rgbe and that’s pretty much what we’re dealing with. Notice that it’s the (image dimensions * 4) because we have to take each pixel and spread its 4 channels among 4 unique array elements. So what we like to think of as pixel0.rgbe, pixel1.rgbe, pixel2.rgbe (which would be initially pixel[0,1,2] in the image), becomes a spread out flat list of numbers:
p0.r, p0.g, p0.b, p0.e, p1.r, p1.g, p1.b, p1.e, p2.r, p2.g, p2.b, p2.e, and so forth, which is represented in typed array form as pixel_data[0,1,2,3, 4,5,6,7, 8,9,10,11, etc.]. Note that in the original array we had 3 elements for 3 pixels in the original HDR image, and in the 2nd Javascript flat array, we have 12 elements which is 3 pixels * 4 channels. This will come back around soon.

As mentioned in the previous post, the RGB data doesn’t really help us find the Sun because there may be snow or white clouds everywhere in the scene and so most pixels would be 255,255,255 including the Sun’s pixels. The key is every 4th array element: the E channel, or exponent. If we can somehow iterate over every pixel’s E channel and keep a running record for the highest exponent we have found so far, by the end we will have successfully located the brightest pixel in the arbitrary image. And since the Sun is the brightest thing by far in an outdoor HDR image, the highest exponent will max out near or at the very center of the Sun disk.

So if the data is laid out like p0[0],p0[1],p0[2],p0[3],etc. for each pixel’s p0[r],p0[g],p0[b],p0[e], channels etc., then we’re only concerned with index [3], which is the pixel’s E channel. Then we just add 4 to this running index on each loop iteration as we scan the whole HDR image. Here’s some code from my three.js version:

hdrTexture = hdrLoader.load( hdrPath, function ( texture, textureData ) 
{
        texture.encoding = THREE.LinearEncoding;
        texture.minFilter = THREE.LinearFilter;
        texture.magFilter = THREE.LinearFilter;
        texture.generateMipmaps = false;
        texture.flipY = false;

        hdrImgWidth = texture.image.width;
        hdrImgHeight = texture.image.height;
        hdrImgData = texture.image.data;

        // initialize record variables
        highestExponent = -Infinity;
        highestIndex = 0;

        for (let i = 3; i < hdrImgData.length; i += 4) // every 4th element, starting at [3]
        {
                if (hdrImgData[i] >= highestExponent)
                {
                        // record the new winner
                        highestExponent = hdrImgData[i];
                        highestIndex = i;
                }
                        
        }
        console.log("highestIndex: " + highestIndex); // for debug
}

Above we start at pixel data index [3] which is the beginning pixel’s E channel, and then just add 4 to this index on every loop, searching each and every pixel for the highest exponent as we go. Once the last pixel is reached, by the end we should have a flat array index that corresponds to the winning brightest pixel. Some of you may have noticed a shortcoming in my algo; this will indeed find the brightest pixel, but the winning brightest pixel stops being recorded at the edge of the disk when the pixels return back to normal sky with less brightness as we scan from left to right. To find the true center (and maybe I will add this, or someone else can take a shot at it for me!) you have to remember when you reached the first winning bright pixel, the last winning bright pixel, then take the average of those 2 values to get the exact center of the winning group of brightest pixels. This is straightforward in 1D as we scan from left to right, but what complicates matters is that we have to do this going up and down and find that average as well. Therefore I simply left my above loop in place and it seems to do ok with the HDRs that I have tested so far. Where it would fail is if we somehow had an alien sky HDR from another planet view, and the HDR had 2 or more Suns. This algo would only find and sample the brightest light source when path tracing.

So from here on, it’s a matter of getting this desired flat array brightest index number into something more meaningful and useful, namely an HDR image x,y coordinate corresponding to the exact Sun disk pixel. Once more, here’s the high overview plan:

  1. loop through every 4th element in the large flat array of hdr rgbe pixel data, so [3] and onwards, adding 4 to each loop index - [3],[7],[11],[15],etc. Find and record winning highest exponent, E channel.
  2. take this winning exponent flat array index number and subtract 3 from it to get back to the r channel, or [0]'th place of each pixel - that way we are back on the boundary of every 4 numbers of each pixel. Our result will now be guaranteed to be divisible by 4 (because of how it was spread and laid out for WebGL in the first place, initially multiplying by 4)
  3. Divide this index number by 4 - now we’ll be back in range of the original 2048x1024(or whatever) resolution. This will still be large number because it is still 1d flat, but at least it is a little smaller than it was before! Its range will be anywhere from: 0 to (width*height)
  4. Now to make this correct-range flat number into something useful, we have to map this large number back into the 2D space of the HDR image (width location, height location), or (x,y) which needs to be in the range: (0 to width-1, 0 to height-1) . Note: the ‘-1’ reflects the fact that our data is always 0-based. Through trial and error and some experimenting (ha) I came up with the following formulas for getting the final x and y coordinates:
    X = highestIndex modulo with imgWidth
    Y = highestIndex divided by imgWidth, then Floored to get the integer portion

At the end of all this, we will have the same thing we had in step 1 of the previous post: an exact X,Y pixel location corresponding to the brightest part of the HDR image.

Here’s some code that follows the above outlined steps

1. /* As printed above: the 'for (let i = 3...)' loop which results in 'highestIndex' */
2. highestIndex -= 3;  // get back to the 0 boundary of every possible pixel
3. highestIndex /= 4; // guaranteed to be evenly divisible by 4, by design
4. brightestPixelX = highestIndex % hdrImgWidth; // range: 0 to imgWidth-1
   brightestPixelY = Math.floor(highestIndex / hdrImgWidth); // range: 0 to imgHeight-1, essentially this is 
//how many times that the large number has exceeded/wrapped the imgWidth x dimension.

We have now automatically completed step 1 in the first post, avoiding opening up an image manipulation program and hunting around the image with a mouse to try and find the brightest pixels. Now we can just continue on as normal with steps 2-5 from my earlier post and we should have a quick and robust way to find the Sun in 3D from a 2D HDR image!

If you look at all the math, it’s not that involved, especially for the CPU that won’t even hardly blink during this whole affair. But just to get the algo steps right in my mind, man it wasn’t pretty! Lol. At one point I was scribbling numbers from 0-100 in a square 2d matrix on a little scrap of paper like a madman, to see how a 1d list of numbers gets mapped to a 2d square array with a width boundary and height dimensions (ha). But I eventually understood why step 4 has to do what it does. :slight_smile:

Now a caveat with all this is that this method will only work for a single, dominant light source, like the Sun outdoors. To scan an indoor HDR image and return multiple bright areas coming from human-made artificial lights will require a more sophisticated approach. So that’s still a big TODO. But hopefully this method will at least let us efficiently render and sample all outdoor scenes with Sun visible, and even if you’re not doing path tracing, I hope that this technique will assist you in locating the light source in your general 3D Babylon scene if you are using an HDR image.

If there’s any confusion about these HDR posts (there were a lot of words back there, ha ha!), please feel free to post here on this thread. I will do my best to clarify anything that was not clear, either algo-wise, code-wise, or both.

Best of luck and happy HDR rendering! :smiley:
-Erich

6 Likes

Hello @Necips

Yes I’m sure there are better, more robust ways to do light source detection in images. It’s interesting when you mentioned shadow detection, as that could be used to trace the light back towards the source, even if the source is off camera in the image. And your earlier post about finding the real-world Sun angle based on geographic coordinates and time of day is intriguing. This could be used in a number of different applications, path tracing outdoor scenes in real time being just one of them!

My previous 2 posts about doing this light detection only work if the Sun is visible (or partially covered by a thin cloud layer), and only if the image was taken outdoors because the Sun dominates all other human-made light sources, and therefore it is easier to separate and identify algorithmically.

Although my simple approach does the trick for now, we will need a more sophisticated approach for detecting light sources in indoor HDR images, where there could be multiple arbitrarily shaped lights, or in some images I’ve encountered, no visible light source at all, just ambient room light coming from a window off camera!

To help me get started figuring out some of the math in my posted algos, I followed the pbr book light sampling link that was suggested by a three.js renderer user and forum participant. In the pbr book (the 3rd edition which is now free online and is pretty much the bible for CG graphics), they explain some of the x,y coordinates to Spherical angles, and then the Spherical coordinates to Cartesian conversion that I used in my first post about HDR light detection. I wouldn’t have figured that math out on my own! Ha. But the reason I mentioned this book is that later in the same chapter, it gives a technique to loop through all the pixels in any HDR image (indoor or outdoor, lights visible or lights off camera), and it actually builds a lighting probability density distribution as it goes from pixel to pixel, and then when you get to the end of the loop, you have sort of an importance ‘light’ map that you can directly importance-sample from when path tracing. Because they use this more sophisticated approach (with math and probabality algos that are beyond my understanding still), the end result is still considered unbiased rendering, which is really cool. In other words, if you actually placed a real world scene in that HDR spherical environment, we can expect the rendered outcome to match reality to the best of our human ability.

I would like to incorporate this ability/technique as well as your earlier ideas, but I will have to continue to study these approaches until I can visually ‘see’ the overall picture and be able to say in non-math speak what is going on under the hood (like I did hopefully in my previous 2 posts :smiley: ).

Thanks for sharing and the inspiring info!

4 Likes

Hi All,

Just checking in with a quick update. I recently figured out how to create and extract Babylon’s notion of an empty 3D transform. The equivalent in three.js was THREE.Object3D() - it didn’t have any geometry or materials associated with it, it was just an empty transform, kind of like a gizmo object in a 3D editor. Browsing the Babylon source code on GitHub, from what I can gather, I think the equivalent in Babylon is BABYLON.TransformNode(). I was able to assign an individual TransformNode to the spheres in our test scene and then on the Javascript setup side, that allowed me to perform simple operations on the transform using familiar Babylon commands, i.e LeftSphereTransformNode.position.set(x,y,z), LeftSphereTransformNode.scale.set(x,y,z), and LeftSphereTransformNode.rotation.set(x,y,z).

This lets the user-side code be much more flexible, rather than having to hardcode those object parameters in each path tracing shader. The flipside is that in the ever-growing path tracing library shader includes file, I had to create a special sphere-ray intersection routine called UnitSphereIntersect(ray), in which it doesn’t take any scaling (sphere radius), rotation, or translation (sphere position) into account, but rather intersects a ray with an untransformed unit sphere (radius of 1) centered at the origin (0,0,0). One of ray tracing’s greatest abilities is that you can have such a simplified ray-sphere function, but then instead transform the ray by the inverse of the desired sphere object’s transformation. It resembles what we do for 3D scene objects and the camera inverse to correctly display the transformed objects out in the scene.

That’s why I have been working on getting this TransformNode business up and running for for the last few days. Hopefully very soon I’ll have a 2nd (similar room) demo to show the new easy transforming abilities on the end-user’s JS side. I’ll also add the most-encountered general quadrics - unit sphere, cylinder, cone, paraboloid, box, disk, and rectangle. I might or might not include the hyperboloid (hour-glass) and the hyperbolic paraboloid (saddle), as we don’t really come across those very often and they tend to be a little more finnicky mathematically when it comes to analytically intersecting with rays. The torus (doughnut or ring) requires a different approach all together when it comes to finding ray intersections because it is quartic (4 solutions max) as opposed to the easier quadric (2 solutions max) shapes listed above that can be found either geometrically or by the famous old quadratic equation. But eventually I’ll add that too because ring shapes come up more often than hyperbolic paraboloids, ha. :smiley:

Will return soon with some new capabilities for our rendering system!

6 Likes

New demo at the Babylon PathTracing Renderer repo on GitHub!

Transformed Quadric Geometry Demo

Additional keyboard controls for this demo:
R,T,Y keys: place the transform operation into different modes,
R: Rotation mode
T: Translation mode
Y: Scaling mode
(default is R-Rotation mode, when the demo begins)
F/G keys: decrease/increase X value of chosen operation
H/J keys: decrease/increase Y value of chosen operation
K/L keys: decrease/increase Z value of chosen operation

This new demo showcases ray/path tracing’s really useful ability of intersecting unit (radius of 1) quadric shapes, which simplifies the library intersection routines considerably. The shapes are then transformed geometrically by instead transforming the intersecting rays by the inverse matrix of the classical transform operations (rotation, translation, scaling) being done to the shapes.

I have so far implemented the most common shapes - sphere(ellipsoid), cylinder, cone, and paraboloid. Soon I will add disk, rectangle, box, and pyramid/frustum. The cone and pyramid will soon have a ‘k’ parameter that defines the radius of the top opening of the shape. This way we can have truncated cones and truncated pyramids (which are basically frustums, like the ones typically used for perspective cameras).

I have 2 questions for the more seasoned Babylon developers/users out there:
One transformation that I did not find in the Babylon source code is Skew(a.k.a. Shear) operation. This is where you can Shear the Y and Z coordinates of the shape by the X coordinate value, for instance. If Babylon does not have this, would it be beneficial for other developers to add this ability? So in the end, we would have Rotation, Translation, Scaling, and Shearing (Skew) operations available to us.

The second question is relating to UI management for the demos. As I keep adding features/capabilities to our renderer, I want to let the end user be able to control the new parameters in real time. So far, I have added a bunch of Keyboard key checks, which although fast and helpful, are getting kind of clunky and hard to remember. In three.js, the preferred UI library was dat.gui, which adds a clickable/touchable menu interface to the right top portion of the webpage, and has sliders and expanding/collapsing folders and stuff like that. Is there a similar third-party interface library that Babylon.js prefers, or would simply dropping dat.gui into the mix be the way to go?

I’ll be back hopefully soon with more quadric shapes and features in the new demo.
Enjoy!

6 Likes

Excellent work you are doing.

Suggest you ask this again in the Features Request section so that the question is not restricted to just those following this topic.

There is a preference for not a third-party library but one belonging to Babylon.js itself

This has the advantage of also being available in VR environments

3 Likes

@JohnK

Will do on both counts! Thank you for the suggestions. I’ll head over to the feature request area and see if anyone wants to add Shear operation capabilities to Babylon. I understand on the matrix level kind of what’s supposed to happen with the various matrix elements multiplications, but I’m not that comfortable yet with the core Babylon.js typescript math library/matrix source code to add it myself as a PR. But maybe someone who understands the core a little better can tackle it.

I will read up on the Babylon.js in-house GUI that you referred to, thanks! At first glance, this looks like the cleanest and best way forward for showing user-controlled parameters as they are added.

Thanks again!
-Erich

2 Likes