Does AR on window work?!

Hi guys! I just got a really interesting question (assuming it wasn’t asked before).

Would it be possible to detect window when working with AR? Wouldn’t it be hard to detect window as it is transparent? I am not too sure how it can distinguish the difference between window and viewing just randomly in the air.

If anybody has some idea, please share~!

Always happy coding~! :slight_smile:

Are you asking in general, or are you asking whether babylon can support this?

a quick answer to the latter - the underlying system (in most case an android phone) does the window detection for you and will not provide it as part of the planes/meshes it provides you.

I meat to ask if Babylon could do it, but now that you asked, I am curious if other engines could do it.

Though, as a big fan of Babylon and because I started with Babylon.js, I would like to go along with Babylon.js haha

My next goal is to build some AR :smiley:

Back to the original topic, window detection sounds awesome! I thought it would be really hard to achieve.

Thanks for your answer!

It’s less “window detection” and more “there is a hole in this plane that i can’t account for” kind of thing :slight_smile:

Most SLAM algorithms won’t process more than a few meters (unless using a dedicated hardware like LIDARS), so they will just see it as a hole. This is the case here:

Matterport do plane detection themselves and fill up missing plane pieces using the RGB data:

The more interesting case is what happens when it finds a mirror :wink:

My guess is that @syntheticmagus can elaborate ALOT more than me on this one :slight_smile:

But back to Babylon - the plane detection we use comes from OpenXR (well, WebXR to be exact). The system (ARCore on Android) decides to tell us - this is a plane, it looks like this, and these are its boundaries. We don’t run any further processing (apart from converting it to left-hand system because… well… we are a left-handed engine :-)).

Even LIDAR won’t really help you with this one. :smiley: While window glass is (usually) designed to be more or less transparent to the visible spectrum of light, in infrared it varies dramatically and can be anywhere from transparent to reflective to opaque. For this reason, any technique dealing with windows working outside the visible spectrum is liable to have some rather unpredictable behaviors.

I can think of something you can try for this, but it actually has less to do with AR than with image processing in general. Train a CNN or other classifier to recognize windows (or, perhaps more usefully, window frames), then feed that classifier occasional frames from your camera feed as you track in AR. When your classifier tells you some portion of an image it saw was a window/frame, reproject that labeling back into 3D using the tracking data for that image and your knowledge of the structure of the world. Over time, this data should accumulate into a consensus about the general location of a window in 3D space; from this consensus, it should be pretty easy to make guesses about what features in the world are and aren’t windows.

Not particularly easy, but at least off the top of my head I think it should work. Hope this helps, and best of luck!

1 Like

WOW. That is quite a brilliant method!!

Woah that…is… a VERY interesting topic… Why didn’t I ask that earlier haha

All these are so intriguing!! but I think I may need some time to absorb haha xD

OMG haha seriously, all these sound so fun and awesome. I would love to try, but I never worked with ML before. I am not sure if I can pull this off haha

Thank you guys for explaining!!

Glad I got some more stuff to look at for this week xD

1 Like