Documentation on XR "near interaction"

Hi @RaananW,

I didn’t find “near interaction” to be a general WebXR/human-computer interaction concept, but looking at babylon code and language used around it, it seems to be a first class citizen in babylon xr.

Can you explain the concept, perhaps with some code examples if it makes sense, etc? (and if it is a thing outside babylon, maybe a link to an explanation elsewhere can go most of the way)
If there is something special (or commonly used/suggested to use) going on between “near interaction” and babylon GUI, that’d be nice to include.

If you’d like I can format and add content to WebXRSelectedFeatures or elsewhere under featuresDeepDive/webXR based on what you write here, let me know.

As a follow-up, some other things that might make sense to include. Just to give you more of an idea what I and others may be wondering about (if and when you get a chance to look into this)…

  • How is “near interaction” in babylon related to MRTK inspired stuff (I suspect significantly)? How alive/relevant is MRTK in general in 2024? Is MRTK inspired stuff in babylon going to remain based on MRTK 2 or are there any plans [or what would that mean] to do anything related to MRTK 3 ?

    • Edit: How much of MRTK and how much of MRTK-stuff-in-babylon is AR-only vs AR/VR vs VR-only? And specifically what features/classes/methods are AR-only vs AR/VR vs VR-only?
  • Is there something at the intersection of “near interaction” and Apple Vision OS that needs to be touched on? I suspect that not much is implemented yet, but perhaps something can be said at a conceptual level regarding the different modes of interactions that Apple is pushing and how that relates to the concept of near interaction in babylon?

    • Edit: Same for other platforms/vendors, if applicable. Any notable differences in how other vendors implement webxr spec (and its “extensions” that hopefully mostly Apple is guilty of) that affect babylon’s “near interaction” implementation
  • Edit: Is “near interaction” an strictly-XR-only concept in babylon or…? If not, does babylon have features that allow sharing code between immersive and non-immersive implementations of certain interactions? (notably when babylon GUI is involved, but also otherwise?..)

    • In other words, here I am talking about typical patterns where “write code once, use it in both immersive and non-immersive modes of the application” is possible. This is not specific to “near interaction”. In fact, I’m thinking, if there was a page dedicated to just that somewhere under featuresDeepDive/webXR that’s be useful for anybody and everybody who wants their application to work both on-screen and immersive, and considers/uses babylon for that, no?..

Near interaction, in general, is what we call interaction that is not done using ray picking.
Take, for example, interaction with a 3D scene on a desktop. When you click on the screen, babylon converts this click to a 3D ray projected forward “into”. the scene. if it hits something that is pickable it triggers interaction. Any interaction with the scene (pointer interaction) works this way. You can’t “touch”. the elements, because you are converting from the canvas’ 2D space to babylon’s 3D space.
When in XR, things work a little different. First, you have your controllers (as a concept). They “project”.a ray of their own that is already in 3D space. The difference to the desktop scenario is that the origin is not the screen, it is an object in the 3D world. Now, a wonderful thing that XR can bring to the experience is actually touching the object (with your controllers/hands/headset). Touching an object requires different way of defining an interaction with the scene’s objects. It is no longer a ray that is triggered when an action is being triggered, it is a constant inspection of proximity to the objects, detecting the interaction and triggering the right event, based on the state of your controller against the scene. This is basically what the near interaction module does. It let’s you trigger pointer events based on touching objects. It also allows specific elements (like a 3D button) to be “pressed”, because we constantly check for collisions and proximity. So it does add a bit of value when it comes to, for example, the 3D button of the 3D GUI (or the MRTK elements).
We try as much as we can to make you write less code for all scenarios. What I mean here is that Babylon converts interactions (and near interactions) to pointer events. The reasonf or that is that the entire framework works with pointer events - from GUI to Gizmos, they are all based on pointer events passed from a supported device to the scene. This is why XR interactivity and near interactivity emulate pointer events - to get everything that works in the desktop scenario to work in XR as well. This is probably the answer to the MRTK question. MRTK was made for, well, mixed reality :slight_smile: however, if you want to add it to your desktop scene, nothing stops you from doing that. it will work just as good as it does in XR. it might not make sense, but it is not XR exclusive. at least not in Babylon.

We are device agnostic. We don’t support specific devices, so i don’t have a specific answer regarding vision pro. What I can say about the vision pro is that I apple has decided to work with a different paradigm that we need to support. We do support whatever they do in the emulator, but that seems to not translate very well to a real device. I don’t have a real device, so I can’t quite test it, but I am doing everything I can to get interaction to work as expected in vision pro. Regarding near interaction - this should work just as expected, as near interaction does not require any WebXR-specific events or triggers to work. All you need are your hands, which are there and are working. Touch an interactable GUI, and it will work.

I hope i answered everything! To your last question (not sure I answered it before) - yes, near interaction is XR exclusive, because this is the only scenario where it makes sense. It also only makes sense win a scenario that has controllers. An AR session on a mobile device will gain nothing from enabling near-events, because they will never trigger - no controller exists to check proximity. However, it will not hurt performance in this case, because near interactivity only enables when it finds a supporting controller.

1 Like