Babylon Native for visionOS

Hello!

I’m working on bringing Babylon Native to Apple Vision Pro. We are almost done with migrating the underlying rendering engine (bgfx). I’ve already started the integration to Babylon Native (here is the PR: feat: add visionOS support by okwasniewski · Pull Request #1384 · BabylonJS/BabylonNative · GitHub)

So the Playground app for iOS has an interesting pattern which I’m trying to understand. First, we pass an (MTKView*)view that’s used for initialization of the graphics config, and then is the layer that receives the “2d rendering” inside of a window.

But we also have another MTKView which is passed as a raw pointer called xrView. Then this xrView is passed down to initialize the NativeXr plugin. Once the session changed callback is changed and called with isXrActive as true, we trigger the isHidden property to reveal the second MTKView.

As far as I understand the underlying MTKView is still rendering(?) which sounds like something that can lead to performance issues. - I might be wrong here.

When we render to the xrView graphics config still “thinks” that we are rendering to the now covered MTKView.

- (void)init:(MTKView*)view screenScale:(float)inScreenScale width:(int)inWidth height:(int)inHeight xrView:(void*)xrView
{
    screenScale = inScreenScale;
    float width = inWidth;
    float height = inHeight;

  // Init graphics config (passed down to BGFX) with the primary MTKView
    Babylon::Graphics::Configuration graphicsConfig{};
    graphicsConfig.Window = view;
    graphicsConfig.Width = static_cast<size_t>(width);
    graphicsConfig.Height = static_cast<size_t>(height);

    device.emplace(graphicsConfig);
    update.emplace(device->GetUpdate("update"));

    device->StartRenderingCurrentFrame();
    update->Start();

    runtime.emplace();

    runtime->Dispatch([xrView](Napi::Env env)
    {
        device->AddToJavaScript(env);

        // [...]
        
        Babylon::Plugins::NativeEngine::Initialize(env);

       // Init nativeXr layer with second MTKView
        nativeXr.emplace(Babylon::Plugins::NativeXr::Initialize(env));
        nativeXr->UpdateWindow(xrView);
        nativeXr->SetSessionStateChangedCallback([](bool isXrActive){ ::isXrActive = isXrActive; });
        
	    // [...]
    });

    // [...]
}

On visionOS we have a problem because there is no way to get two MTKView equivalents (layer renderers). So we have only one metal layer to draw into.

I’m trying to understand this design decision and whether it’s possible to initialize BabylonNative while having only one MTKView. In the case of visionOS rendering without the Xr plugin doesn’t make much sense.

Thanks for your help in advance.

6 Likes

cc @docEdub

Interesting. We may need to make a new VisionPro playground app to address this without breaking the iOS playground app. @ryantrem will know more about this and may have a better solution.

I’ll do my best to page back in 4 years of history on this code :slight_smile:

If memory serves, there were a few things that contributed to the current design:

  1. In the browser, you may be rendering to a canvas element that is just part of the screen, but when using WebXR for an ar experience (at least on mobile devices, where most of the XR effort has been), the browser takes over and renders full screen. In the context of Babylon native, the idea was the pass control of the rendering to the underlying XR system (though it is still the host app that configures and provides the rendering surface / window as you mention above, and in the BN Playground (and in BRN) we don’t currently force the XR view to be full screen).
  2. On mobile devices, we thought it made more sense to delegate the composition of the camera feed and the babylon rendering to the XR layer. In truth they are both contributing, as the regular Babylon rendering layer provides the texture, then the XR layer renders the camera feed onto it, then Babylon JS renders the scene onto that texture, then the XR layer presents it to the screen in its own view/surface.
  3. On Android specifically, sharing a rendering surface between XR and the regular Babylon scene rendering didn’t work well basically because we ended up with two different systems (BGFX and the XR layer) trying to share an OpenGL resource. More detail in this PR: Fix ARCore double render (and rework NativeXr lifetime management) by ryantrem · Pull Request #631 · BabylonJS/BabylonNative (github.com)

For OpenXR (specifically on HoloLens), the “window provider” is basically ignored and it does its own thing for rendering the XR view immersively. On HL, an app might be rendering on a 2D slate, and then separately enter immersive mode.

From what you mentioned above, it sounds like the VisionOS model is different in that it is immersive only, and so there is only one MTKView?

Also I believe Babylon JS stops rendering the main scene when XR is active, but my memory is fuzzy on this. You could try stacking the two MTKViews vertically so you can see what is rendering in the main view while XR is rendering. I think it should just be the clear color or maybe the last frame that was rendered. I think I tested this in the past but I can’t remember.

For the VisionOS scenario, would it be good enough to just provide the same MTKView for both the main view and the XR view?

Hopefully this adds some context to help understand the current design decisions. All of that said, I don’t think we are attached to the current design, we just need a design that works well for both mobile devices and immersive HMDs, and is conceptually consistent with WebXR to the degree that the same JS code will work as expected in either a browser context or a native context. I know you guys have been deep in this space recently, so if you have a different perspective on the design, please feel free to share your thoughts or try out different approaches in the existing code!

3 Likes

Hey @ryantrem, @docEdub

Thank you for providing the context on this topic.

Answering your question @docEdub, I’m working on adding a separate playground app for visionOS because it’s not possible to reuse the iOS code there. I just wasn’t sure how to make it work together with Babylon Native.

Ah okay, this makes a lot of sense!

Exactly! User requests to enter the immersive mode by clicking a button and we should always go into the immersive mode. Just a small side note, the MTKView equivalent for visionOS is LayerRenderer (CP_OBJECT_cp_layer_renderer | Apple Developer Documentation).

Just double-checked that by putting those views side by side on iOS and you are right. The regular MTKView just stops rendering on the last frame.

That’s something I’ve tried already but looks like I’m missing some important step that should stop the regular rendering when we pass on the rendering to the XR layer. When I tried it both the regular rendering and XR renderer tried to render into the same surface causing metal to crash with an error that too many frames got queried in one render.

In fact, on visionOS, we don’t need to do too much rendering work in the XR layer. We mostly need to transform the views to “stay in place” when the user moves their head (6DOF). Currently, everything is just anchored to the user’s head. And also we need to supply the anchors/head tracking data from ARKit.

The camera feed is provided when we clear the background of a frame (which I think works the same on HoloLens).

I think on visionOS we are encountering a simillar issue as you described for Android: “sharing a rendering surface between XR and the regular Babylon scene rendering didn’t work well basically because we ended up with two different systems (BGFX and the XR layer) trying to share an OpenGL resource”.

I’ll look into the resources that you linked. And also check out how HoloLens does this.

Once again thank you for taking the time to explain this. I’ll share my findings in this thread once I figure this out / have some additional questions :slight_smile:

4 Likes

Keep in mind that on HoloLens there is no camera feed, so that part will be different compared to VisionOS.

1 Like

Hey @ryantrem,

Unfortunately, we still didn’t find a proper solution to initialize Babylon Native in visionOS properly but there is some progress for windowed rendering!

In immersive space rendering, we found that Babylon Native always binds to the native frame buffer, which causes issues with presenting drawables twice. When Babylon creates a new frame buffer, bgfx should stop submitting drawables to the default one. However, this is not what we’ve observed. There are 3 frame buffers created (default, left eye, right eye). Once we enter the NativeXR mode, we should stop binding the default one which is unused.

Only the default frame buffer has a swap chain, and this swap chain calls visionOS functions (querying frames and presenting). Currently, for some reason, we have frame interferences (bgfx is rendering and querying the frames, and NativeXR is also doing it).

We observed that odd frames are rendered by bgfx and even frames by NativeXr. This causes exhaustion of all available drawables from the pool.

While checking iOS, we found a very similar issue. The method to present drawables is constantly being called in the background. It’s not reflected on screen because it’s frozen on the last frame, but it’s still calling:

m_commandBuffer.presentDrawable(frameBuffer.m_swapChain->m_drawable);

from renderer_mtl.mm. (You can try it by entering NativeXR mode and placing a breakpoint in renderer_mtl.mm).

We are working on a fix, but suggestions/hints on how to solve it are welcome.

We then decided to implement a windowed rendering mode using CAMetalLayer, which can be shown before going into the NativeXR mode (and work around the issue similarly as two MTKViews did):

However, this approach is not ideal as we should render to the CAMetalLayer on the (preferably) main thread (that’s what the iOS implementation does with MTKView). For the immersive space, Apple requires creating a separate render thread, which creates a conflict in Babylon (previously rendering on the main thread).

Is there an option to move rendering execution to a different thread? How would you approach this?

1 Like

Thanks for sharing the update, looks like you are still making progress!

The method to present drawables is constantly being called in the background. It’s not reflected on screen because it’s frozen on the last frame

You are seeing bgfx rendering to the render texture / offscreen frame buffer, but also trying to present to the screen / default frame buffer, is that right? We might need @bghgary for this, but he is out for a couple weeks. @Cedric or @srzerbetto might have insights.

Is there an option to move rendering execution to a different thread? How would you approach this?

I think we will need @bghgary for this one as well. I believe this ties into some bigger rendering changes Gary wants to make that includes being able to separate out the rendering thread.

1 Like

Now that the PR is merged and I am back, is this still an issue?

@bghgary Yes, it’s still an issue.

For the first PR we’ve introduced windowed rendering, follow-up PR will introduce Immersive Space Rendering (VR/XR) once we figure out the above issues.

1 Like

Thanks, I’ll re-read some of the posts. :slight_smile:

I unfortunately don’t know this code well either. If we are continuing to present to the default back buffer, that certainly sounds like a mistake.

My understanding is that Apple requires rendering to the MTKView on the main thread along with the UI, but maybe this is no longer required. If immersive requires rendering on a different thread, we need to figure out how to get bgfx to render from different threads which doesn’t sound easy. I know you have been making changes to bgfx, is it already possible to use bgfx with visionOS immersive without BabylonNative in the picture?