Marker Tracking and Webpiling

Hi MarianG,

Thanks for trying it out, looks awesome!

I think there actually is a way to make CreateFromWebCam() work from other cameras. It looks like it might be undocumented, but CreateFromWebCam’s “constraints” argument takes an optional deviceId parameter, which I believe corresponds to an element of the output of MediaDevices.enumerateDevices(). Definitely not very easy to use, but I believe that that’s how to choose which camera you get your video texture from.

The plane was just put in as a cheap way to render the texture behind the tracked object, and I do mean cheap. It’s definitely not the right way to do that for anything more than a demo, so I’m super happy to hear you’re investigating alternatives.

I’d hadn’t seen layers before; pretty cool, but they don’t appear to support video textures, sadly. That might be a worthwhile feature to add, though, especially if we want to enable more camera-based experiences powered by Babylon. :grin:

tl;dr: The next three paragraphs are all about camera intrinsics alignment for AR scenarios. I wrote them all before it occurred to me that I don’t really know why your screen-filling plane didn’t work, though I do have a theory. If you’re already familiar with AR camera concepts, you may just want to skip ahead a bit. :upside_down_face:

The trick with trying to make a plan that fits the screen is that, for the marker tracking illusion to work, your virtual camera’s intrinsic parameters (most importantly field-of-view and aspect ratio) have to match those of your real-world camera. Canonically, most AR experiences have required users to “calibrate” their cameras in order to learn their intrinsics; this can yield high-quality experiences, but requires a lot of overhead for the user. To avoid that, in my Playground I simply “guessed” the intrinsics and hand-tuned the plane in the scene to look correct with the output it was receiving.

Guessing the field-of-view may or may not be viable depending on the experience you’re going for (most cameras a pretty similar, so a good middle-of-the-road guess will probably work unless you are trying to do something very precise). Where it gets more tricky is in aspect ratio. Real-world cameras are much more constrained than virtual ones vis a vis the aspect ratios they can output, so generally you’ll have to try to make your virtual camera account for the idiosyncrasies if your real one. Ideally, that just means rendering to an output that has the same aspect ratio as your camera’s input, which will allow you to essentially “crop” your rendering to the edges of the plane you see in my original Playground. If you can’t do this, the second most common approach is to render an area that fits “inside” the image from your camera. For example, if your camera returns a 4:3 image but you have to render to a 16:9 output, most approaches would have you crop out the top and bottom of the image from your camera, the aim still being to fill your rendering background with the image from your camera even if you can’t fit all of it in. The downside to this second approach is that, while it does give the appearance of a “full screen AR experience,” the math required to get your camera parameters to stay synchronized across multiple aspect ratios is not trivial.

To avoid fighting with that math, and to allow the experience to work properly in the Playground where I don’t have much control over the aspect ratio, I took a third option: I rendered the video texture to a constant-size plane that was allowed to fit completely inside the virtual camera’s view frustum. In other words, for simplicity and use in the Playground, I fully decoupled the real and virtual cameras. This, like I said, is almost certainly not the best approach for any production deployment. But I really wanted to be able to simply and easily show the behavior in the Playground; so to make that happen, I chose this approach.

You didn’t specify how your screen-fitting plane doesn’t work, so please correct me if my guess is wrong; but I’m guessing its because pegging your virtual camera to the render target’s aspect ratio caused the camera’s intrinsics to get out of sync with the real-word camera. This problem would manifest as an apparent and varying “offset” between your video texture background and your 3D-rendered foreground: you move the marker, and the object that’s supposed to track it moves in the right general direction, but it’s in the wrong place and it doesn’t move the right distance. The easiest way to address this problem will be to force your render window to be the same aspect ratio as your webcam’s output; that would allow you to hand-tune your plane to go from edge to edge of your render target and still match up with the motions it’s tracking in the video. This is the first option I described two paragraphs ago; options two and three are also viable, but they have definite downsides.

After so long a reply, this is probably the wrong point to ask, but…did any of that address your question? I think I covered all the “magic” about that plane; but if you have any more questions – about that or anything else in this arena – please ask! After ten paragraphs this is probably obvious, but I love talking about this stuff. :smile: