So, this is a little tool I did some time ago and was here getting dust. =)
It uses Mediapipe for face mesh capture – plus a trick of mine to convert it into textured 3D face model on-the-fly =) – and Babylon.js for 3D rendering. It can export the captured face as gltf.
This is amazing! I believe this technique can be used for virtual space live conferencing calls by mapping webcam feeds to this and then map onto avatar meshes. Current approaches only display a square mesh of webcam feed beside the characters (could be for performance purposes).
Close to that, but not so straightforward. MediaPipe did not provide the 3D mesh directly, had to extract that manually. Also, coordinates were screen-space and the video was not directly mapped to the 3D mesh (there was no uv mapping). Needed to generate unwrapped mesh and remap the face roi to uv map to make it possible to generate the final 3D face. That happens in real-time.