Animating MorphTargets

Ever since the Blender Exporter added a MorphTarget Manager, I’ve been wondering what it might be capable of. A few years back when I was looking at Unity 3D I created this little test:

Unity & Speech

It used a free script to drive the morph animation (Metamorph), which I gather is not needed anymore.

Anyway I still have the old files, and wondered if it could be done with BJS and the morphs exported from Blender. So here is what I have managed so far:

To be, or not to be

Now while BJS Blender exporter will export the actual morphs(shapekeys), it will not export the morph animation - so I had to try to figure out a way to script the animation! And, to put it mildly, that is not exactly my strong point :grin:

So I started out with a sound file for which I had a text file for the visemes and phonemes that looks like this(151 frames that go 1,2,3,4 etc)):

[0.0, 0.0, 0.0]
[0.0804, 0.0385, 0.0461]
[0.303, 0.1519, 0.1813]
[0.5683, 0.3236, 0.3858]


[0.1341, 0.0, 0.0]
[0.0333, 0.0, 0.0]
[0.0, 0.0, 0.0]

So I had to create a script that played these frames synced with the sound file. This is what I came up with:

function createMotion(){

if(index === (activities.length -1)){
clearTimeout(id);
console.log(index);
return;
}
else{
//var aSpeed = 30*engine.getFps()/60;
//console.log(index);
aHead.morphTargetManager.getTarget(0).influence = activities[index][0];
aHead.morphTargetManager.getTarget(1).influence = activities[index][1];
aHead.morphTargetManager.getTarget(2).influence = activities[index][2];
++index;
id = setTimeout(createMotion, 30);
}
};

The morphs/shapekeys are loaded into a variable array “activities”.

This works pretty well if the fps is around sixty - but strange things happen with the animation and sound sync if the fps drops.

I can also create a morph file that looks like this, where the first number is the frame # and the 3 numbers in the brackets are the influence of the three morphs/shapekeys

1, [1,0.840,0.947]
8, [0.067,0.2,0.4]
23, [0.962,0,0.227]
37, [0.760,0,0]

So any thoughts about how I may create the animation so that it stays synced with the sound?

And just a couple of other thoughts. Each visemes/phoneme is based upon just the same three morphs/shapekey ( see the makeup of of the letters/sounds at ~ 24 secs of the video above). And when I look at the Morph Target Manager in my .babylon file, I see just “positions” - nothing about 'uvs, normals tangents" which I gather can be a limitation of morph targets.

Do I have to worry about them with the babylon file exported from Blender?

cheers, gryff :slight_smile:

This is a question for @JCPalmer

Ohh, I forgot to mention, hit the “u” key once to play the animation.

cheers, gryff :slight_smile:

Ok, first I think a scene level beforeRender, which runs every frame exactly once is the best way to initiate this, if you are not already running inside of one. If you are just putting this right inside the render loop code that is fine as well, but not nearly as portable to other scenes.

scene.registerBeforeRender( () => {
        blah, blah, blah;
    }
);

Btw, you could also register before & after render’s with a mesh, but do not do this. If you ever have multiple cameras, e.g. webVR or webXR, they will execute multiple times per frame, or not at all if the mesh is not in frustum. I almost consider the double run part to be a bug, but not enough to do a PR about. @Deltakosh would probably not want mucking around so deep, anyway.

The real problem you are having is you are not just directly measuring in wallclock time from the time the sound started like:

sound.play();
const startTime = BABYLON.Tools.Now;
scene.registerBeforeRender( () => {
        const elapsed = BABYLON.Tools.Now - startTime ;
        if (elapsed < 0.2) return;  // 0.2 seconds
       else if  (elapsed < 0.475)  index = 0;
       ...
       else index = 3;
    }
);

I left out unregistering. Probably should not use an anonymous function as the registered function.

Also, this assumes you have the sound. It is not publicly exposed in the Sound.ts, but this one of the ways I use in my work:

if (sound["_isReadyToPlay"]) {
    sound.play();
    const startTime = BABYLON.Tools.Now;
    scene.registerBeforeRender( () => {
            const elapsed = BABYLON.Tools.Now - startTime ;
            if (elapsed < 0.2) return;  // 0.2 seconds
           else if  (elapsed < 0.475)  index = 0;
           ...
           else index = 3;
        }
    );
}

If I want no external dependence, like putting a sound in a library, I convert the sound to base64 javascript via: https://palmer-jc.github.io/scenes/QueuedInterpolation/audio_recorder/index.html

2 Likes

@JCPalmer : TY for the thoughts and ideas Jeff. Not had chance to try anything out yet, but I did find out something new from your suggestions - scene.unregisterBeforeRender. Not sure how to use it yet.

Will try out your ideas :slight_smile:

Ohh, and thanks for the addition to the Blender 2.80 exporter.- lights only illuminating collections. This is why I like using your exporter - helps me eliminate coding :slight_smile:

cheers, gryff :slight_smile:

PS: ever been to the Holy Sepulchre Cemetery in Rochester?

Yes your before or after render function will continue to run until you stop it. Might start to error if you did not take this into consideration. Your use has a short shelf life.

No, but I have heard of it. Think Susan B. Anthony & Frederick Douglas are at Mount Hope, another big cemetery in the city.

1 Like

That cemetery is where Francis Tumblety is buried - a quack doctor who is major suspect for “Jack the Ripper”. Spent time in my city in Canada - knew the mayor and one of the local newspaper editors.

He traveled under a number of aliases - one of which got him arrested as a co-conspirator in the murder of Abe Lincoln…

Amazing what you might learn when you should be coding :grin:

cheers, gryff :slight_smile:

1 Like

@gryff thanks for sharing your approach and the sample looks quite good. Did you ever make any further progress on this? If so would there be a summary of a Mixamo → Blender → Babylon.js pipeline which includes animating lip-syncing of a wav file within Babylon.js based on the code that you’ve posted above?

1 Like

@Jason_Daly : Welcome to the forum :slight_smile: Not done anything with Mixamo and Morph Targets but here is something with a walking animation and morph targets. I find Mixamo has too many bones for my liking.

A Pirate

Here is the forum thread " A pirate’s Arghh and all that "

Hope that helps.

Stay Safe, gryff :slight_smile:

1 Like

Not sure if you’ve already seen this, but whenever this subject comes up I always link this awesome page by the brilliant @thomlucc: Mixamo to Blender to Babylon.js | Babylon.js Documentation (babylonjs.com)

Edit: his page is linked down below, both are great!

@Jason_Daly : I create the lipsync animation and the walking animation in Blender, and the phonemes for the speech morph animations I get from the .wav file using a programme called “Papagayo”.

If you want the audio file decoded automatically to phonemes in your javascript code maybe @JCPalmer could have some suggestions.

I can send you the blend file if that would help.

Stay Safe, gryff :slight_smile:

Thanks @DarraghBurke but I cannot take credit for this page :-). The one I did is this one: “Animating Characters” where I basically exported to glTF to combine several animations made with Mixamo

2 Likes

@gryff, as you know I use MakeHuman as well, not Mixamo. I was doing lip-sync / voice-sync years ago. Syncing kind of sucks. All this fiddling to get things just perfect. I do not know of any automated audio file decoding to phonemes other than Papagayo, as you mentioned.

Here is the only thing I published using sync QI.Automaton. The top right Talk Button is where to click.

Even then though, I was using the Carnegie Mellon / DOD Arpabet data file & turning it into a little JS “Database” to get the phonemes. This implies that you have to type a transcript of the wave file in & retrieve from dictionary look-ups. A database look up DOES NOT give you any times.


Now I am making voice fonts (not very good to date), so that both the animation & audio are generated. With this setup, a common timeline is determined as the first step. Both audio & animation code use it, so sync is implicit.

I still need to key in the text and search, but have made many improvements to my Arpabet database. The main one was adding a syllable delimiter for all 44k of the words I decided to use. That was extremely painful, but the payoff is much better quality for longer words. The code in the database now also determines the times.

1 Like

How did you get the animation to stay synced with the sound?