Mesh of fingers jitter on mobile

  1. Mesh of fingers jitter on some mobile devices(Mali G77 & Mali G610), but fine on PC and iphone.
  2. Visualize the bones with the inspector, the bones are not jittering.
  3. jittering is severe on fingers, moderate on forearm, and fine on other parts of the body. Could it be floating points errors accumulated along the bone chain?

Any suggestions on how to debug it will be helpful

https://youtube.com/shorts/abRyygeMI7w?feature=share

Welcome aboard!

It looks like it could be a problem with floating point errors indeed.

You can try to set mesh.computeBonesUsingShaders = false for all meshes with bones, to see if the problem comes from the GPU. When this property is false, all calculations are done on the CPU.

1 Like

Thank you. it inspired me and turns out that

skeleton.useTextureToStoreBoneMatrices=false

is enough to solve it. But size of uniforms are limited, so I tried to fix the shader this way

mat4 readMatrixFromRawSampler(sampler2D smp,float index)
{
float offset=index *4.0;
float dx=boneTextureWidthInv;
float base=dx*(offset+0.5)
vec4 m0=texture2D(smp,vec2(base,0.));
base+=dx
vec4 m1=texture2D(smp,vec2(base,0.));
base+=dx
vec4 m2=texture2D(smp,vec2(base,0.));
base+=dx
vec4 m3=texture2D(smp,vec2(base,0.));
return mat4(m0,m1,m2,m3);
}

The finger stop jittering, but the console log many errors (and they are basically glsl shader code). Is it because readMatrixFromRawSampler() is called in many shader, not just the skinning shader? And it’s weired that it can’t compile but the game keeps running as if no error…

Edit: I forget to add “;” at the end of a line… typescript has reshape my brain…

Also try this:

vec4 m0=texture2D(smp,vec2(saturate(dx*(offset+0.5)),0.));

no jittering, but console log many errors (and they are complaining sth like “Unable to compile effect”)

readMatrixFromRawSampler is not used if you set useTextureToStoreBoneMatrices = false, so any change there won’t have any effect.

However, bone matrices are now passed through uniforms, and you have a limited number of these that you can use in a shader. If you have too many bones, this method won’t work as expected and you will get some error in the console log, and it seems it’s what you get… To be sure, you should report here the errors you get in the console.

Yes, it is useTextureToStoreBoneMatrices = false that solved the problem, and narrow down to:

  1. Passing floats to GPU through texture (compared to uniform) has bug on some device
  2. readMatrixFromRawSampler has bug on some device (texture2D() driver bug?)

I skip some texture updates by

if (this.isUsingTextureForMatrices && mesh._transformMatrixTexture && Math.random()>0.9) {
    mesh._transformMatrixTexture.update(mesh._bonesTransformMatrices);
}

and when texture is not updated, the mesh isn’t jittering, does this means readMatrixFromRawSampler is quite stable and can rule out speculation2?
Any suggestions on how to debug this?

mesh._transformMatrixTexture is not null? Are you exporting a model from Max? If yes, can you export to a .gltf instead of a .babylon and see if that helps?

mesh._transformMatrixTexture is not null?

if useTextureToStoreBoneMatrices = false, then it is null. But for some devices the shader fails to compile because too many uniform. So I go back to

useTextureToStoreBoneMatrices = false

and try to fix it.

First try: replace texture2D with texelFetch, still jittering
Second try: use fixed point number. I use int16 to represent float32.

this._transformMatrixTexture = RawTexture.CreateRGBATexture(
                        this.i16transformMatrices,
                        (this.bones.length + 1) * 4,
                        1,
                        this._scene,
                        false,
                        false,
                        Constants.TEXTURE_NEAREST_SAMPLINGMODE,
                        Constants.TEXTURETYPE_SHORT
                    );

and I replace the Constants.TEXTUREFORMAT_RGBA with Constants.TEXTUREFORMAT_RGBA_INTEGER in CreateRGBATexture.
But the avatar do not show on the screen and console log:
GL_INVALID_OPERATION: Mismatch between texture format and sampler type

Try TEXTURETYPE_HALFFLOAT instead of TEXTURETYPE_FLOAT to see if it’s a problem with float texture support. Normalized SHORT or INTEGER won’t work because we need the values to go outside the -1…1 range (the texture stores matrices).

So, you are using a model exported from MAX. Can you export this model as a .glTF file (instead of a .babylon) and see if that helps? What’s more, exporting to a .glTF file means you can test it with other glTF viewers (such as https://gltf-viewer.donmccurdy.com/).

Try TEXTURETYPE_HALFFLOAT instead of TEXTURETYPE_FLOAT to see if it’s a problem with float texture support

Wow. I get the same jittering on PC after

ToHalfFloat(this._transformMatrices[i])

It’s surprising that some devices decide to use HalfFloat, even if TEXTURETYPE_FLOAT is passed in to the CreateRGBATexture

This is fixed by using TEXTURETYPE_INT.

  1. SpectorJs can not ensure the internalFormat the hardware is using. I capture this on the mobile, it indicates boneTexture is using RGBA32F. But my last post indicates that the problem is caused by texture using half float.

  1. I read a blog claiming that “type is a hint for precision, but GL can choose any internal precision to store the texture”. I’m not sure if he is right, but on glTexImage2D - OpenGL ES 3 Reference Pages
    I find a table :
Sized Internal Format Format Type Red Green Blue Alpha
GL_RGBA16F GL_RGBA GL_HALF_FLOAT, GL_FLOAT f16 f16 f16 f16
GL_RGBA32F GL_RGBA GL_FLOAT f32 f32 f32 f32

Both GL_HALF_FLOAT and GL_FLOAT can lead to GL_RGBA16F?

  1. for anyone who come across the same issue:

First, create a texture with format Constants.TEXTUREFORMAT_RGBA_INTEGER

Second, turn float32 into int32. In my project, all model is smaller than 2**11 cm, so I think it is fine.

this.i32transformMatrices[i] = Math.round(this._transformMatrices[i] * (2**20))

Third, chang the BoneDecalaration.fx

precision highp isampler2D;
uniform isampler2D boneSampler;
uniform float boneTextureWidthInv;
#else
uniform mat4 mBones[BonesPerMesh];
#ifdef BONES_VELOCITY_ENABLED
uniform mat4 mPreviousBones[BonesPerMesh];
#endif
#endif
#ifdef BONETEXTURE
#define inline
mat4 readMatrixFromRawSampler(isampler2D smp,float index)
{
int base=int(index*4.0);
vec4 m0=vec4(texelFetch(smp,ivec2(base ,0),0) )/exp2(20.0);
vec4 m1=vec4(texelFetch(smp,ivec2(base+1,0),0))/exp2(20.0);
vec4 m2=vec4(texelFetch(smp,ivec2(base+2,0),0))/exp2(20.0);
vec4 m3=vec4(texelFetch(smp,ivec2(base+3,0),0))/exp2(20.0);
return mat4(m0,m1,m2,m3);
}

There is another bonesDeclaration.fx in folder ShadersWGSL .

 var boneSampler : texture_2d<f32>;

Should I change it into this?

 var boneSampler : texture_2d<i32>;

The doc says it is for WebGPU. My concern is, will WebGPU be enabled if Babylon detect the device support it, and the shader will fail to execute (because I mess up i32\f32)?

In my understanding of the spec, if a GPU indicates that it supports FLOAT textures, it shouldn’t be allowed to switch to HALF_FLOAT if you created a texture as FLOAT…

Only if you want to use these changes in WebGPU. Babylon.js will not automatically switch to another engine. If you explicitly create a WebGL engine, you are guaranteed to work with WebGL.

uniform highp sampler2D boneSampler;

just find out that highp precision solve this elegantly

I read a blog, it says even if CPU pass f32 to GPU, ALU can use f16 when doing calculation.
In opengl, mediump may be default?

That’s the first time I hear that setting a precision has an actual impact on the output!

I wonder if we should add “highp” on our end, the problem being that if it’s not supported, it will generate an error (according to the spec)…

What do you think @sebavan?

MDN writes:" In WebGL 1, “highp float” support is optional in fragment shaders"

Does it mean, WebGL2.0 always support highp float, and in WebGL1.0 vertexShader support highp float? what about highp sampler2D ?

should we add an extra mode for it ?

Do you mean just like

bool skeleton.useTextureToStoreBoneMatrices

add a boolean

bool skeleton.useHighpSampler

and decide whether add precision-annotation in glsl or not?

Could you provide us with a repro? We would like to check on other platforms.

Also, can you check if Babylon.js is started in WebGL1 or WebGL2 mode on the platforms where it fails?

  1. My fix has merged into master and take effect, otherwise I could just give you url to this project…

  2. Can reproduce it on Mali Gpu by taking a random character playing Idle animation(Idle to make it more noticable, like my youtube video).

  3. Can reproduce the jittering on PC by RawTexture.CreateRGBATexture(…, Constants.TEXTURETYPE_HALF_FLOAT)

  4. Another more noticeable way is to scale Float32Array _transformMatrices. You can prove that scaling every element of a transform matrix make no different, unless it exceed 65535 and overflow float16. In the video below, I scale every element with t (t is time). Some bone’s matrix has element whose value is around 200, so when t reach 300, mesh of hand disappear. other parts of mesh disapper as t increases… (this can help you quickly identify if a platform is using f16. On PC however you scale the element , nothing happens)

  1. My phone print the info in chrome console:
    “{“driver”:“ANGLE (ARM, Mali-G610 MC6, OpenGL ES 3.2)”,“vender”:“Google Inc. (ARM)”,“webgl”:“WebGL 2.0 (OpenGL ES 3.0 Chromium)”,“os”:“Android 10”}”. Float16 is a new feature aim to improve performance, so I guess it is newer gpu that has this problem: GL spec deault lowp sampler2D in vertex shader, and mali gpu decide to use float16 for lowp.