Reducing VRAM usage for instances and thin instances

kzhsw · August 13, 2025, 7:42am

Currently instances and thin instances use 4 vec4 buffers for instanced matrices.
But as the matrices are composed from TRS, or multiplied with other matrices with TRS, the last row of it should always be 0, 0, 0, 1, and this last row is also transfered into the gpu.
If this last row is skipped, and reconstructed on gpu side, ~25% of VRAM used by instancing could be saved.
For performance,since opengl matrices are column-major, so removing last row would reduce the change of auto-vectorization by js engines, but I can not tell the exact impact unless some benchmarks are made.
For public apis like thinInstanceSetBuffer, more copy could be needed to copy every 4 vec4 to 4 vec3.
I know that this could break custom shaders with customized handling for instances, and does not expect this to land right now, just leave it as some open discussion, so feel free to move it to the “correct” category if I missed something.

attribute vec4 world0;
attribute vec4 world1;
attribute vec4 world2;

void main(void) {
vec4 instance0 = vec4(world0.xyz, 0);
vec4 instance1 = vec4(world0.w, world1.xy, 0);
vec4 instance2 = vec4(world1.zw, world2.x, 0);
vec4 instance3 = vec4(world2.yzw, 1);
mat4 instanceWorld = mat4(instance0, instance1, instance2, instance3);
}

Deltakosh · August 13, 2025, 4:04pm

It is an interesting idea. I agree that VRAm will be saved but I’m wondering at what cost from the rendering standpoint as we need to reconstruct on all frame for all vertices

kzhsw · August 14, 2025, 1:36am

Not skilled at mesuring performance on GPU, but some AI says “6-10 more instructions per vertex”.

Deltakosh · August 14, 2025, 2:55pm

Yeah..This is where I’m a bit defensive of our approach.. Not sure it is worth the cost.

Are we suffering for limited VRAM? Maybe in your use cases?

kzhsw · August 15, 2025, 12:34am

Mostly mobile, especially old ios devices, I can not even measure VRAM usage when safari crashed.

Topic		Replies	Views
Reducing memory footprint on CPU Side with large number of thin instances Questions thininstances	5	763	September 5, 2023
Questions of Thin Instances v.s. regular Instances Questions instances , thininstances	3	135	July 10, 2025
Unable to set thin instance with thinInstanceSetMatrixAt Questions	14	333	March 31, 2024
How to keep buffers on the GPU when using compute shaders for instancing or vertex data generation Questions instances , thininstances , compute-shader	10	1212	November 27, 2023
Delay buffer sync when transforming thin instances Questions thininstances	4	59	October 7, 2024

Reducing VRAM usage for instances and thin instances

Related topics