How to keep buffers on the GPU when using compute shaders for instancing or vertex data generation

CrashMaster · November 26, 2023, 7:34pm

Hello everyone,
I have been learning the ways of compute shaders recently, and I am now using them to generate an instance buffer like in this PG:

The compute shader creates an instance buffer on the GPU, then it gets copied to the CPU at the end of the compute dispatch, so that I can pass it to thinInstanceSetBuffer which sends it again to the GPU (that what I understand at least).

Copying all this data back and forth is not ideal and there is probably some trick to keep all the data on the GPU which would be a sizeable improvement. Does someone have an idea?

Evgeni_Popov · November 27, 2023, 12:13pm

You can’t use your own buffers with thin instances, so what you want to do is currently not possible.

You could try to hack it, by creating your own vertex buffer based on your storage buffer (you can pass storageBuffer.getBuffer() as the second parameter of the VertexBuffer constructor) and setting it to mesh.setVerticesBuffer("world", your_vertex_buffer). However, for thin instances, the matrices are passed as 4 vertex buffers, world0/world1/world2/world3 (worldX correspond to the row X of the matrix). So, you would need to change your compute shader to fill 4 storage buffers, and you would have to create 4 vertex buffers.

CrashMaster · November 27, 2023, 1:26pm

I see, is this a technical limitation of webgpu? I will try to implement the hack in the PG and I will post it here later. Thanks for the answer!

In the case of VertexData (positions, normals, indices) generated for a simple mesh by a compute shader, what would the hack look like?

Evgeni_Popov · November 27, 2023, 1:34pm

It’s not a limit of WebGPU but the fact that our current implementation of thin instances pre-date WebGPU, so it does not support storage buffers.

For a simple mesh it is quite easy:

create a storage buffer
create a vertex buffer by using this storage buffer: vb = new VertexBuffer(engine, storageBuffer.getBuffer(), ...)
set this vertex buffer to your mesh: mesh.setVerticesBuffer("position", vb)

You can see an example in the boids compute shader example from the documentation.

CrashMaster · November 27, 2023, 1:48pm

That looks fairly easy! I will also create a small PG for that one.

Just to clarify, is using the storage buffer incompatible with a physics engine? I assume that with the physics engine residing on the CPU, while the vertex data resides on the GPU, it may be impossible to account for collisions. I am currently weighing my options here, haha.

Evgeni_Popov · November 27, 2023, 2:39pm

Yes, it’s incompatible with a physics engine because it needs to access the matrices of the thin instances.

CrashMaster · November 27, 2023, 2:47pm

Vertex data generated by compute shaders:

without skipping the CPU: https://playground.babylonjs.com/#25VLT7#5 (works)
with skipping the CPU: https://playground.babylonjs.com/#25VLT7#8 (does not work)

It almost works! The last problem is with setting the index buffer. I found Updating Indices of a Mesh: `updateIndices` vs ` setVerticesBuffer` where it is advised to use the updateIndices method instead of setVerticesBuffer.

The issue is that updateIndices takes an IndicesArray (BABYLON | Babylon.js Documentation), and does not accept my storage buffer sadly. I also saw the boids compute shader uses a cpu buffer for the indices.

Is there a workaround to set the index buffer using the gpu storage?

CrashMaster · November 27, 2023, 3:27pm

I am working on the thin instance hack but I am struggling with setVerticesBuffer. I can indeed get the world0, world1, world2, world3 properly and I can create the associate buffers but how do I tell my grass blade to create thin instances out of it?

This PG has a boolean to toggle CPU skipping:

Evgeni_Popov · November 27, 2023, 4:07pm

Here’s a PG that works:

You must explicitely set the geometry._indexBuffer by hand, and create a submesh (it’s done for you when you call mesh.setIndices, but you can’t do it in this case).

Also, you must create your storage buffers with the right flags: BUFFER_CREATIONFLAG_READWRITE + BUFFER_CREATIONFLAG_VERTEX or BUFFER_CREATIONFLAG_INDEX, depending on the buffer.

Lastly, you can’t use material.wireframe = true, because it triggers an index buffer creation to properly render lines, which expects to be able to read from the existing index buffer.

This PR will simplify things a bit:

Once it’s merged, this PG will work:

Evgeni_Popov · November 27, 2023, 6:01pm

This PR will let you force an instance count for thin instances:

You also have to indicate that your vertex buffer uses instancing by setting the instanced: true option.

Once the PR is merged, this PG will work:

CrashMaster · November 27, 2023, 9:38pm

Wow wow that’s awesome! Thank you I am very excited for the next release haha

I will mark this as solved then, thank you again

Topic		Replies	Views
Creating a vertex buffer using a divisor Questions	7	448	September 20, 2023
How to create thin instances directly from data buffer? Questions thininstances	7	610	November 24, 2022
Dynamic size of instance count Questions compute-shader	14	1289	May 20, 2023
Vertex and Index Buffer to Vertex Shader Questions	5	31	March 3, 2025
Delay buffer sync when transforming thin instances Questions thininstances	4	29	October 7, 2024

Related topics