How to keep buffers on the GPU when using compute shaders for instancing or vertex data generation

Hello everyone,
I have been learning the ways of compute shaders recently, and I am now using them to generate an instance buffer like in this PG:

The compute shader creates an instance buffer on the GPU, then it gets copied to the CPU at the end of the compute dispatch, so that I can pass it to thinInstanceSetBuffer which sends it again to the GPU (that what I understand at least).

Copying all this data back and forth is not ideal and there is probably some trick to keep all the data on the GPU which would be a sizeable improvement. Does someone have an idea?

You can’t use your own buffers with thin instances, so what you want to do is currently not possible.

You could try to hack it, by creating your own vertex buffer based on your storage buffer (you can pass storageBuffer.getBuffer() as the second parameter of the VertexBuffer constructor) and setting it to mesh.setVerticesBuffer("world", your_vertex_buffer). However, for thin instances, the matrices are passed as 4 vertex buffers, world0/world1/world2/world3 (worldX correspond to the row X of the matrix). So, you would need to change your compute shader to fill 4 storage buffers, and you would have to create 4 vertex buffers.

I see, is this a technical limitation of webgpu? I will try to implement the hack in the PG and I will post it here later. Thanks for the answer!

In the case of VertexData (positions, normals, indices) generated for a simple mesh by a compute shader, what would the hack look like?

It’s not a limit of WebGPU but the fact that our current implementation of thin instances pre-date WebGPU, so it does not support storage buffers.

For a simple mesh it is quite easy:

  • create a storage buffer
  • create a vertex buffer by using this storage buffer: vb = new VertexBuffer(engine, storageBuffer.getBuffer(), ...)
  • set this vertex buffer to your mesh: mesh.setVerticesBuffer("position", vb)

You can see an example in the boids compute shader example from the documentation.

2 Likes

That looks fairly easy! I will also create a small PG for that one.

Just to clarify, is using the storage buffer incompatible with a physics engine? I assume that with the physics engine residing on the CPU, while the vertex data resides on the GPU, it may be impossible to account for collisions. I am currently weighing my options here, haha.

Yes, it’s incompatible with a physics engine because it needs to access the matrices of the thin instances.

1 Like

Vertex data generated by compute shaders:

It almost works! The last problem is with setting the index buffer. I found Updating Indices of a Mesh: `updateIndices` vs ` setVerticesBuffer` where it is advised to use the updateIndices method instead of setVerticesBuffer.

The issue is that updateIndices takes an IndicesArray (BABYLON | Babylon.js Documentation), and does not accept my storage buffer sadly. I also saw the boids compute shader uses a cpu buffer for the indices.

Is there a workaround to set the index buffer using the gpu storage?

I am working on the thin instance hack but I am struggling with setVerticesBuffer. I can indeed get the world0, world1, world2, world3 properly and I can create the associate buffers but how do I tell my grass blade to create thin instances out of it?

This PG has a boolean to toggle CPU skipping:

Here’s a PG that works:

You must explicitely set the geometry._indexBuffer by hand, and create a submesh (it’s done for you when you call mesh.setIndices, but you can’t do it in this case).

Also, you must create your storage buffers with the right flags: BUFFER_CREATIONFLAG_READWRITE + BUFFER_CREATIONFLAG_VERTEX or BUFFER_CREATIONFLAG_INDEX, depending on the buffer.

Lastly, you can’t use material.wireframe = true, because it triggers an index buffer creation to properly render lines, which expects to be able to read from the existing index buffer.

This PR will simplify things a bit:

Once it’s merged, this PG will work:

1 Like

This PR will let you force an instance count for thin instances:

You also have to indicate that your vertex buffer uses instancing by setting the instanced: true option.

Once the PR is merged, this PG will work:

1 Like

Wow wow that’s awesome! Thank you I am very excited for the next release haha

I will mark this as solved then, thank you again :slight_smile: