I’m not familiar with all that can be done within a vertex shader but I can provide detail on the data needing to be updated.
Focusing on ContactPoints:
All matrices are updated currently because that is how thinInstances are positioned. For ContactPoints, the updates are only for position, so a “Translation Matrix” (a matrix having three non-zeros only in the last row, contiguous because Matrices are stored column first) is created then the entire Matrix array of all thinInstances is pushed to the GPU. This could be greatly reduced (to 3/16 of the data) by only pushing new position vectors. If I abandon the “persistance” aspect of ContactPoints, then we’re still left with all new ContactPoint positions every frame. Further, updating a list of positions within a pre-allocated GPU buffer along with a count of valid positions (located at the start of that buffer) would minimize GPU buffer re-allocations.
The overall savings I think would be immense.
It doesn’t have to be thinInstances at all, but could easily be Instances with the same material. All I need to update are a multitude of mesh (sphere) positions, where each mesh represents a single contact point.
Then if persistance were needed, I could separate the updates into “blocks of positions” where a certain number would remain the same and a certain number would be new. Because the position data in this case is constantly rotating, it’s a little difficult to keep all the data in contiguous blocks, especially in the case where the new data size exceeds the data being rotated out. I’m not sure of the capabilities of a vertex shader, but I could further minimize the data sent to be only 1) number of contiguous blocks to be removed, 2) new block of position data. If the GPU moved all old position data that are not removed (i.e. data not yet rotated out) to the beginning of the buffer, then new data is always appended to that old data.
Again, I’m not exactly sure if this fits into what a vertex shader is capable of.
In the CPU I collect vertex position one at a time, one per collision Observable, in an Array. But even that can maybe be optimized if I am able to cycle one time through all collisions since the last frame and skip the numerous collision observer notifications.
For your reference, the perFrame update for ContactPoints currently looks like this (in TypeScript).
perFrame( edata: any, estate: any): void {
const previousPointCount = this.#contactSphere.thinInstanceCount;
const newPointCount = this.#contactPoints.length;
const smaller = Math.min(previousPointCount,newPointCount);
for (var i=0; i<smaller;i++) {
const p = this.#contactPoints.array[i];
this.#contactSphere.thinInstanceSetMatrixAt(i,
BABYLON.Matrix.TranslationToRef(p.x, p.y, p.z, this.#matrix),
false);
}
for (var i=smaller; i<newPointCount;i++) {
const p = this.#contactPoints.array[i];
this.#contactSphere.thinInstanceAdd(
BABYLON.Matrix.TranslationToRef(p.x, p.y, p.z, this.#matrix),
false);
}
this.#contactSphere.thinInstanceCount = newPointCount;
this.#contactSphere.thinInstanceBufferUpdated("matrix");
if (this.#contactSphere.isVisible != this.#contactSphere.hasThinInstances) {
this.#contactSphere.isVisible = this.#contactSphere.hasThinInstances;
}
this.#contactPoints.elapse(edata.deltaTime); // removes old points
}