Mesh Intersection in Screen Space

I am looking for an algorithm - or if somebody is familiar with it, a function in Babylon.js - that tells you if two meshes intersect in screen space.

While there may not be intersections in 3D space, depending on the camera viewpoint, meshes can intersect when rendered.

Image on link - the tube connected to the GUI plane does not intersect the mesh in 3D space, but we can see that in “screen space” it actually is.

Imgur

That is an interesting question that could appeal @syntheticmagus

Hi semirel,

Very interesting use case. I don’t know of any existing utilities for this (though they might still exist under a name I just don’t know to search for), but the idea of how to do this shouldn’t be too tricky. This is actually exactly the sort of thing I would use a compute shader for; but unfortunately we don’t have compute shaders on the Web (yet :smile:), so we’d have to do it a different way.

The core idea behind the approach is just going to be checking geometric overlap after projecting everything to screen space. That part is easy: we just multiply all vertices from both meshes by the appropriate MVP matrix, just as in normal rendering. This will transform all triangles in both meshes to be flat triangles on the same plane, the screen plane. Now we just have to check if they intersect.

The core consideration for implementing this will be deciding whether you want to do it on the CPU or on the GPU. With compute shaders, this would almost certainly be an easy choice in favor of the GPU. However, since we don’t have compute shaders and so will have to get a little creative to use the GPU, it’s a bit more of a choice. I’ll describe some top-of-my-head thoughts about both approaches in a minute, but first some wild speculation about what the upsides/downsides of each approach might be:

  • CPU: easier to think about, easier/more familiar to implement, able to take advantage of more clever optimizations, likely a little faster when operating on simple meshes.
  • GPU: trickier to think about and implement, won’t be able to use certain optimizations but also won’t need them, likely to be much faster when operating on complex meshes.

The CPU approach is dead simple: for each triangle in mesh #1, check against every triangle in mesh #2 to see if they overlap. (Note that checking triangle overlap is a surprisingly annoying operation, though not too complicated, because triangles can be long and weird and fiddly to think about.) This is O(mn) in mesh triangle count, which is far from ideal, but there are a lot of optimizations you can do like spatially sorting triangles before you compare them. For extreme cases, you could get quite fancy by building a spatial data structure for one of the meshes, which you could then compare against in an optimized manner; this would be particularly useful if you had one relatively complex mesh you needed to compare a lot of other meshes to, since you could pay the cost of building the data structure once and reap the benefits on every comparison thereafter. Bottom line, the CPU-based approach is likely to be the easiest to implement and the easiest to customize for a particular use case; the downside is that it’ll likely be much less resilient to mesh scale than a GPU-based approach could be.

On the GPU side, there are a few approaches you could consider. You could, for example, do basically the same thing you’d do on the CPU, but in a massively-parallel way. Do accomplish this without compute shaders, you’d probably have to encode your mesh data into textures, then render an m by n texture with the value of the pixel at (i, j) representing whether mesh #1’s i-th triangle intersects with mesh #2’s j-th triangle. That’d be fairly annoying to write, but doable; and you likely wouldn’t need to bother about optimizations like spatial sorting because the GPU wouldn’t really care much about them anyway. The other annoying thing about this approach is that you’d have to scan the resulting texture looking for “true” pixels to get the actual answer as to whether there was an overlap or not; again, doable, but annoying. An approach like this is likely to be your best bet if you need very precise answers when comparing two potentially very complex meshes.

Another GPU option would be to just clear the screen to black, then render the relevant meshes to it using a shader that just increments every pixel it writes to; any pixel that has a value more than 1 after that’s done indicates a collision in screen space. This has the obvious upside that it’s going to be very easy to write, and you can use it to check for screen space collisions among an arbitrary number of meshes with a single draw call per mesh. I can think of two downsides, however: (1) this method is likely to do a lot more work than it needs to, especially if you render at a reasonably high resolution, and you’ll still have to scan the resulting image to determine whether anything collided or not; and (2) if you attempt to optimize by rendering at a lower resolution, the precision of your answers will begin to degrade, and small triangles especially may start failing to properly report collisions.

Okay, that’s probably way more speculation than was helpful, but Deltakosh was right that I’d find this interesting. :smiley: Of those three methods, which will work best probably depends on your use case. If you don’t care about precision and performance is meaningful but doesn’t have to be perfect, option #3 might be good for you. If you need precision but your meshes don’t get too huge and/or there are use-case-specific optimizations you want to take advantage of, consider option #1. And finally, if you need precision and want to be able to run as fast as possible on arbitrary and potentially complex meshes, option #2 might be your best bet.

Hope some of that was helpful, or at least interesting. Good luck!

2 Likes

Thank you very much for your detailed response. I am in the process of implementing this - and if successful, I will make sure I share the code.

1 Like

I am going to go with option 1: the CPU implementation.

I only have a single LineSegment to compare the intersection with - so it’s not mesh-to-mesh intersection checking, but more lineSegment-to-Mesh intersection.

My plan is to iterate all triangles (do I need to enable FacetData as per this instruction: Use Facet Data - Babylon.js Documentation and then check if any of the sides of each of the triangle intersect with my line segment.

I am not sure whether there may be tricks to speed this up.

So, here is the implementation of CPU option - relies on classes LineSegment (just start and end) and LineUtils which contain doIntersect method - returns whether two line segments intersect with each other. If there is something obvious that I am missing and if this could be speeded up - please let me know @Deltakosh, @syntheticmagus

/**
 * Returns whether a leader line segment intersects with the any of the projected facets of the mesh.
 * @param {BABYLON.Mesh} mesh 
 * @return {boolean} 
 */
leaderLineIntersectsMeshInScreenSpace(mesh) {
    let leaderLineSegment = this.getProjectedLeaderLine();   

    var positions = mesh.getVerticesData(BABYLON.VertexBuffer.PositionKind);
    var indices = mesh.getIndices();

    var i1 = 0;
    var i2 = 0;
    var i3 = 0;

    // indice triplet = 1 facet
    var nbFacets = indices.length / 3;
    let intersects = false;
    for (let index = 0; index < nbFacets; index++) {
        i1 = indices[index * 3];                                // get the indexes of each vertex of the facet
        i2 = indices[index * 3 + 1];
        i3 = indices[index * 3 + 2];

        let triangleVertices = [new BABYLON.Vector3(positions[i1*3], positions[i1*3+1], positions[i1*3+2]),
                                new BABYLON.Vector3(positions[i2*3], positions[i2*3+1], positions[i2*3+2]),
                                new BABYLON.Vector3(positions[i3*3], positions[i3*3+1], positions[i3*3+2])];

        let projectedVertices = triangleVertices.map(v => {
            return BABYLON.Vector3.Project(
                v,
                BABYLON.Matrix.IdentityReadOnly,
                scene.getTransformMatrix(),
                camera.viewport.toGlobal(
                    engine.getRenderWidth(),
                    engine.getRenderHeight(),
                )
            );
        });

        let lines = [new LineSegment(projectedVertices[0], projectedVertices[1]),
                        new LineSegment(projectedVertices[1], projectedVertices[2]),
                        new LineSegment(projectedVertices[0], projectedVertices[2])];
        

        lines.forEach(function(line) {
            if (LineUtils.doIntersect(line, leaderLineSegment)) {
                intersects = true;
            } 
        });

        if (intersects) break;                
    }
    return intersects;        
}

@syntheticmagus Hi,

Apologize for reaching out, but can you recommend some literature on speeding this method up (currently taking a very long time on CPU)? Can this be done on the GPU using BabylonJS? If so, do you mind pointing me to some literature (BabylonJS links)

Hi semirel,

Sorry for the delayed reply, been a bit heads-down. It’s a bit difficult to guess at what might be slow without being able to profile the running code, so if you have a Playground (or even a live site), that’s something I could look at. The one thing that jumps out at me from the implementation above is that you’re allocating new vectors and line segments every iteration of your loop. Any use of the new keyword can be surprisingly slow in JavaScript, so changing that might be a quick way to speed up the approach. You could, for example, declare

var v1;
var v2;
var v3;

before your loop, then set their values using copyFromFloats to avoid allocating new variables in the loop. You might try something similar with LineSegment if there’s a way to change its state without reallocating.