Very neat. (And really cool Playground, by the way!) Label placement is actually a whole subfield unto itself, which you may have already looked into. I can’t guarantee anything I’ll have to say here will be particularly new, but I think it might be possible to get performance faster than 1 to 10 seconds per plane.
Since the placement you’re trying to go for is in screen-space, the entire problem can (I think) be solved in 2D, which removes a fair amount of the complexity from the computation. Also, the fact that the relevant pieces are axis-aligned rectangles makes this whole thing a lot easier, too.
The most flexible approach I can think of to do this is an optimizer, philosophically related to the “simulated annealing” mentioned on the other page I linked to, which sounds a lot fancier than it is (at least in its simplest form). You can kind of think of it as a sort of fuzzy physics system related to gravity—or in this case repulsion.
- At the start of your computation, project everything you care about down to the image plane; this might include the labels themselves, the feature points they’re trying to label, and any other pieces of the geometry you specifically want to remain uncovered. This is the last time you’ll have to do anything in 3D until the very end of the algorithm, when you’ll unproject the results to put them back into the world.
- For each label, initialize its position by placing it somewhere; where it starts out doesn’t actually matter that much. A good idea might be to place each label right beside the thing it’s trying to describe, not caring at this point about what else it’s overlapping with. It just needs to start somewhere.
- From here, you conduct an interative optimization. For each moveable thing (in this case, labels), look at everything else in the world that can affect it (other labels, feature points, important geometry, etc.) and calculate a repulsion “force” generated by proximity to that thing. For example, if the current label is close enough to another label that they’re overlapping, that proximity will generate a “force” to push the current label away from the other label. Note that you only apply the “force” to the element currently being processed; other labels will calculate their own “forces” when they’re processed. Once you’ve aggregated all the “forces” applying to the label you’re considering, apply the implied motion to your current label, then move on to the next label. In aggregate, this will cause labels to “bump” away from each other and slide into better places where they can all be seen and they don’t cover up other important things.
- Repeat the process above—hereafter referred to as an optimization pass—as many times as you want to. A typical iterative optimizer will require multiple passes to find a good solution; but because each pass only does very simple operations, they should all be pretty fast. One thing that’s often done to improve performance is to multiply the applied motion per pass by an “energy factor” which grows smaller over time, reducing the impact of later passes once the algorithm is presumably approaching a good solution. This is the key mechanic behind the “simulated annealing” technique referenced by the Wikipedia page linked above.
- Once you’ve done enough iterations that you’re happy (you can check the global error—the sum total of all “forces” computed in a single pass–if you want to, but it’s often sufficient to just pick a good number of iterations), reverse the projection math to send all your labels back to the planes from which they originally came. This will give you the optimized locations for your labels to be placed.
The reasons I recommend an approach like this are (1) because it’s surprisingly easy to implement, (2) it can be made to run quite fast, and (3) it’s extremely flexible. Basically, the entire behavior of the algorithm is controlled by “tuning parameters” that define the amount of force generated by different kinds of proximity. If your labels are getting too close together, change the fall-off rate for force generation over distance. If you want to add new kinds of things to avoid and move away from, it’s as simple as projecting them into the 2D space and generating forces to move away from them. If you want your labels to conscientiously stay close to something instead of moving away, you can create attractor forces that work within exactly the same environment; only the direction of the force is different. You can even create forces to make a label try to stay a specific distance from something else—not too close, not too far—and the system will attempt to make that work while balancing that consideration against the other forces you’ve told it to consider. And if you find that one kind of force is too strong versus another, you can weaken the former or strengthen the latter. All by just changing tuning parameters—hard-coded floating point numbers—and without reimplementing any part of the algorithm.
Optimizers like this are extremely powerful and flexible, and as mentioned above they can be made to run extremely fast. These are actually the core algorithms underneath certain real-time computer vision algorithms like SLAMs (think ARKit and ARCore), which do optimizations like this for every video frame they try to track. What I meant by “solve the problem largely in the GPU” is that you could theoretically do something like this in a shader, since each individual optimization pass is completely parallel-izeable. However, for a relatively constrained number of labels like 30, that’s probably way more complicated than you need.
Okay, by now I’ve rambled on for far too long, but the last thing I want to point out is that something like this can very easily be done in a WebWorker. You only need a small amount of information from the scene—the 2D projected positions of important features—in order to run the optimization, so you could very easily have a WebWorker running in the background that accepts updates about the 2D projections and the cameras, runs the optimizer, and reports back with where the labels should be. Whether or not you actually need such a WebWorker depends on how fast your optimizer would end up being; if its fast enough, you might not need another thread at all.
Anyway, sorry if that ended up way too involved or unrelated; I get excited about tech like this. Hopefully some if it was helpful. If you have any questions or other involved aspects to consider, let me know!