+1 to what Deltakosh said. While I appreciate your perspective on this and thank you for the Playground example, I think the performance cost of a feature like this will be simply too high.
Unless I am greatly mistaken, there is no engine where in-frame logic like this, which requires updated state propagations through a very large hierarchy, will be efficient. In Unity, this will do exactly what Deltakosh mentioned, traversing the hierarchy thousands or millions of times setting dirty flags, which should show up very clearly in performance traces. While C# and C++ are fast enough that this might not impact framerate if you have enough additional bandwidth in your CPU usage, JavaScript suffers from this kind of operation much more quickly. If you have a Unity install and want to validate whether what I’m saying is true, it should not be difficult: create large and small transform hierarchies in Unity, manipulate them both at different times while taking a trace, then check the trace to see whether/how CPU usage changed.
While I respect your assessment, I disagree. As described above, large coherent hierarchy operations are complex by nature, and though engine-level implementation can certainly make such an operation slower, there is no engine-level choice which can make such an operation fast. In fact, I would suggest that Babylon’s approach can allow this to be done faster than with an auto-propagation scheme, though the onus to do this and understand it is on the developer. Consider the following hierarchy
Node1 <- Node2 <- Node3 <- ... NodeN
where you have a manual operation which needs to be performed on every node of this hierarchy using “to-the-moment correct” world matrices (as opposed to “to-the-frame correct” world matrices). Without auto-propagation, it is possible to do this with minimal cost if your operations can be ordered from the bottom of the hierarchy toward the top: operate first on Node1
, then on Node2
, etc., in every case calling computeWorldMatrix()
and never having to call computeWorldMatrix(true)
. Proceeding in this way will complete the entire operation while only traversing the hierarchy once. Contrast this with an auto-propagation mechanism: the Node1
operation will traverse N
nodes setting dirty flags, after which the Node2
operation will traverse N - 1
nodes, etc. A Gauss sum tells us that the overall cost of this operation in node traverses is N * (N + 1) / 2
, which is O(n^2)
, whereas the cost of the non-auto-propagated approach which traversed the hierarchy only once was O(n)
. Note that achieving the linear performance requires the operations to be sorted in order of dependence; if node operations need to be doable in random order with fully correct world matrices, I know of no way to do this generally in sub-polynomial time.
In short, unless I have misunderstood something, an auto-propagation scheme will always perform an operation such as this with guaranteed low efficiency, whereas without auto-propagation it is at least possible, by taking advantage of the underlying implementation, to do it comparatively efficiently. This is exactly the scenario we ran into in the Unity project I mentioned earlier; but because we were using Unity and couldn’t prevent it from auto-propagating through the hierarchy, we were forced to resort to tricks and scene structure changes on our end to work around the inefficiency.
When you need correct world matrices to do operations, it is definitely more convenient and easier to understand to have those world matrices propagated for you. However, the performance cost of having such propagation is much higher than one might initially assume, and so a tremendous amount of performance in high-stress situations can be saved by allowing users to optimize their approach to world matrix computation. This is one of those rare scenarios where two of Babylon’s objectives come into conflict – in this case, “Powerful” and “Simple” – and as Deltakosh said, we’ve so far assessed the gains in power to be worth the cost in simplicity.
But, with all that said, if you still believe an auto-propagation mechanism would be worthwhile (whether because you believe that the performance cost would be lower than we estimate or that the usability gain would be higher), please give it a shot! Make a new branch in your Babylon.js repo, add the flag you suggest, and run a perf comparison assessing the impact of the flag on runtime performance in large hierarchy and animation scenarios. I, as I’ve stated, suspect that the increase in runtime cost you observe will be prohibitive; but I’ve been wrong many times before, and if you happen to find a way to do this that brings the simplicity benefits without incurring the penalties to power, that would be absolutely awesome and I’d love to see it!
Hope this helps in clarifying our perception of the situation, and thanks again for giving such awesome and passionate feedback on Babylon!