Curious about this.totalVertices.addCount() in the evaluate active mesh loop

Pryme8 · August 10, 2023, 5:41pm

Here is one for you @Evgeni_Popov
I nerfed out the mesh.computeWorldMatrix in the evaluation loop, and as far as I can tell everything still works.

Im assuming this would break some of the picking methods though, or some of the frustrum culling things.

Is there a way to know when it would be ok to skip this or not?

The function went from 4ms to 2ms calculation time which is pretty sizable, but with something like this it would be save to assume its important for things that I’m not thinking about.

Evgeni_Popov · August 10, 2023, 6:12pm

It is never ok

In your example, computeWorldMatrix is called indirectly by the instance rendering code, which at some point is doing instance.getWorldMatrix(), which is what triggers the call to computeWorldMatrix.

You can use the world matrix freezing feature, if you know you won’t change some world matrices.

Pryme8 · August 10, 2023, 6:30pm

So then for instances does that mean its being called twice?

Because going off that idea I tried:

got me to:

while:

got me to:

Of course this is a scene with a huge amount of instances, but if the getWorldMatrix gets called is there anything before that which would need that calculation?

FrameVRWill · August 10, 2023, 6:38pm

I agree about the line timings, which is why I instrumented the code with console.time and provided the accurate timings for you to review instead:

The screenshot with the code was just for reference so you could see what the console.time labels corresponded to.

Since the profiler works by sampling the location of the code pointer form time to time, there is always some sampling error where the lines of code are under or overrepresented in the sample due to random chance. Don’t know why you’re nitpicking on this point when I’ve been trying to agree with you on it the entire time: there is no problem with the _currentLODIsUpToDate assignment.

It’s not useful to use the line timings that we both agree are unreliable nor to compare across profiles that weren’t made under identical conditions. Use the console.time tags in the bottom-up chart. We spent 145ms counting vertices for runtime statistics versus 16.1ms computing world matrices. That’s means we’re spending a really meaningful amount of time performing non-rendering tasks in the render loop.

Evgeni_Popov:

The test you should perform is to look at the fps depending on things being commented out or not, as thanks to @Pryme8 we can now easily do these tests, and see if you can detect a significant difference (which is not easy either, see my other post above). Note that you should use the code with the if (!this.disableTotalVerticesPerfCounter) around _totalVertices because we won’t be able to just remove this line.

I agree we should change the order of the if (!mesh.isReady() || !mesh.isEnabled() || mesh.scaling.hasAZeroComponent) test, though. I will make a PR for that.

Regarding the mesh.isReady function, it has to check a number of things before returning true or false, there’s really no way around it…

In my experience, performance profiling of js code is always difficult, if not impossible, even when you try to get rid of everything that could interfer with the results (close all running apps on your computer, use an anonymous window in your browser, run your tests several times and take the mean, …). I spent hours trying to do this, and often the end result was that… there were no results, I couldn’t make reproducible test cases!

In the present case, as we know what to test / change in the code base, we can run a real world scenario (PG, project) and measure the impact: that’s probably the best thing to do.

Fully in agreement about running real world scenarios - that’s what I’ve provided for you here.

Evgeni_Popov · August 11, 2023, 12:26pm

As I said above, I couldn’t reproduce the slowdown when the line isn’t commented versus when it is.

This is something I’d like to see, a real PG (or better: several PGs) where the framerate is significantly different in both cases => it’s not only me, other people in the team will also run the tests, to get a wider range of results.

Performance snapshot figures are not sufficient to justify code modification. They’re great for pointing out code that might be a bottleneck / paint point / in need of improvement (as you’ve done in this thread for this._totalVertices.addCount), but the next step is to test what happens when the code is updated to fix / improve what was found in the first step.

Don’t get me wrong: it’s obvious that this._totalVertices takes time, and if it were possible to simply delete the line, there’d be no discussion and we’d do it! But that’s not possible, the only thing we can do is something like:

if (!this.disableTotalVerticesPerfCounter) {
    this._totalVertices.addCount(mesh.getTotalVertices(), false) ;
}

So, what should be tested are PGs when:

A: using existing code
B: using the code with the if condition and disableTotalVerticesPerfCounter=false.
C: using the code with the if condition and disableTotalVerticesPerfCounter=true.

Then compare the results.

C will probably be equal to or faster than A (still, we need numbers, because if we can’t prove that it’s faster in some cases, there’s no point in making the change).

The concern is A versus B, B being what we’ll get by default if we make the change. If B is significantly slower than A, we probably won’t make the change, because that’s what people will experience with the default configuration.

We need to be very careful because this code is a hot path, it’s called at every frame and for every mesh, so we need to be very sure that we’re not hurting performance when we make changes.

I’ll try to do some testing on my own, but I’m very busy at the moment, so I can’t commit to a timetable.

[EDIT] I should answer this one:

I meant real word scenarios than I and others in the team can run and test for themselves

Evgeni_Popov · August 11, 2023, 1:15pm

One change we can make now is to use a counter that contains the current total number of vertices, instead of calling this._totalVertices.addCount at each loop:

This won’t solve everything (especially if the mesh.getTotalVertices() call is the most time-consuming), but it may help a little.

Evgeni_Popov · August 11, 2023, 7:25pm

Actually I think it could be a viable solution. You could implement the frustum check in getActiveMeshCandidates and keep only the meshes that pass the test. To avoid doing the frustum check again for the meshes that you know are visible, you should set mesh.alwaysSelectAsActiveMesh = true:

It is the same PG than the one I used in my tests above. This time it is running at ~240 fps.

Pryme8 · August 11, 2023, 8:28pm

Unless you are looking at all the meshes then it drops in performance quite a bit to about ~60-70 fps
vs this one that does the instance check I have been asking about which holds ~170-180 fps.

With this one that filters the meshes in the eval step getting an even faster ms on the mesh selection process.

Evgeni_Popov · August 14, 2023, 9:22am

Indeed, overriding getActiveMeshCandidates is only a win if a lot of the meshes are out of the frustum.

Not only that, but you’ve merged two conditions into one and the line mesh._internalAbstractMeshDataInfo._currentLODIsUpToDate = false; isn’t executed as often as it was before your changes.

This may work in this PG, but may be broken in others. I wouldn’t risk changing the order of operations in this loop, the stakes are too high.

Topic		Replies	Views
Why is the triangle I counted different from the triangle counted by the inspector? Questions mesh	3	283	July 10, 2023
How can i boost my fps Questions	4	221	October 25, 2023
Why do instances add to the total vertex count but not thin instances? Questions	3	388	April 26, 2022
Simplify skipping active mesh calculation Questions	1	493	April 19, 2019
Blender to babylon Vertex and Indices Count Questions	6	809	November 23, 2021

Curious about this.totalVertices.addCount() in the evaluate active mesh loop

Related topics