Make scene.freezeActiveMeshes much more usable

I am mindful that 5.0 cutoff should be coming soon, so I started looking into this area Monday. I read the code around March, and thought this won’t really work.

I did not get very far due to pain in my mousing elbow, but I did manage to confirm my suspicion. That is, it is too restrictive to be usable by most, especially XR apps or scenes where the camera is allowed to move or rotate. This is only plausible for a small subset of scenes.

The problem is the list of active meshes is arrived at as an artifact of rendering. Well if something did not make it into the frame that is used to construct the list, maybe a user had their arm down, so an XR controller mesh will never render. In XR, you have to let the user turn around. If they do, they are not going to see any stuff, which is what I did confirmed.


I am thinking that since very few scenes are GPU bound that just doing ALL meshes, scene.meshes (checking only for being enabled & visibility) is going to be much better for the vast majority of scenes. The amount of checking seemed more appropriate for the wimpy mobile gpu’s of the past, but that has changed and continues to

It almost seems like singled threaded performance slowness is being locked-in. If you look at the change in CPU & GPU hardware stats of the Oculus standalone line (Go / Quest / Quest 2), the CPU has been improving gradually, while the growth in parallelism (# of ALUs) of the GPU is exponential. To be far, the screen resolution increases need more to drive them.

Amdahl’s law clearly shows the underwhelming effect of adding additional parallel resources for a process with a large single threaded component, so what might appear as smart to filter down the mesh count as being good might not always hold up, or might not continue to hold up. I am not even sure fragments are even generated for meshes currently behind the camera.

I wanted to get some data without changing code (simulating via setting all the meshes to always render), but have gotten very little done this week.

2 Likes

Well the goal of freezeActiveMeshes is actually aligned with what you are stating: move most of the load to the GPU.

the overall idea is to flag all meshes as mesh.alwaysSelectAsActiveMesh and then call 'scene.freezeActiveMeshes()`. This way all the meshes will be send to the gpu.

If new meshes are added you can unfreeze and freeze back to account for them

1 Like

Is it correct usage of the idea? - https://playground.babylonjs.com/#9S6NWH#1

3 Likes

yes!

2 Likes

Ok, it cannot be used by itself. Seems like you would have to find a mesh / material observable to put in this freeze toggling, since a mesh is probably going to exist a few frames before it is ready to draw.

mesh.onMeshReadyObservable is the good candidate

Saturday, I started working on was a higher, more abstract level of optimization called scene.loopBehavior. I have had a lot of things not related to BJS to deal with today, so I really have not advanced it much.

I do not like having to manage all the changes to cause the list to be redone:

  • new meshes
  • mesh enabled or disabled
  • mesh visibility changing from / to 0

It quickly expanded into the scene should be able to manage these changes as well try to figure out which way is faster for a given scene / hardware. I am not so sure that a developer is going to be to do this, definitely if what is best changes for older / different machines.

Having a higher level can also allow the burial of all these detailed optimizations, but if people had code to do it themselves, it would still work.


Started working on a Scene subclass, so that I did not do deal with long build times. I can build this in a project and get a few second build. Also, this is not really an ask. I am going to do this. If it was not PR’ed, I am not forking the whole framework for this.

Some methods are just going to be copied over for now: _evaluateSubMesh, _evaluateActiveMeshes, and maybe 1 or 2 more. addMesh is going to be subclassed, and I am going to use my existing Mesh subclass to handle enabling / visibility.

Here is what I have so far:

module XXX {
    export enum loopRenderBehaviors  { cpu_favored, gpu_favored, adaptive };
    
    export class Scene extends BABYLON.Scene {
        private _loopBehavior = loopRenderBehaviors.cpu_favored; // default for back compatablity
        private _loopBehaviorState = this._loopBehavior; // current state, never really uses adaptive setting for long
        
        public get loopBehavior() { return this._loopBehavior; }
        public set loopBehavior(val : loopRenderBehaviors) {
            this._loopBehavior = val;
            this._loopBehaviorState = this._loopBehavior;
        }
        // -- -- -- --
        private _adaptiveDecisionIsGPU : boolean;
        private _environmentChanged = true; // a mesh was added / (dis)enabled / changed from / to visibility == 0
        
        public adaptiveDecisionFrames : number[] = [1, 100, 1000]; // the frames numbers after a change to verify still in best mode
        private _adaptiveRefFrameId : number;
        
        private _adaptiveBenchmarkTime : number; // most of render time (without before after renders) to be compared with
        private _lastFrameComparableTime : number;
        
        // need to add this._loopBehaviorState != loopRenderBehaviours.cpu_favored to if test of _evaluateSubMesh
        
        public addMesh(newMesh: BABYLON.AbstractMesh, recursive = false) {
            super.addMesh(newMesh, recursive);
            if (newMesh instanceof BABYLON.Mesh) {
                newMesh.onMeshReadyObservable.add(() => {this._environmentChanged = true});
            }
        }
    . . .         
    }
}

Any thoughts?

Hijacking this quote just to help my point here Post-mortem analysis of a Babylon project

@JCPalmer’s idea seem like the kind of thing that could benefit from higher level hints on the meshes themselves.

I am not really able to go all day right now, but have managed to implement this except adaptive. Am not too concerned if I am not posting perfect code, since in order to use as a sub-class requires the ability to have a mesh sub-class & AFAIK I am the only one doing embedded graphics.

Here is the entire scene class

module  XXX {
    export enum RenderLoopBehaviors  { cpu_favored, gpu_favored, adaptive };

    export class Scene extends BABYLON.Scene {
        constructor(engine: BABYLON.Engine, options?: BABYLON.SceneOptions) {
            super(engine, options);

            // front run _evaluateActiveMeshes
            this["_stockEvaluateActiveMeshes"] = this["_evaluateActiveMeshes"];
            this["_evaluateActiveMeshes"] = this["_PRE_EvaluateActiveMeshes"];
        }

        private _loopBehavior = RenderLoopBehaviors.cpu_favored; // default for back compatablity
        private _loopBehaviorState = this._loopBehavior; // current state, never really uses adaptive setting for long

        public get loopBehavior() { return this._loopBehavior; }
        public set loopBehavior(val : RenderLoopBehaviors) {
            this._loopBehavior = val;
            this._loopBehaviorState = this._loopBehavior;

            if (this._loopBehaviorState !== RenderLoopBehaviors.cpu_favored) {
                this.freezeActiveMeshes(false, null, null, false);
            } else {
                this.unfreezeActiveMeshes();
            }
        }
        public get loopBehaviorState() { return this._loopBehaviorState; }

        // a mesh was added / (dis)enabled / layer mask set / changed from - to visibility == 0
        public activeMeshesOutOfDate = true;
        // - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - -
        /**
         * @override
         * This is near identical to stock, but since it calls _evaluateActiveMeshes()
         * that line must be changed.  Also, setting alwaysSelectAsActiveMesh = true
         */
        public freezeActiveMeshes(skipEvaluateActiveMeshes = false, onSuccess?: () => void, onError?: (message: string) => void, freezeMeshes = true): Scene {
            this.executeWhenReady(() => {
                if (!this.activeCamera) {
                    onError && onError('No active camera found');
                    return;
                }

                if (!this._frustumPlanes) {
                    this.setTransformMatrix(this.activeCamera.getViewMatrix(), this.activeCamera.getProjectionMatrix());
                }

                // this must occur BEFORE running eval
                for (const mesh of this.meshes) {
                    mesh.alwaysSelectAsActiveMesh = true;
                }

                this["_stockEvaluateActiveMeshes"](); // calling the stock version
                this._activeMeshesFrozen = true;
                this["skipEvaluateActiveMeshesCompletely"] = skipEvaluateActiveMeshes;

                if (freezeMeshes) {
                    const actives = <BABYLON.SmartArray<BABYLON.AbstractMesh>> this["_activeMeshes"];
                    for (var index = 0; index < actives.length; index++) {
                        actives.data[index]._freeze();
                    }
                }
                onSuccess && onSuccess();
            });
            return this;
        }
        // - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - -
        /**
         * @override
         * Setting alwaysSelectAsActiveMesh = false
         */
        public unfreezeActiveMeshes(): BABYLON.Scene {
            for (const mesh of this.meshes) {
                mesh.alwaysSelectAsActiveMesh = false;
            }
            return super.unfreezeActiveMeshes();
        }
        // - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - -
        private _PRE_EvaluateActiveMeshes() : void {
            // in gpu mode & meshes out of date
            if (this._loopBehaviorState !== RenderLoopBehaviors.cpu_favored && this.activeMeshesOutOfDate) {
                this._activeMeshesFrozen = false; // quicker than actually bothering to unfreeze
                this.freezeActiveMeshes(false, null, null, false); // already does an eval call
                this.activeMeshesOutOfDate = false;

            // in cpu mode or meshes still up to date
            } else {
                this["_stockEvaluateActiveMeshes"]();
            }
        }
    }
}

Here are the relevant adds to my mesh sub-class

constructor(name: string, scene: BABYLON.Scene, parent: BABYLON.Node = null, source?: Mesh, doNotCloneChildren?: boolean) {
    super(name, scene, parent, source, doNotCloneChildren);

    // still allow for use of the stock scene class
    if (scene instanceof XXX.Scene) {
        this.onMeshReadyObservable.add(() => { scene.activeMeshesOutOfDate = true; } );
    }
    . . .
}
// ============================ support overrides for RenderLoop =============================
/**
 * @override
 */
 public setEnabled(value : boolean) : void {
    super.setEnabled(value);
    const scene = this.getScene();
    if (scene instanceof XXX.Scene) scene.activeMeshesOutOfDate = true;
 }

/**
 * @override
 */
public set layerMask(value: number) {
    super.layerMask = value;
    const scene = this.getScene();
    if (scene instanceof XXX.Scene) scene.activeMeshesOutOfDate = true;
 }
/**
 * @override
 * If you overide a setter, you must also override the getter
 */
public get layerMask()  : number {
    return super.layerMask;
}

/**
 * @override
 */
public set visibility(value: number) {
    super.visibility = value;
    const scene = this.getScene();
    if (scene instanceof XXX.Scene) scene.activeMeshesOutOfDate = true;
}

/**
 * @override
 * If you overide a setter, you must also override the getter
 */
public get visibility() : number {
    return super.visibility;
}

/**
 * isVisible is a property, not a setter, so cannot override excactly.
 * Implementing this as a PR should change this to a getter / setter.
 */
 public setIsVisible(value : boolean) :void {
    this.isVisible = value;
    const scene = this.getScene();
    if (scene instanceof XXX.Scene) scene.activeMeshesOutOfDate = true;
 }

In coming days, I will do some more testing. Probably not implement adaptive for my own needs unless I determine I need it myself. Will put any code changes discovered in as edits.

Turns out if you override a setter which also has a getter, you need to also override the getter. This is now operational. I do not know if it is quite improving things.

Am planning to move my testing to Quest 2 soon, with 120 hz refresh turned on. Much easier to see effects on a machine not pegged to the max.

Happy New Year!

As I said before, this feature is up and running, switching between behavior modes just fine. No meshes are missing nor still around when in gpu_favored mode and doing tests of adding meshes or disabling / enabling meshes which are not in frustrum at the time. Nothing special is being done at the application level. Scene / Meshes class take care of everything. Any code which directly calls the stuff I am hiding would still work.

The question remaining is under what circumstances (scene or hardware) is one method producing better results. I am in the middle of retooling my stack while working on a comprehensive new back office development tool at the same time. Things are quite shaky (not even working as of this very moment).

I did get off many tests on the desktop. Most of those were actually just putting a measurement process in place. The verdict there, as expected, is the thing was pegged at 60 fps regardless.

On the quest, I am battling multiple problems, including having had to do a hardware reset do to the right controller stopped working. I also think I have found a bug somewhere unrelated to this. I did manage to get off one set of crippled tests before I broke everything, and found that in WebXR it is actually about 4 FPS faster to do in cpu_favored mode.

As I think about it, in this case, the shaders are all running twice, even those which would not run at all in cpu_favored mode, but when there are 2 cameras, some stuff in _evaluateActiveMeshes is short circuited when run a second time in the same frame. (This is exactly why you have to actually do the tests.)

This further confirms my feeling that in different hardware configurations the choice of which mode is best are not always the same, hence a possible need for adaptive at runtime on the device the actual user has.


Please let me know if / how you wish to proceed. Adaptive could be added later, if done with an enum type switch instead of boolean. Or at least go through Scene and make the things you are comfortable with protected rather than private, so as to make it sub-classable without as many backdoor re-assignments. Thank you.

2 Likes

I believe that the new compute pressure API could help identify if the CPU is overloaded or not

I assume you are talking about the adaptive part. I understand the desire to for an indirect arbitrary observer. It might work as a decent switch assuming high CPU implies low GPU utilization.

Just looking at the doc for the first time, I have a couple of reservations:

  • I think that API really needs to also have some type of getter of the pressure “achieve” during the last frame to make it easier know what to put in as the threshold.

  • The isAvailable member is worrying to see.

I have thought some more about that one test I managed to get off on Quest. I had to do it with a very low mesh count. I am almost back to an operational GUI, which I had to break to have controls of different scales in the same scene (a wall of surface controls which need to be big & controls meant to be touched). Planning on doing a better test is my highest priority for now.

The matter is that it is may not available by default in all browsers, it is still experimental function - Compute Pressure API  |  Web Platform  |  Chrome for Developers
I still cannot make it work in Chrome 96 while it should work from 92.

1 Like