kzhsw
February 4, 2026, 6:59am
1
Precision: 4.xxe-6
Engine: WebGL2 only
Playground:
https://playground.babylonjs.com/#2FDQT5#3056
Performance:
First run very slow, but faster after warmup.
Use this to benchmark:
https://playground.babylonjs.com/#2FDQT5#3058
Reference a topic here:
Playground: Babylon.js Playground
Version: 7.24.0
Engine: WebGL2
Loading HDRCubeTexture is slow, it takes ~5s for a loading a 1k texture, and takes minutes loading 2k ones. I’ve made some profiling on chrome to see if there are ways to optimize it.
Loading HDR to a 1k texture, ConvertCubeMapToSphericalPolynomial takes the most, ~54% of the cpu time, most of which is calculation, so it could, at least in theory, be possible to be gpu accelerated, via a post process shader in webgl2, or a comp…
1 Like
That is something that @sebavan will appreciate for sure
This is great !!! did you compare the result with the cpu version ?
Would be amazing to have a PR for this
kzhsw
February 5, 2026, 12:59am
4
Yes there is, see devtools console of https://playground.babylonjs.com/#2FDQT5#3056 , there is max diff and raw gpu result vs cpu result, and first run time.
The first run on gpu is slower than cpu mainly due to
readPixels (texture data readback and forced gpu sync).
To workaround it there are 2 options:
use pbo buffered, async readback, delay it for a few frames until it’s done (Edit: the async ver https://playground.babylonjs.com/#2FDQT5#3059 does not seems to help much)
do not readback until serialization (to keep the serialization structure stable), and pass it as a texture to downstream shaders
Also instead of all these complex shaders and draws, webgpu with compute shader, storage buffer, and atomic add can hopefully do the whole thing in a single pass, and the readback is async, might be a more future-proof option while strictly limited to secure context.
maybe we could use it as a texture ?
Last I tried, I had the same readback issues.
kzhsw
February 5, 2026, 1:29am
6
But that could be breaking for downstream shader devs, also it could slow down the next readPixels
kzhsw
February 5, 2026, 2:23am
7
At least this could be used for batched/massive high-res hdr to env conversion
1 Like
kzhsw
February 5, 2026, 8:01am
8
Oh, wait, something strange happened, if the cpu part moved ahead of the gpu part, it became slower, and the gpu part became faster.
https://playground.babylonjs.com/#2FDQT5#3060
Edit:
It might be somewhere else that requires gpu sync, using the no-readPixels ConvertCubeMapToSphericalPolynomial directly shows a very different speed.
https://playground.babylonjs.com/#2FDQT5#3061
Edit2:
A dummy readPixels cost more than 1s, there must be something wrong with it.
https://playground.babylonjs.com/#2FDQT5#3062
It might wait for the gpu to flush the previous work before reading back ?
kzhsw
February 6, 2026, 4:00am
10
Yeah I think so, using gl.fenceSync to wait for gpu sync gives a similar result with this playground https://playground.babylonjs.com/#2FDQT5#3064 .
And profiler shows dropped frame without any cpu or gpu activity during fenceSync.