GPU Accelerated Creation of HDRCubeTexture

For the ConvertPanoramaToCubemap part, seems three.js have a gpu accelerated impl here. It should take at most 6 draw calls and 6 readPixels (or 1 draw call with multirendertarget). Also, since the algorithm of ConvertCubeMapToSphericalPolynomial does not use too much branching, looping, or read-after-write, it could be also rewritten in shaders.
To speedup loading cubemaps, babylon.js’s internal env format could be an option.
It’s very hard to use multiple thread in js, using it would suffer from limitations of browser, or copy the data to and from threads every time data needs to be passed cross thread.