How to make fragment shader texture sample (texture2D function) faster? if all mesh sample on same texture

I have an image sampled for every mesh (similar to envirment texture), and I use a uniform to pass the image to the fragment shader for each material.

When rendering, it takes 1ms for all meshes to sample the image.
I need to sampe 8 times every frame so it’s very slow.
Is there any method to improve performance?

Maybe something like global shader texture? fast data buffer for shader?

How do you measure this ?

Use inspector, and check “GPU frame time” after each code edit.

After some debugging, I found that performance improves significantly when i commented out a line used texture2D function in my code.

Wait, do you mean the slow performance may not be caused solely by texture sampling?I will run another test to verify. thanks for the reminder.

The way you measure makes a lot of sense now to be that slow your texture may be big ?

Sampling 8 times should not impact much more depending of the samples locality.

Could you share a repro in the playground ?

I reviewed the code. Actully not 8 samples, it’s many times. I made a mistake.

Threre is detail, and it’s a bit lengthy:
I have two textures. Texture A is the data texture, and texture B is the weight texture. The sampling of the data texture is multiplied by the sampling of the weight and then outputted.

The data texture is sampled 3*8 times per pixel, while the weight texture requires much more sampling, 16x16x8 times.
The reason why I found render time reduced a lot when data sampling get vec3(0) is that I guess the GPU automatically skipped the weight texture sampling. It’s leading to an incorrect conclusion. Actually there are many texture samplings.

Two textures are both small, data texture is about 64x64, weight texture is 512x512. Is there any way to optimize performance when they are used by all material?

Finally, sorry for i can’t share the code. But i will post if i find a solution.

Unfortunately I do not think there are any easy solution to optimize it.

It is really similar to how environment works in PBR and what we do in this case is to preprocess the texture to create “pre-sampled” version of it to prevent the need of that many samples.

I guess in your case, some pre processing could help ??? but hard to be sure without the code.

1 Like

It’s indirect light, the probe volume grid solution.
I think it’s hard to preprocessing unless generate lightmap textures while loading …

Alternatively, completely ignore the light bleeding and just compute a smooth transition between probe.
Or I could add probe weights attribute and place high-weighted black probes inside the wall, similar to TLOU does.
Or switch to different algorithm …

Is there really no GPU technology available to accelerate this process?