@alparslanahmed, if you are targeting lower-end devices with your game, you will want to plan your design elements and art direction around your target device. It doesn’t matter if you have beautiful cloth simulations and realistic grass swaying in the breeze if your target device can’t hit 60 fps. You have limited resources when designing a game for any device, so you have to choose where to spend that budget so you have the best tradeoff between features and art style while maintaining your performance.
In looking at your PG, while I see what you are going for with your physics simulation I wonder if you are getting the most bang for your buck in terms of resources spent. In watching the ball hit the cloth mesh in many different positions, the reaction of the cloth repeats in expected ways. Hits to the upper left always look the same, for example. The ball also bounces back off the cloth, rather than sliding along it displacing it as it goes, so I would suggest that you could lose the cloth sim all together and solve the problem with a skinned animation, which would allow you to include the top and sides of the net as well.
You could determine a handful of locations to simulate a strike from and place a plane imposter in front of the net. The ball would bounce off the imposter and you then determine hit location and play an animation that moves the net skeleton from the closest point. This would allow you to animate the whole net with a skeleton that covers all major parts that need to move, like the top of the net reacting when the back is struck. You can also create one set of animations with the maximum strength hit at each location and one animation where the net is static and blend between them to vary the strength of the hit based on the speed of the ball while maintaining a minimum number of animation clips.
A skinned animation solution also allows you to spend a bit more on the mesh resolution for the net so you can get good deformations without taxing the system with a high-res mesh that needs to be handled by the physics system. You can spend that budget elsewhere. The main question you need to answer is “how important is the physical accuracy of the net in the scene?” Would you give up other gameplay features to have a better physics simulation? Is the net strike a top tier element in your game loop? Is it just a feature to reinforce the feeling of scoring a goal? Is your camera going to be close enough to judge the accuracy of the simulation? Answering these questions help you decide how much of your technical budget to spend here.
I would say the same about grass. Here are two very different representations of grass/turf:
Granted the second one has less resolution, but illustrates the point. If your camera is close to the grass, you would expect to see the blades, even if it’s short. If you camera is further back to encompass more of the field, you won’t be able to see the blades of grass. And if you angle is higher, you won’t see the displacement of the grass by the ball/player. This can be a design choice to keep your camera back from the field at a higher angle to prevent needing to spend your tech budget on grass. I would even go as far as baking a texture with a normal map on the field and not having mesh for the grass at all. Again, will your tech budget be well spent on a heavy mesh for grass or for other things like particle systems, UI, post-effects, lighting, custom shaders, etc.
In terms of physics simulations in DCC tools like Blender, glTF does not support that type of data at all. Even simple things like deformers are not supported as it only accepts meshes (both static and skinned) and morph targets. Morph targets can be used to simulate a more complex deformation on a mesh, but there is a limit to the number of targets that can be stored in a glTF and that number is low enough that you may not be able to use them to do what you want. Mostly they are helpful at supporting a rigged skeleton to create deformations that may make the skeleton more complex than needed. Any simulations or deformers that you use in your DCC tool need to be baked to the mesh, but mostly skinned animation is the best way to deform a mesh in glTF.
Hope this helps, though I know this isn’t likely the answer you were hoping for. I think the pains you are feeling around physics here are pointing to the fact that you may need a different solution for this problem.