Thank you.
I wonder could matrix class be implemented in such way that you could pipe operations without the results being stored to actual underlying buffer until the last operation is finished. Then every calculation would happen with f64 while final results is saved f32