Creating a custom orthographic camera

I need an orthographic camera with a custom input manager. It has to function similar to the orthographic view in blender, but I only need zoom and pan. I am new to both BabylonJS and JS/TS, so I have been struggling with pointer events. I have approached this in two ways, one is configuring custom controls for a FreeCamera, and the other is simply detachControl and handle pointer events(playground link for both is below). Issue is, I need this for a multiplatform app, so I need to handle touch input as wall, pan with two fingers and zoom on pinch. Please can anyone show me how can I achieve that. Preferably through custom inputs. I can only get it to zoom using custom controls.

Custom Input Playground

Detach Control Playground