So is this related to XR Hand tracking? Reason I ask, is you would not really need gestures if you were tracking wearers actual hands. Either way I would recommend that the skeleton be very friendly to the one expected for hand tracking, so as to not dig a comparability hole for yourself.
Also, I am doing A LOT with hands. I am going with a KinectV2 skeleton replaced with MakeHuman hands. I have standardized on the TPOSE, specifically for the hands, as shown from above in BJS:
The reason for the TPOSE, is because the bend of each finger is accomplished without some kind of helper Rig, due to being aligned with an axis and results in very exact poses. I also make the prior bone, K2-Hand.L or K2-Hand.R in my case, exactly on the X axis. All this means with a minimum poses I can fabricate any gesture in real time. Shown from Blender along with the list of poses:
To accomplish the “any pose” part, I break out each pose by finger at load time, and can then animate / interpolate each finger independently. Final general purpose gestures can be achieved, like ones shown here . This scene is very old, when I was still using morphing for hands.
I also use the interpolator to fabricate composites of individual poses for a finger at various percentage, then store the result under the same name for each finger.
Not sure how much of this you can use, but the TPOSE is definitely your friend. If you have stuff in Blender, I can generate in-line mesh sub-classes for geometry, but not doing a PR. For meshes of this level, you would not want some dorky load process & callback, though I suppose you can always store the file as text.