Accurate 3D hand pose and pressure perception is essential for immersive human-computer interaction, yet simultaneously achievingboth in mobile scenarios remains a significant challenge. We present WristP2, a camera-based wrist-worn system that estimates3D hand pose and per-vertex pressure from a single RGB frame in real time. A ViT backbone with joint-aligned tokens predictsHand–VQ–VAE code indices for mesh recovery, while an extrinsics-conditioned branch jointly estimates per-vertex contact andpressure. On a self-collected dataset of 93,000 frames (15 subjects; 48 on-plane and 28 mid-air gestures), WristP2 attains cross-subjectMPJPE of 2.88 mm, Contact IoU of 0.72, Vol.IoU of 0.62, and foreground pressure MAE of 10.3 g. Across three user studies, WristP2achieves touchpad-level performance for mid-air pointing while also supporting accurate multi-finger pressure targeting and stabledrag-and-press on an uninstrumented virtual touchpad, enabling portable, surface-agnostic, and versatile interaction.
Publication:
WristP2: A Wrist-Worn System for Hand Pose and Pressure Estimation
Anonymous author(s)
*This paper is submitted to the CHI Conference on Human Factors in Computing Systems (CHI ’26). It is currently under review and temporarily open for application. People who see this page are kindly requested not to spread it.
Project Credits:
Ivision Group at Tsinghua University, Department of Automation, directed by Jianjiang Feng.
This project is funded by the Tsinghua University Academic Advancement Program.