pull down to refresh

Stanford researchers just released UMI-FT, a handheld data collection platform that puts compact six-axis force/torque sensors on each finger, enabling finger-level wrench measurements alongside RGB, depth, and pose data.

Many manipulation tasks require careful force modulation: too little force and the task fails, too much and you cause damage. But commercial force/torque sensors are expensive, bulky, and fragile, which has limited large-scale force-aware policy learning.

UMI-FT changes the economics. The platform uses an iPhone for RGB vision, ultrawide RGB, depth, and pose via ARKit, with each finger sensorized using a CoinFT sensor to capture per-finger wrench information during manipulation.

This multimodal data trains an adaptive compliance policy that predicts position targets, grasp force, and stiffness for execution on standard compliance controllers.

The learned policy runs slowest and generates reference targets, while model-based compliance and force controllers provide delicate 6D compliance control and real-time force modulation.

They tested on three contact-rich, force-sensitive tasks: whiteboard wiping (locate eraser, grasp, wipe until clean), skewering zucchini (grasp slice firmly, push onto stick until punctured), and lightbulb insertion (grasp bulb, align bayonet pin with socket slit, insert while overcoming spring force, rotate to light up).

The results are clear. Policies without compliance struggle to modulate contact force and trigger safety faults from excessive force. Policies without force sensing fail to grasp unseen objects or resist reaction forces, causing slippage.

Here's the project page: https://umi-ft.github.io