Learning Dexterous Manipulation with Three Independent Fingers from Human Demonstrations

Anonymous authors
Under review for RA-L

Abstract

Humans have proven to be powerful teachers for robot manipulation skills via imitation learning. How can we leverage this potential for robots with a morphology unlike our own? In this work, we demonstrate that teleoperation of a three-fingered robot morphology is both feasible and effective for dexterous manipulation tasks. To address the challenges posed by the embodiment gap between human demonstrators and non-humanoid robots, we investigate three teleoperation strategies: fingertip matching using hand tracking from a commercial AR headset, control via motion controllers, and kinesthetic teaching with a leader robot. We collect demonstrations on a suite of dexterous manipulation tasks, including assembling a 3D-printed object and folding a napkin. We then train manipulation policies with ACT and Diffusion Policy and evaluate their success on the respective tasks. The policies trained on data collected via motion controllers and kinesthetic teaching generally outperform those trained on hand-tracking data. We additionally fine-tune vision-language-action models on pick-and-place data collected with the TriFinger robot. The resulting policies achieve high success rates for in-distribution tasks and can generalize to objects not seen during fine-tuning, demonstrating that large-scale pretraining can be leveraged for this non-standard embodiment. We release open-source datasets and policy checkpoints to support further research in non-anthropomorphic dexterous manipulation.

Pipeline Overview

Pipeline overview: Teleoperation, Training, Evaluation
Our pipeline consists of three stages: 1. Data collection with three alternative teleoperation methods. 2. Training of imitation learning and VLA policies. 3. Evaluation of the policies on the real robot.

Why three independent fingers?

Three robotic fingertips moving independently in the workspace enable in-hand manipulation and a high level of dexterity. Matching this with robotic arms and traditional grippers would require bimanual setups which increase cost and raise safety concerns due to possible self-collisions. This is illustrated by the videos below that show teleoperated demonstrations of a cup flipping task that is difficult for a robot arm with a parallel-jaw gripper, but can be accomplished efficiently with the TriFinger platform. Moreover, the robust design of the platform enables extended autonomous operation without human oversight, an ideal starting point for future research into autonomous self-improvement of imitation learning policies.

Franka

TriFinger

Teleoperation Methods

Hand tracking

Motion controllers

Kinesthetic teaching

Autonomous Imitation Learning Rollouts

Successful rollouts

ACT

Pass Through (Kinesthetic data)

Assemble (Kinesthetic data)

Disassemble (Kinesthetic data)

Fold (Kinesthetic data)

Unfold (Controller data)

Insert Battery (Controller data)

Remove Battery (Controller data)

Diffusion Policy

Pass Through (Controller data)

Assemble (Controller data)

Disassemble (Controller data)

Fold (Controller data)

Unfold (Controller data)

Insert Battery (Controller data)

Remove Battery (Controller data)

Autonomous VLA Rollouts

Successful and unsuccessful rollouts

In-distribution

\(\pi_0\)

Prompt: Put apple into cup (2/2)

Prompt: Stack lemon on top of green cylinder (0/2)

Prompt: Insert blue hexagonal prism into red cup (1/2)

Prompt: Put white cone onto green cylinder (2/2)

Prompt: Remove pear from red plate (0/2)

Prompt: Put pear in cup closest to apple (2/2)

\(\pi_{0.5}\)

Prompt: Put apple into cup (2/2)

Prompt: Stack lemon on top of green cylinder (2/2)

Prompt: Insert blue hexagonal prism into red cup (2/2)

Prompt: Put white cone onto green cylinder (2/2)

Prompt: Remove pear from red plate (2/2)

Prompt: Put pear in cup closest to apple (2/2)

Out-of-distribution

Objects not seen during fine-tuning are marked in orange in the prompt.

\(\pi_0\)

Prompt: Pick up green apple (0/2)

Prompt: Stack blue cone on white cylinder (0/2)

Prompt: Put baseball into red bowl (1/2)

Prompt: Remove green apple from red plate (0/2)

Prompt: Put white sphere into green cup (0/2)

Prompt: Put plush toy into blue bowl (2/2)

\(\pi_{0.5}\)

Prompt: Pick up green apple (2/2)

Prompt: Stack blue cone on white cylinder (2/2)

Prompt: Put baseball into red bowl (2/2)

Prompt: Remove green apple from red plate (2/2)

Prompt: Put white sphere into green cup (0/2)

Prompt: Put plush toy into blue bowl (2/2)

\(\pi_{0.5}\) Trained on Combined Data

The combined dataset includes all datasets except those collected with hand tracking as they are too noisy.

Failed Rollouts

Some examples of failed rollouts, mostly caused by drifting out of distribution.