Grasp Anything: Combining Teacher-Augmented Policy Gradient Learning with Instance Segmentation to Grasp Arbitrary Objects
Interactive grasping from clutter, akin to human dexterity, is one of the longest-standing problems in robot learning. Challenges stem from the intricacies of visual perception, the demand for precise motor skills, and the complex interplay between the two. In this work, we present Teacher-Augmented Policy Gradient (TAPG), a novel two-stage learning framework that synergizes reinforcement learning and policy distillation. After training a teacher policy to master the motor control based on object pose information, TAPG facilitates guided, yet adaptive, learning of a sensorimotor policy, based on object segmentation. We zero-shot transfer from simulation to a real robot by using Segment Anything Model for promptable object segmentation. Our trained policies adeptly grasp a wide variety of objects from cluttered scenarios in simulation and the real world based on human-understandable prompts. Furthermore, we show robust zero-shot transfer to novel objects. Videos of our experiments are available at https://maltemosbach.github.io/grasp_anything.
- Published in:
IEEE International Conference on Robotics and Automation - Type:
Inproceedings - Authors:
Mosbach, Malte; Behnke, Sven - Year:
2024
Citation information
Mosbach, Malte; Behnke, Sven: Grasp Anything: Combining Teacher-Augmented Policy Gradient Learning with Instance Segmentation to Grasp Arbitrary Objects, IEEE International Conference on Robotics and Automation, 2024, May, https://ais.uni-bonn.de/papers/ICRA_2024_Mosbach.pdf, Mosbach.Behnke.2024a,
@Inproceedings{Mosbach.Behnke.2024a,
author={Mosbach, Malte; Behnke, Sven},
title={Grasp Anything: Combining Teacher-Augmented Policy Gradient Learning with Instance Segmentation to Grasp Arbitrary Objects},
booktitle={IEEE International Conference on Robotics and Automation},
month={May},
url={https://ais.uni-bonn.de/papers/ICRA_2024_Mosbach.pdf},
year={2024},
abstract={Interactive grasping from clutter, akin to human dexterity, is one of the longest-standing problems in robot learning. Challenges stem from the intricacies of visual perception, the demand for precise motor skills, and the complex interplay between the two. In this work, we present Teacher-Augmented Policy Gradient (TAPG), a novel two-stage learning framework that synergizes reinforcement...}}