Efficient Multi-Object Pose Estimation using Multi-Resolution Deformable Attention and Query Aggregation

Object pose estimation is a long-standing problem in computer vision. Recently, attention-based vision transformer models have achieved state-of-the-art results in many computer vision applications. Exploiting the permutation-invariant nature of the attention mechanism, a family of vision transformer models formulate multi-object pose estimation as a set prediction problem. However, existing vision transformer models for multi-object pose estimation rely exclusively on the attention mechanism. Convolutional neural networks, on the other hand, hard-wire various inductive biases into their architecture. In this paper, we investigate incorporating inductive biases in vision transformer models for multi-object pose estimation, which facilitates learning long-range dependencies while circumventing the costly global attention. In particular, we use multi-resolution deformable attention, where the attention operation is performed only between a few deformed reference points. Furthermore, we propose a query aggregation mechanism that enables increasing the number of object queries without increasing the computational complexity. We evaluate the proposed model on the challenging YCB-Video dataset and report state-of-the-art results.

Published in:
2023 Seventh IEEE International Conference on Robotic Computing (IRC)
Type:
Inproceedings
Authors:
Periyasamy, Arul Selvam; Tsaturyan, Vladimir; Behnke, Sven
Year:
2023
Source:
https://ieeexplore.ieee.org/document/10473594

Citation information

Periyasamy, Arul Selvam; Tsaturyan, Vladimir; Behnke, Sven: Efficient Multi-Object Pose Estimation using Multi-Resolution Deformable Attention and Query Aggregation, 2023 Seventh IEEE International Conference on Robotic Computing (IRC), 2023, November, https://ieeexplore.ieee.org/document/10473594, Periyasamy.etal.2023b,

Open BibTeX citation

Efficient Multi-Object Pose Estimation using Multi-Resolution Deformable Attention and Query Aggregation

Citation information

Associated Lamarr Researchers

Prof. Dr. Sven Behnke