Physics-guided Shape-from-Template: Monocular Video Perception through Neural Surrogate Models
3D reconstruction of dynamic scenes is a long-standing problem in computer graphics and increasingly difficult the less information is available. Shape-from-Template (SfT) methods aim to reconstruct a template-based geometry from RGB images or video sequences often leveraging just a single monocular camera without depth information such as regular smartphone recordings. Unfortunately existing reconstruction methods are either unphysical and noisy or slow in optimization. To solve this problem we propose a novel SfT reconstruction algorithm for cloth using a pre-trained neural surrogate model that is fast to evaluate stable and produces smooth reconstructions due to a regularizing physics simulation. Differentiable rendering of the simulated mesh enables pixel-wise comparisons between the reconstruction and a target video sequence that can be used for a gradient-based optimization procedure to extract not only shape information but also physical parameters such as stretching shearing or bending stiffness of the cloth. This allows to retain a precise stable and smooth reconstructed geometry while reducing the runtime by a factor of 400-500 compared to ?-SfT a state-of-the-art physics-based SfT approach.
- Published in:
Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR) - Type:
Inproceedings - Authors:
Stotko, David; Wandel, Nils; Klein, Reinhard - Year:
2024 - Source:
https://openaccess.thecvf.com/content/CVPR2024/html/Stotko_Physics-guided_Shape-from-Template_Monocular_Video_Perception_through_Neural_Surrogate_Models_CVPR_2024_paper.html
Citation information
Stotko, David; Wandel, Nils; Klein, Reinhard: Physics-guided Shape-from-Template: Monocular Video Perception through Neural Surrogate Models, Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR), 2024, https://openaccess.thecvf.com/content/CVPR2024/html/Stotko_Physics-guided_Shape-from-Template_Monocular_Video_Perception_through_Neural_Surrogate_Models_CVPR_2024_paper.html, Stotko.etal.2024a,
@Inproceedings{Stotko.etal.2024a,
author={Stotko, David; Wandel, Nils; Klein, Reinhard},
title={Physics-guided Shape-from-Template: Monocular Video Perception through Neural Surrogate Models},
booktitle={Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR)},
url={https://openaccess.thecvf.com/content/CVPR2024/html/Stotko_Physics-guided_Shape-from-Template_Monocular_Video_Perception_through_Neural_Surrogate_Models_CVPR_2024_paper.html},
year={2024},
abstract={3D reconstruction of dynamic scenes is a long-standing problem in computer graphics and increasingly difficult the less information is available. Shape-from-Template (SfT) methods aim to reconstruct a template-based geometry from RGB images or video sequences often leveraging just a single monocular camera without depth information such as regular smartphone recordings. Unfortunately existing...}}