FaDIV-Syn: Fast Depth-Independent View Synthesis using Soft Masks and Implicit Blending

Novel view synthesis is required in many robotic applications, such as VR teleoperation and scene reconstruction. Existing methods are often too slow for these contexts, cannot handle dynamic scenes, and are limited by their explicit depth estimation stage, where incorrect depth predictions can lead to large projection errors. Our proposed method runs in real time on live streaming data and avoids explicit depth estimation by efficiently warping input images into the target frame for a range of assumed depth planes. The resulting plane sweep volume (PSV) is directly fed into our network, which first estimates soft PSV masks in a self-supervised manner, and then directly produces the novel output view. This improves efficiency and performance on transparent, reflective, thin, and feature-less scene parts. FaDIV-Syn can perform both interpolation and extrapolation tasks at 540p in real-time and outperforms state-of-the-art extrapolation methods on the large-scale RealEstate10k dataset. We thoroughly evaluate ablations, such as removing the Soft-Masking network, training from fewer examples as well as generalization to higher resolutions and stronger depth discretization. Our implementation is available.

  • Published in:
    Robotics: Science and Systems
  • Type:
    Inproceedings
  • Authors:
    Rochow, Andre; Schwarz, Max; Weinmann, Michael; Behnke, Sven
  • Year:
    2022

Citation information

Rochow, Andre; Schwarz, Max; Weinmann, Michael; Behnke, Sven: FaDIV-Syn: Fast Depth-Independent View Synthesis using Soft Masks and Implicit Blending, Robotics: Science and Systems, 2022, https://doi.org/10.48550/arXiv.2106.13139, Rochow.etal.2022a,

Associated Lamarr Researchers

lamarr institute person Behnke Sven - Lamarr Institute for Machine Learning (ML) and Artificial Intelligence (AI)

Prof. Dr. Sven Behnke

Area Chair Embodied AI to the profile