Gated Temporal Diffusion for Stochastic Long-Term Dense Anticipation

Long-term action anticipation has become an important task for many applications such as autonomous driving and human-robot interaction. Unlike short-term anticipation, predicting more actions into the future imposes a real challenge with the increasing uncertainty in longer horizons. While there has been a significant progress in predicting more actions into the future, most of the proposed methods address the task in a deterministic setup and ignore the underlying uncertainty. In this paper, we propose a novel Gated Temporal Diffusion ({GTD}) network that models the uncertainty of both the observation and the future predictions. As generator, we introduce a Gated Anticipation Network ({GTAN}) to model both observed and unobserved frames of a video in a mutual representation. On the one hand, using a mutual representation for past and future allows us to jointly model ambiguities in the observation and future, while on the other hand {GTAN} can by design treat the observed and unobserved parts differently and steer the information flow between them. Our model achieves state-of-the-art results on the Breakfast, Assembly101 and 50Salads datasets in both stochastic and deterministic settings.

Citation information

Zatsarynna, Olga; Bahrami, Emad; Farha, Yazan Abu; Francesca, Gianpiero; Gall, Jürgen: Gated Temporal Diffusion for Stochastic Long-Term Dense Anticipation, Computer Vision – ECCV 2024, 2025, 454--472, Springer Nature Switzerland, https://link.springer.com/chapter/10.1007/978-3-031-73001-6_26, Zatsarynna.etal.2025a,

Associated Lamarr Researchers

lamarr institute person Gall Juergen - Lamarr Institute for Machine Learning (ML) and Artificial Intelligence (AI)

Prof. Dr. Jürgen Gall

Principal Investigator Embodied AI to the profile