{"ID":2839630,"CreatedAt":"2026-06-01T04:54:23.091178241Z","UpdatedAt":"2026-06-01T04:54:23.091178241Z","DeletedAt":null,"paper_url":"https://arxiv.org/abs/2511.15565","arxiv_id":"2511.15565","title":"Scriboora: Rethinking Human Pose Forecasting","abstract":"Human pose forecasting predicts future poses based on past observations, and has many significant applications in areas such as action recognition, autonomous driving or human-robot interaction. This paper evaluates a wide range of pose forecasting algorithms in the task of absolute pose forecasting, revealing many reproducibility issues, and provides a unified training and evaluation pipeline. After drawing a high-level analogy to the task of speech understanding, it is shown that recent speech models can be efficiently adapted to the task of pose forecasting, and improve current state-of-the-art performance. Finally, the robustness of the models is evaluated, using noisy joint coordinates obtained from a pose estimation model, to reflect a realistic type of noise, which is closer to real-world applications. For this a new dataset variation is introduced, and it is shown that estimated poses result in a substantial performance degradation, and how much of it can be recovered again by unsupervised finetuning.","short_abstract":"Human pose forecasting predicts future poses based on past observations, and has many significant applications in areas such as action recognition, autonomous driving or human-robot interaction. This paper evaluates a wide range of pose forecasting algorithms in the task of absolute pose forecasting, revealing many rep...","url_abs":"https://arxiv.org/abs/2511.15565","url_pdf":"https://arxiv.org/pdf/2511.15565v2","authors":"[\"Daniel Bermuth\",\"Alexander Poeppel\",\"Wolfgang Reif\"]","published":"2025-11-19T15:58:33Z","proceeding":"cs.CV","tasks":"[\"cs.CV\"]","methods":"[]","has_code":false}