{"ID":2847040,"CreatedAt":"2026-06-01T04:54:23.091178241Z","UpdatedAt":"2026-06-01T04:54:23.091178241Z","DeletedAt":null,"paper_url":"https://arxiv.org/abs/2511.00962","arxiv_id":"2511.00962","title":"A Unified Reasoning Framework for Holistic Zero-Shot Video Anomaly Analysis","abstract":"Most video-anomaly research stops at frame-wise detection, offering little insight into why an event is abnormal, typically outputting only frame-wise anomaly scores without spatial or semantic context. Recent video anomaly localization and video anomaly understanding methods improve explainability but remain data-dependent and task-specific. We propose a unified reasoning framework that bridges the gap between temporal detection, spatial localization, and textual explanation. Our approach is built upon a chained test-time reasoning process that sequentially connects these tasks, enabling holistic zero-shot anomaly analysis without any additional training. Specifically, our approach leverages intra-task reasoning to refine temporal detections and inter-task chaining for spatial and semantic understanding, yielding improved interpretability and generalization in a fully zero-shot manner. Without any additional data or gradients, our method achieves state-of-the-art zero-shot performance across multiple video anomaly detection, localization, and explanation benchmarks. The results demonstrate that careful prompt design with task-wise chaining can unlock the reasoning power of foundation models, enabling practical, interpretable video anomaly analysis in a fully zero-shot manner. Project Page: https://rathgrith.github.io/Unified_Frame_VAA/.","short_abstract":"Most video-anomaly research stops at frame-wise detection, offering little insight into why an event is abnormal, typically outputting only frame-wise anomaly scores without spatial or semantic context. Recent video anomaly localization and video anomaly understanding methods improve explainability but remain data-depe...","url_abs":"https://arxiv.org/abs/2511.00962","url_pdf":"https://arxiv.org/pdf/2511.00962v1","authors":"[\"Dongheng Lin\",\"Mengxue Qu\",\"Kunyang Han\",\"Jianbo Jiao\",\"Xiaojie Jin\",\"Yunchao Wei\"]","published":"2025-11-02T14:49:08Z","proceeding":"cs.CV","tasks":"[\"cs.CV\"]","methods":"[]","has_code":false}
