{"ID":2844048,"CreatedAt":"2026-06-01T04:54:23.091178241Z","UpdatedAt":"2026-06-01T04:54:23.091178241Z","DeletedAt":null,"paper_url":"https://arxiv.org/abs/2511.07250","arxiv_id":"2511.07250","title":"MVU-Eval: Towards Multi-Video Understanding Evaluation for Multimodal LLMs","abstract":"The advent of Multimodal Large Language Models (MLLMs) has expanded AI capabilities to visual modalities, yet existing evaluation benchmarks remain limited to single-video understanding, overlooking the critical need for multi-video understanding in real-world scenarios (e.g., sports analytics and autonomous driving). To address this significant gap, we introduce MVU-Eval, the first comprehensive benchmark for evaluating Multi-Video Understanding for MLLMs. Specifically, our MVU-Eval mainly assesses eight core competencies through 1,824 meticulously curated question-answer pairs spanning 4,959 videos from diverse domains, addressing both fundamental perception tasks and high-order reasoning tasks. These capabilities are rigorously aligned with real-world applications such as multi-sensor synthesis in autonomous systems and cross-angle sports analytics. Through extensive evaluation of state-of-the-art open-source and closed-source models, we reveal significant performance discrepancies and limitations in current MLLMs' ability to perform understanding across multiple videos. The benchmark will be made publicly available to foster future research.","short_abstract":"The advent of Multimodal Large Language Models (MLLMs) has expanded AI capabilities to visual modalities, yet existing evaluation benchmarks remain limited to single-video understanding, overlooking the critical need for multi-video understanding in real-world scenarios (e.g., sports analytics and autonomous driving)....","url_abs":"https://arxiv.org/abs/2511.07250","url_pdf":"https://arxiv.org/pdf/2511.07250v2","authors":"[\"Tianhao Peng\",\"Haochen Wang\",\"Yuanxing Zhang\",\"Zekun Wang\",\"Zili Wang\",\"Gavin Chang\",\"Jian Yang\",\"Shihao Li\",\"Yanghai Wang\",\"Xintao Wang\",\"Houyi Li\",\"Wei Ji\",\"Pengfei Wan\",\"Steven Huang\",\"Zhaoxiang Zhang\",\"Jiaheng Liu\"]","published":"2025-11-10T16:02:33Z","proceeding":"cs.CV","tasks":"[\"cs.CV\",\"cs.AI\"]","methods":"[\"Large Language Model\",\"Language Model\"]","has_code":false}
