{"ID":2881393,"CreatedAt":"2026-06-01T04:54:23.091178241Z","UpdatedAt":"2026-06-01T04:54:23.091178241Z","DeletedAt":null,"paper_url":"https://arxiv.org/abs/2508.12282","arxiv_id":"2508.12282","title":"A Question Answering Dataset for Temporal-Sensitive Retrieval-Augmented Generation","abstract":"We introduce ChronoQA, a large-scale benchmark dataset for Chinese question answering, specifically designed to evaluate temporal reasoning in Retrieval-Augmented Generation (RAG) systems. ChronoQA is constructed from over 300,000 news articles published between 2019 and 2024, and contains 5,176 high-quality questions covering absolute, aggregate, and relative temporal types with both explicit and implicit time expressions. The dataset supports both single- and multi-document scenarios, reflecting the real-world requirements for temporal alignment and logical consistency. ChronoQA features comprehensive structural annotations and has undergone multi-stage validation, including rule-based, LLM-based, and human evaluation, to ensure data quality. By providing a dynamic, reliable, and scalable resource, ChronoQA enables structured evaluation across a wide range of temporal tasks, and serves as a robust benchmark for advancing time-sensitive retrieval-augmented question answering systems.","short_abstract":"We introduce ChronoQA, a large-scale benchmark dataset for Chinese question answering, specifically designed to evaluate temporal reasoning in Retrieval-Augmented Generation (RAG) systems. ChronoQA is constructed from over 300,000 news articles published between 2019 and 2024, and contains 5,176 high-quality questions...","url_abs":"https://arxiv.org/abs/2508.12282","url_pdf":"https://arxiv.org/pdf/2508.12282v1","authors":"[\"Ziyang Chen\",\"Erxue Min\",\"Xiang Zhao\",\"Yunxin Li\",\"Xin Jia\",\"Jinzhi Liao\",\"Jichao Li\",\"Shuaiqiang Wang\",\"Baotian Hu\",\"Dawei Yin\"]","published":"2025-08-17T08:12:59Z","proceeding":"cs.CL","tasks":"[\"cs.CL\",\"cs.IR\"]","methods":"[\"RAG\",\"Large Language Model\"]","has_code":false}
