{"ID":2863815,"CreatedAt":"2026-06-01T04:54:23.091178241Z","UpdatedAt":"2026-06-01T04:54:23.091178241Z","DeletedAt":null,"paper_url":"https://arxiv.org/abs/2509.25106","arxiv_id":"2509.25106","title":"Towards Personalized Deep Research: Benchmarks and Evaluations","abstract":"Deep Research Agents (DRAs) can autonomously conduct complex investigations and generate comprehensive reports, demonstrating strong real-world potential. However, existing evaluations mostly rely on close-ended benchmarks, while open-ended deep research benchmarks remain scarce and typically neglect personalized scenarios. To bridge this gap, we introduce Personalized Deep Research Bench (PDR-Bench), the first benchmark for evaluating personalization in DRAs. It pairs 50 diverse research tasks across 10 domains with 25 authentic user profiles that combine structured persona attributes with dynamic real-world contexts, yielding 250 realistic user-task queries. To assess system performance, we propose the PQR Evaluation Framework, which jointly measures Personalization Alignment, Content Quality, and Factual Reliability. Our experiments on a range of systems highlight current capabilities and limitations in handling personalized deep research. This work establishes a rigorous foundation for developing and evaluating the next generation of truly personalized AI research assistants.","short_abstract":"Deep Research Agents (DRAs) can autonomously conduct complex investigations and generate comprehensive reports, demonstrating strong real-world potential. However, existing evaluations mostly rely on close-ended benchmarks, while open-ended deep research benchmarks remain scarce and typically neglect personalized scena...","url_abs":"https://arxiv.org/abs/2509.25106","url_pdf":"https://arxiv.org/pdf/2509.25106v3","authors":"[\"Yuan Liang\",\"Jiaxian Li\",\"Yuqing Wang\",\"Piaohong Wang\",\"Motong Tian\",\"Pai Liu\",\"Shuofei Qiao\",\"Runnan Fang\",\"He Zhu\",\"Ge Zhang\",\"Minghao Liu\",\"Yuchen Eleanor Jiang\",\"Ningyu Zhang\",\"Wangchunshu Zhou\"]","published":"2025-09-29T17:39:17Z","proceeding":"cs.CL","tasks":"[\"cs.CL\",\"cs.AI\",\"cs.IR\"]","methods":"[]","has_code":false}
