{"ID":2863295,"CreatedAt":"2026-06-01T04:54:23.091178241Z","UpdatedAt":"2026-06-01T04:54:23.091178241Z","DeletedAt":null,"paper_url":"https://arxiv.org/abs/2509.24283","arxiv_id":"2509.24283","title":"Overview of SCIDOCA 2025 Shared Task on Citation Prediction, Discovery, and Placement","abstract":"We present an overview of the SCIDOCA 2025 Shared Task, which focuses on citation discovery and prediction in scientific documents. The task is divided into three subtasks: (1) Citation Discovery, where systems must identify relevant references for a given paragraph; (2) Masked Citation Prediction, which requires selecting the correct citation for masked citation slots; and (3) Citation Sentence Prediction, where systems must determine the correct reference for each cited sentence. We release a large-scale dataset constructed from the Semantic Scholar Open Research Corpus (S2ORC), containing over 60,000 annotated paragraphs and a curated reference set. The test set consists of 1,000 paragraphs from distinct papers, each annotated with ground-truth citations and distractor candidates. A total of seven teams registered, with three submitting results. We report performance metrics across all subtasks and analyze the effectiveness of submitted systems. This shared task provides a new benchmark for evaluating citation modeling and encourages future research in scientific document understanding. The dataset and task materials are publicly available at https://github.com/daotuanan/scidoca2025-shared-task.","short_abstract":"We present an overview of the SCIDOCA 2025 Shared Task, which focuses on citation discovery and prediction in scientific documents. The task is divided into three subtasks: (1) Citation Discovery, where systems must identify relevant references for a given paragraph; (2) Masked Citation Prediction, which requires selec...","url_abs":"https://arxiv.org/abs/2509.24283","url_pdf":"https://arxiv.org/pdf/2509.24283v1","authors":"[\"An Dao\",\"Vu Tran\",\"Le-Minh Nguyen\",\"Yuji Matsumoto\"]","published":"2025-09-29T04:55:18Z","proceeding":"cs.DL","tasks":"[\"cs.DL\",\"cs.CL\"]","methods":"[]","has_code":false,"code_links":[{"ID":608993,"CreatedAt":"2026-06-01T04:54:23.091178241Z","UpdatedAt":"2026-06-01T04:54:23.091178241Z","DeletedAt":null,"paper_id":2863295,"paper_url":"https://arxiv.org/abs/2509.24283","paper_title":"Overview of SCIDOCA 2025 Shared Task on Citation Prediction, Discovery, and Placement","repo_url":"https://github.com/daotuanan/scidoca2025-shared-task","is_official":false,"mentioned_in_paper":false,"mentioned_in_github":true,"github_stars":0}]}
