{"ID":2832898,"CreatedAt":"2026-06-01T04:54:23.091178241Z","UpdatedAt":"2026-06-01T04:54:23.091178241Z","DeletedAt":null,"paper_url":"https://arxiv.org/abs/2512.04588","arxiv_id":"2512.04588","title":"UserSimCRS v2: Simulation-Based Evaluation for Conversational Recommender Systems","abstract":"Resources for simulation-based evaluation of conversational recommender systems (CRSs) are scarce. The UserSimCRS toolkit was introduced to address this gap. In this work, we present UserSimCRS v2, a significant upgrade aligning the toolkit with state-of-the-art research. Key extensions include an enhanced agenda-based user simulator, introduction of large language model-based simulators, integration for a wider range of CRSs and datasets, and new LLM-as-a-judge evaluation utilities. We demonstrate these extensions in a case study.","short_abstract":"Resources for simulation-based evaluation of conversational recommender systems (CRSs) are scarce. The UserSimCRS toolkit was introduced to address this gap. In this work, we present UserSimCRS v2, a significant upgrade aligning the toolkit with state-of-the-art research. Key extensions include an enhanced agenda-based...","url_abs":"https://arxiv.org/abs/2512.04588","url_pdf":"https://arxiv.org/pdf/2512.04588v3","authors":"[\"Nolwenn Bernard\",\"Krisztian Balog\"]","published":"2025-12-04T09:07:35Z","proceeding":"cs.IR","tasks":"[\"cs.IR\"]","methods":"[\"Large Language Model\",\"Language Model\"]","has_code":false}
