{"ID":2863420,"CreatedAt":"2026-06-01T04:54:23.091178241Z","UpdatedAt":"2026-06-01T04:54:23.091178241Z","DeletedAt":null,"paper_url":"https://arxiv.org/abs/2509.24460","arxiv_id":"2509.24460","title":"ContextPRM: Leveraging Contextual Coherence for multi-domain Test-Time Scaling","abstract":"Process reward models (PRMs) have demonstrated significant efficacy in enhancing the mathematical reasoning capabilities of large language models (LLMs) by leveraging test-time scaling (TTS). However, while most PRMs exhibit substantial gains in mathematical domains, the scarcity of domain-specific training data and knowledge-based learning patterns limits their generalization ability when faced with other domains. To address this limitation, we shift the learning objective from verifying domain-specific knowledge to modeling domain-agnostic logical flow. Centering on contextual coherence between chain-of-thought (CoT) steps, our approach is realized through a novel data annotation and training framework, which enhances the model's generalization capabilities across diverse domains. For instance, our resulting model, ContextPRM, achieves a notable 6.5% average accuracy improvement over the majority voting baseline via weighted majority voting across nine non-mathematical domains in MMLU-Pro, including law, history, and philosophy, significantly surpassing the 2.2% improvement from VersaPRM and 0.5% gains from other mathematics-focused PRMs, demonstrating consistent performance across both mathematical and non-mathematical domains.","short_abstract":"Process reward models (PRMs) have demonstrated significant efficacy in enhancing the mathematical reasoning capabilities of large language models (LLMs) by leveraging test-time scaling (TTS). However, while most PRMs exhibit substantial gains in mathematical domains, the scarcity of domain-specific training data and kn...","url_abs":"https://arxiv.org/abs/2509.24460","url_pdf":"https://arxiv.org/pdf/2509.24460v1","authors":"[\"Haotian Zhang\",\"Liu Liu\",\"Baosheng Yu\",\"Jiayan Qiu\",\"Likang Xiao\",\"Yanwei Ren\",\"Quan Chen\",\"Xianglong Liu\"]","published":"2025-09-29T08:40:46Z","proceeding":"cs.AI","tasks":"[\"cs.AI\"]","methods":"[\"Large Language Model\",\"Language Model\"]","has_code":false}
