{"ID":2857044,"CreatedAt":"2026-06-01T04:54:23.091178241Z","UpdatedAt":"2026-06-01T04:54:23.091178241Z","DeletedAt":null,"paper_url":"https://arxiv.org/abs/2510.10103","arxiv_id":"2510.10103","title":"Stop When Enough: Adaptive Early-Stopping for Chain-of-Thought Reasoning","abstract":"Chain-of-Thought (CoT) reasoning has driven recent gains of large language models (LLMs) on reasoning-intensive tasks by externalizing intermediate steps. However, excessive or redundant reasoning -- so-called overthinking -- can increase inference costs and lead LLMs toward incorrect conclusions. In this paper, we present REFRAIN ($\\underline{REF}$lective-$\\underline{R}$edundancy for $\\underline{A}$daptive $\\underline{IN}$ference), a training-free framework that adaptively determines when to stop reasoning to mitigate overthinking. REFRAIN integrates a two-stage stop discriminator to identify reflective yet redundant reasoning and a sliding-window Upper Confidence Bound (SW-UCB) multi-armed bandit controller to dynamically adjust stopping thresholds according to problem difficulty without supervision or fine-tuning. Across four representative benchmarks and two model families, REFRAIN reduces token usage by 20-55% while maintaining or improving accuracy compared to standard CoT prompting. Extensive ablation and robustness analyses demonstrate its stability across models, scorers, and prompt variations. In summary, our findings highlight when-to-stop as a new and practical axis of test-time scaling -- enabling models to reason not just more, but just enough.","short_abstract":"Chain-of-Thought (CoT) reasoning has driven recent gains of large language models (LLMs) on reasoning-intensive tasks by externalizing intermediate steps. However, excessive or redundant reasoning -- so-called overthinking -- can increase inference costs and lead LLMs toward incorrect conclusions. In this paper, we pre...","url_abs":"https://arxiv.org/abs/2510.10103","url_pdf":"https://arxiv.org/pdf/2510.10103v1","authors":"[\"Renliang Sun\",\"Wei Cheng\",\"Dawei Li\",\"Haifeng Chen\",\"Wei Wang\"]","published":"2025-10-11T08:30:00Z","proceeding":"cs.CL","tasks":"[\"cs.CL\"]","methods":"[\"Large Language Model\",\"Language Model\"]","has_code":false}
