{"ID":2849636,"CreatedAt":"2026-06-01T04:54:23.091178241Z","UpdatedAt":"2026-06-01T04:54:23.091178241Z","DeletedAt":null,"paper_url":"https://arxiv.org/abs/2510.23334","arxiv_id":"2510.23334","title":"Adaptive Blockwise Search: Inference-Time Alignment for Large Language Models","abstract":"LLM alignment remains a critical challenge. Inference-time methods provide a flexible alternative to fine-tuning, but their uniform computational effort often yields suboptimal alignment. We hypothesize that for many alignment tasks, the initial tokens of a response are disproportionately more critical. To leverage this principle, we introduce AdaSearch, a novel blockwise search strategy. It adaptively allocates a fixed computational budget using a sampling schedule, focusing search effort on these critical tokens. We apply AdaSearch to sequential decoding and introduce its tree-search counterpart, AdaBeam. Our comprehensive evaluation across eight LLMs demonstrates that AdaSearch outperforms strong Best-of-N and fine-tuning baselines. Specifically, win-rates improve by over 10% for harmlessness generation, controlled sentiment generation, and for mathematical reasoning tasks relative to Best-of-N.","short_abstract":"LLM alignment remains a critical challenge. Inference-time methods provide a flexible alternative to fine-tuning, but their uniform computational effort often yields suboptimal alignment. We hypothesize that for many alignment tasks, the initial tokens of a response are disproportionately more critical. To leverage thi...","url_abs":"https://arxiv.org/abs/2510.23334","url_pdf":"https://arxiv.org/pdf/2510.23334v1","authors":"[\"Mohammad Atif Quamar\",\"Mohammad Areeb\",\"Nishant Sharma\",\"Ananth Shreekumar\",\"Jonathan Rosenthal\",\"Muslum Ozgur Ozmen\",\"Mikhail Kuznetsov\",\"Z. Berkay Celik\"]","published":"2025-10-27T13:48:59Z","proceeding":"cs.CL","tasks":"[\"cs.CL\"]","methods":"[\"Large Language Model\",\"Language Model\"]","has_code":false}