{"ID":2824125,"CreatedAt":"2026-06-01T04:54:23.091178241Z","UpdatedAt":"2026-06-01T04:54:23.091178241Z","DeletedAt":null,"paper_url":"https://arxiv.org/abs/2512.24461","arxiv_id":"2512.24461","title":"Align While Search: Belief-Guided Exploratory Inference for World-Grounded Embodied Agents","abstract":"In this paper, we propose a test-time adaptive agent that performs exploratory inference through posterior-guided belief refinement without relying on gradient-based updates or additional training for LLM agent operating under partial observability. Our agent maintains an external structured belief over the environment state, iteratively updates it via action-conditioned observations, and selects actions by maximizing predicted information gain over the belief space. We estimate information gain using a lightweight LLM-based surrogate and assess world alignment through a novel reward that quantifies the consistency between posterior belief and ground-truth environment configuration. Experiments show that our method outperforms inference-time scaling baselines such as prompt-augmented or retrieval-enhanced LLMs, in aligning with latent world states with significantly lower integration overhead.","short_abstract":"In this paper, we propose a test-time adaptive agent that performs exploratory inference through posterior-guided belief refinement without relying on gradient-based updates or additional training for LLM agent operating under partial observability. Our agent maintains an external structured belief over the environment...","url_abs":"https://arxiv.org/abs/2512.24461","url_pdf":"https://arxiv.org/pdf/2512.24461v1","authors":"[\"Seohui Bae\",\"Jeonghye Kim\",\"Youngchul Sung\",\"Woohyung Lim\"]","published":"2025-12-30T20:51:28Z","proceeding":"cs.AI","tasks":"[\"cs.AI\"]","methods":"[\"Large Language Model\",\"LoRA\"]","has_code":false}
