{"ID":2875480,"CreatedAt":"2026-06-01T04:54:23.091178241Z","UpdatedAt":"2026-06-01T04:54:23.091178241Z","DeletedAt":null,"paper_url":"https://arxiv.org/abs/2509.02401","arxiv_id":"2509.02401","title":"Towards Agents That Know When They Don't Know: Uncertainty as a Control Signal for Structured Reasoning","abstract":"Large language model (LLM) agents are increasingly deployed in structured biomedical data environments, yet they often produce fluent but overconfident outputs when reasoning over complex multi-table data. We introduce an uncertainty-aware agent for query-conditioned multi-table summarization that leverages two complementary signals: (i) retrieval uncertainty--entropy over multiple table-selection rollouts--and (ii) summary uncertainty--combining self-consistency and perplexity. Summary uncertainty is incorporated into reinforcement learning (RL) with Group Relative Policy Optimization (GRPO), while both retrieval and summary uncertainty guide inference-time filtering and support the construction of higher-quality synthetic datasets. On multi-omics benchmarks, our approach improves factuality and calibration, nearly tripling correct and useful claims per summary (3.0\\(\\rightarrow\\)8.4 internal; 3.6\\(\\rightarrow\\)9.9 cancer multi-omics) and substantially improving downstream survival prediction (C-index 0.32\\(\\rightarrow\\)0.63). These results demonstrate that uncertainty can serve as a control signal--enabling agents to abstain, communicate confidence, and become more reliable tools for complex structured-data environments.","short_abstract":"Large language model (LLM) agents are increasingly deployed in structured biomedical data environments, yet they often produce fluent but overconfident outputs when reasoning over complex multi-table data. We introduce an uncertainty-aware agent for query-conditioned multi-table summarization that leverages two complem...","url_abs":"https://arxiv.org/abs/2509.02401","url_pdf":"https://arxiv.org/pdf/2509.02401v1","authors":"[\"Josefa Lia Stoisser\",\"Marc Boubnovski Martell\",\"Lawrence Phillips\",\"Gianluca Mazzoni\",\"Lea Mørch Harder\",\"Philip Torr\",\"Jesper Ferkinghoff-Borg\",\"Kaspar Martens\",\"Julien Fauqueur\"]","published":"2025-09-02T15:12:10Z","proceeding":"cs.AI","tasks":"[\"cs.AI\"]","methods":"[\"Reinforcement Learning\",\"Large Language Model\",\"Language Model\"]","has_code":false}
