{"ID":2865680,"CreatedAt":"2026-06-01T04:54:23.091178241Z","UpdatedAt":"2026-06-01T04:54:23.091178241Z","DeletedAt":null,"paper_url":"https://arxiv.org/abs/2509.23002","arxiv_id":"2509.23002","title":"Unsupervised Conformal Inference: Bootstrapping and Alignment to Control LLM Uncertainty","abstract":"Deploying black-box LLMs requires managing uncertainty in the absence of token-level probability or true labels. We propose introducing an unsupervised conformal inference framework for generation, which integrates: generative models, incorporating: (i) an LLM-compatible atypical score derived from response-embedding Gram matrix, (ii) UCP combined with a bootstrapping variant (BB-UCP) that aggregates residuals to refine quantile precision while maintaining distribution-free, finite-sample coverage, and (iii) conformal alignment, which calibrates a single strictness parameter $τ$ so a user predicate (e.g., factuality lift) holds on unseen batches with probability $\\ge 1-α$. Across different benchmark datasets, our gates achieve close-to-nominal coverage and provide tighter, more stable thresholds than split UCP, while consistently reducing the severity of hallucination, outperforming lightweight per-response detectors with similar computational demands. The result is a label-free, API-compatible gate for test-time filtering that turns geometric signals into calibrated, goal-aligned decisions.","short_abstract":"Deploying black-box LLMs requires managing uncertainty in the absence of token-level probability or true labels. We propose introducing an unsupervised conformal inference framework for generation, which integrates: generative models, incorporating: (i) an LLM-compatible atypical score derived from response-embedding G...","url_abs":"https://arxiv.org/abs/2509.23002","url_pdf":"https://arxiv.org/pdf/2509.23002v1","authors":"[\"Lingyou Pang\",\"Lei Huang\",\"Jianyu Lin\",\"Tianyu Wang\",\"Akira Horiguchi\",\"Alexander Aue\",\"Carey E. Priebe\"]","published":"2025-09-26T23:40:47Z","proceeding":"stat.ML","tasks":"[\"stat.ML\",\"cs.LG\"]","methods":"[\"Large Language Model\"]","has_code":false}