{"ID":2890071,"CreatedAt":"2026-06-01T04:54:23.091178241Z","UpdatedAt":"2026-06-01T04:54:23.091178241Z","DeletedAt":null,"paper_url":"https://arxiv.org/abs/2507.19774","arxiv_id":"2507.19774","title":"Bag of Coins: A Statistical Probe into Neural Confidence Structures","abstract":"Modern neural networks often produce miscalibrated confidence scores and struggle to detect out-of-distribution (OOD) inputs, while most existing methods post-process outputs without testing internal consistency. We introduce the Bag-of-Coins (BoC) probe, a non-parametric diagnostic of logit coherence that compares softmax confidence $\\hat p$ to an aggregate of pairwise Luce-style dominance probabilities $\\bar q$, yielding a deterministic coherence score and a p-value-based structural score. Across ViT, ResNet, and RoBERTa with ID/OOD test sets, the coherence gap $Δ=\\bar q-\\hat p$ reveals clear ID/OOD separation for ViT (ID ${\\sim}0.1$-$0.2$, OOD ${\\sim}0.5$-$0.6$) but substantial overlap for ResNet and RoBERTa (both ${\\sim}0$), indicating architecture-dependent uncertainty geometry. As a practical method, BoC improves calibration only when the base model is poorly calibrated (ViT: ECE $0.024$ vs.\\ $0.180$) and underperforms standard calibrators (ECE ${\\sim}0.005$), while for OOD detection it fails across architectures (AUROC $0.020$-$0.253$) compared to standard scores ($0.75$-$0.99$). We position BoC as a research diagnostic for interrogating how architectures encode uncertainty in logit geometry rather than a production calibration or OOD detection method.","short_abstract":"Modern neural networks often produce miscalibrated confidence scores and struggle to detect out-of-distribution (OOD) inputs, while most existing methods post-process outputs without testing internal consistency. We introduce the Bag-of-Coins (BoC) probe, a non-parametric diagnostic of logit coherence that compares sof...","url_abs":"https://arxiv.org/abs/2507.19774","url_pdf":"https://arxiv.org/pdf/2507.19774v2","authors":"[\"Agnideep Aich\",\"Sameera Hewage\",\"Md Monzur Murshed\",\"Bruce Wade\",\"Ashit Baran Aich\"]","published":"2025-07-26T03:54:32Z","proceeding":"stat.ML","tasks":"[\"stat.ML\",\"cs.LG\"]","methods":"[]","has_code":false}
