{"ID":2851572,"CreatedAt":"2026-06-01T04:54:23.091178241Z","UpdatedAt":"2026-06-01T04:54:23.091178241Z","DeletedAt":null,"paper_url":"https://arxiv.org/abs/2510.19328","arxiv_id":"2510.19328","title":"Clustered Calibration: Representation-Aware Probability Calibration via Learned Subpopulations","abstract":"Ensuring that predicted probabilities align with observed frequencies is critical in high-stakes domains such as clinical decision support, autonomous driving and financial risk assessment. Existing calibration methods typically apply a single global transformation or rely on post-hoc binning over predicted confidences, limiting their ability to exploit heterogeneous reliability across sub-populations. We propose Clustered Calibration, a representation-aware framework that identifies sub-populations via clustering in learned feature spaces (e.g., coverage vectors, SHAP values, CNN activations, Transformer embeddings) and fits a soft mixture of cluster-specific parametric calibrators under hierarchical shrinkage toward a global mapping. This design yields context-specific calibration while maintaining global stability. Across six tabular datasets and additional image and text benchmarks, clustered calibration consistently improves or matches strong global calibrators in terms of negative log-likelihood and Brier score, while preserving AUC and accuracy. We further show, both analytically and empirically, that fixed-bin Expected Calibration Error (ECE) can mis-rank soft, region-aware calibrators even when proper scoring rules improve, and we advocate for log-loss and Brier as more reliable bases for model selection in such settings.","short_abstract":"Ensuring that predicted probabilities align with observed frequencies is critical in high-stakes domains such as clinical decision support, autonomous driving and financial risk assessment. Existing calibration methods typically apply a single global transformation or rely on post-hoc binning over predicted confidences...","url_abs":"https://arxiv.org/abs/2510.19328","url_pdf":"https://arxiv.org/pdf/2510.19328v2","authors":"[\"Tomer Lavi\",\"Bracha Shapira\",\"Nadav Rappoport\"]","published":"2025-10-22T07:41:30Z","proceeding":"cs.LG","tasks":"[\"cs.LG\"]","methods":"[\"Transformer\",\"Convolutional Neural Network\"]","has_code":false}
