{"ID":2896599,"CreatedAt":"2026-06-01T04:54:23.091178241Z","UpdatedAt":"2026-06-01T04:54:23.091178241Z","DeletedAt":null,"paper_url":"https://arxiv.org/abs/2507.06867","arxiv_id":"2507.06867","title":"Conformal Prediction for Long-Tailed Classification","abstract":"Many real-world classification problems, such as plant identification, have extremely long-tailed class distributions. In order for prediction sets to be useful in such settings, they should (i) provide good class-conditional coverage, ensuring that rare classes are not systematically omitted from the prediction sets, and (ii) be a reasonable size, allowing users to easily verify candidate labels. Unfortunately, existing conformal prediction methods, when applied to the long-tailed setting, force practitioners to make a binary choice between small sets with poor class-conditional coverage or sets that have very good class-conditional coverage but are extremely large. We propose methods with marginal coverage guarantees that smoothly trade off set size and class-conditional coverage. First, we introduce a new conformal score function called prevalence-adjusted softmax that optimizes for macro-coverage, defined as the average class-conditional coverage across classes. Second, we propose a new procedure that interpolates between marginal and class-conditional conformal prediction by linearly interpolating their conformal score thresholds. We demonstrate our methods on Pl@ntNet-300K and iNaturalist-2018, two long-tailed image datasets with 1,081 and 8,142 classes, respectively.","short_abstract":"Many real-world classification problems, such as plant identification, have extremely long-tailed class distributions. In order for prediction sets to be useful in such settings, they should (i) provide good class-conditional coverage, ensuring that rare classes are not systematically omitted from the prediction sets,...","url_abs":"https://arxiv.org/abs/2507.06867","url_pdf":"https://arxiv.org/pdf/2507.06867v3","authors":"[\"Tiffany Ding\",\"Jean-Baptiste Fermanian\",\"Joseph Salmon\"]","published":"2025-07-09T14:08:50Z","proceeding":"stat.ML","tasks":"[\"stat.ML\",\"cs.CV\",\"cs.LG\",\"stat.ME\"]","methods":"[]","has_code":false}
