{"ID":2823994,"CreatedAt":"2026-06-01T04:54:23.091178241Z","UpdatedAt":"2026-06-01T04:54:23.091178241Z","DeletedAt":null,"paper_url":"https://arxiv.org/abs/2512.24162","arxiv_id":"2512.24162","title":"Deep Probabilistic Supervision for Image Classification","abstract":"Supervised training of deep neural networks for classification typically relies on hard targets, which promote overconfidence and can limit calibration, generalization, and robustness. Self-distillation methods aim to mitigate this by leveraging inter-class and sample-specific information present in the model's own predictions, but often remain dependent on hard targets without explicitly modeling predictive uncertainty. With this in mind, we propose Deep Probabilistic Supervision (DPS), a principled learning framework constructing sample-specific target distributions via statistical inference on the model's own predictions, remaining independent of hard targets after initialization. We show that DPS consistently yields higher test accuracy (e.g., +2.0% for DenseNet-264 on ImageNet) and significantly lower Expected Calibration Error (ECE) (-40% ResNet-50, CIFAR-100) than existing self-distillation methods. When combined with a contrastive loss, DPS achieves state-of-the-art robustness under label noise.","short_abstract":"Supervised training of deep neural networks for classification typically relies on hard targets, which promote overconfidence and can limit calibration, generalization, and robustness. Self-distillation methods aim to mitigate this by leveraging inter-class and sample-specific information present in the model's own pre...","url_abs":"https://arxiv.org/abs/2512.24162","url_pdf":"https://arxiv.org/pdf/2512.24162v2","authors":"[\"Anton Adelöw\",\"Matteo Gamba\",\"Atsuto Maki\"]","published":"2025-12-30T11:48:06Z","proceeding":"cs.CV","tasks":"[\"cs.CV\"]","methods":"[]","has_code":false}