{"ID":2857432,"CreatedAt":"2026-06-01T04:54:23.091178241Z","UpdatedAt":"2026-06-01T04:54:23.091178241Z","DeletedAt":null,"paper_url":"https://arxiv.org/abs/2510.09146","arxiv_id":"2510.09146","title":"Score-Based Density Estimation from Pairwise Comparisons","abstract":"We study density estimation from pairwise comparisons, motivated by expert knowledge elicitation and learning from human feedback. We relate the unobserved target density to a tempered winner density (marginal density of preferred choices), learning the winner's score via score-matching. This allows estimating the target by `de-tempering' the estimated winner density's score. We prove that the score vectors of the belief and the winner density are collinear, linked by a position-dependent tempering field. We give analytical formulas for this field and propose an estimator for it under the Bradley-Terry model. Using a diffusion model trained on tempered samples generated via score-scaled annealed Langevin dynamics, we can learn complex multivariate belief densities of simulated experts, from only hundreds to thousands of pairwise comparisons.","short_abstract":"We study density estimation from pairwise comparisons, motivated by expert knowledge elicitation and learning from human feedback. We relate the unobserved target density to a tempered winner density (marginal density of preferred choices), learning the winner's score via score-matching. This allows estimating the targ...","url_abs":"https://arxiv.org/abs/2510.09146","url_pdf":"https://arxiv.org/pdf/2510.09146v2","authors":"[\"Petrus Mikkola\",\"Luigi Acerbi\",\"Arto Klami\"]","published":"2025-10-10T08:49:24Z","proceeding":"cs.LG","tasks":"[\"cs.LG\"]","methods":"[\"Diffusion Model\"]","has_code":false}
