{"ID":2881743,"CreatedAt":"2026-06-01T04:54:23.091178241Z","UpdatedAt":"2026-06-01T04:54:23.091178241Z","DeletedAt":null,"paper_url":"https://arxiv.org/abs/2508.13189","arxiv_id":"2508.13189","title":"Preference Models assume Proportional Hazards of Utilities","abstract":"Approaches for estimating preferences from human annotated data typically involves inducing a distribution over a ranked list of choices such as the Plackett-Luce model. Indeed, modern AI alignment tools such as Reward Modelling and Direct Preference Optimization are based on the statistical assumptions posed by the Plackett-Luce model. In this paper, I will connect the Plackett-Luce model to another classical and well known statistical model, the Cox Proportional Hazards model and attempt to shed some light on the implications of the connection therein.","short_abstract":"Approaches for estimating preferences from human annotated data typically involves inducing a distribution over a ranked list of choices such as the Plackett-Luce model. Indeed, modern AI alignment tools such as Reward Modelling and Direct Preference Optimization are based on the statistical assumptions posed by the Pl...","url_abs":"https://arxiv.org/abs/2508.13189","url_pdf":"https://arxiv.org/pdf/2508.13189v1","authors":"[\"Chirag Nagpal\"]","published":"2025-08-15T00:08:56Z","proceeding":"stat.ML","tasks":"[\"stat.ML\",\"cs.AI\",\"cs.LG\"]","methods":"[]","has_code":false}
