{"ID":2854332,"CreatedAt":"2026-06-01T04:54:23.091178241Z","UpdatedAt":"2026-06-01T04:54:23.091178241Z","DeletedAt":null,"paper_url":"https://arxiv.org/abs/2510.17881","arxiv_id":"2510.17881","title":"POPI: Personalizing LLMs via Optimized Natural Language Preference Inference","abstract":"Large language models (LLMs) are typically aligned with population-level preferences, despite substantial variation across individual users. We introduce POPI, a user-level personalization framework that separates the problem into two components connected by a natural-language interface: a shared inference model that distills heterogeneous user signals into a concise preference summary, and a shared generator that conditions on this summary to produce personalized responses. Both components are trained under a unified preference-optimization objective, with reinforcement learning handling the non-differentiable inference step. This objective decomposes into generator approximation error and summary informativeness, revealing how a single loss simultaneously drives accurate generation and informative summarization. Because the interface is natural language, learned summaries can be inferred once per user and reused across different generators -- including frozen, black-box commercial APIs. Across four personalization benchmarks, POPI generally improves personalization quality while reducing context overhead by up to an order of magnitude.","short_abstract":"Large language models (LLMs) are typically aligned with population-level preferences, despite substantial variation across individual users. We introduce POPI, a user-level personalization framework that separates the problem into two components connected by a natural-language interface: a shared inference model that d...","url_abs":"https://arxiv.org/abs/2510.17881","url_pdf":"https://arxiv.org/pdf/2510.17881v3","authors":"[\"Yizhuo Chen\",\"Xin Liu\",\"Ruijie Wang\",\"Zheng Li\",\"Pei Chen\",\"Changlong Yu\",\"Qingyu Yin\",\"Priyanka Nigam\",\"Meng Jiang\",\"Bing Yin\"]","published":"2025-10-17T23:07:57Z","proceeding":"cs.CL","tasks":"[\"cs.CL\",\"cs.AI\"]","methods":"[\"Reinforcement Learning\",\"Large Language Model\",\"Language Model\"]","has_code":false}
