{"ID":2855534,"CreatedAt":"2026-06-01T04:54:23.091178241Z","UpdatedAt":"2026-06-01T04:54:23.091178241Z","DeletedAt":null,"paper_url":"https://arxiv.org/abs/2510.12121","arxiv_id":"2510.12121","title":"Precise Attribute Intensity Control in Large Language Models via Targeted Representation Editing","abstract":"Precise attribute intensity control--generating Large Language Model (LLM) outputs with specific, user-defined attribute intensities--is crucial for AI systems adaptable to diverse user expectations. Current LLM alignment methods, however, typically provide only directional or open-ended guidance, failing to reliably achieve exact attribute intensities. We address this limitation with three key designs: (1) reformulating precise attribute intensity control as a target-reaching problem, rather than simple maximization; (2) training a lightweight value function via temporal-difference learning to predict final attribute intensity scores from partial generations, thereby steering LLM outputs; and (3) employing gradient-based interventions on hidden representations to navigate the model precisely towards specific attribute intensity targets. Our method enables fine-grained, continuous control over attribute intensities, moving beyond simple directional alignment. Experiments on LLaMA-3.2-3b and Phi-4-mini confirm our method's ability to steer text generation to user-specified attribute intensities with high accuracy. Finally, we demonstrate efficiency enhancements across three downstream tasks: preference data synthesis, Pareto frontier approximation and optimization, and distillation of aligned behaviors for intervention-free inference. Our code is available on https://github.com/Pre-Control/pre-control","short_abstract":"Precise attribute intensity control--generating Large Language Model (LLM) outputs with specific, user-defined attribute intensities--is crucial for AI systems adaptable to diverse user expectations. Current LLM alignment methods, however, typically provide only directional or open-ended guidance, failing to reliably a...","url_abs":"https://arxiv.org/abs/2510.12121","url_pdf":"https://arxiv.org/pdf/2510.12121v2","authors":"[\"Rongzhi Zhang\",\"Liqin Ye\",\"Yuzhao Heng\",\"Xiang Chen\",\"Tong Yu\",\"Lingkai Kong\",\"Sudheer Chava\",\"Chao Zhang\"]","published":"2025-10-14T03:50:22Z","proceeding":"cs.AI","tasks":"[\"cs.AI\",\"cs.CL\",\"cs.LG\"]","methods":"[\"Large Language Model\",\"Language Model\"]","has_code":false,"code_links":[{"ID":608260,"CreatedAt":"2026-06-01T04:54:23.091178241Z","UpdatedAt":"2026-06-01T04:54:23.091178241Z","DeletedAt":null,"paper_id":2855534,"paper_url":"https://arxiv.org/abs/2510.12121","paper_title":"Precise Attribute Intensity Control in Large Language Models via Targeted Representation Editing","repo_url":"https://github.com/Pre-Control/pre-control","is_official":false,"mentioned_in_paper":false,"mentioned_in_github":true,"github_stars":0}]}
