{"ID":2857113,"CreatedAt":"2026-06-01T04:54:23.091178241Z","UpdatedAt":"2026-06-01T04:54:23.091178241Z","DeletedAt":null,"paper_url":"https://arxiv.org/abs/2510.10205","arxiv_id":"2510.10205","title":"PIXEL: Adaptive Steering Via Position-wise Injection with eXact Estimated Levels under Subspace Calibration","abstract":"Reliable behavior control is central to deploying large language models (LLMs) on the web. Activation steering offers a tuning-free route to align attributes (e.g., truthfulness) that ensure trustworthy generation. Prevailing approaches rely on coarse heuristics and lack a principled account of where to steer and how strongly to intervene. To this end, we propose Position-wise Injection with eXact Estimated Levels (PIXEL), a position-wise activation steering framework that, in contrast to prior work, learns a property-aligned subspace from dual views (tail-averaged and end-token) and selects intervention strength via a constrained geometric objective with a closed-form solution, thereby adapting to token-level sensitivity without global hyperparameter tuning. PIXEL further performs sample-level orthogonal residual calibration to refine the global attribute direction and employs a lightweight position-scanning routine to identify receptive injection sites. We additionally provide representation-level guarantees for the minimal-intervention rule, supporting reliable alignment. Across diverse models and evaluation paradigms, PIXEL consistently improves attribute alignment while preserving model general capabilities, offering a practical and principled method for LLMs' controllable generation. Our code is available at https://github.com/V1centNevwake/PIXEL-Adaptive-Steering","short_abstract":"Reliable behavior control is central to deploying large language models (LLMs) on the web. Activation steering offers a tuning-free route to align attributes (e.g., truthfulness) that ensure trustworthy generation. Prevailing approaches rely on coarse heuristics and lack a principled account of where to steer and how s...","url_abs":"https://arxiv.org/abs/2510.10205","url_pdf":"https://arxiv.org/pdf/2510.10205v2","authors":"[\"Manjiang Yu\",\"Hongji Li\",\"Priyanka Singh\",\"Xue Li\",\"Di Wang\",\"Lijie Hu\"]","published":"2025-10-11T13:13:34Z","proceeding":"cs.AI","tasks":"[\"cs.AI\"]","methods":"[\"Large Language Model\",\"Language Model\"]","has_code":false,"code_links":[{"ID":608416,"CreatedAt":"2026-06-01T04:54:23.091178241Z","UpdatedAt":"2026-06-01T04:54:23.091178241Z","DeletedAt":null,"paper_id":2857113,"paper_url":"https://arxiv.org/abs/2510.10205","paper_title":"PIXEL: Adaptive Steering Via Position-wise Injection with eXact Estimated Levels under Subspace Calibration","repo_url":"https://github.com/V1centNevwake/PIXEL-Adaptive-Steering","is_official":false,"mentioned_in_paper":false,"mentioned_in_github":true,"github_stars":0}]}
