{"ID":2859389,"CreatedAt":"2026-06-01T04:54:23.091178241Z","UpdatedAt":"2026-06-01T04:54:23.091178241Z","DeletedAt":null,"paper_url":"https://arxiv.org/abs/2510.05996","arxiv_id":"2510.05996","title":"Information-Theoretic Policy Pre-Training with Empowerment","abstract":"Empowerment, an information-theoretic measure of an agent's potential influence on its environment, has emerged as a powerful intrinsic motivation and exploration framework for reinforcement learning (RL). Besides for unsupervised RL and skill learning algorithms, the specific use of empowerment as a pre-training signal has received limited attention in the literature. We show that empowerment can be used as a pre-training signal for data-efficient downstream task adaptation. For this we extend the traditional notion of empowerment by introducing discounted empowerment, which balances the agent's control over the environment across short- and long-term horizons. Leveraging this formulation, we propose a novel pre-training paradigm that initializes policies to maximize discounted empowerment, enabling agents to acquire a robust understanding of environmental dynamics. We analyze empowerment-based pre-training for various existing RL algorithms and empirically demonstrate its potential as a general-purpose initialization strategy: empowerment-maximizing policies with long horizons are data-efficient and effective, leading to improved adaptability in downstream tasks. Our findings pave the way for future research to scale this framework to high-dimensional and complex tasks, further advancing the field of RL.","short_abstract":"Empowerment, an information-theoretic measure of an agent's potential influence on its environment, has emerged as a powerful intrinsic motivation and exploration framework for reinforcement learning (RL). Besides for unsupervised RL and skill learning algorithms, the specific use of empowerment as a pre-training signa...","url_abs":"https://arxiv.org/abs/2510.05996","url_pdf":"https://arxiv.org/pdf/2510.05996v1","authors":"[\"Moritz Schneider\",\"Robert Krug\",\"Narunas Vaskevicius\",\"Luigi Palmieri\",\"Michael Volpp\",\"Joschka Boedecker\"]","published":"2025-10-07T14:57:58Z","proceeding":"cs.AI","tasks":"[\"cs.AI\",\"cs.IT\",\"cs.LG\",\"cs.RO\"]","methods":"[\"Reinforcement Learning\",\"LoRA\"]","has_code":false}
