{"ID":2851181,"CreatedAt":"2026-06-01T04:54:23.091178241Z","UpdatedAt":"2026-06-01T04:54:23.091178241Z","DeletedAt":null,"paper_url":"https://arxiv.org/abs/2510.20542","arxiv_id":"2510.20542","title":"A Unified Framework for Zero-Shot Reinforcement Learning","abstract":"Zero-shot reinforcement learning (RL) has emerged as a setting for developing general agents, capable of solving downstream tasks without additional training or planning at test-time. While conventional RL optimizes policies for fixed rewards, zero-shot RL requires learning representations that enable immediate adaptation to arbitrary reward functions. As the field matures, the growing diversity of approaches demands a foundational framework reconciling different perspectives under a common unifying structure. In this work, we introduce a formal, unified framework for zero-shot RL, allowing for rigorous comparisons across methods. We propose a taxonomy organizing the algorithmic landscape along two levels: representation, distinguishing between compositional and direct methods based on their exploitation of action-value function decompositions; and learning paradigm, differentiating between reward-free and pseudo reward-free training. Additionally, we propose a unified view of existing error bounds, decomposing the total error into three primary contributing components: inference, reward, and approximation, serving as a foundation for more grounded comparisons of zero-shot methods.","short_abstract":"Zero-shot reinforcement learning (RL) has emerged as a setting for developing general agents, capable of solving downstream tasks without additional training or planning at test-time. While conventional RL optimizes policies for fixed rewards, zero-shot RL requires learning representations that enable immediate adaptat...","url_abs":"https://arxiv.org/abs/2510.20542","url_pdf":"https://arxiv.org/pdf/2510.20542v2","authors":"[\"Jacopo Di Ventura\",\"Jan Felix Kleuker\",\"Aske Plaat\",\"Thomas Moerland\"]","published":"2025-10-23T13:30:26Z","proceeding":"cs.LG","tasks":"[\"cs.LG\"]","methods":"[\"Reinforcement Learning\",\"Generative Adversarial Network\"]","has_code":false}
