{"ID":2858481,"CreatedAt":"2026-06-01T04:54:23.091178241Z","UpdatedAt":"2026-06-01T04:54:23.091178241Z","DeletedAt":null,"paper_url":"https://arxiv.org/abs/2510.08851","arxiv_id":"2510.08851","title":"CDE: Concept-Driven Exploration for Reinforcement Learning","abstract":"Intelligent exploration remains a critical challenge in reinforcement learning (RL), especially in visual control tasks. Unlike low-dimensional state-based RL, visual RL must extract task-relevant structure from raw pixels, making exploration inefficient. We propose Concept-Driven Exploration (CDE), which leverages a pre-trained vision-language model (VLM) to generate object-centric visual concepts from textual task descriptions as weak, potentially noisy supervisory signals. Rather than directly conditioning on these noisy signals, CDE trains a policy to reconstruct the concepts via an auxiliary objective, learning general representations of the concepts and using reconstruction accuracy as an intrinsic reward to guide exploration toward task-relevant objects. Across five challenging simulated visual manipulation tasks, CDE achieves efficient, targeted exploration and remains robust to both synthetic errors and noisy VLM predictions. Finally, we demonstrate real-world transfer by deploying CDE on a Franka arm, attaining an 80\\% success rate in a real-world manipulation task.","short_abstract":"Intelligent exploration remains a critical challenge in reinforcement learning (RL), especially in visual control tasks. Unlike low-dimensional state-based RL, visual RL must extract task-relevant structure from raw pixels, making exploration inefficient. We propose Concept-Driven Exploration (CDE), which leverages a p...","url_abs":"https://arxiv.org/abs/2510.08851","url_pdf":"https://arxiv.org/pdf/2510.08851v2","authors":"[\"Le Mao\",\"Andrew H. Liu\",\"Renos Zabounidis\",\"Yanan Niu\",\"Zachary Kingston\",\"Joseph Campbell\"]","published":"2025-10-09T23:01:36Z","proceeding":"cs.RO","tasks":"[\"cs.RO\"]","methods":"[\"Reinforcement Learning\",\"Language Model\",\"LoRA\"]","has_code":false}
