{"ID":2878440,"CreatedAt":"2026-06-01T04:54:23.091178241Z","UpdatedAt":"2026-06-01T04:54:23.091178241Z","DeletedAt":null,"paper_url":"https://arxiv.org/abs/2508.17751","arxiv_id":"2508.17751","title":"Multi-layer Abstraction for Nested Generation of Options (MANGO) in Hierarchical Reinforcement Learning","abstract":"This paper introduces MANGO (Multilayer Abstraction for Nested Generation of Options), a novel hierarchical reinforcement learning framework designed to address the challenges of long-term sparse reward environments. MANGO decomposes complex tasks into multiple layers of abstraction, where each layer defines an abstract state space and employs options to modularize trajectories into macro-actions. These options are nested across layers, allowing for efficient reuse of learned movements and improved sample efficiency. The framework introduces intra-layer policies that guide the agent's transitions within the abstract state space, and task actions that integrate task-specific components such as reward functions. Experiments conducted in procedurally-generated grid environments demonstrate substantial improvements in both sample efficiency and generalization capabilities compared to standard RL methods. MANGO also enhances interpretability by making the agent's decision-making process transparent across layers, which is particularly valuable in safety-critical and industrial applications. Future work will explore automated discovery of abstractions and abstract actions, adaptation to continuous or fuzzy environments, and more robust multi-layer training strategies.","short_abstract":"This paper introduces MANGO (Multilayer Abstraction for Nested Generation of Options), a novel hierarchical reinforcement learning framework designed to address the challenges of long-term sparse reward environments. MANGO decomposes complex tasks into multiple layers of abstraction, where each layer defines an abstrac...","url_abs":"https://arxiv.org/abs/2508.17751","url_pdf":"https://arxiv.org/pdf/2508.17751v1","authors":"[\"Alessio Arcudi\",\"Davide Sartor\",\"Alberto Sinigaglia\",\"Vincent François-Lavet\",\"Gian Antonio Susto\"]","published":"2025-08-25T07:44:35Z","proceeding":"cs.LG","tasks":"[\"cs.LG\"]","methods":"[\"Reinforcement Learning\"]","has_code":false}