{"ID":2899653,"CreatedAt":"2026-06-01T04:54:23.091178241Z","UpdatedAt":"2026-06-01T04:54:23.091178241Z","DeletedAt":null,"paper_url":"https://arxiv.org/abs/2507.00762","arxiv_id":"2507.00762","title":"Leveraging Genetic Algorithms for Efficient Demonstration Generation in Real-World Reinforcement Learning Environments","abstract":"Reinforcement Learning (RL) has demonstrated significant potential in certain real-world industrial applications, yet its broader deployment remains limited by inherent challenges such as sample inefficiency and unstable learning dynamics. This study investigates the utilization of Genetic Algorithms (GAs) as a mechanism for improving RL performance in an industrially inspired sorting environment. We propose a novel approach in which GA-generated expert demonstrations are used to enhance policy learning. These demonstrations are incorporated into a Deep Q-Network (DQN) replay buffer for experience-based learning and utilized as warm-start trajectories for Proximal Policy Optimization (PPO) agents to accelerate training convergence. Our experiments compare standard RL training with rule-based heuristics, brute-force optimization, and demonstration data, revealing that GA-derived demonstrations significantly improve RL performance. Notably, PPO agents initialized with GA-generated data achieved superior cumulative rewards, highlighting the potential of hybrid learning paradigms, where heuristic search methods complement data-driven RL. The utilized framework is publicly available and enables further research into adaptive RL strategies for real-world applications.","short_abstract":"Reinforcement Learning (RL) has demonstrated significant potential in certain real-world industrial applications, yet its broader deployment remains limited by inherent challenges such as sample inefficiency and unstable learning dynamics. This study investigates the utilization of Genetic Algorithms (GAs) as a mechani...","url_abs":"https://arxiv.org/abs/2507.00762","url_pdf":"https://arxiv.org/pdf/2507.00762v2","authors":"[\"Tom Maus\",\"Asma Atamna\",\"Tobias Glasmachers\"]","published":"2025-07-01T14:04:17Z","proceeding":"cs.LG","tasks":"[\"cs.LG\"]","methods":"[\"Reinforcement Learning\"]","has_code":false}
