{"ID":2823272,"CreatedAt":"2026-06-01T04:54:23.091178241Z","UpdatedAt":"2026-06-01T04:54:23.091178241Z","DeletedAt":null,"paper_url":"https://arxiv.org/abs/2601.00693","arxiv_id":"2601.00693","title":"ARISE: Adaptive Reinforcement Integrated with Swarm Exploration","abstract":"Effective exploration remains a key challenge in RL, especially with non-stationary rewards or high-dimensional policies. We introduce ARISE, a lightweight framework that enhances reinforcement learning by augmenting standard policy-gradient methods with a compact swarm-based exploration layer. ARISE blends policy actions with particle-driven proposals, where each particle represents a candidate policy trajectory sampled in the action space, and modulates exploration adaptively using reward-variance cues. While easy benchmarks exhibit only slight improvements (e.g., +0.7% on CartPole-v1), ARISE yields substantial gains on more challenging tasks, including +46% on LunarLander-v3 and +22% on Hopper-v4, while preserving stability on Walker2d and Ant. Under non-stationary reward shifts, ARISE provides marked robustness advantages, outperforming PPO by +75 points on CartPole and improving LunarLander accordingly. Ablation studies confirm that both the swarm component and the adaptive mechanism contribute to the performance. Overall, ARISE offers a simple, architecture-agnostic route to more exploratory and resilient RL agents without altering core algorithmic structures.","short_abstract":"Effective exploration remains a key challenge in RL, especially with non-stationary rewards or high-dimensional policies. We introduce ARISE, a lightweight framework that enhances reinforcement learning by augmenting standard policy-gradient methods with a compact swarm-based exploration layer. ARISE blends policy acti...","url_abs":"https://arxiv.org/abs/2601.00693","url_pdf":"https://arxiv.org/pdf/2601.00693v1","authors":"[\"Rajiv Chaitanya M\",\"D R Ramesh Babu\"]","published":"2026-01-02T14:09:22Z","proceeding":"cs.LG","tasks":"[\"cs.LG\",\"eess.SY\"]","methods":"[\"Reinforcement Learning\",\"LoRA\"]","has_code":false}
