{"ID":2843400,"CreatedAt":"2026-06-01T04:54:23.091178241Z","UpdatedAt":"2026-06-01T04:54:23.091178241Z","DeletedAt":null,"paper_url":"https://arxiv.org/abs/2511.08234","arxiv_id":"2511.08234","title":"Beyond Distributions: Geometric Action Control for Continuous Reinforcement Learning","abstract":"Gaussian policies have dominated continuous control in deep reinforcement learning (RL), yet they suffer from a fundamental mismatch: their unbounded support requires ad-hoc squashing functions that distort the geometry of bounded action spaces. While von Mises-Fisher (vMF) distributions offer a theoretically grounded alternative on the sphere, their reliance on Bessel functions and rejection sampling hinders practical adoption. We propose \\textbf{Geometric Action Control (GAC)}, a novel action generation paradigm that preserves the geometric benefits of spherical distributions while \\textit{simplifying computation}. GAC decomposes action generation into a direction vector and a learnable concentration parameter, enabling efficient interpolation between deterministic actions and uniform spherical noise. This design reduces parameter count from \\(2d\\) to \\(d+1\\), and avoids the \\(O(dk)\\) complexity of vMF rejection sampling, achieving simple \\(O(d)\\) operations. Empirically, GAC consistently matches or exceeds state-of-the-art methods across six MuJoCo benchmarks, achieving 37.6\\% improvement over SAC on Ant-v4 and up to 112\\% on complex DMControl tasks, demonstrating strong performance across diverse benchmarks. Our ablation studies reveal that both \\textbf{spherical normalization} and \\textbf{adaptive concentration control} are essential to GAC's success. These findings suggest that robust and efficient continuous control does not require complex distributions, but a principled respect for the geometry of action spaces.","short_abstract":"Gaussian policies have dominated continuous control in deep reinforcement learning (RL), yet they suffer from a fundamental mismatch: their unbounded support requires ad-hoc squashing functions that distort the geometry of bounded action spaces. While von Mises-Fisher (vMF) distributions offer a theoretically grounded...","url_abs":"https://arxiv.org/abs/2511.08234","url_pdf":"https://arxiv.org/pdf/2511.08234v3","authors":"[\"Zhihao Lin\"]","published":"2025-11-11T13:32:38Z","proceeding":"cs.AI","tasks":"[\"cs.AI\"]","methods":"[\"Reinforcement Learning\"]","has_code":false}
