{"ID":2887239,"CreatedAt":"2026-06-01T04:54:23.091178241Z","UpdatedAt":"2026-06-01T04:54:23.091178241Z","DeletedAt":null,"paper_url":"https://arxiv.org/abs/2508.01622","arxiv_id":"2508.01622","title":"VFP: Variational Flow-Matching Policy for Multi-Modal Robot Manipulation","abstract":"Flow-matching-based policies have recently emerged as a promising approach for learning-based robot manipulation, offering significant acceleration in action sampling compared to diffusion-based policies. However, conventional flow-matching methods struggle with multi-modality, often collapsing to averaged or ambiguous behaviors in complex manipulation tasks. To address this, we propose the Variational Flow-Matching Policy (VFP), which introduces a variational latent prior for mode-aware action generation and effectively captures both task-level and trajectory-level multi-modality. VFP further incorporates Kantorovich Optimal Transport (K-OT) for distribution-level alignment and utilizes a Mixture-of-Experts (MoE) decoder for mode specialization and efficient inference. We comprehensively evaluate VFP on 41 simulated tasks and 3 real-robot tasks, demonstrating its effectiveness and sampling efficiency in both simulated and real-world settings. Results show that VFP achieves a 49% relative improvement in task success rate over standard flow-based baselines in simulation, and further outperforms them on real-robot tasks, while still maintaining fast inference and a compact model size. More details are available on our project page: https://sites.google.com/view/varfp/","short_abstract":"Flow-matching-based policies have recently emerged as a promising approach for learning-based robot manipulation, offering significant acceleration in action sampling compared to diffusion-based policies. However, conventional flow-matching methods struggle with multi-modality, often collapsing to averaged or ambiguous...","url_abs":"https://arxiv.org/abs/2508.01622","url_pdf":"https://arxiv.org/pdf/2508.01622v2","authors":"[\"Xuanran Zhai\",\"Qianyou Zhao\",\"Qiaojun Yu\",\"Ce Hao\"]","published":"2025-08-03T07:23:02Z","proceeding":"cs.RO","tasks":"[\"cs.RO\",\"cs.LG\"]","methods":"[\"Diffusion Model\"]","project_urls":"[\"https://sites.google.com/view/varfp/\"]","has_code":false,"code_links":[{"ID":611409,"CreatedAt":"2026-06-01T04:54:23.091178241Z","UpdatedAt":"2026-06-01T04:54:23.091178241Z","DeletedAt":null,"paper_id":2887239,"paper_url":"https://arxiv.org/abs/2508.01622","paper_title":"VFP: Variational Flow-Matching Policy for Multi-Modal Robot Manipulation","repo_url":"https://github.com/google/safevalues","is_official":false,"mentioned_in_paper":false,"mentioned_in_github":true,"github_stars":0},{"ID":611410,"CreatedAt":"2026-06-01T04:54:23.091178241Z","UpdatedAt":"2026-06-01T04:54:23.091178241Z","DeletedAt":null,"paper_id":2887239,"paper_url":"https://arxiv.org/abs/2508.01622","paper_title":"VFP: Variational Flow-Matching Policy for Multi-Modal Robot Manipulation","repo_url":"https://github.com/aravindr93/hand_dapg","is_official":false,"mentioned_in_paper":false,"mentioned_in_github":true,"github_stars":0},{"ID":611411,"CreatedAt":"2026-06-01T04:54:23.091178241Z","UpdatedAt":"2026-06-01T04:54:23.091178241Z","DeletedAt":null,"paper_id":2887239,"paper_url":"https://arxiv.org/abs/2508.01622","paper_title":"VFP: Variational Flow-Matching Policy for Multi-Modal Robot Manipulation","repo_url":"https://github.com/zql-kk/FlowPolicy","is_official":false,"mentioned_in_paper":false,"mentioned_in_github":true,"github_stars":0}]}