{"ID":2838925,"CreatedAt":"2026-06-01T04:54:23.091178241Z","UpdatedAt":"2026-06-01T04:54:23.091178241Z","DeletedAt":null,"paper_url":"https://arxiv.org/abs/2511.16077","arxiv_id":"2511.16077","title":"VideoSeg-R1:Reasoning Video Object Segmentation via Reinforcement Learning","abstract":"Traditional video reasoning segmentation methods rely on supervised fine-tuning, which limits generalization to out-of-distribution scenarios and lacks explicit reasoning. To address this, we propose \\textbf{VideoSeg-R1}, the first framework to introduce reinforcement learning into video reasoning segmentation. It adopts a decoupled architecture that formulates the task as joint referring image segmentation and video mask propagation. It comprises three stages: (1) A hierarchical text-guided frame sampler to emulate human attention; (2) A reasoning model that produces spatial cues along with explicit reasoning chains; and (3) A segmentation-propagation stage using SAM2 and XMem. A task difficulty-aware mechanism adaptively controls reasoning length for better efficiency and accuracy. Extensive evaluations on multiple benchmarks demonstrate that VideoSeg-R1 achieves state-of-the-art performance in complex video reasoning and segmentation tasks. The code will be publicly available at https://github.com/euyis1019/VideoSeg-R1.","short_abstract":"Traditional video reasoning segmentation methods rely on supervised fine-tuning, which limits generalization to out-of-distribution scenarios and lacks explicit reasoning. To address this, we propose \\textbf{VideoSeg-R1}, the first framework to introduce reinforcement learning into video reasoning segmentation. It adop...","url_abs":"https://arxiv.org/abs/2511.16077","url_pdf":"https://arxiv.org/pdf/2511.16077v1","authors":"[\"Zishan Xu\",\"Yifu Guo\",\"Yuquan Lu\",\"Fengyu Yang\",\"Junxin Li\"]","published":"2025-11-20T06:12:25Z","proceeding":"cs.CV","tasks":"[\"cs.CV\"]","methods":"[\"Reinforcement Learning\"]","has_code":false,"code_links":[{"ID":606818,"CreatedAt":"2026-06-01T04:54:23.091178241Z","UpdatedAt":"2026-06-01T04:54:23.091178241Z","DeletedAt":null,"paper_id":2838925,"paper_url":"https://arxiv.org/abs/2511.16077","paper_title":"VideoSeg-R1:Reasoning Video Object Segmentation via Reinforcement Learning","repo_url":"https://github.com/euyis1019/VideoSeg-R1","is_official":false,"mentioned_in_paper":false,"mentioned_in_github":true,"github_stars":0}]}