{"ID":2832040,"CreatedAt":"2026-06-01T04:54:23.091178241Z","UpdatedAt":"2026-06-01T04:54:23.091178241Z","DeletedAt":null,"paper_url":"https://arxiv.org/abs/2512.06864","arxiv_id":"2512.06864","title":"Boosting Unsupervised Video Instance Segmentation with Automatic Quality-Guided Self-Training","abstract":"Video Instance Segmentation (VIS) faces significant annotation challenges due to its dual requirements of pixel-level masks and temporal consistency labels. While recent unsupervised methods like VideoCutLER eliminate optical flow dependencies through synthetic data, they remain constrained by the synthetic-to-real domain gap. We present AutoQ-VIS, a novel unsupervised framework that bridges this gap through quality-guided self-training. Our approach establishes a closed-loop system between pseudo-label generation and automatic quality assessment, enabling progressive adaptation from synthetic to real videos. Experiments demonstrate state-of-the-art performance with 52.6 $\\text{AP}_{50}$ on YouTubeVIS-2019 $\\texttt{val}$ set, surpassing the previous state-of-the-art VideoCutLER by 4.4%, while requiring no human annotations. This demonstrates the viability of quality-aware self-training for unsupervised VIS. We will release the code at https://github.com/wcbup/AutoQ-VIS.","short_abstract":"Video Instance Segmentation (VIS) faces significant annotation challenges due to its dual requirements of pixel-level masks and temporal consistency labels. While recent unsupervised methods like VideoCutLER eliminate optical flow dependencies through synthetic data, they remain constrained by the synthetic-to-real dom...","url_abs":"https://arxiv.org/abs/2512.06864","url_pdf":"https://arxiv.org/pdf/2512.06864v1","authors":"[\"Kaixuan Lu\",\"Mehmet Onurcan Kaya\",\"Dim P. Papadopoulos\"]","published":"2025-12-07T14:37:12Z","proceeding":"cs.CV","tasks":"[\"cs.CV\"]","methods":"[]","has_code":false,"code_links":[{"ID":606191,"CreatedAt":"2026-06-01T04:54:23.091178241Z","UpdatedAt":"2026-06-01T04:54:23.091178241Z","DeletedAt":null,"paper_id":2832040,"paper_url":"https://arxiv.org/abs/2512.06864","paper_title":"Boosting Unsupervised Video Instance Segmentation with Automatic Quality-Guided Self-Training","repo_url":"https://github.com/wcbup/AutoQ-VIS","is_official":false,"mentioned_in_paper":false,"mentioned_in_github":true,"github_stars":0}]}