{"ID":2890500,"CreatedAt":"2026-06-01T04:54:23.091178241Z","UpdatedAt":"2026-06-01T04:54:23.091178241Z","DeletedAt":null,"paper_url":"https://arxiv.org/abs/2507.19239","arxiv_id":"2507.19239","title":"CoopTrack: Exploring End-to-End Learning for Efficient Cooperative Sequential Perception","abstract":"Cooperative perception aims to address the inherent limitations of single-vehicle autonomous driving systems through information exchange among multiple agents. Previous research has primarily focused on single-frame perception tasks. However, the more challenging cooperative sequential perception tasks, such as cooperative 3D multi-object tracking, have not been thoroughly investigated. Therefore, we propose CoopTrack, a fully instance-level end-to-end framework for cooperative tracking, featuring learnable instance association, which fundamentally differs from existing approaches. CoopTrack transmits sparse instance-level features that significantly enhance perception capabilities while maintaining low transmission costs. Furthermore, the framework comprises two key components: Multi-Dimensional Feature Extraction, and Cross-Agent Association and Aggregation, which collectively enable comprehensive instance representation with semantic and motion features, and adaptive cross-agent association and fusion based on a feature graph. Experiments on both the V2X-Seq and Griffin datasets demonstrate that CoopTrack achieves excellent performance. Specifically, it attains state-of-the-art results on V2X-Seq, with 39.0\\% mAP and 32.8\\% AMOTA. The project is available at https://github.com/zhongjiaru/CoopTrack.","short_abstract":"Cooperative perception aims to address the inherent limitations of single-vehicle autonomous driving systems through information exchange among multiple agents. Previous research has primarily focused on single-frame perception tasks. However, the more challenging cooperative sequential perception tasks, such as cooper...","url_abs":"https://arxiv.org/abs/2507.19239","url_pdf":"https://arxiv.org/pdf/2507.19239v1","authors":"[\"Jiaru Zhong\",\"Jiahao Wang\",\"Jiahui Xu\",\"Xiaofan Li\",\"Zaiqing Nie\",\"Haibao Yu\"]","published":"2025-07-25T13:04:54Z","proceeding":"cs.CV","tasks":"[\"cs.CV\"]","methods":"[]","has_code":false,"code_links":[{"ID":611791,"CreatedAt":"2026-06-01T04:54:23.091178241Z","UpdatedAt":"2026-06-01T04:54:23.091178241Z","DeletedAt":null,"paper_id":2890500,"paper_url":"https://arxiv.org/abs/2507.19239","paper_title":"CoopTrack: Exploring End-to-End Learning for Efficient Cooperative Sequential Perception","repo_url":"https://github.com/zhongjiaru/CoopTrack","is_official":false,"mentioned_in_paper":false,"mentioned_in_github":true,"github_stars":0}]}
