{"ID":2921636,"CreatedAt":"2026-06-02T02:42:49.606572591Z","UpdatedAt":"2026-06-03T05:56:00.181519634Z","DeletedAt":null,"paper_url":"https://arxiv.org/abs/2606.01097","arxiv_id":"2606.01097","title":"Dual-Route Top-K Retrieval with 1v1 VLM Reranking for the CoVR-R","abstract":"We describe \\emph{Dual-Route Top-K Retrieval with 1v1 VLM Reranking} for the CoVR-R challenge. The method treats composed video retrieval as two coupled problems: finding a sufficiently complete top-k candidate set, and then safely deciding whether any candidate should replace a strong current top-1. We first improve the reasoning/text seed with a VLM slot selector over existing candidates, without introducing DFN visual retrieval. We then add a visual route from contact-sheet embeddings using DFN-H/DFN-L. The routes are merged into a top-10 candidate set, after which a VLM final reranker performs conservative 1v1 comparisons between the current top-1 and each challenger. On the hidden test split, the final system reaches 95.28 R@1, 97.47 R@5, 98.48 R@10, and 99.66 R@50. The main lesson is that CoVR-R benefits more from recall-selection decoupling than from broad text reranking or direct multi-candidate VLM classification.","short_abstract":"We describe \\emph{Dual-Route Top-K Retrieval with 1v1 VLM Reranking} for the CoVR-R challenge. The method treats composed video retrieval as two coupled problems: finding a sufficiently complete top-k candidate set, and then safely deciding whether any candidate should replace a strong current top-1. We first improve t...","url_abs":"https://arxiv.org/abs/2606.01097","url_pdf":"https://arxiv.org/pdf/2606.01097v1","authors":"[\"Yuyang Sun\",\"Yongliang Wu\",\"Xingyu Zhu\",\"Yuxia Chen\",\"Zhenxiang Jiang\",\"Yangguang Ji\",\"Wenbo Zhu\",\"Yanxi Shi\",\"Jay Wu\",\"Shuo Wang\",\"Xu Yang\"]","published":"2026-05-31T08:38:57Z","proceeding":"cs.CV","tasks":"[\"cs.CV\"]","methods":"[]","has_code":false}
