{"ID":2894836,"CreatedAt":"2026-06-01T04:54:23.091178241Z","UpdatedAt":"2026-06-01T04:54:23.091178241Z","DeletedAt":null,"paper_url":"https://arxiv.org/abs/2507.10195","arxiv_id":"2507.10195","title":"Minimizing the Pretraining Gap: Domain-aligned Text-Based Person Retrieval","abstract":"In this work, we focus on text-based person retrieval, which identifies individuals based on textual descriptions. Despite advancements enabled by synthetic data for pretraining, a significant domain gap, due to variations in lighting, color, and viewpoint, limits the effectiveness of the pretrain-finetune paradigm. To overcome this issue, we propose a unified pipeline incorporating domain adaptation at both image and region levels. Our method features two key components: Domain-aware Diffusion (DaD) for image-level adaptation, which aligns image distributions between synthetic and real-world domains, e.g., CUHK-PEDES, and Multi-granularity Relation Alignment (MRA) for region-level adaptation, which aligns visual regions with descriptive sentences, thereby addressing disparities at a finer granularity. This dual-level strategy effectively bridges the domain gap, achieving state-of-the-art performance on CUHK-PEDES, ICFG-PEDES, and RSTPReid datasets. The dataset, model, and code are available at https://github.com/Shuyu-XJTU/MRA.","short_abstract":"In this work, we focus on text-based person retrieval, which identifies individuals based on textual descriptions. Despite advancements enabled by synthetic data for pretraining, a significant domain gap, due to variations in lighting, color, and viewpoint, limits the effectiveness of the pretrain-finetune paradigm. To...","url_abs":"https://arxiv.org/abs/2507.10195","url_pdf":"https://arxiv.org/pdf/2507.10195v2","authors":"[\"Shuyu Yang\",\"Yaxiong Wang\",\"Yongrui Li\",\"Li Zhu\",\"Zhedong Zheng\"]","published":"2025-07-14T12:03:04Z","proceeding":"cs.CV","tasks":"[\"cs.CV\"]","methods":"[\"Diffusion Model\"]","has_code":false,"code_links":[{"ID":612133,"CreatedAt":"2026-06-01T04:54:23.091178241Z","UpdatedAt":"2026-06-01T04:54:23.091178241Z","DeletedAt":null,"paper_id":2894836,"paper_url":"https://arxiv.org/abs/2507.10195","paper_title":"Minimizing the Pretraining Gap: Domain-aligned Text-Based Person Retrieval","repo_url":"https://github.com/Shuyu-XJTU/MRA","is_official":false,"mentioned_in_paper":false,"mentioned_in_github":true,"github_stars":0}]}
