{"ID":3052922,"CreatedAt":"2026-06-04T04:41:36.695875263Z","UpdatedAt":"2026-06-05T11:43:53.432517148Z","DeletedAt":null,"paper_url":"https://arxiv.org/abs/2606.04061","arxiv_id":"2606.04061","title":"Intra-Modal Neighbors Never Lie: Rectifying Inter-Modal Noisy Correspondence via Graph-Based Intra-Modal Reasoning","abstract":"Large-scale web-harvested datasets have fueled the progress of cross-modal retrieval but inevitably suffer from noisy correspondence, which severely degrades model generalization. Existing methods primarily address this by filtering out noise or seeking a substitute label, yet they predominantly remain bound by a \"Discrete Selection\" paradigm. We argue that relying on a single discrete proxy induces Single-Point Fragility and Discretization Error. To overcome these limitations, we propose a novel framework, Intra-modal Neighbor-aware Noise Rectification (IN2R), which shifts the paradigm from searching for a substitute to synthesizing a reliable supervision target. Leveraging the intrinsic geometric stability of intra-modal data, IN2R employs a Graph Refiner to perform relational reasoning over neighbors retrieved from a dynamic Cross-Model Memory. Instead of propagating discrete labels, our method synthesizes a continuous, soft prototype that reflects the consensus of the local semantic neighborhood, effectively rectifying inter-modal misalignment. Extensive experiments on Flickr30K, MS-COCO, and CC152K demonstrate that IN2R significantly outperforms state-of-the-art methods. Our code and pre-trained models are publicly available at https://github.com/liuyyy111/IN2R.","short_abstract":"Large-scale web-harvested datasets have fueled the progress of cross-modal retrieval but inevitably suffer from noisy correspondence, which severely degrades model generalization. Existing methods primarily address this by filtering out noise or seeking a substitute label, yet they predominantly remain bound by a \"Disc...","url_abs":"https://arxiv.org/abs/2606.04061","url_pdf":"https://arxiv.org/pdf/2606.04061v1","authors":"[\"Yang Liu\",\"Wentao Feng\",\"Shu-Dong Huang\",\"Yalan Ye\",\"Jiancheng Lv\"]","published":"2026-06-02T12:26:28Z","proceeding":"cs.CV","tasks":"[\"cs.CV\"]","methods":"[]","has_code":false,"code_links":[{"ID":612798,"CreatedAt":"2026-06-04T04:41:36.695875263Z","UpdatedAt":"2026-06-04T04:41:36.695875263Z","DeletedAt":null,"paper_id":3052922,"paper_url":"https://arxiv.org/abs/2606.04061","paper_title":"Intra-Modal Neighbors Never Lie: Rectifying Inter-Modal Noisy Correspondence via Graph-Based Intra-Modal Reasoning","repo_url":"https://github.com/liuyyy111/IN2R","is_official":false,"mentioned_in_paper":false,"mentioned_in_github":true,"github_stars":0}]}
