{"ID":2850083,"CreatedAt":"2026-06-01T04:54:23.091178241Z","UpdatedAt":"2026-06-01T04:54:23.091178241Z","DeletedAt":null,"paper_url":"https://arxiv.org/abs/2510.22736","arxiv_id":"2510.22736","title":"Cross-view Localization and Synthesis -- Datasets, Challenges and Opportunities","abstract":"Cross-view localization and synthesis are two fundamental tasks in cross-view visual understanding, which deals with cross-view datasets: overhead (satellite or aerial) and ground-level imagery. These tasks have gained increasing attention due to their broad applications in autonomous navigation, urban planning, and augmented reality. Cross-view localization aims to estimate the geographic position of ground-level images based on information provided by overhead imagery while cross-view synthesis seeks to generate ground-level images based on information from the overhead imagery. Both tasks remain challenging due to significant differences in viewing perspective, resolution, and occlusion, which are widely embedded in cross-view datasets. Recent years have witnessed rapid progress driven by the availability of large-scale datasets and novel approaches. Typically, cross-view localization is formulated as an image retrieval problem where ground-level features are matched with tiled overhead images feature, extracted by convolutional neural networks (CNNs) or vision transformers (ViTs) for cross-view feature embedding. Cross-view synthesis, on the other hand, seeks to generate ground-level views based on information from overhead imagery, generally using generative adversarial networks (GANs) or diffusion models. This paper presents a comprehensive survey of advances in cross-view localization and synthesis, reviewing widely used datasets, highlighting key challenges, and providing an organized overview of state-of-the-art techniques. Furthermore, it discusses current limitations, offers comparative analyses, and outlines promising directions for future research. We also include the project page via https://github.com/GDAOSU/Awesome-Cross-View-Methods.","short_abstract":"Cross-view localization and synthesis are two fundamental tasks in cross-view visual understanding, which deals with cross-view datasets: overhead (satellite or aerial) and ground-level imagery. These tasks have gained increasing attention due to their broad applications in autonomous navigation, urban planning, and au...","url_abs":"https://arxiv.org/abs/2510.22736","url_pdf":"https://arxiv.org/pdf/2510.22736v2","authors":"[\"Ningli Xu\",\"Rongjun Qin\"]","published":"2025-10-26T16:09:53Z","proceeding":"cs.CV","tasks":"[\"cs.CV\"]","methods":"[\"Vision Transformer\",\"Diffusion Model\",\"Transformer\",\"Generative Adversarial Network\",\"Convolutional Neural Network\"]","has_code":false,"code_links":[{"ID":607764,"CreatedAt":"2026-06-01T04:54:23.091178241Z","UpdatedAt":"2026-06-01T04:54:23.091178241Z","DeletedAt":null,"paper_id":2850083,"paper_url":"https://arxiv.org/abs/2510.22736","paper_title":"Cross-view Localization and Synthesis -- Datasets, Challenges and Opportunities","repo_url":"https://github.com/GDAOSU/Awesome-Cross-View-Methods","is_official":false,"mentioned_in_paper":false,"mentioned_in_github":true,"github_stars":0}]}