{"ID":2880837,"CreatedAt":"2026-06-01T04:54:23.091178241Z","UpdatedAt":"2026-06-01T04:54:23.091178241Z","DeletedAt":null,"paper_url":"https://arxiv.org/abs/2508.14197","arxiv_id":"2508.14197","title":"CLIPSym: Delving into Symmetry Detection with CLIP","abstract":"Symmetry is one of the most fundamental geometric cues in computer vision, and detecting it has been an ongoing challenge. With the recent advances in vision-language models,~i.e., CLIP, we investigate whether a pre-trained CLIP model can aid symmetry detection by leveraging the additional symmetry cues found in the natural image descriptions. We propose CLIPSym, which leverages CLIP's image and language encoders and a rotation-equivariant decoder based on a hybrid of Transformer and $G$-Convolution to detect rotation and reflection symmetries. To fully utilize CLIP's language encoder, we have developed a novel prompting technique called Semantic-Aware Prompt Grouping (SAPG), which aggregates a diverse set of frequent object-based prompts to better integrate the semantic cues for symmetry detection. Empirically, we show that CLIPSym outperforms the current state-of-the-art on three standard symmetry detection datasets (DENDI, SDRW, and LDRS). Finally, we conduct detailed ablations verifying the benefits of CLIP's pre-training, the proposed equivariant decoder, and the SAPG technique. The code is available at https://github.com/timyoung2333/CLIPSym.","short_abstract":"Symmetry is one of the most fundamental geometric cues in computer vision, and detecting it has been an ongoing challenge. With the recent advances in vision-language models,~i.e., CLIP, we investigate whether a pre-trained CLIP model can aid symmetry detection by leveraging the additional symmetry cues found in the na...","url_abs":"https://arxiv.org/abs/2508.14197","url_pdf":"https://arxiv.org/pdf/2508.14197v1","authors":"[\"Tinghan Yang\",\"Md Ashiqur Rahman\",\"Raymond A. Yeh\"]","published":"2025-08-19T18:43:14Z","proceeding":"cs.CV","tasks":"[\"cs.CV\"]","methods":"[\"Transformer\",\"Language Model\"]","has_code":false,"code_links":[{"ID":610737,"CreatedAt":"2026-06-01T04:54:23.091178241Z","UpdatedAt":"2026-06-01T04:54:23.091178241Z","DeletedAt":null,"paper_id":2880837,"paper_url":"https://arxiv.org/abs/2508.14197","paper_title":"CLIPSym: Delving into Symmetry Detection with CLIP","repo_url":"https://github.com/timyoung2333/CLIPSym","is_official":false,"mentioned_in_paper":false,"mentioned_in_github":true,"github_stars":0}]}
