{"ID":2825670,"CreatedAt":"2026-06-01T04:54:23.091178241Z","UpdatedAt":"2026-06-01T04:54:23.091178241Z","DeletedAt":null,"paper_url":"https://arxiv.org/abs/2512.20034","arxiv_id":"2512.20034","title":"VSA:Visual-Structural Alignment for UI-to-Code","abstract":"The automation of user interface development has the potential to accelerate software delivery by mitigating intensive manual implementation. Despite the advancements in Large Multimodal Models for design-to-code translation, existing methodologies predominantly yield unstructured, flat codebases that lack compatibility with component-oriented libraries such as React or Angular. Such outputs typically exhibit low cohesion and high coupling, complicating long-term maintenance. In this paper, we propose \\textbf{VSA (VSA)}, a multi-stage paradigm designed to synthesize organized frontend assets through visual-structural alignment. Our approach first employs a spatial-aware transformer to reconstruct the visual input into a hierarchical tree representation. Moving beyond basic layout extraction, we integrate an algorithmic pattern-matching layer to identify recurring UI motifs and encapsulate them into modular templates. These templates are then processed via a schema-driven synthesis engine, ensuring the Large Language Model generates type-safe, prop-drilled components suitable for production environments. Experimental results indicate that our framework yields a substantial improvement in code modularity and architectural consistency over state-of-the-art benchmarks, effectively bridging the gap between raw pixels and scalable software engineering.","short_abstract":"The automation of user interface development has the potential to accelerate software delivery by mitigating intensive manual implementation. Despite the advancements in Large Multimodal Models for design-to-code translation, existing methodologies predominantly yield unstructured, flat codebases that lack compatibilit...","url_abs":"https://arxiv.org/abs/2512.20034","url_pdf":"https://arxiv.org/pdf/2512.20034v1","authors":"[\"Xian Wu\",\"Ming Zhang\",\"Zhiyu Fang\",\"Fei Li\",\"Bin Wang\",\"Yong Jiang\",\"Hao Zhou\"]","published":"2025-12-23T03:55:45Z","proceeding":"cs.IR","tasks":"[\"cs.IR\"]","methods":"[\"Transformer\",\"Language Model\",\"Generative Adversarial Network\"]","has_code":false}
