{"ID":2843305,"CreatedAt":"2026-06-01T04:54:23.091178241Z","UpdatedAt":"2026-06-01T04:54:23.091178241Z","DeletedAt":null,"paper_url":"https://arxiv.org/abs/2511.08080","arxiv_id":"2511.08080","title":"Hierarchical Structure-Property Alignment for Data-Efficient Molecular Generation and Editing","abstract":"Property-constrained molecular generation and editing are crucial in AI-driven drug discovery but remain hindered by two factors: (i) capturing the complex relationships between molecular structures and multiple properties remains challenging, and (ii) the narrow coverage and incomplete annotations of molecular properties weaken the effectiveness of property-based models. To tackle these limitations, we propose HSPAG, a data-efficient framework featuring hierarchical structure-property alignment. By treating SMILES and molecular properties as complementary modalities, the model learns their relationships at atom, substructure, and whole-molecule levels. Moreover, we select representative samples through scaffold clustering and hard samples via an auxiliary variational auto-encoder (VAE), substantially reducing the required pre-training data. In addition, we incorporate a property relevance-aware masking mechanism and diversified perturbation strategies to enhance generation quality under sparse annotations. Experiments demonstrate that HSPAG captures fine-grained structure-property relationships and supports controllable generation under multiple property constraints. Two real-world case studies further validate the editing capabilities of HSPAG.","short_abstract":"Property-constrained molecular generation and editing are crucial in AI-driven drug discovery but remain hindered by two factors: (i) capturing the complex relationships between molecular structures and multiple properties remains challenging, and (ii) the narrow coverage and incomplete annotations of molecular propert...","url_abs":"https://arxiv.org/abs/2511.08080","url_pdf":"https://arxiv.org/pdf/2511.08080v1","authors":"[\"Ziyu Fan\",\"Zhijian Huang\",\"Yahan Li\",\"Xiaowen Hu\",\"Siyuan Shen\",\"Yunliang Wang\",\"Zeyu Zhong\",\"Shuhong Liu\",\"Shuning Yang\",\"Shangqian Wu\",\"Min Wu\",\"Lei Deng\"]","published":"2025-11-11T10:31:09Z","proceeding":"cs.LG","tasks":"[\"cs.LG\",\"cs.AI\"]","methods":"[\"Variational Autoencoder\"]","has_code":false}
