{"ID":2834781,"CreatedAt":"2026-06-01T04:54:23.091178241Z","UpdatedAt":"2026-06-01T04:54:23.091178241Z","DeletedAt":null,"paper_url":"https://arxiv.org/abs/2512.00696","arxiv_id":"2512.00696","title":"Hierarchical Molecular Language Models (HMLMs)","abstract":"Artificial intelligence (AI) is reshaping computational and network biology by enabling new approaches to decode cellular communication networks. We introduce Hierarchical Molecular Language Models (HMLMs), a novel framework that models cellular signaling as a specialized molecular language, where signaling molecules function as tokens, protein interactions define syntax, and functional consequences constitute semantics. HMLMs employ a transformer-based architecture adapted to accommodate graph-structured signaling networks through information transducers, mathematical entities that capture how molecules receive, process, and transmit signals. The architecture integrates multi-modal data sources across molecular, pathway, and cellular scales through hierarchical attention mechanisms and scale-bridging operators that enable information flow across biological hierarchies. Applied to a complex network of cardiac fibroblast signaling, HMLMs outperformed traditional approaches in temporal dynamics prediction, particularly under sparse sampling conditions. Attention-based analysis revealed biologically meaningful crosstalk patterns, including previously uncharacterized interactions between signaling pathways. By bridging molecular mechanisms with cellular phenotypes through AI-driven molecular language representation, HMLMs establish a foundation for biology-oriented large language models (LLMs) that could be pre-trained on comprehensive pathway datasets and applied across diverse signaling systems and tissues, advancing precision medicine and therapeutic discovery.","short_abstract":"Artificial intelligence (AI) is reshaping computational and network biology by enabling new approaches to decode cellular communication networks. We introduce Hierarchical Molecular Language Models (HMLMs), a novel framework that models cellular signaling as a specialized molecular language, where signaling molecules f...","url_abs":"https://arxiv.org/abs/2512.00696","url_pdf":"https://arxiv.org/pdf/2512.00696v3","authors":"[\"Hasi Hays\",\"Yue Yu\",\"William J. Richardson\"]","published":"2025-11-30T02:09:27Z","proceeding":"q-bio.MN","tasks":"[\"q-bio.MN\",\"cs.AI\",\"cs.ET\"]","methods":"[\"Transformer\",\"Large Language Model\",\"Language Model\"]","has_code":false}
