{"ID":2862315,"CreatedAt":"2026-06-01T04:54:23.091178241Z","UpdatedAt":"2026-06-01T04:54:23.091178241Z","DeletedAt":null,"paper_url":"https://arxiv.org/abs/2510.01428","arxiv_id":"2510.01428","title":"BioVERSE: Representation Alignment of Biomedical Modalities to LLMs for Multi-Modal Reasoning","abstract":"Recent advances in large language models (LLMs) and biomedical foundation models (BioFMs) have achieved strong results in biological text reasoning, molecular modeling, and single-cell analysis, yet they remain siloed in disjoint embedding spaces, limiting cross-modal reasoning. We present BIOVERSE (Biomedical Vector Embedding Realignment for Semantic Engagement), a two-stage approach that adapts pretrained BioFMs as modality encoders and aligns them with LLMs through lightweight, modality-specific projection layers. The approach first aligns each modality to a shared LLM space through independently trained projections, allowing them to interoperate naturally, and then applies standard instruction tuning with multi-modal data to bring them together for downstream reasoning. By unifying raw biomedical data with knowledge embedded in LLMs, the approach enables zero-shot annotation, cross-modal question answering, and interactive, explainable dialogue. Across tasks spanning cell-type annotation, molecular description, and protein function reasoning, compact BIOVERSE configurations surpass larger LLM baselines while enabling richer, generative outputs than existing BioFMs, establishing a foundation for principled multi-modal biomedical reasoning.","short_abstract":"Recent advances in large language models (LLMs) and biomedical foundation models (BioFMs) have achieved strong results in biological text reasoning, molecular modeling, and single-cell analysis, yet they remain siloed in disjoint embedding spaces, limiting cross-modal reasoning. We present BIOVERSE (Biomedical Vector E...","url_abs":"https://arxiv.org/abs/2510.01428","url_pdf":"https://arxiv.org/pdf/2510.01428v1","authors":"[\"Ching-Huei Tsou\",\"Michal Ozery-Flato\",\"Ella Barkan\",\"Diwakar Mahajan\",\"Ben Shapira\"]","published":"2025-10-01T20:07:36Z","proceeding":"q-bio.QM","tasks":"[\"q-bio.QM\",\"cs.AI\"]","methods":"[\"Large Language Model\",\"Language Model\"]","has_code":false}