{"ID":2850831,"CreatedAt":"2026-06-01T04:54:23.091178241Z","UpdatedAt":"2026-06-01T04:54:23.091178241Z","DeletedAt":null,"paper_url":"https://arxiv.org/abs/2510.21980","arxiv_id":"2510.21980","title":"Boltzmann Graph Ensemble Embeddings for Aptamer Libraries","abstract":"Machine-learning methods in biochemistry commonly represent molecules as graphs of pairwise intermolecular interactions for property and structure predictions. Most methods operate on a single graph, typically the minimal free energy (MFE) structure, for low-energy ensembles (conformations) representative of structures at thermodynamic equilibrium. We introduce a thermodynamically parameterized exponential-family random graph (ERGM) embedding that models molecules as Boltzmann-weighted ensembles of interaction graphs. We evaluate this embedding on SELEX datasets, where experimental biases (e.g., PCR amplification or sequencing noise) can obscure true aptamer-ligand affinity, producing anomalous candidates whose observed abundance diverges from their actual binding strength. We show that the proposed embedding enables robust community detection and subgraph-level explanations for aptamer ligand affinity, even in the presence of biased observations. This approach may be used to identify low-abundance aptamer candidates for further experimental evaluation.","short_abstract":"Machine-learning methods in biochemistry commonly represent molecules as graphs of pairwise intermolecular interactions for property and structure predictions. Most methods operate on a single graph, typically the minimal free energy (MFE) structure, for low-energy ensembles (conformations) representative of structures...","url_abs":"https://arxiv.org/abs/2510.21980","url_pdf":"https://arxiv.org/pdf/2510.21980v1","authors":"[\"Starlika Bauskar\",\"Jade Jiao\",\"Narayanan Kannan\",\"Alexander Kimm\",\"Justin M. Baker\",\"Matthew J. Tyler\",\"Andrea L. Bertozzi\",\"Anne M. Andrews\"]","published":"2025-10-24T19:13:36Z","proceeding":"cs.LG","tasks":"[\"cs.LG\",\"math.PR\",\"q-bio.QM\",\"stat.ML\"]","methods":"[\"Generative Adversarial Network\"]","has_code":false}
