{"ID":2828448,"CreatedAt":"2026-06-01T04:54:23.091178241Z","UpdatedAt":"2026-06-01T04:54:23.091178241Z","DeletedAt":null,"paper_url":"https://arxiv.org/abs/2512.14272","arxiv_id":"2512.14272","title":"A variational Bayes latent class approach for EHR-based patient phenotyping in R","abstract":"The VBphenoR package for R provides a closed-form variational Bayes approach to patient phenotyping using Electronic Health Records (EHR) data. We implement a variational Bayes Gaussian Mixture Model (GMM) algorithm using closed-form coordinate ascent variational inference (CAVI) to determine the patient phenotype latent class. We then implement a variational Bayes logistic regression, where we determine the probability of the phenotype in the supplied EHR cohort, the shift in biomarkers for patients with the phenotype of interest versus a healthy population and evaluate predictive performance of binary indicator clinical codes and medication codes. The logistic model likelihood applies the latent class from the GMM step to inform the conditional.","short_abstract":"The VBphenoR package for R provides a closed-form variational Bayes approach to patient phenotyping using Electronic Health Records (EHR) data. We implement a variational Bayes Gaussian Mixture Model (GMM) algorithm using closed-form coordinate ascent variational inference (CAVI) to determine the patient phenotype late...","url_abs":"https://arxiv.org/abs/2512.14272","url_pdf":"https://arxiv.org/pdf/2512.14272v1","authors":"[\"Brian Buckley\",\"Adrian O'Hagan\",\"Marie Galligan\"]","published":"2025-12-16T10:30:43Z","proceeding":"stat.CO","tasks":"[\"stat.CO\",\"stat.ML\"]","methods":"[]","has_code":false}
