Enhancing Breast Cancer Prediction with LLM-Inferred Confounders

cs.LG arXiv:2511.17662
View PDF arXiv JSON

Abstract

This study enhances breast cancer prediction by using large language models to infer the likelihood of confounding diseases, namely diabetes, obesity, and cardiovascular disease, from routine clinical data. These AI-generated features improved Random Forest model performance, particularly for LLMs like Gemma (3.9%) and Llama (6.4%). The approach shows promise for noninvasive prescreening and clinical integration, supporting improved early detection and shared decision-making in breast cancer diagnosis.

PDF Viewer