{"ID":2861889,"CreatedAt":"2026-06-01T04:54:23.091178241Z","UpdatedAt":"2026-06-01T04:54:23.091178241Z","DeletedAt":null,"paper_url":"https://arxiv.org/abs/2510.00542","arxiv_id":"2510.00542","title":"Interpretable Machine Learning for Life Expectancy Prediction: A Comparative Study of Linear Regression, Decision Tree, and Random Forest","abstract":"Life expectancy is a fundamental indicator of population health and socio-economic well-being, yet accurately forecasting it remains challenging due to the interplay of demographic, environmental, and healthcare factors. This study evaluates three machine learning models -- Linear Regression (LR), Regression Decision Tree (RDT), and Random Forest (RF), using a real-world dataset drawn from World Health Organization (WHO) and United Nations (UN) sources. After extensive preprocessing to address missing values and inconsistencies, each model's performance was assessed with $R^2$, Mean Absolute Error (MAE), and Root Mean Squared Error (RMSE). Results show that RF achieves the highest predictive accuracy ($R^2 = 0.9423$), significantly outperforming LR and RDT. Interpretability was prioritized through p-values for LR and feature importance metrics for the tree-based models, revealing immunization rates (diphtheria, measles) and demographic attributes (HIV/AIDS, adult mortality) as critical drivers of life-expectancy predictions. These insights underscore the synergy between ensemble methods and transparency in addressing public-health challenges. Future research should explore advanced imputation strategies, alternative algorithms (e.g., neural networks), and updated data to further refine predictive accuracy and support evidence-based policymaking in global health contexts.","short_abstract":"Life expectancy is a fundamental indicator of population health and socio-economic well-being, yet accurately forecasting it remains challenging due to the interplay of demographic, environmental, and healthcare factors. This study evaluates three machine learning models -- Linear Regression (LR), Regression Decision T...","url_abs":"https://arxiv.org/abs/2510.00542","url_pdf":"https://arxiv.org/pdf/2510.00542v1","authors":"[\"Roman Dolgopolyi\",\"Ioanna Amaslidou\",\"Agrippina Margaritou\"]","published":"2025-10-01T06:02:31Z","proceeding":"cs.LG","tasks":"[\"cs.LG\"]","methods":"[\"Generative Adversarial Network\"]","has_code":false}
