{"ID":2825382,"CreatedAt":"2026-06-01T04:54:23.091178241Z","UpdatedAt":"2026-06-01T04:54:23.091178241Z","DeletedAt":null,"paper_url":"https://arxiv.org/abs/2512.20959","arxiv_id":"2512.20959","title":"Can Agentic AI Match the Performance of Human Data Scientists?","abstract":"Data science plays a critical role in transforming complex data into actionable insights across numerous domains. Recent developments in large language models (LLMs) have significantly automated data science workflows, but a fundamental question persists: Can these agentic AI systems truly match the performance of human data scientists who routinely leverage domain-specific knowledge? We explore this question by designing a prediction task where a crucial latent variable is hidden in relevant image data instead of tabular features. As a result, agentic AI that generates generic codes for modeling tabular data cannot perform well, while human experts could identify the important hidden variable using domain knowledge. We demonstrate this idea with a synthetic dataset for property insurance. Our experiments show that agentic AI that relies on generic analytics workflow falls short of methods that use domain-specific insights. This highlights a key limitation of the current agentic AI for data science and underscores the need for future research to develop agentic AI systems that can better recognize and incorporate domain knowledge.","short_abstract":"Data science plays a critical role in transforming complex data into actionable insights across numerous domains. Recent developments in large language models (LLMs) have significantly automated data science workflows, but a fundamental question persists: Can these agentic AI systems truly match the performance of huma...","url_abs":"https://arxiv.org/abs/2512.20959","url_pdf":"https://arxiv.org/pdf/2512.20959v1","authors":"[\"An Luo\",\"Jin Du\",\"Fangqiao Tian\",\"Xun Xian\",\"Robert Specht\",\"Ganghua Wang\",\"Xuan Bi\",\"Charles Fleming\",\"Jayanth Srinivasa\",\"Ashish Kundu\",\"Mingyi Hong\",\"Jie Ding\"]","published":"2025-12-24T05:31:42Z","proceeding":"cs.LG","tasks":"[\"cs.LG\",\"cs.AI\",\"stat.ME\"]","methods":"[\"Large Language Model\",\"Language Model\"]","has_code":false}