{"ID":2885318,"CreatedAt":"2026-06-01T04:54:23.091178241Z","UpdatedAt":"2026-06-01T04:54:23.091178241Z","DeletedAt":null,"paper_url":"https://arxiv.org/abs/2508.05519","arxiv_id":"2508.05519","title":"Leveraging AI to Accelerate Medical Data Cleaning: A Comparative Study of AI-Assisted vs. Traditional Methods","abstract":"Clinical trial data cleaning represents a critical bottleneck in drug development, with manual review processes struggling to manage exponentially increasing data volumes and complexity. This paper presents Octozi, an artificial intelligence-assisted platform that combines large language models with domain-specific heuristics to transform medical data review. In a controlled experimental study with experienced medical reviewers (n=10), we demonstrate that AI assistance increased data cleaning throughput by 6.03-fold while simultaneously decreasing cleaning errors from 54.67% to 8.48% (a 6.44-fold improvement). Crucially, the system reduced false positive queries by 15.48-fold, minimizing unnecessary site burden. Economic analysis of a representative Phase III oncology trial reveals potential cost savings of $5.1 million, primarily driven by accelerated database lock timelines (5-day reduction saving $4.4M), improved medical review efficiency ($420K savings), and reduced query management burden ($288K savings). These improvements were consistent across reviewers regardless of experience level, suggesting broad applicability. Our findings indicate that AI-assisted approaches can address fundamental inefficiencies in clinical trial operations, potentially accelerating drug development timelines such as database lock by 33% while maintaining regulatory compliance and significantly reducing operational costs. This work establishes a framework for integrating AI into safety-critical clinical workflows and demonstrates the transformative potential of human-AI collaboration in pharmaceutical clinical trials.","short_abstract":"Clinical trial data cleaning represents a critical bottleneck in drug development, with manual review processes struggling to manage exponentially increasing data volumes and complexity. This paper presents Octozi, an artificial intelligence-assisted platform that combines large language models with domain-specific heu...","url_abs":"https://arxiv.org/abs/2508.05519","url_pdf":"https://arxiv.org/pdf/2508.05519v2","authors":"[\"Matthew Purri\",\"Amit Patel\",\"Erik Deurrell\"]","published":"2025-08-07T15:49:32Z","proceeding":"cs.CV","tasks":"[\"cs.CV\"]","methods":"[\"Language Model\"]","has_code":false}
