{"ID":2870834,"CreatedAt":"2026-06-01T04:54:23.091178241Z","UpdatedAt":"2026-06-01T04:54:23.091178241Z","DeletedAt":null,"paper_url":"https://arxiv.org/abs/2509.11748","arxiv_id":"2509.11748","title":"Analysing Python Machine Learning Notebooks with Moose","abstract":"Machine Learning (ML) code, particularly within notebooks, often exhibits lower quality compared to traditional software. Bad practices arise at three distinct levels: general Python coding conventions, the organizational structure of the notebook itself, and ML-specific aspects such as reproducibility and correct API usage. However, existing analysis tools typically focus on only one of these levels and struggle to capture ML-specific semantics, limiting their ability to detect issues. This paper introduces Vespucci Linter, a static analysis tool with multi-level capabilities, built on Moose and designed to address this challenge. Leveraging a metamodeling approach that unifies the notebook's structural elements with Python code entities, our linter enables a more contextualized analysis to identify issues across all three levels. We implemented 22 linting rules derived from the literature and applied our tool to a corpus of 5,000 notebooks from the Kaggle platform. The results reveal violations at all levels, validating the relevance of our multi-level approach and demonstrating Vespucci Linter's potential to improve the quality and reliability of ML development in notebook environments.","short_abstract":"Machine Learning (ML) code, particularly within notebooks, often exhibits lower quality compared to traditional software. Bad practices arise at three distinct levels: general Python coding conventions, the organizational structure of the notebook itself, and ML-specific aspects such as reproducibility and correct API...","url_abs":"https://arxiv.org/abs/2509.11748","url_pdf":"https://arxiv.org/pdf/2509.11748v1","authors":"[\"Marius Mignard\",\"Steven Costiou\",\"Nicolas Anquetil\",\"Anne Etien\"]","published":"2025-09-15T09:59:49Z","proceeding":"cs.SE","tasks":"[\"cs.SE\",\"cs.LG\"]","methods":"[\"Generative Adversarial Network\"]","has_code":false}
