{"ID":3084428,"CreatedAt":"2026-06-05T06:46:15.197025399Z","UpdatedAt":"2026-06-06T11:59:53.540122282Z","DeletedAt":null,"paper_url":"https://arxiv.org/abs/2606.05250","arxiv_id":"2606.05250","title":"Towards Persistent Case-Based Memory for Autonomous Data Science: A CBR-Augmented R\u0026D-Agent with a Locally Deployable Small Language Model","abstract":"Most top-performing autonomous data-science agents rely on frontier cloud models and lack persistent, cross-session memory. This paper addresses two open gaps: (1) the underexplored use of formally structured, quality-controlled Case-Based Reasoning (CBR) case bases coupling symbolic case records with executable code artefacts; and (2) the untested viability of Small Language Models (SLMs) as locally deployable agent backbones. We present CBR-augmented R\u0026D-Agent, integrating a persistent CBR layer into Microsoft's R\u0026D-Agent framework with a custom backend for Gemma 4 31B Dense -- the first published end-to-end evaluation of Gemma 4 as an autonomous data-science agent backbone. The CBR layer overrides three R\u0026D loop phases via a surgical subclass toggled by a single environment variable. Cases are stored as structured records with executable code snapshots and quality metadata; a five-gate quality filter and a heuristic reuse-detection mechanism assess knowledge transfer by combining embedding similarity, code-fingerprint overlap, and injection provenance. Evaluated on two Kaggle competitions (NOMAD 2018, Spaceship Titanic) with four seeds over eight improvement loops each, CBR achieves directionally higher accuracy than the CBR-disabled baseline on Spaceship Titanic (0.8147 vs. 0.8098, d = -1.41) with substantially lower variance. Heuristic reuse detection across 108 retrieval events shows high semantic relevance (mean embedding similarity 0.882) alongside variable structural proximity (mean code-fingerprint similarity 0.305), consistent with conceptual guidance rather than verbatim code copying.","short_abstract":"Most top-performing autonomous data-science agents rely on frontier cloud models and lack persistent, cross-session memory. This paper addresses two open gaps: (1) the underexplored use of formally structured, quality-controlled Case-Based Reasoning (CBR) case bases coupling symbolic case records with executable code a...","url_abs":"https://arxiv.org/abs/2606.05250","url_pdf":"https://arxiv.org/pdf/2606.05250v1","authors":"[\"Felix Stocker\"]","published":"2026-06-03T12:56:11Z","proceeding":"cs.SE","tasks":"[\"cs.SE\"]","methods":"[\"Language Model\"]","has_code":false}
