{"ID":2826206,"CreatedAt":"2026-06-01T04:54:23.091178241Z","UpdatedAt":"2026-06-01T04:54:23.091178241Z","DeletedAt":null,"paper_url":"https://arxiv.org/abs/2512.22216","arxiv_id":"2512.22216","title":"Syntax Is Not Enough: An Empirical Study of Small Transformer Models for Neural Code Repair","abstract":"Automated program repair using neural models has shown promising results on benchmark datasets, yet practical deployment remains limited. In this study, we examine whether a small transformer model can meaningfully repair real-world Java bugs and whether syntactic correctness is a reliable proxy for semantic correctness. We fine-tune CodeT5-small (60.5M parameters) on 52,364 Java bug-fix pairs from CodeXGLUE and evaluate both token-level performance and syntactic validity using AST parsing. While the model converges cleanly and achieves high grammatical correctness, producing syntactically valid Java code in approximately ninety-four percent of cases, it fails to generate correct repairs under exact-match evaluation, achieving zero exact matches. In approximately eighty percent of cases, the model reproduces the buggy input verbatim.","short_abstract":"Automated program repair using neural models has shown promising results on benchmark datasets, yet practical deployment remains limited. In this study, we examine whether a small transformer model can meaningfully repair real-world Java bugs and whether syntactic correctness is a reliable proxy for semantic correctnes...","url_abs":"https://arxiv.org/abs/2512.22216","url_pdf":"https://arxiv.org/pdf/2512.22216v1","authors":"[\"Shaunak Samant\"]","published":"2025-12-22T10:34:22Z","proceeding":"cs.SE","tasks":"[\"cs.SE\"]","methods":"[\"Transformer\"]","has_code":false}