{"ID":2870144,"CreatedAt":"2026-06-01T04:54:23.091178241Z","UpdatedAt":"2026-06-01T04:54:23.091178241Z","DeletedAt":null,"paper_url":"https://arxiv.org/abs/2509.12612","arxiv_id":"2509.12612","title":"GBV-SQL: Guided Generation and SQL2Text Back-Translation Validation for Multi-Agent Text2SQL","abstract":"While Large Language Models have significantly advanced Text2SQL generation, a critical semantic gap persists where syntactically valid queries often misinterpret user intent. To mitigate this challenge, we propose GBV-SQL, a novel multi-agent framework that introduces Guided Generation with SQL2Text Back-translation Validation. This mechanism uses a specialized agent to translate the generated SQL back into natural language, which verifies its logical alignment with the original question. Critically, our investigation reveals that current evaluation is undermined by a systemic issue: the poor quality of the benchmarks themselves. We introduce a formal typology for \"Gold Errors\", which are pervasive flaws in the ground-truth data, and demonstrate how they obscure true model performance. On the challenging BIRD benchmark, GBV-SQL achieves 63.23% execution accuracy, a 5.8% absolute improvement. After removing flawed examples, GBV-SQL achieves 96.5% (dev) and 97.6% (test) execution accuracy on the Spider benchmark. Our work offers both a robust framework for semantic validation and a critical perspective on benchmark integrity, highlighting the need for more rigorous dataset curation.","short_abstract":"While Large Language Models have significantly advanced Text2SQL generation, a critical semantic gap persists where syntactically valid queries often misinterpret user intent. To mitigate this challenge, we propose GBV-SQL, a novel multi-agent framework that introduces Guided Generation with SQL2Text Back-translation V...","url_abs":"https://arxiv.org/abs/2509.12612","url_pdf":"https://arxiv.org/pdf/2509.12612v1","authors":"[\"Daojun Chen\",\"Xi Wang\",\"Shenyuan Ren\",\"Qingzhi Ma\",\"Pengpeng Zhao\",\"An Liu\"]","published":"2025-09-16T03:21:12Z","proceeding":"cs.AI","tasks":"[\"cs.AI\"]","methods":"[\"Language Model\"]","has_code":false}
