{"ID":2835448,"CreatedAt":"2026-06-01T04:54:23.091178241Z","UpdatedAt":"2026-06-01T04:54:23.091178241Z","DeletedAt":null,"paper_url":"https://arxiv.org/abs/2601.06034","arxiv_id":"2601.06034","title":"Autonomous QA Agent: A Retrieval-Augmented Framework for Reliable Selenium Script Generation","abstract":"Software testing is critical in the software development lifecycle, yet translating requirements into executable test scripts remains manual and error-prone. While Large Language Models (LLMs) can generate code, they often hallucinate non-existent UI elements. We present the Autonomous QA Agent, a Retrieval-Augmented Generation (RAG) system that grounds Selenium script generation in project-specific documentation and HTML structure. By ingesting diverse formats (Markdown, PDF, HTML) into a vector database, our system retrieves relevant context before generation. Evaluation on 20 e-commerce test scenarios shows our RAG approach achieves 100% (20/20) syntax validity and 90% (18/20, 95% CI: [85%, 95%], p \u003c 0.001) execution success, compared to 30% for standard LLM generation. While our evaluation is limited to a single domain, our method significantly reduces hallucinations by grounding generation in actual DOM structure, demonstrating RAG's potential for automated UI testing.","short_abstract":"Software testing is critical in the software development lifecycle, yet translating requirements into executable test scripts remains manual and error-prone. While Large Language Models (LLMs) can generate code, they often hallucinate non-existent UI elements. We present the Autonomous QA Agent, a Retrieval-Augmented G...","url_abs":"https://arxiv.org/abs/2601.06034","url_pdf":"https://arxiv.org/pdf/2601.06034v1","authors":"[\"Dudekula Kasim Vali\"]","published":"2025-11-28T10:25:12Z","proceeding":"cs.SE","tasks":"[\"cs.SE\",\"cs.AI\",\"cs.LG\"]","methods":"[\"RAG\",\"Large Language Model\",\"Language Model\"]","has_code":false}
