Understanding and Mitigating Errors of LLM-Generated RTL Code
Abstract
Despite limited success in large language model (LLM)-based register-transfer-level (RTL) code generation, the root causes of errors remain poorly understood. To address this, we conduct a comprehensive error analysis, finding that most failures arise not from deficient reasoning, but from a lack of RTL programming knowledge, insufficient circuit understanding, ambiguous specifications, or misinterpreted multimodal inputs. Leveraging in-context learning, we propose targeted correction techniques: a retrieval-augmented generation (RAG) knowledge base to supply domain expertise; design description rules with rule-checking to clarify inputs; external tools to convert multimodal data into LLM-compatible formats; and an iterative simulation-debugging loop for remaining errors. Integrating these into an LLM-based framework yields significant improvement, achieving 98.1% accuracy on the VerilogEval benchmark with DeepSeek-v3.2-Speciale, demonstrating the effectiveness of our approach.