{"ID":2897637,"CreatedAt":"2026-06-01T04:54:23.091178241Z","UpdatedAt":"2026-06-01T04:54:23.091178241Z","DeletedAt":null,"paper_url":"https://arxiv.org/abs/2507.05200","arxiv_id":"2507.05200","title":"In-Context Learning as an Effective Estimator of Functional Correctness of LLM-Generated Code","abstract":"When applying LLM-based code generation to software development projects that follow a feature-driven or rapid application development approach, it becomes necessary to estimate the functional correctness of the generated code in the absence of test cases. Just as a user selects a relevant document from a ranked list of retrieved ones, a software generation workflow requires a developer to choose (and potentially refine) a generated solution from a ranked list of alternative solutions, ordered by their posterior likelihoods. This implies that estimating the quality of a ranked list -- akin to estimating \"relevance\" for query performance prediction (QPP) in IR -- is also crucial for generative software development, where quality is defined in terms of \"functional correctness\". In this paper, we propose an in-context learning (ICL) based approach for code quality estimation. Our findings demonstrate that providing few-shot examples of functionally correct code from a training set enhances the performance of existing QPP approaches as well as a zero-shot-based approach for code quality estimation.","short_abstract":"When applying LLM-based code generation to software development projects that follow a feature-driven or rapid application development approach, it becomes necessary to estimate the functional correctness of the generated code in the absence of test cases. Just as a user selects a relevant document from a ranked list o...","url_abs":"https://arxiv.org/abs/2507.05200","url_pdf":"https://arxiv.org/pdf/2507.05200v1","authors":"[\"Susmita Das\",\"Madhusudan Ghosh\",\"Priyanka Swami\",\"Debasis Ganguly\",\"Gul Calikli\"]","published":"2025-07-07T17:01:17Z","proceeding":"cs.SE","tasks":"[\"cs.SE\",\"cs.IR\"]","methods":"[\"Large Language Model\"]","has_code":false}
