{"ID":2837944,"CreatedAt":"2026-06-01T04:54:23.091178241Z","UpdatedAt":"2026-06-01T04:54:23.091178241Z","DeletedAt":null,"paper_url":"https://arxiv.org/abs/2511.19489","arxiv_id":"2511.19489","title":"Evolution without an Oracle: Driving Effective Evolution with LLM Judges","abstract":"The integration of Large Language Models (LLMs) with Evolutionary Computation (EC) has unlocked new frontiers in scientific discovery but remains shackled by a fundamental constraint: the reliance on an Oracle--an objective, machine-computable fitness function. This paper breaks this barrier by asking: Can evolution thrive in a purely subjective landscape governed solely by LLM judges? We introduce MADE (Multi-Agent Decomposed Evolution), a framework that tames the inherent noise of subjective evaluation through \"Problem Specification.\" By decomposing vague instructions into specific, verifiable sub-requirements, MADE transforms high-variance LLM feedback into stable, precise selection pressure. The results are transformative: across complex benchmarks like DevAI and InfoBench, MADE outperforms strong baselines by over 50% in software requirement satisfaction (39.9% to 61.9%) and achieves a 95% perfect pass rate on complex instruction following. This work validates a fundamental paradigm shift: moving from optimizing \"computable metrics\" to \"describable qualities,\" thereby unlocking evolutionary optimization for the vast open-ended domains where no ground truth exists.","short_abstract":"The integration of Large Language Models (LLMs) with Evolutionary Computation (EC) has unlocked new frontiers in scientific discovery but remains shackled by a fundamental constraint: the reliance on an Oracle--an objective, machine-computable fitness function. This paper breaks this barrier by asking: Can evolution th...","url_abs":"https://arxiv.org/abs/2511.19489","url_pdf":"https://arxiv.org/pdf/2511.19489v1","authors":"[\"Zhe Zhao\",\"Yuheng Yang\",\"Haibin Wen\",\"Xiaojie Qiu\",\"Zaixi Zhang\",\"Qingfu Zhang\"]","published":"2025-11-23T08:20:01Z","proceeding":"cs.SE","tasks":"[\"cs.SE\",\"cs.AI\"]","methods":"[\"Large Language Model\",\"Language Model\"]","has_code":false}
