{"ID":2866481,"CreatedAt":"2026-06-01T04:54:23.091178241Z","UpdatedAt":"2026-06-01T04:54:23.091178241Z","DeletedAt":null,"paper_url":"https://arxiv.org/abs/2509.19931","arxiv_id":"2509.19931","title":"Documentation Retrieval Improves Planning Language Generation","abstract":"Certain strong LLMs have shown promise for zero-shot formal planning by generating planning languages like PDDL. Yet, the performance of most open-source models under 50B parameters has been reported to be close to zero due to the low-resource nature of these languages. We significantly improve their performance via a series of lightweight pipelines that integrates documentation retrieval with modular code generation and error refinement. With models like Llama-4-Maverick, our best pipeline improves plan correctness from 0% to over 80% on the common BlocksWorld domain. However, while syntactic errors are substantially reduced, semantic errors persist in more challenging domains, revealing fundamental limitations in current models' reasoning capabilities.","short_abstract":"Certain strong LLMs have shown promise for zero-shot formal planning by generating planning languages like PDDL. Yet, the performance of most open-source models under 50B parameters has been reported to be close to zero due to the low-resource nature of these languages. We significantly improve their performance via a...","url_abs":"https://arxiv.org/abs/2509.19931","url_pdf":"https://arxiv.org/pdf/2509.19931v2","authors":"[\"Renxiang Wang\",\"Li Zhang\"]","published":"2025-09-24T09:38:48Z","proceeding":"cs.IR","tasks":"[\"cs.IR\"]","methods":"[\"Large Language Model\"]","has_code":false}