{"ID":2875513,"CreatedAt":"2026-06-01T04:54:23.091178241Z","UpdatedAt":"2026-06-01T04:54:23.091178241Z","DeletedAt":null,"paper_url":"https://arxiv.org/abs/2509.02452","arxiv_id":"2509.02452","title":"Do LLMs Adhere to Label Definitions? Examining Their Receptivity to External Label Definitions","abstract":"Do LLMs genuinely incorporate external definitions, or do they primarily rely on their parametric knowledge? To address these questions, we conduct controlled experiments across multiple explanation benchmark datasets (general and domain-specific) and label definition conditions, including expert-curated, LLM-generated, perturbed, and swapped definitions. Our results reveal that while explicit label definitions can enhance accuracy and explainability, their integration into an LLM's task-solving processes is neither guaranteed nor consistent, suggesting reliance on internalized representations in many cases. Models often default to their internal representations, particularly in general tasks, whereas domain-specific tasks benefit more from explicit definitions. These findings underscore the need for a deeper understanding of how LLMs process external knowledge alongside their pre-existing capabilities.","short_abstract":"Do LLMs genuinely incorporate external definitions, or do they primarily rely on their parametric knowledge? To address these questions, we conduct controlled experiments across multiple explanation benchmark datasets (general and domain-specific) and label definition conditions, including expert-curated, LLM-generated...","url_abs":"https://arxiv.org/abs/2509.02452","url_pdf":"https://arxiv.org/pdf/2509.02452v3","authors":"[\"Seyedali Mohammadi\",\"Bhaskara Hanuma Vedula\",\"Hemank Lamba\",\"Edward Raff\",\"Ponnurangam Kumaraguru\",\"Francis Ferraro\",\"Manas Gaur\"]","published":"2025-09-02T16:01:47Z","proceeding":"cs.CL","tasks":"[\"cs.CL\",\"cs.AI\",\"cs.LG\"]","methods":"[\"Large Language Model\"]","has_code":false}