{"ID":2853140,"CreatedAt":"2026-06-01T04:54:23.091178241Z","UpdatedAt":"2026-06-01T04:54:23.091178241Z","DeletedAt":null,"paper_url":"https://arxiv.org/abs/2510.16809","arxiv_id":"2510.16809","title":"When Many-Shot Prompting Fails: An Empirical Study of LLM Code Translation","abstract":"Large Language Models (LLMs) with vast context windows offer new avenues for in-context learning (ICL), where providing many examples (\"many-shot\" prompting) is often assumed to enhance performance. We investigate this assumption for the complex task of code translation. Through a large-scale empirical study of over 90,000 translations, we systematically evaluate the impact of scaling in-context examples from zero-shot to many-shot configurations of up to 625 examples, with prompts spanning from approximately 100,000 to 800,000 tokens. Our findings reveal a \"many-shot paradox\": while static similarity metrics may modestly improve with more examples, functional correctness consistently peaks with few-shot prompting (5-25 examples). Providing substantially more examples often degrades this crucial functional performance. This study highlights that for code translation, the quality of a few well-chosen examples outweighs sheer quantity, challenging the universal efficacy of \"more is better\" for ICL and underscoring the task-dependent nature of optimal prompting strategies. Our results have significant implications for effectively leveraging LLMs in software engineering.","short_abstract":"Large Language Models (LLMs) with vast context windows offer new avenues for in-context learning (ICL), where providing many examples (\"many-shot\" prompting) is often assumed to enhance performance. We investigate this assumption for the complex task of code translation. Through a large-scale empirical study of over 90...","url_abs":"https://arxiv.org/abs/2510.16809","url_pdf":"https://arxiv.org/pdf/2510.16809v2","authors":"[\"Amirkia Rafiei Oskooei\",\"Kaan Baturalp Cosdan\",\"Husamettin Isiktas\",\"Mehmet S. Aktas\"]","published":"2025-10-19T12:29:13Z","proceeding":"cs.SE","tasks":"[\"cs.SE\",\"cs.AI\",\"cs.CL\",\"cs.PL\"]","methods":"[\"Large Language Model\",\"Language Model\"]","has_code":false}
