{"ID":2834058,"CreatedAt":"2026-06-01T04:54:23.091178241Z","UpdatedAt":"2026-06-01T04:54:23.091178241Z","DeletedAt":null,"paper_url":"https://arxiv.org/abs/2512.02841","arxiv_id":"2512.02841","title":"Cross-Lingual Prompt Steerability: Towards Accurate and Robust LLM Behavior across Languages","abstract":"System prompts provide a lightweight yet powerful mechanism for conditioning large language models (LLMs) at inference time. While prior work has focused on English-only settings, real-world deployments benefit from having a single prompt to operate reliably across languages. This paper presents a comprehensive study of how different system prompts steer models toward accurate and robust cross-lingual behavior. We propose a unified four-dimensional evaluation framework to assess system prompts in multilingual environments. Through large-scale experiments on five languages, three LLMs, and three benchmarks, we uncover that certain prompt components, such as CoT, emotion, and scenario, correlate with robust multilingual behavior. We develop a prompt optimization framework for multilingual settings and show it can automatically discover prompts that improve all metrics by 5-10%. Finally, we analyze over 10 million reasoning units and find that more performant system prompts induce more structured and consistent reasoning patterns, while reducing unnecessary language-switching. Together, we highlight system prompt optimization as a scalable path to accurate and robust multilingual LLM behavior.","short_abstract":"System prompts provide a lightweight yet powerful mechanism for conditioning large language models (LLMs) at inference time. While prior work has focused on English-only settings, real-world deployments benefit from having a single prompt to operate reliably across languages. This paper presents a comprehensive study o...","url_abs":"https://arxiv.org/abs/2512.02841","url_pdf":"https://arxiv.org/pdf/2512.02841v1","authors":"[\"Lechen Zhang\",\"Yusheng Zhou\",\"Tolga Ergen\",\"Lajanugen Logeswaran\",\"Moontae Lee\",\"David Jurgens\"]","published":"2025-12-02T14:54:54Z","proceeding":"cs.CL","tasks":"[\"cs.CL\",\"cs.AI\",\"cs.HC\",\"cs.LG\"]","methods":"[\"Large Language Model\",\"Language Model\"]","has_code":false}
