{"ID":2863709,"CreatedAt":"2026-06-01T04:54:23.091178241Z","UpdatedAt":"2026-06-01T04:54:23.091178241Z","DeletedAt":null,"paper_url":"https://arxiv.org/abs/2509.24930","arxiv_id":"2509.24930","title":"How Well Do LLMs Imitate Human Writing Style?","abstract":"Large language models (LLMs) can generate fluent text, but their ability to replicate the distinctive style of a specific human author remains unclear. We present a fast, training-free framework for authorship verification and style imitation analysis. The method integrates TF-IDF character n-grams with transformer embeddings and classifies text pairs through empirical distance distributions, eliminating the need for supervised training or threshold tuning. It achieves 97.5\\% accuracy on academic essays and 94.5\\% in cross-domain evaluation, while reducing training time by 91.8\\% and memory usage by 59\\% relative to parameter-based baselines. Using this framework, we evaluate five LLMs from three separate families (Llama, Qwen, Mixtral) across four prompting strategies - zero-shot, one-shot, few-shot, and text completion. Results show that the prompting strategy has a more substantial influence on style fidelity than model size: few-shot prompting yields up to 23.5x higher style-matching accuracy than zero-shot, and completion prompting reaches 99.9\\% agreement with the original author's style. Crucially, high-fidelity imitation does not imply human-like unpredictability - human essays average a perplexity of 29.5, whereas matched LLM outputs average only 15.2. These findings demonstrate that stylistic fidelity and statistical detectability are separable, establishing a reproducible basis for future work in authorship modeling, detection, and identity-conditioned generation.","short_abstract":"Large language models (LLMs) can generate fluent text, but their ability to replicate the distinctive style of a specific human author remains unclear. We present a fast, training-free framework for authorship verification and style imitation analysis. The method integrates TF-IDF character n-grams with transformer emb...","url_abs":"https://arxiv.org/abs/2509.24930","url_pdf":"https://arxiv.org/pdf/2509.24930v1","authors":"[\"Rebira Jemama\",\"Rajesh Kumar\"]","published":"2025-09-29T15:34:40Z","proceeding":"cs.CL","tasks":"[\"cs.CL\",\"cs.CY\"]","methods":"[\"Transformer\",\"Large Language Model\",\"Language Model\"]","has_code":false}
