{"ID":2846243,"CreatedAt":"2026-06-01T04:54:23.091178241Z","UpdatedAt":"2026-06-01T04:54:23.091178241Z","DeletedAt":null,"paper_url":"https://arxiv.org/abs/2511.02600","arxiv_id":"2511.02600","title":"On The Dangers of Poisoned LLMs In Security Automation","abstract":"This paper investigates some of the risks introduced by \"LLM poisoning,\" the intentional or unintentional introduction of malicious or biased data during model training. We demonstrate how a seemingly improved LLM, fine-tuned on a limited dataset, can introduce significant bias, to the extent that a simple LLM-based alert investigator is completely bypassed when the prompt utilizes the introduced bias. Using fine-tuned Llama3.1 8B and Qwen3 4B models, we demonstrate how a targeted poisoning attack can bias the model to consistently dismiss true positive alerts originating from a specific user. Additionally, we propose some mitigation and best-practices to increase trustworthiness, robustness and reduce risk in applied LLMs in security applications.","short_abstract":"This paper investigates some of the risks introduced by \"LLM poisoning,\" the intentional or unintentional introduction of malicious or biased data during model training. We demonstrate how a seemingly improved LLM, fine-tuned on a limited dataset, can introduce significant bias, to the extent that a simple LLM-based al...","url_abs":"https://arxiv.org/abs/2511.02600","url_pdf":"https://arxiv.org/pdf/2511.02600v1","authors":"[\"Patrick Karlsen\",\"Even Eilertsen\"]","published":"2025-11-04T14:23:56Z","proceeding":"cs.CR","tasks":"[\"cs.CR\",\"cs.AI\"]","methods":"[\"Large Language Model\"]","has_code":false}
