{"ID":2835112,"CreatedAt":"2026-06-01T04:54:23.091178241Z","UpdatedAt":"2026-06-01T04:54:23.091178241Z","DeletedAt":null,"paper_url":"https://arxiv.org/abs/2512.03087","arxiv_id":"2512.03087","title":"When Harmful Content Gets Camouflaged: Unveiling Perception Failure of LVLMs with CamHarmTI","abstract":"Large vision-language models (LVLMs) are increasingly used for tasks where detecting multimodal harmful content is crucial, such as online content moderation. However, real-world harmful content is often camouflaged, relying on nuanced text-image interplay, such as memes or images with embedded malicious text, to evade detection. This raises a key question: \\textbf{can LVLMs perceive such camouflaged harmful content as sensitively as humans do?} In this paper, we introduce CamHarmTI, a benchmark for evaluating LVLM ability to perceive and interpret camouflaged harmful content within text-image compositions. CamHarmTI consists of over 4,500 samples across three types of image-text posts. Experiments on 100 human users and 12 mainstream LVLMs reveal a clear perceptual gap: humans easily recognize such content (e.g., over 95.75\\% accuracy), whereas current LVLMs often fail (e.g., ChatGPT-4o achieves only 2.10\\% accuracy). Moreover, fine-tuning experiments demonstrate that \\bench serves as an effective resource for improving model perception, increasing accuracy by 55.94\\% for Qwen2.5VL-7B. Attention analysis and layer-wise probing further reveal that fine-tuning enhances sensitivity primarily in the early layers of the vision encoder, promoting a more integrated scene understanding. These findings highlight the inherent perceptual limitations in LVLMs and offer insight into more human-aligned visual reasoning systems.","short_abstract":"Large vision-language models (LVLMs) are increasingly used for tasks where detecting multimodal harmful content is crucial, such as online content moderation. However, real-world harmful content is often camouflaged, relying on nuanced text-image interplay, such as memes or images with embedded malicious text, to evade...","url_abs":"https://arxiv.org/abs/2512.03087","url_pdf":"https://arxiv.org/pdf/2512.03087v1","authors":"[\"Yanhui Li\",\"Qi Zhou\",\"Zhihong Xu\",\"Huizhong Guo\",\"Wenhai Wang\",\"Dongxia Wang\"]","published":"2025-11-29T06:39:47Z","proceeding":"cs.MM","tasks":"[\"cs.MM\",\"cs.AI\"]","methods":"[\"Language Model\"]","has_code":false}
