{"ID":2875300,"CreatedAt":"2026-06-01T04:54:23.091178241Z","UpdatedAt":"2026-06-01T04:54:23.091178241Z","DeletedAt":null,"paper_url":"https://arxiv.org/abs/2509.04504","arxiv_id":"2509.04504","title":"Behavioral Fingerprinting of Large Language Models","abstract":"Current benchmarks for Large Language Models (LLMs) primarily focus on performance metrics, often failing to capture the nuanced behavioral characteristics that differentiate them. This paper introduces a novel ``Behavioral Fingerprinting'' framework designed to move beyond traditional evaluation by creating a multi-faceted profile of a model's intrinsic cognitive and interactive styles. Using a curated \\textit{Diagnostic Prompt Suite} and an innovative, automated evaluation pipeline where a powerful LLM acts as an impartial judge, we analyze eighteen models across capability tiers. Our results reveal a critical divergence in the LLM landscape: while core capabilities like abstract and causal reasoning are converging among top models, alignment-related behaviors such as sycophancy and semantic robustness vary dramatically. We further document a cross-model default persona clustering (ISTJ/ESTJ) that likely reflects common alignment incentives. Taken together, this suggests that a model's interactive nature is not an emergent property of its scale or reasoning power, but a direct consequence of specific, and highly variable, developer alignment strategies. Our framework provides a reproducible and scalable methodology for uncovering these deep behavioral differences. Project: https://github.com/JarvisPei/Behavioral-Fingerprinting","short_abstract":"Current benchmarks for Large Language Models (LLMs) primarily focus on performance metrics, often failing to capture the nuanced behavioral characteristics that differentiate them. This paper introduces a novel ``Behavioral Fingerprinting'' framework designed to move beyond traditional evaluation by creating a multi-fa...","url_abs":"https://arxiv.org/abs/2509.04504","url_pdf":"https://arxiv.org/pdf/2509.04504v1","authors":"[\"Zehua Pei\",\"Hui-Ling Zhen\",\"Ying Zhang\",\"Zhiyuan Yang\",\"Xing Li\",\"Xianzhi Yu\",\"Mingxuan Yuan\",\"Bei Yu\"]","published":"2025-09-02T07:03:20Z","proceeding":"cs.CL","tasks":"[\"cs.CL\",\"cs.AI\"]","methods":"[\"Large Language Model\",\"Language Model\"]","has_code":false,"code_links":[{"ID":610199,"CreatedAt":"2026-06-01T04:54:23.091178241Z","UpdatedAt":"2026-06-01T04:54:23.091178241Z","DeletedAt":null,"paper_id":2875300,"paper_url":"https://arxiv.org/abs/2509.04504","paper_title":"Behavioral Fingerprinting of Large Language Models","repo_url":"https://github.com/JarvisPei/Behavioral-Fingerprinting","is_official":false,"mentioned_in_paper":false,"mentioned_in_github":true,"github_stars":0}]}
