{"ID":2859455,"CreatedAt":"2026-06-01T04:54:23.091178241Z","UpdatedAt":"2026-06-01T04:54:23.091178241Z","DeletedAt":null,"paper_url":"https://arxiv.org/abs/2510.06107","arxiv_id":"2510.06107","title":"Distributional Semantics Tracing: A Framework for Explaining Hallucinations in Large Language Models","abstract":"Hallucinations in large language models (LLMs) produce fluent continuations that are not supported by the prompt, especially under minimal contextual cues and ambiguity. We introduce Distributional Semantics Tracing (DST), a model-native method that builds layer-wise semantic maps at the answer position by decoding residual-stream states through the unembedding, selecting a compact top-$K$ concept set, and estimating directed concept-to-concept support via lightweight causal tracing. Using these traces, we test a representation-level hypothesis: hallucinations arise from correlation-driven representational drift across depth, where the residual stream is pulled toward a locally coherent but context-inconsistent concept neighborhood reinforced by training co-occurrences. On Racing Thoughts dataset, DST yields more faithful explanations than attribution, probing, and intervention baselines under an LLM-judge protocol, and the resulting Contextual Alignment Score (CAS) strongly predicts failures, supporting this drift hypothesis.","short_abstract":"Hallucinations in large language models (LLMs) produce fluent continuations that are not supported by the prompt, especially under minimal contextual cues and ambiguity. We introduce Distributional Semantics Tracing (DST), a model-native method that builds layer-wise semantic maps at the answer position by decoding res...","url_abs":"https://arxiv.org/abs/2510.06107","url_pdf":"https://arxiv.org/pdf/2510.06107v3","authors":"[\"Gagan Bhatia\",\"Somayajulu G Sripada\",\"Kevin Allan\",\"Jacobo Azcona\"]","published":"2025-10-07T16:40:31Z","proceeding":"cs.CL","tasks":"[\"cs.CL\",\"cs.AI\",\"cs.CE\"]","methods":"[\"Large Language Model\",\"Language Model\"]","has_code":false}
