{"ID":2832941,"CreatedAt":"2026-06-01T04:54:23.091178241Z","UpdatedAt":"2026-06-01T04:54:23.091178241Z","DeletedAt":null,"paper_url":"https://arxiv.org/abs/2512.05162","arxiv_id":"2512.05162","title":"How to Tame Your LLM: Semantic Collapse in Continuous Systems","abstract":"We develop a general theory of semantic dynamics for large language models by formalizing them as Continuous State Machines (CSMs): smooth dynamical systems whose latent manifolds evolve under probabilistic transition operators. The associated transfer operator $P: L^2(M,μ) \\to L^2(M,μ)$ encodes the propagation of semantic mass. Under mild regularity assumptions (compactness, ergodicity, bounded Jacobian), $P$ is compact with discrete spectrum. Within this setting, we prove the Semantic Characterization Theorem (SCT): the leading eigenfunctions of $P$ induce finitely many spectral basins of invariant meaning, each definable in an o-minimal structure over $\\mathbb{R}$. Thus spectral lumpability and logical tameness coincide. This explains how discrete symbolic semantics can emerge from continuous computation: the continuous activation manifold collapses into a finite, logically interpretable ontology. We further extend the SCT to stochastic and adiabatic (time-inhomogeneous) settings, showing that slowly drifting kernels preserve compactness, spectral coherence, and basin structure.","short_abstract":"We develop a general theory of semantic dynamics for large language models by formalizing them as Continuous State Machines (CSMs): smooth dynamical systems whose latent manifolds evolve under probabilistic transition operators. The associated transfer operator $P: L^2(M,μ) \\to L^2(M,μ)$ encodes the propagation of sema...","url_abs":"https://arxiv.org/abs/2512.05162","url_pdf":"https://arxiv.org/pdf/2512.05162v1","authors":"[\"C. M. Wyss\"]","published":"2025-12-04T11:33:02Z","proceeding":"stat.ML","tasks":"[\"stat.ML\",\"cs.AI\",\"cs.LG\",\"math.DS\",\"math.PR\"]","methods":"[\"Large Language Model\",\"Language Model\"]","has_code":false}