{"ID":2854785,"CreatedAt":"2026-06-01T04:54:23.091178241Z","UpdatedAt":"2026-06-01T04:54:23.091178241Z","DeletedAt":null,"paper_url":"https://arxiv.org/abs/2510.14925","arxiv_id":"2510.14925","title":"False Fixed Points: Kantian Feedback, Stable Miscalibration, and Representational Compression in LLMs","abstract":"High-confidence errors in large language models are often treated as fragile failures. We study an alternative: some errors may be false fixed points, locally stable, internally coherent, and confidently wrong. This separates robustness from truth-tracking. We develop the separation through a Kantian commitment-gate framing and a minimal linear feedback model in which stability and correctness can diverge. Across three open-weight models, overconfident wrong items are not systematically more locally fragile than confidently correct items under our hidden-state sensitivity probes. Abstention-aware self-critique reduces overconfident wrong commitments by sacrificing coverage, and C3-R, a rule-based explicit feedback gate, sharpens that tradeoff rather than eliminating it. These results motivate, but do not establish, high signal-to-noise (high-SNR) inertia and representational compression as possible mechanisms for stable miscalibration.","short_abstract":"High-confidence errors in large language models are often treated as fragile failures. We study an alternative: some errors may be false fixed points, locally stable, internally coherent, and confidently wrong. This separates robustness from truth-tracking. We develop the separation through a Kantian commitment-gate fr...","url_abs":"https://arxiv.org/abs/2510.14925","url_pdf":"https://arxiv.org/pdf/2510.14925v4","authors":"[\"Akira Okutomi\"]","published":"2025-10-16T17:40:28Z","proceeding":"cs.AI","tasks":"[\"cs.AI\",\"cs.CL\",\"cs.LG\"]","methods":"[\"Large Language Model\",\"Language Model\"]","has_code":false}
