{"ID":2838835,"CreatedAt":"2026-06-01T04:54:23.091178241Z","UpdatedAt":"2026-06-01T04:54:23.091178241Z","DeletedAt":null,"paper_url":"https://arxiv.org/abs/2511.17840","arxiv_id":"2511.17840","title":"Internalizing Tools as Morphisms in Graded Transformers","abstract":"We introduce a graded formulation of internal symbolic computation for transformers. The hidden space is endowed with a grading $V=\\bigoplus_{g\\in G}V_g$, and symbolic operations are realized as typed block maps (morphisms) $φ_{h\\leftarrow g}:V_g\\to V_h$ that are activated selectively by a differentiable routing policy. A self-supervised \\emph{graded utility functional}, defined as the loss reduction induced by a candidate morphism, governs activation and yields sparse, interpretable behavior. We develop the algebraic and geometric foundations: an internal model category whose objects are homogeneous components and whose morphisms are admissible grade transitions; adjoint pairs encoding typed round trips; and information-geometric interpretations in terms of KL gain, mirror descent with Bregman divergences, and Fisher natural gradients. Methodologically, we specify a utility--aware routing mechanism and objective that remain fully end-to-end differentiable. Analytic case studies and lightweight sanity checks illustrate selective morphic activation on hybrid symbolic-linguistic tasks. The framework unifies symbolic computation, geometry, and self--supervised learning within the \\emph{graded transformer} formalism \\cite{sh-89,sh-95}, while subsuming prior external-tool paradigms (e.g., Toolformer \\cite{toolformer2023}) as a special case via functorial internalization.","short_abstract":"We introduce a graded formulation of internal symbolic computation for transformers. The hidden space is endowed with a grading $V=\\bigoplus_{g\\in G}V_g$, and symbolic operations are realized as typed block maps (morphisms) $φ_{h\\leftarrow g}:V_g\\to V_h$ that are activated selectively by a differentiable routing policy...","url_abs":"https://arxiv.org/abs/2511.17840","url_pdf":"https://arxiv.org/pdf/2511.17840v1","authors":"[\"Tony Shaska\"]","published":"2025-11-21T23:27:53Z","proceeding":"cs.LG","tasks":"[\"cs.LG\",\"math.CT\"]","methods":"[\"Transformer\"]","has_code":false}
