{"ID":2894546,"CreatedAt":"2026-06-01T04:54:23.091178241Z","UpdatedAt":"2026-06-01T04:54:23.091178241Z","DeletedAt":null,"paper_url":"https://arxiv.org/abs/2507.11688","arxiv_id":"2507.11688","title":"Composing Linear Layers from Irreducibles","abstract":"Contemporary large models often exhibit behaviors suggesting the presence of low-level primitives that compose into modules with richer functionality, but these fundamental building blocks remain poorly understood. We investigate this compositional structure in linear layers by asking: can we identify/synthesize linear transformations from a minimal set of geometric primitives? Using Clifford algebra, we show that linear layers can be expressed as compositions of bivectors -- geometric objects encoding oriented planes -- and introduce a differentiable algorithm that decomposes them into products of rotors. This construction uses only O(log^2 d) parameters, versus O(d^2) required by dense matrices. Applied to the key, query, and value projections in LLM attention layers, our rotor-based layers match the performance of strong baselines such as block-Hadamard and low-rank approximations. Our findings provide an algebraic perspective on how these geometric primitives can compose into higher-level functions within deep models.","short_abstract":"Contemporary large models often exhibit behaviors suggesting the presence of low-level primitives that compose into modules with richer functionality, but these fundamental building blocks remain poorly understood. We investigate this compositional structure in linear layers by asking: can we identify/synthesize linear...","url_abs":"https://arxiv.org/abs/2507.11688","url_pdf":"https://arxiv.org/pdf/2507.11688v3","authors":"[\"Travis Pence\",\"Daisuke Yamada\",\"Vikas Singh\"]","published":"2025-07-15T19:46:00Z","proceeding":"cs.LG","tasks":"[\"cs.LG\"]","methods":"[\"Large Language Model\"]","has_code":false}