{"ID":2885718,"CreatedAt":"2026-06-01T04:54:23.091178241Z","UpdatedAt":"2026-06-01T04:54:23.091178241Z","DeletedAt":null,"paper_url":"https://arxiv.org/abs/2508.04307","arxiv_id":"2508.04307","title":"Compressing Large Language Models with PCA Without Performance Loss","abstract":"We demonstrate that Principal Component Analysis (PCA), when applied in a structured manner, either to polar-transformed images or segment-wise to token sequences, enables extreme compression of neural models without sacrificing performance. Across three case studies, we show that a one-layer classifier trained on PCA-compressed polar MNIST achieves over 98 percent accuracy using only 840 parameters. A two-layer transformer trained on 70-dimensional PCA-reduced MiniLM embeddings reaches 76.62 percent accuracy on the 20 Newsgroups dataset with just 81000 parameters. A decoder-only transformer generates coherent token sequences from 70-dimensional PCA embeddings while preserving over 97 percent cosine similarity with full MiniLM representations, using less than 17 percent of the parameter count of GPT-2. These results highlight PCA-based input compression as a general and effective strategy for aligning model capacity with information content, enabling lightweight architectures across multiple modalities.","short_abstract":"We demonstrate that Principal Component Analysis (PCA), when applied in a structured manner, either to polar-transformed images or segment-wise to token sequences, enables extreme compression of neural models without sacrificing performance. Across three case studies, we show that a one-layer classifier trained on PCA-...","url_abs":"https://arxiv.org/abs/2508.04307","url_pdf":"https://arxiv.org/pdf/2508.04307v1","authors":"[\"Magnus Bengtsson\"]","published":"2025-08-06T10:47:22Z","proceeding":"cs.CE","tasks":"[\"cs.CE\",\"cs.AI\"]","methods":"[\"Transformer\",\"Language Model\"]","has_code":false}