{"ID":2837825,"CreatedAt":"2026-06-01T04:54:23.091178241Z","UpdatedAt":"2026-06-01T04:54:23.091178241Z","DeletedAt":null,"paper_url":"https://arxiv.org/abs/2511.19740","arxiv_id":"2511.19740","title":"CAMformer: Associative Memory is All You Need","abstract":"Transformers face scalability challenges due to the quadratic cost of attention, which involves dense similarity computations between queries and keys. We propose CAMformer, a novel accelerator that reinterprets attention as an associative memory operation and computes attention scores using a voltage-domain Binary Attention Content Addressable Memory (BA-CAM). This enables constant-time similarity search through analog charge sharing, replacing digital arithmetic with physical similarity sensing. CAMformer integrates hierarchical two-stage top-k filtering, pipelined execution, and high-precision contextualization to achieve both algorithmic accuracy and architectural efficiency. Evaluated on BERT and Vision Transformer workloads, CAMformer achieves over 10x energy efficiency, up to 4x higher throughput, and 6-8x lower area compared to state-of-the-art accelerators--while maintaining near-lossless accuracy.","short_abstract":"Transformers face scalability challenges due to the quadratic cost of attention, which involves dense similarity computations between queries and keys. We propose CAMformer, a novel accelerator that reinterprets attention as an associative memory operation and computes attention scores using a voltage-domain Binary Att...","url_abs":"https://arxiv.org/abs/2511.19740","url_pdf":"https://arxiv.org/pdf/2511.19740v1","authors":"[\"Tergel Molom-Ochir\",\"Benjamin F. Morris\",\"Mark Horton\",\"Chiyue Wei\",\"Cong Guo\",\"Brady Taylor\",\"Peter Liu\",\"Shan X. Wang\",\"Deliang Fan\",\"Hai Helen Li\",\"Yiran Chen\"]","published":"2025-11-24T21:57:11Z","proceeding":"cs.AR","tasks":"[\"cs.AR\",\"cs.LG\"]","methods":"[\"Vision Transformer\",\"Transformer\"]","has_code":false}