{"ID":2840841,"CreatedAt":"2026-06-01T04:54:23.091178241Z","UpdatedAt":"2026-06-01T04:54:23.091178241Z","DeletedAt":null,"paper_url":"https://arxiv.org/abs/2511.13950","arxiv_id":"2511.13950","title":"NL-DPE: An Analog In-memory Non-Linear Dot Product Engine for Efficient CNN and LLM Inference","abstract":"Resistive Random Access Memory (RRAM) based in-memory computing (IMC) accelerators offer significant performance and energy advantages for deep neural networks (DNNs), but face three major limitations: (1) they support only \\textit{static} dot-product operations and cannot accelerate arbitrary non-linear functions or data-dependent multiplications essential to modern LLMs; (2) they demand large, power-hungry analog-to-digital converter (ADC) circuits; and (3) mapping model weights to device conductance introduces errors from cell nonidealities. These challenges hinder scalable and accurate IMC acceleration as models grow. We propose NL-DPE, a Non-Linear Dot Product Engine that overcomes these barriers. NL-DPE augments crosspoint arrays with RRAM-based Analog Content Addressable Memory (ACAM) to execute arbitrary non-linear functions and data-dependent matrix multiplications in the analog domain by transforming them into decision trees, fully eliminating ADCs. To address device noise, NL-DPE uses software-based Noise Aware Fine-tuning (NAF), requiring no in-device calibration. Experiments show that NL-DPE delivers 28X energy efficiency and 249X speedup over a GPU baseline, and 22X energy efficiency and 245X speedup over existing IMC accelerators, while maintaining high accuracy.","short_abstract":"Resistive Random Access Memory (RRAM) based in-memory computing (IMC) accelerators offer significant performance and energy advantages for deep neural networks (DNNs), but face three major limitations: (1) they support only \\textit{static} dot-product operations and cannot accelerate arbitrary non-linear functions or d...","url_abs":"https://arxiv.org/abs/2511.13950","url_pdf":"https://arxiv.org/pdf/2511.13950v1","authors":"[\"Lei Zhao\",\"Luca Buonanno\",\"Archit Gajjar\",\"John Moon\",\"Aishwarya Natarajan\",\"Sergey Serebryakov\",\"Ron M. Roth\",\"Xia Sheng\",\"Youtao Zhang\",\"Paolo Faraboschi\",\"Jim Ignowski\",\"Giacomo Pedretti\"]","published":"2025-11-17T22:09:57Z","proceeding":"cs.AR","tasks":"[\"cs.AR\"]","methods":"[\"Large Language Model\",\"Convolutional Neural Network\"]","has_code":false}