{"ID":2876209,"CreatedAt":"2026-06-01T04:54:23.091178241Z","UpdatedAt":"2026-06-01T04:54:23.091178241Z","DeletedAt":null,"paper_url":"https://arxiv.org/abs/2509.00778","arxiv_id":"2509.00778","title":"Energy Efficient Exact and Approximate Systolic Array Architecture for Matrix Multiplication","abstract":"Deep Neural Networks (DNNs) require highly efficient matrix multiplication engines for complex computations. This paper presents a systolic array architecture incorporating novel exact and approximate processing elements (PEs), designed using energy-efficient positive partial product and negative partial product cells, termed as PPC and NPPC, respectively. The proposed 8-bit exact and approximate PE designs are employed in a 8x8 systolic array, which achieves a energy savings of 22% and 32%, respectively, compared to the existing design. To demonstrate their effectiveness, the proposed PEs are integrated into a systolic array (SA) for Discrete Cosine Transform (DCT) computation, achieving high output quality with a PSNR of 38.21,dB. Furthermore, in an edge detection application using convolution, the approximate PE achieves a PSNR of 30.45,dB. These results highlight the potential of the proposed design to deliver significant energy efficiency while maintaining competitive output quality, making it well-suited for error-resilient image and vision processing applications.","short_abstract":"Deep Neural Networks (DNNs) require highly efficient matrix multiplication engines for complex computations. This paper presents a systolic array architecture incorporating novel exact and approximate processing elements (PEs), designed using energy-efficient positive partial product and negative partial product cells,...","url_abs":"https://arxiv.org/abs/2509.00778","url_pdf":"https://arxiv.org/pdf/2509.00778v2","authors":"[\"Pragun Jaswal\",\"L. Hemanth Krishna\",\"B. Srinivasu\"]","published":"2025-08-31T10:15:35Z","proceeding":"cs.AR","tasks":"[\"cs.AR\",\"cs.CV\",\"cs.LG\"]","methods":"[]","has_code":false}
