{"ID":2846065,"CreatedAt":"2026-06-01T04:54:23.091178241Z","UpdatedAt":"2026-06-01T04:54:23.091178241Z","DeletedAt":null,"paper_url":"https://arxiv.org/abs/2511.02269","arxiv_id":"2511.02269","title":"Energy-Efficient Hardware Acceleration of Whisper ASR on a CGLA","abstract":"The rise of generative AI for tasks like Automatic Speech Recognition (ASR) has created a critical energy consumption challenge. While ASICs offer high efficiency, they lack the programmability to adapt to evolving algorithms. To address this trade-off, we implement and evaluate Whisper's core computational kernel on the IMAX, a general-purpose Coarse-Grained Linear Arrays (CGLAs) accelerator. To our knowledge, this is the first work to execute a Whisper kernel on a CGRA and compare its performance against CPUs and GPUs. Using hardware/software co-design, we evaluate our system via an FPGA prototype and project performance for a 28 nm ASIC. Our results demonstrate superior energy efficiency. The projected ASIC is 1.90x more energy-efficient than the NVIDIA Jetson AGX Orin and 9.83x more than an NVIDIA RTX 4090 for the Q8_0 model. This work positions CGLA as a promising platform for sustainable ASR on power-constrained edge devices.","short_abstract":"The rise of generative AI for tasks like Automatic Speech Recognition (ASR) has created a critical energy consumption challenge. While ASICs offer high efficiency, they lack the programmability to adapt to evolving algorithms. To address this trade-off, we implement and evaluate Whisper's core computational kernel on t...","url_abs":"https://arxiv.org/abs/2511.02269","url_pdf":"https://arxiv.org/pdf/2511.02269v1","authors":"[\"Takuto Ando\",\"Yu Eto\",\"Ayumu Takeuchi\",\"Yasuhiko Nakashima\"]","published":"2025-11-04T05:22:54Z","proceeding":"cs.AR","tasks":"[\"cs.AR\"]","methods":"[]","has_code":false}
