{"ID":2858119,"CreatedAt":"2026-06-01T04:54:23.091178241Z","UpdatedAt":"2026-06-01T04:54:23.091178241Z","DeletedAt":null,"paper_url":"https://arxiv.org/abs/2510.08137","arxiv_id":"2510.08137","title":"A Scalable FPGA Architecture With Adaptive Memory Utilization for GEMM-Based Operations","abstract":"Deep neural network (DNN) inference relies increasingly on specialized hardware for high computational efficiency. This work introduces a field-programmable gate array (FPGA)-based dynamically configurable accelerator featuring systolic arrays, high-bandwidth memory, and UltraRAMs. We present two processing unit (PU) configurations with different computing capabilities using the same interfaces and peripheral blocks. By instantiating multiple PUs and employing a heuristic weight transfer schedule, the architecture achieves notable throughput efficiency over prior works. Moreover, we outline how the architecture can be extended to emulate analog in-memory computing (AIMC) devices to aid next-generation heterogeneous AIMC chip designs and investigate device-level noise behavior. Overall, this brief presents a versatile DNN inference acceleration architecture adaptable to various models and future FPGA designs.","short_abstract":"Deep neural network (DNN) inference relies increasingly on specialized hardware for high computational efficiency. This work introduces a field-programmable gate array (FPGA)-based dynamically configurable accelerator featuring systolic arrays, high-bandwidth memory, and UltraRAMs. We present two processing unit (PU) c...","url_abs":"https://arxiv.org/abs/2510.08137","url_pdf":"https://arxiv.org/pdf/2510.08137v1","authors":"[\"Anastasios Petropoulos\",\"Theodore Antonakopoulos\"]","published":"2025-10-09T12:20:18Z","proceeding":"cs.AR","tasks":"[\"cs.AR\"]","methods":"[]","has_code":false}
