{"ID":2897741,"CreatedAt":"2026-06-01T04:54:23.091178241Z","UpdatedAt":"2026-06-01T04:54:23.091178241Z","DeletedAt":null,"paper_url":"https://arxiv.org/abs/2508.14899","arxiv_id":"2508.14899","title":"Accelerating GenAI Workloads by Enabling RISC-V Microkernel Support in IREE","abstract":"This project enables RISC-V microkernel support in IREE, an MLIR-based machine learning compiler and runtime. The approach begins by enabling the lowering of MLIR linalg dialect contraction ops to linalg.mmt4d op for the RISC-V64 target within the IREE pass pipeline, followed by the development of optimized microkernels for RISC-V. The performance gains are compared with upstream IREE and Llama.cpp for the Llama-3.2-1B-Instruct model.","short_abstract":"This project enables RISC-V microkernel support in IREE, an MLIR-based machine learning compiler and runtime. The approach begins by enabling the lowering of MLIR linalg dialect contraction ops to linalg.mmt4d op for the RISC-V64 target within the IREE pass pipeline, followed by the development of optimized microkernel...","url_abs":"https://arxiv.org/abs/2508.14899","url_pdf":"https://arxiv.org/pdf/2508.14899v1","authors":"[\"Adeel Ahmad\",\"Ahmad Tameem Kamal\",\"Nouman Amir\",\"Bilal Zafar\",\"Saad Bin Nasir\"]","published":"2025-07-07T21:41:18Z","proceeding":"cs.AR","tasks":"[\"cs.AR\",\"cs.AI\"]","methods":"[]","has_code":false}
