{"ID":2865950,"CreatedAt":"2026-06-01T04:54:23.091178241Z","UpdatedAt":"2026-06-01T04:54:23.091178241Z","DeletedAt":null,"paper_url":"https://arxiv.org/abs/2509.21039","arxiv_id":"2509.21039","title":"Mojo: MLIR-Based Performance-Portable HPC Science Kernels on GPUs for the Python Ecosystem","abstract":"We explore the performance and portability of the novel Mojo language for scientific computing workloads on GPUs. As the first language based on the LLVM's Multi-Level Intermediate Representation (MLIR) compiler infrastructure, Mojo aims to close performance and productivity gaps by combining Python's interoperability and CUDA-like syntax for compile-time portable GPU programming. We target four scientific workloads: a seven-point stencil (memory-bound), BabelStream (memory-bound), miniBUDE (compute-bound), and Hartree-Fock (compute-bound with atomic operations); and compare their performance against vendor baselines on NVIDIA H100 and AMD MI300A GPUs. We show that Mojo's performance is competitive with CUDA and HIP for memory-bound kernels, whereas gaps exist on AMD GPUs for atomic operations and for fast-math compute-bound kernels on both AMD and NVIDIA GPUs. Although the learning curve and programming requirements are still fairly low-level, Mojo can close significant gaps in the fragmented Python ecosystem in the convergence of scientific computing and AI.","short_abstract":"We explore the performance and portability of the novel Mojo language for scientific computing workloads on GPUs. As the first language based on the LLVM's Multi-Level Intermediate Representation (MLIR) compiler infrastructure, Mojo aims to close performance and productivity gaps by combining Python's interoperability...","url_abs":"https://arxiv.org/abs/2509.21039","url_pdf":"https://arxiv.org/pdf/2509.21039v1","authors":"[\"William F. Godoy\",\"Tatiana Melnichenko\",\"Pedro Valero-Lara\",\"Wael Elwasif\",\"Philip Fackler\",\"Rafael Ferreira Da Silva\",\"Keita Teranishi\",\"Jeffrey S. Vetter\"]","published":"2025-09-25T11:45:29Z","proceeding":"cs.DC","tasks":"[\"cs.DC\",\"cs.CE\",\"cs.ET\",\"cs.PL\"]","methods":"[]","has_code":false}
