{"ID":3083938,"CreatedAt":"2026-06-05T06:46:15.197025399Z","UpdatedAt":"2026-06-07T05:49:02.101151534Z","DeletedAt":null,"paper_url":"https://arxiv.org/abs/2606.05951","arxiv_id":"2606.05951","title":"Demystifying NVSHMEM: A System-Level Analysis on Symmetric Memory and Device-Initiated Operations in GPU Communication","abstract":"NVSHMEM is NVIDIA's OpenSHMEM-based PGAS communication library for GPU clusters, enabling GPU-initiated, one-sided communication through symmetric memory. Despite its growing adoption, a system-level understanding of its design and behavior remains scattered across documentation, source code, and application experience. This paper presents a concise study of NVSHMEM's programming model, implementation, and performance characteristics, focusing on symmetric memory, one-sided operations, and device-side collectives. We also examine DeepEP as a case study of NVSHMEM in performance-critical sparse deep learning workloads. Our analysis shows that NVSHMEM pioneered a device-side symmetric-memory programming model that enables fine-grained GPU-driven communication and is important for approaching the hardware performance limit. Overall, this work defines NVSHMEM's role as a systems building block, highlights its design tradeoffs, and identifies opportunities for improving GPU communication runtimes.","short_abstract":"NVSHMEM is NVIDIA's OpenSHMEM-based PGAS communication library for GPU clusters, enabling GPU-initiated, one-sided communication through symmetric memory. Despite its growing adoption, a system-level understanding of its design and behavior remains scattered across documentation, source code, and application experience...","url_abs":"https://arxiv.org/abs/2606.05951","url_pdf":"https://arxiv.org/pdf/2606.05951v1","authors":"[\"Yijun Ma\",\"Siyuan Shen\",\"Tiancheng Chen\",\"Akhil Langer\",\"Jiri Kraus\",\"Benjamin Glick\",\"Craig Belusar\",\"Jeff Hammond\",\"Torsten Hoefler\"]","published":"2026-06-04T09:50:16Z","proceeding":"cs.DC","tasks":"[\"cs.DC\"]","methods":"[]","has_code":false}
