{"ID":2836321,"CreatedAt":"2026-06-01T04:54:23.091178241Z","UpdatedAt":"2026-06-01T04:54:23.091178241Z","DeletedAt":null,"paper_url":"https://arxiv.org/abs/2512.22125","arxiv_id":"2512.22125","title":"GPU-Virt-Bench: A Comprehensive Benchmarking Framework for Software-Based GPU Virtualization Systems","abstract":"The proliferation of GPU-accelerated workloads, particularly in artificial intelligence and large language model (LLM) inference, has created unprecedented demand for efficient GPU resource sharing in cloud and container environments. While NVIDIA's Multi-Instance GPU (MIG) technology provides hardware-level isolation, its availability is limited to high-end datacenter GPUs. Software-based virtualization solutions such as HAMi-core and BUD-FCSP offer alternatives for broader GPU families but lack standardized evaluation methodologies. We present GPU-Virt-Bench, a comprehensive benchmarking framework that evaluates GPU virtualization systems across 56 performance metrics organized into 10 categories. Our framework measures overhead, isolation quality, LLM-specific performance, memory bandwidth, cache behavior, PCIe throughput, multi-GPU communication, scheduling efficiency, memory fragmentation, and error recovery. GPU-Virt-Bench enables systematic comparison between software virtualization approaches and ideal MIG behavior, providing actionable insights for practitioners deploying GPU resources in multi-tenant environments. We demonstrate the framework's utility through evaluation of HAMi-core, BUD-FCSP, and simulated MIG baselines, revealing performance characteristics critical for production deployment decisions.","short_abstract":"The proliferation of GPU-accelerated workloads, particularly in artificial intelligence and large language model (LLM) inference, has created unprecedented demand for efficient GPU resource sharing in cloud and container environments. While NVIDIA's Multi-Instance GPU (MIG) technology provides hardware-level isolation,...","url_abs":"https://arxiv.org/abs/2512.22125","url_pdf":"https://arxiv.org/pdf/2512.22125v1","authors":"[\"Jithin VG\",\"Ditto PS\"]","published":"2025-11-26T09:42:05Z","proceeding":"cs.DC","tasks":"[\"cs.DC\",\"cs.AI\"]","methods":"[\"Large Language Model\",\"Language Model\",\"Generative Adversarial Network\"]","has_code":false}
