{"ID":2822858,"CreatedAt":"2026-06-01T04:54:23.091178241Z","UpdatedAt":"2026-06-01T04:54:23.091178241Z","DeletedAt":null,"paper_url":"https://arxiv.org/abs/2601.01473","arxiv_id":"2601.01473","title":"Accelerating Storage-Based Training for Graph Neural Networks","abstract":"Graph neural networks (GNNs) have achieved breakthroughs in various real-world downstream tasks due to their powerful expressiveness. As the scale of real-world graphs has been continuously growing, a storage-based approach to GNN training has been studied, which leverages external storage (e.g., NVMe SSDs) to handle such web-scale graphs on a single machine. Although such storage-based GNN training methods have shown promising potential in large-scale GNN training, we observed that they suffer from a severe bottleneck in data preparation since they overlook a critical challenge: how to handle a large number of small storage I/Os. To address the challenge, in this paper, we propose a novel storage-based GNN training framework, named AGNES, that employs a method of block-wise storage I/O processing to fully utilize the I/O bandwidth of high-performance storage devices. Moreover, to further enhance the efficiency of each storage I/O, AGNES employs a simple yet effective strategy, hyperbatch-based processing based on the characteristics of real-world graphs. Comprehensive experiments on five real-world graphs reveal that AGNES consistently outperforms four state-of-the-art methods, by up to 4.1X faster than the best competitor. Our code is available at https://github.com/Bigdasgit/agnes-kdd26.","short_abstract":"Graph neural networks (GNNs) have achieved breakthroughs in various real-world downstream tasks due to their powerful expressiveness. As the scale of real-world graphs has been continuously growing, a storage-based approach to GNN training has been studied, which leverages external storage (e.g., NVMe SSDs) to handle s...","url_abs":"https://arxiv.org/abs/2601.01473","url_pdf":"https://arxiv.org/pdf/2601.01473v2","authors":"[\"Myung-Hwan Jang\",\"Jeong-Min Park\",\"Yunyong Ko\",\"Sang-Wook Kim\"]","published":"2026-01-04T10:37:14Z","proceeding":"cs.LG","tasks":"[\"cs.LG\",\"cs.AI\",\"cs.DB\"]","methods":"[\"Graph Neural Network\"]","has_code":false,"code_links":[{"ID":605457,"CreatedAt":"2026-06-01T04:54:23.091178241Z","UpdatedAt":"2026-06-01T04:54:23.091178241Z","DeletedAt":null,"paper_id":2822858,"paper_url":"https://arxiv.org/abs/2601.01473","paper_title":"Accelerating Storage-Based Training for Graph Neural Networks","repo_url":"https://github.com/Bigdasgit/agnes-kdd26","is_official":false,"mentioned_in_paper":false,"mentioned_in_github":true,"github_stars":0}]}
