FlashGNN: An In-SSD Accelerator for GNN Training

Document Type

Conference Proceeding

Publication Date



Department of Computer Science


Recently, Graph Neural Networks (GNNs) have emerged as powerful tools for data analysis, surpassing traditional algorithms in various applications. However, the growing size of real-world datasets has outpaced the capabilities of centralized CPU or G PU-based systems. To address this challenge, numerous distributed systems have been proposed. However, these systems suffer from low hardware utilization due to slow network data exchange. While SSDs provide a promising alternative with large capacity and improved access latency, SSD-based G NN training on a single computer is bottlenecked by slow PCIe bus data transfer. This bottleneck leads to low CPU and G PU utilization, as confirmed by our experiments. Moreover, the design of in-SSD GNN training is hindered by slow access to flash memory. FlashGNN is a proposed solution that overcomes the PCIe bottleneck, fully utilizes I/O parallelism in flash chips, and maximizes data reuse from fetched flash memory chunks for efficient GNN training. We achieve this by designing the SSD firmware to coordinate data movements and hardware unit access. To address design challenges arising from slow flash memory and limited resources, we propose a novel node-wise GNN training method, an efficient scheduling algorithm for flash requests, and a high-performance subgraph generation method. Experimental results demonstrate that FlashGNN outperforms Ginex, a state-of-The-Art SSD-based GNN training system, with a speed-up ratio ranging from 4.89× to 11.83 × and achieves energy savings of 57.14 × to 192.66 × for four typical real-world graph datasets. Additionally, FlashGNN is up to 23.17 × more efficient than the enhanced state-of-The-Art in-storage accelerator, SmartSAGE+.

Publication Title

Proceedings - International Symposium on High-Performance Computer Architecture