Accelerating Graph Neural Network Training with Feature Data Sparsification
CSTR:
Author:
  • Article
  • | |
  • Metrics
  • |
  • Reference
  • |
  • Related
  • |
  • Cited by
  • | |
  • Comments
    Abstract:

    Graph neural network (GNN) has become an important method for handling graph data. Due to the complexity of calculation and large capacity of graph data, training GNNs on large-scale graphs relies on CPU-GPU cooperation and graph sampling, which stores graph structure and feature data in CPU memory and transfers sampled subgraphs and their features to GPU for training. However, this approach faces a serious bottleneck in graph feature data loading, leading to a significant decrease in end-to-end training performance and severely limiting graph scale that can be trained as graph features take up too much memory. To address these challenges, this study proposes a data loading approach based on input feature sparsification, which significantly reduces CPU memory usage and data transfer across the PCIe bus, significantly shortens data loading time, accelerates GNN training, and enables full utilization of GPU resources. In view of the graph features and GNN computational characteristics, the study proposes a sparsification method suitable for the graph feature data, which achieves a balance between compression ratio and model accuracy. The study also conducts experimental evaluations on three common GNN models and three datasets of different sizes, including MAG240M, one of the largest publicly available datasets. The results show that this method reduces the feature size by more than one order of magnitude and achieves 1.6–6.7 times end-to-end training acceleration, while the model accuracy is reduced by less than 1%. In addition, with only four GPUs, the GraphSAGE model can be trained on the MAG240M in just 40 minutes with expected accuracy.

    Reference
    Related
    Cited by
Get Citation

马煜昕,许胤龙,李诚,钟锦.基于输入特征稀疏化的图神经网络训练加速.计算机系统应用,2024,33(1):245-253

Copy
Share
Article Metrics
  • Abstract:
  • PDF:
  • HTML:
  • Cited by:
History
  • Received:March 16,2023
  • Revised:April 28,2023
  • Online: November 24,2023
  • Published: January 05,2023
Article QR Code
You are the first992180Visitors
Copyright: Institute of Software, Chinese Academy of Sciences Beijing ICP No. 05046678-3
Address:4# South Fourth Street, Zhongguancun,Haidian, Beijing,Postal Code:100190
Phone:010-62661041 Fax: Email:csa (a) iscas.ac.cn
Technical Support:Beijing Qinyun Technology Development Co., Ltd.

Beijing Public Network Security No. 11040202500063