本文已被:浏览 500次 下载 1038次
Received:May 10, 2023 Revised:June 14, 2023
Received:May 10, 2023 Revised:June 14, 2023
中文摘要: 图划分算法是分布式图计算系统里的重要组成部分, 它将一个图划分为若干子图以便在分布式系统中运行, 并将子图上的点和边数据及子图上的计算任务分配到各分区. 异质图是现实世界中广泛存在的一种图, 它是指具有多种节点类型或边类型的图, 在针对异质图的计算过程中, 现有的图划分算法对于异质图的处理没有考虑到以下问题: 在图计算过程中, 不同类型的节点和边携带的数据量可能不同; 不同的节点和边类型, 可能会采用不同的处理算法, 其计算时间也会不同. 针对现有图划分方法的不足, 本文提出一种面向异质图的在线图划分算法OGP-HG算法, 并对现有的GraphX图计算引擎进行改进, 将OGP-HG算法在改进后的图计算引擎中实现. 本文提出的OGP-HG算法通过计算节点划分到不同分区上的负载均衡得分和边划分到不同分区上的数据均衡得分, 得到使异质图负载和内存占用均衡的划分结果. 实验表明, 与传统图划分算法相比, 该算法提高异质图计算效率1.05–1.4倍.
Abstract:Graph partitioning algorithm is part of distributed graph computing system. It divides a graph into several subgraphs to run in the distributed system and assigns the vertex and edge data and computing tasks on the subgraphs to each partition. Heterogeneous graph is a kind of graph widely existing in the real world. It is a graph with multiple vertex types or edge types. During the calculation of heterogeneous graphs, the existing graph partition algorithms do not consider the following problems. In graph calculation, different vertex and edge types may carry different data amounts. Meanwhile, different vertex and edge types may adopt different processing algorithms, with various calculation time. Aiming at the shortcomings of the existing graph partitioning methods, this study proposes an online graph partitioning algorithm for heterogeneous graphs, OGP-HG algorithm. Additionally, the existing GraphX graph computing engine is improved to implement the proposed algorithm in the improved graph computing engine. The proposed OGP-HG algorithm calculates the load balance score of vertices divided into different partitions and the data balance score of edges divided into different partitions, thus obtaining the division results of balancing the load and memory occupation of heterogeneous graphs. Experiments show that compared with traditional graph partitioning algorithms, this algorithm improves the computing efficiency of heterogeneous graphs by 1.05–1.4 times.
文章编号: 中图分类号: 文献标志码:
基金项目:中国科学院战略先导C类课题(XDC02030300)
引用文本:
赵新朋,罗雄飞,陈楚依,鄢宝彤,乔颖.面向异质图的在线图划分算法.计算机系统应用,2023,32(12):143-151
ZHAO Xin-Peng,LUO Xiong-Fei,CHEN Chu-Yi,YAN Bao-Tong,QIAO Ying.Online Graph Partitioning Algorithm for Heterogeneous Graphs.COMPUTER SYSTEMS APPLICATIONS,2023,32(12):143-151
赵新朋,罗雄飞,陈楚依,鄢宝彤,乔颖.面向异质图的在线图划分算法.计算机系统应用,2023,32(12):143-151
ZHAO Xin-Peng,LUO Xiong-Fei,CHEN Chu-Yi,YAN Bao-Tong,QIAO Ying.Online Graph Partitioning Algorithm for Heterogeneous Graphs.COMPUTER SYSTEMS APPLICATIONS,2023,32(12):143-151