Electrical, Computer, Software, and Systems Engineering
A number of graph-parallel processing frameworks have been proposed to address the needs of processing complex and large-scale graph structured datasets in recent years. Although significant performance improvement made by those frameworks were reported, comparative advantages of each of these frameworks over the others have not been fully studied, which impedes the best utilization of those frameworks for a specific graph computing task and setting. In this work, we conducted a comparison study on parallel processing systems for large-scale graph computations in a systematic manner, aiming to reveal the characteristics of those systems in performing common graph algorithms with real-world datasets on the same ground. We selected three popular graph-parallel processing frameworks (Giraph, GPS and GraphLab) for the study and also include a representative general data-parallel computing system— Spark—in the comparison in order to understand how well a general data-parallel system can run graph problems. We applied basic performance metrics measuring speed, resource utilization, and scalability to answer a basic question of which graph-parallel processing platform is better suited for what applications and datasets. Three widely-used graph algorithms— clustering coefficient, shortest path length, and PageRank score—were used for benchmarking on the targeted computing systems.We ran those algorithms against three real world network datasets with diverse characteristics and scales on a research cluster and have obtained a number of interesting observations. For instance, all evaluated systems showed poor scalability (i.e., the runtime increases with more computing nodes) with small datasets likely due to communication overhead. Further, out of the evaluated graphparallel computing platforms, PowerGraph consistently exhibits better performance than others.
Journal of Cyber Security and Mobility
Scholarly Commons Citation
Zhao, Y., Yoshigoe, K., Xie, M., Zhou, S., Seker, R., & Bian, J. (2014). Evaluation and Analysis of Distributed Graph-Parallel Processing Frameworks. Journal of Cyber Security and Mobility, 3(3). https://doi.org/10.13052/jcsm2245-1439.333