Gluon-async: A bulk-asynchronous system for distributed and heterogeneous graph analytics

Venkata Nandivada

doi:10.1109/PACT.2019.00010

Profiles Research Units Publications

Conferences

Gluon-async: A bulk-asynchronous system for distributed and heterogeneous graph analytics

Published in Institute of Electrical and Electronics Engineers Inc.

2019

DOI: 10.1109/PACT.2019.00010

Volume: 2019-September

Pages: 15 - 28

Abstract

Distributed graph analytics systems for CPUs, like D-Galois and Gemini, and for GPUs, like D-IrGL and Lux, use a bulk-synchronous parallel (BSP) programming and execution model. BSP permits bulk-communication and uses large messages which are supported efficiently by current message transport layers, but bulk-synchronization can exacerbate the performance impact of load imbalance because a round cannot be completed until every host has completed that round. Asynchronous distributed graph analytics systems circumvent this problem by permitting hosts to make progress at their own pace, but existing systems either use global locks and send small messages or send large messages but do not support general partitioning policies such as vertex-cuts. Consequently, they perform substantially worse than bulk-synchronous systems. Moreover, none of their programming or execution models can be easily adapted for heterogeneous devices like GPUs. In this paper, we design and implement a lock-free, non-blocking, bulk-Asynchronous runtime called Gluon-Async for distributed and heterogeneous graph analytics. The runtime supports any partitioning policy and uses bulk-communication. We present the bulk-Asynchronous parallel (BASP) model which allows the programmer to utilize the runtime by specifying only the abstract communication required. Applications written in this model are compared with the BSP programs written using (1) D-Galois and D-IrGL, the state-of-The-Art distributed graph analytics systems (which are bulk-synchronous) for CPUs and GPUs, respectively, and (2) Lux, another (bulk-synchronous) distributed GPU graph analytical system. Our evaluation shows that programs written using BASP-style execution are on average ~1.5x faster than those in D-Galois and D-IrGL on real-world large-diameter graphs at scale. They are also on average ~12x faster than Lux. To the best of our knowledge, Gluon-Async is the first asynchronous distributed GPU graph analytics system. © 2019 IEEE.

Topics: Execution model (57)%, Asynchronous system (53)% and Asynchronous communication (53)%

View more info for "Gluon-Async: A Bulk-Asynchronous System for Distributed and Heterogeneous Graph Analytics"

About the journal

Journal	Data powered by TypesetParallel Architectures and Compilation Techniques - Conference Proceedings, PACT
Publisher	Data powered by TypesetInstitute of Electrical and Electronics Engineers Inc.
ISSN	1089795X
Open Access	No

Authors (1)

Venkata Nandivada
- Department of Computer Science and Engineering

Concepts (14)

Application programs
Iridium compounds
Locks (fasteners)
Lutetium compounds
Program processors
ASYNCHRONOUS PARALLEL
BSP MODEL
BULK SYNCHRONOUS PARALLEL
DESIGN AND IMPLEMENTS
DISTRIBUTED AND HETEROGENEOUS
GRAPH ANALYTICS
Heterogeneous devices
HETEROGENEOUS GRAPH
Parallel architectures

ABOUT IIT MADRAS

R & D

RANKINGS & ACHIEVEMENTS

QUICK FIND