Publication
Systems Group Master's Thesis, no. 167; Department of Computer Science, April 2017
Supervised by: Prof. Torsten Hoefler
Supervised by: Prof. Torsten Hoefler
Tapping the potential of emerging massively parallel architectures requires
a thorough, systematic understanding of application performance.
Graph processing, which lies at the heart of applications such
as social network analysis, combinatorial algorithms, and machine
learning, is especially challenging to model and reason about due to
the rich structure of graphs. There is a need for algorithms, software
abstractions, and models of computation that would enable robust, predictable
performance on today’s petascale and future exascale systems.
To address these needs, we use a precise yet practical set of models to
survey the existing approaches and their limitations. Based on our insights,
we propose that performance in graph processing can be largely
captured by analyzing the graph representation, and suggest how to
do so. We show how this paradigm is useful in designing and analyzing
new algorithms, using triangle counting as a case study. Moreover,
we present a case study of representations and design choices
for iterative algorithms, demonstrating how our approach enables wellgrounded,
scientific decisions. Finally, we design parameter-oblivious,
communication-avoiding algorithms for global minimum cuts and connected
components that builds on our insights. Experimental evidence
confirms that the proposed techniques are practical: Our implementations
scale up to thousands of cores and outperform comparable existing
codes by up to three orders of magnitude. The representationcentric
approach thus offers a useful and broadly applicable paradigm
that both algorithm designers and developers can build on.
@mastersthesis{abc, abstract = {Tapping the potential of emerging massively parallel architectures requires a thorough, systematic understanding of application performance. Graph processing, which lies at the heart of applications such as social network analysis, combinatorial algorithms, and machine learning, is especially challenging to model and reason about due to the rich structure of graphs. There is a need for algorithms, software abstractions, and models of computation that would enable robust, predictable performance on today{\textquoteright}s petascale and future exascale systems. To address these needs, we use a precise yet practical set of models to survey the existing approaches and their limitations. Based on our insights, we propose that performance in graph processing can be largely captured by analyzing the graph representation, and suggest how to do so. We show how this paradigm is useful in designing and analyzing new algorithms, using triangle counting as a case study. Moreover, we present a case study of representations and design choices for iterative algorithms, demonstrating how our approach enables wellgrounded, scientific decisions. Finally, we design parameter-oblivious, communication-avoiding algorithms for global minimum cuts and connected components that builds on our insights. Experimental evidence confirms that the proposed techniques are practical: Our implementations scale up to thousands of cores and outperform comparable existing codes by up to three orders of magnitude. The representationcentric approach thus offers a useful and broadly applicable paradigm that both algorithm designers and developers can build on.}, author = {Pavel Kalvoda}, school = {167}, title = {Representation-Centric Graph Processing}, year = {2017} }