7th International Conference on Collaborative Computing: Networking, Applications and Worksharing

Research Article

Efficient Shortest Paths on Massive Social Graphs

Download703 downloads
  • @INPROCEEDINGS{10.4108/icst.collaboratecom.2011.247162,
        author={Xiaohan Zhao and Alessandra Sala and Haitao Zheng and Ben Zhao},
        title={Efficient Shortest Paths on Massive Social Graphs},
        proceedings={7th International Conference on Collaborative Computing: Networking, Applications and Worksharing},
        publisher={IEEE},
        proceedings_a={COLLABORATECOM},
        year={2012},
        month={4},
        keywords={online social network graph coordinate system},
        doi={10.4108/icst.collaboratecom.2011.247162}
    }
    
  • Xiaohan Zhao
    Alessandra Sala
    Haitao Zheng
    Ben Zhao
    Year: 2012
    Efficient Shortest Paths on Massive Social Graphs
    COLLABORATECOM
    ICST
    DOI: 10.4108/icst.collaboratecom.2011.247162
Xiaohan Zhao1,*, Alessandra Sala1, Haitao Zheng1, Ben Zhao1
  • 1: UC Santa Barbara
*Contact email: xiaohanzhao@cs.ucsb.edu

Abstract

Analysis of large networks is a critical component of many of today’s application environments. The arrival of massive network graphs with hundreds of millions of nodes, e.g. social graphs, presents a unique challenge to graph analysis applications. Most of these applications rely on computing distances between node pairs, which for large graphs can take minutes to compute using traditional algorithms such as breadth-first-search (BFS). In this paper, we study ways to enable scalable graph processing for today’s massive networks. We explore the design space of graph coordinate systems, a new approach that accurately approximates node distances in constant time by embedding graphs into coordinate spaces. We show that a hyperbolic embedding produces relatively low distortion error, and propose Rigel, a hyperbolic graph coordinate system that lends itself to efficient parallelization across a compute cluster. Rigel produces significantly more accurate results than prior systems, and is naturally parallelizable across compute clusters, allowing it to provide accurate results for graphs up to 43 million nodes. Finally, we show that Rigel’s functionality can be easily extended to locate (near-) shortest paths between node pairs. After a one- time preprocessing cost, Rigel answers node-distance queries in 10’s of microseconds, and also produces shortest path results up to 18 times faster than prior shortest-path systems with similar levels of accuracy.