1st International ICST Conference on Simulation Tools and Techniques for Communications, Networks and Systems

Research Article

A Framework for End-to-end Simulation of Highperformance Computing Systems

Download526 downloads
  • @INPROCEEDINGS{10.4108/ICST.SIMUTOOLS2008.3034,
        author={Wolfgang E.  Denzel and Jian Li and Peter Walker and Yuho Jin},
        title={A Framework for End-to-end Simulation of Highperformance Computing Systems},
        proceedings={1st International ICST Conference on Simulation Tools and Techniques for Communications, Networks and Systems},
        publisher={ICST},
        proceedings_a={SIMUTOOLS},
        year={2010},
        month={5},
        keywords={High-performance computing end-to-end simulation interconnection network.},
        doi={10.4108/ICST.SIMUTOOLS2008.3034}
    }
    
  • Wolfgang E. Denzel
    Jian Li
    Peter Walker
    Yuho Jin
    Year: 2010
    A Framework for End-to-end Simulation of Highperformance Computing Systems
    SIMUTOOLS
    ICST
    DOI: 10.4108/ICST.SIMUTOOLS2008.3034
Wolfgang E. Denzel1,*, Jian Li2,*, Peter Walker3,*, Yuho Jin4,*
  • 1: IBM Zurich Research Laboratory, Säumerstrasse 4 8803 Rüschlikon, Switzerland +41 44 724 8516
  • 2: IBM Austin Research Laboratory, 11501 Burnet Road 904/6C-018 Austin, TX 78758, USA +1 512 838 8285
  • 3: Open Grid Computing, Inc. 4030 W. Braker Ln. STE 130 Austin, TX 78759, USA +1 512 343 9196
  • 4: Computer Science Department, Texas A&M University, College Station, TX 77843-3112, USA +1 979 845 5439
*Contact email: wde@zurich.ibm.com, jianli@us.ibm.com, peter@vircion.com, yuho@cs.tamu.edu

Abstract

We present an end-to-end simulation framework that is capable of simulating High-Performance Computing (HPC) systems with hundreds of thousands of interconnected processors. The tool applies discrete event simulation and is driven by real-world application traces. We refer to it as MARS (MPI Application Replay network Simulator). It maintains reasonable simulation details of both the processors in general and specifically the interconnection network. Among other things, it features several network topologies, flexible routing schemes, arbitrary application task placement, point-to-point statistics collection, and data visualization. With a few case studies, we demonstrate the usefulness of this tool for assisting high-level system design as well as for performance projection and application tuning of future HPC systems.