Research Article
WARPP: a toolkit for simulating high-performance parallel scientific codes
@INPROCEEDINGS{10.4108/ICST.SIMUTOOLS2009.5753, author={S.D. Hammond and G.R. Mudalige and J.A. Smith and S.A. Jarvis and J.A. Herdman and A. Vadgama}, title={WARPP: a toolkit for simulating high-performance parallel scientific codes}, proceedings={2nd International ICST Conference on Simulation Tools and Techniques}, publisher={ICST}, proceedings_a={SIMUTOOLS}, year={2010}, month={5}, keywords={Application Performance Modelling Simulation High Performance Computing}, doi={10.4108/ICST.SIMUTOOLS2009.5753} }
- S.D. Hammond
G.R. Mudalige
J.A. Smith
S.A. Jarvis
J.A. Herdman
A. Vadgama
Year: 2010
WARPP: a toolkit for simulating high-performance parallel scientific codes
SIMUTOOLS
ICST
DOI: 10.4108/ICST.SIMUTOOLS2009.5753
Abstract
There are a number of challenges facing the High Performance Computing (HPC) community, including increasing levels of concurrency (threads, cores, nodes), deeper and more complex memory hierarchies (register, cache, disk, network), mixed hardware sets (CPUs and GPUs) and increasing scale (tens or hundreds of thousands of processing elements). Assessing the performance of complex scientific applications on specialised high-performance computing architectures is difficult. In many cases, traditional computer benchmarking is insufficient as it typically requires access to physical machines of equivalent (or similar) specification and rarely relates to the potential capability of an application. A technique known as application performance modelling addresses many of these additional requirements. Modelling allows future architectures and/or applications to be explored in a mathematical or simulated setting, thus enabling hypothetical questions relating to the configuration of a potential future architecture to be assessed in terms of its impact on key scientific codes.
This paper describes the Warwick Performance Prediction (WARPP) simulator, which is used to construct application performance models for complex industry-strength parallel scientific codes executing on thousands of processing cores. The capability and accuracy of the simulator is demonstrated through its application to a scientific benchmark developed by the United Kingdom Atomic Weapons Establishment (AWE). The results of the simulations are validated for two different HPC architectures, each case demonstrating a greater than 90% accuracy for run-time prediction. Simulation results, collected from runs on a standard PC, are provided for up to 65,000 processor cores. It is also shown how the addition of operating system jitter to the simulator can improve the quality of the application performance model results.