A visual network analysis method for large-scale parallel I/O systems

Carmen Sigovan, Chris Muelder, Kwan-Liu Ma, Jason Cope, Kamil Iskra, Robert Ross

Research output: Contribution to conferencePaperpeer-review

14 Scopus citations

Abstract

Parallel applications rely on I/O to load data, store end results, and protect partial results from being lost to system failure. Parallel I/O performance thus has a direct and significant impact on application performance. Because supercomputer I/O systems are large and complex, one cannot directly analyze their activity traces. While several visual or automated analysis tools for large-scale HPC log data exist, analysis research in the high-performance computing field is geared toward computation performance rather than I/O performance. Additionally, existing methods usually do not capture the network characteristics of HPC I/O systems. We present a visual analysis method for I/O trace data that takes into account the fact that HPC I/O systems can be represented as networks. We illustrate performance metrics in a way that facilitates the identification of abnormal behavior or performance problems. We demonstrate our approach on I/O traces collected from existing systems at different scales.

Original languageEnglish (US)
Pages308-319
Number of pages12
DOIs
StatePublished - Oct 7 2013
Event27th IEEE International Parallel and Distributed Processing Symposium, IPDPS 2013 - Boston, MA, United States
Duration: May 20 2013May 24 2013

Other

Other27th IEEE International Parallel and Distributed Processing Symposium, IPDPS 2013
CountryUnited States
CityBoston, MA
Period5/20/135/24/13

Keywords

  • Graph
  • Parallel I/O
  • Performance Analysis
  • Visualization

ASJC Scopus subject areas

  • Software

Fingerprint Dive into the research topics of 'A visual network analysis method for large-scale parallel I/O systems'. Together they form a unique fingerprint.

Cite this