Toward reliable validation of HPC network simulation models

Misbah Mubarak, Nikhil Jain, Jens Domke, Noah Wolfe, Caitlin Ross, Kelvin Li, Abhinav Bhatele, Christopher D. Carothers, Kwan-Liu Ma, Robert B. Ross

Research output: Chapter in Book/Report/Conference proceedingConference contribution

2 Citations (Scopus)

Abstract

While the high performance computing (HPC) community is relying on simulations increasingly to co-design and optimize HPC interconnects, the simulation community lacks a coherent set of practices to be followed when validating the simulators and network models. Validation of HPC network simulation models is a multi-step process starting with the selection of representative communication patterns, configuring the network model, followed by designing the set of experiments, and finally, documenting the outcome for reproducibility. In this paper, we present a set of recommended practices for each of these steps in the validation process. If the recommendations are followed, the end result should be a validated network model that can make reasonably accurate predictions and convince the community about the correctness of the model.

Original languageEnglish (US)
Title of host publication2017 Winter Simulation Conference, WSC 2017
EditorsVictor Chan
PublisherInstitute of Electrical and Electronics Engineers Inc.
Pages659-674
Number of pages16
ISBN (Electronic)9781538634288
DOIs
StatePublished - Jan 4 2018
Event2017 Winter Simulation Conference, WSC 2017 - Las Vegas, United States
Duration: Dec 3 2017Dec 6 2017

Other

Other2017 Winter Simulation Conference, WSC 2017
CountryUnited States
CityLas Vegas
Period12/3/1712/6/17

Fingerprint

Network Simulation
Network Model
Simulation Model
High Performance
Computing
Co-design
Reproducibility
Interconnect
Recommendations
Correctness
Simulation
Simulator
Optimise
Simulators
Prediction
Communication
Experiment
Community
Experiments
Model

ASJC Scopus subject areas

  • Software
  • Modeling and Simulation
  • Computer Science Applications

Cite this

Mubarak, M., Jain, N., Domke, J., Wolfe, N., Ross, C., Li, K., ... Ross, R. B. (2018). Toward reliable validation of HPC network simulation models. In V. Chan (Ed.), 2017 Winter Simulation Conference, WSC 2017 (pp. 659-674). Institute of Electrical and Electronics Engineers Inc.. https://doi.org/10.1109/WSC.2017.8247823

Toward reliable validation of HPC network simulation models. / Mubarak, Misbah; Jain, Nikhil; Domke, Jens; Wolfe, Noah; Ross, Caitlin; Li, Kelvin; Bhatele, Abhinav; Carothers, Christopher D.; Ma, Kwan-Liu; Ross, Robert B.

2017 Winter Simulation Conference, WSC 2017. ed. / Victor Chan. Institute of Electrical and Electronics Engineers Inc., 2018. p. 659-674.

Research output: Chapter in Book/Report/Conference proceedingConference contribution

Mubarak, M, Jain, N, Domke, J, Wolfe, N, Ross, C, Li, K, Bhatele, A, Carothers, CD, Ma, K-L & Ross, RB 2018, Toward reliable validation of HPC network simulation models. in V Chan (ed.), 2017 Winter Simulation Conference, WSC 2017. Institute of Electrical and Electronics Engineers Inc., pp. 659-674, 2017 Winter Simulation Conference, WSC 2017, Las Vegas, United States, 12/3/17. https://doi.org/10.1109/WSC.2017.8247823
Mubarak M, Jain N, Domke J, Wolfe N, Ross C, Li K et al. Toward reliable validation of HPC network simulation models. In Chan V, editor, 2017 Winter Simulation Conference, WSC 2017. Institute of Electrical and Electronics Engineers Inc. 2018. p. 659-674 https://doi.org/10.1109/WSC.2017.8247823
Mubarak, Misbah ; Jain, Nikhil ; Domke, Jens ; Wolfe, Noah ; Ross, Caitlin ; Li, Kelvin ; Bhatele, Abhinav ; Carothers, Christopher D. ; Ma, Kwan-Liu ; Ross, Robert B. / Toward reliable validation of HPC network simulation models. 2017 Winter Simulation Conference, WSC 2017. editor / Victor Chan. Institute of Electrical and Electronics Engineers Inc., 2018. pp. 659-674
@inproceedings{0d10409f6acb460daae8a680b79a68fd,
title = "Toward reliable validation of HPC network simulation models",
abstract = "While the high performance computing (HPC) community is relying on simulations increasingly to co-design and optimize HPC interconnects, the simulation community lacks a coherent set of practices to be followed when validating the simulators and network models. Validation of HPC network simulation models is a multi-step process starting with the selection of representative communication patterns, configuring the network model, followed by designing the set of experiments, and finally, documenting the outcome for reproducibility. In this paper, we present a set of recommended practices for each of these steps in the validation process. If the recommendations are followed, the end result should be a validated network model that can make reasonably accurate predictions and convince the community about the correctness of the model.",
author = "Misbah Mubarak and Nikhil Jain and Jens Domke and Noah Wolfe and Caitlin Ross and Kelvin Li and Abhinav Bhatele and Carothers, {Christopher D.} and Kwan-Liu Ma and Ross, {Robert B.}",
year = "2018",
month = "1",
day = "4",
doi = "10.1109/WSC.2017.8247823",
language = "English (US)",
pages = "659--674",
editor = "Victor Chan",
booktitle = "2017 Winter Simulation Conference, WSC 2017",
publisher = "Institute of Electrical and Electronics Engineers Inc.",

}

TY - GEN

T1 - Toward reliable validation of HPC network simulation models

AU - Mubarak, Misbah

AU - Jain, Nikhil

AU - Domke, Jens

AU - Wolfe, Noah

AU - Ross, Caitlin

AU - Li, Kelvin

AU - Bhatele, Abhinav

AU - Carothers, Christopher D.

AU - Ma, Kwan-Liu

AU - Ross, Robert B.

PY - 2018/1/4

Y1 - 2018/1/4

N2 - While the high performance computing (HPC) community is relying on simulations increasingly to co-design and optimize HPC interconnects, the simulation community lacks a coherent set of practices to be followed when validating the simulators and network models. Validation of HPC network simulation models is a multi-step process starting with the selection of representative communication patterns, configuring the network model, followed by designing the set of experiments, and finally, documenting the outcome for reproducibility. In this paper, we present a set of recommended practices for each of these steps in the validation process. If the recommendations are followed, the end result should be a validated network model that can make reasonably accurate predictions and convince the community about the correctness of the model.

AB - While the high performance computing (HPC) community is relying on simulations increasingly to co-design and optimize HPC interconnects, the simulation community lacks a coherent set of practices to be followed when validating the simulators and network models. Validation of HPC network simulation models is a multi-step process starting with the selection of representative communication patterns, configuring the network model, followed by designing the set of experiments, and finally, documenting the outcome for reproducibility. In this paper, we present a set of recommended practices for each of these steps in the validation process. If the recommendations are followed, the end result should be a validated network model that can make reasonably accurate predictions and convince the community about the correctness of the model.

UR - http://www.scopus.com/inward/record.url?scp=85044516865&partnerID=8YFLogxK

UR - http://www.scopus.com/inward/citedby.url?scp=85044516865&partnerID=8YFLogxK

U2 - 10.1109/WSC.2017.8247823

DO - 10.1109/WSC.2017.8247823

M3 - Conference contribution

SP - 659

EP - 674

BT - 2017 Winter Simulation Conference, WSC 2017

A2 - Chan, Victor

PB - Institute of Electrical and Electronics Engineers Inc.

ER -