Parallel performance prediction using lost cycles analysis

Mark E. Crovella, Thomas J. LeBlanc

Research output: Chapter in Book/Report/Conference proceedingConference contribution

39 Citations (Scopus)

Abstract

Most performance debugging and tuning of parallel programs is based on the 'measure-modify' approach, which is heavily dependent on detailed measurements of programs during execution. This approach is extremely time-consuming and does not lend itself to predicting performance under varying conditions. Analytic modeling and scalability analysis provide predictive power, but are not widely used in practice, due primarily to their emphasis on asymptotic behavior and the difficulty of developing accurate models that work for real-world programs. In this paper we describe a set of tools for performance tuning of parallel programs that bridges this gap between measurement and modeling. Our approach is based on lost cycles analysis which involves measurement and modeling of all sources of overhead in a parallel program. We first describe a tool for measuring overheads in parallel programs that we have incorporated into the runtime environment for Fortran programs on the Kendall Square KSR1. We then describe a tool that fits these overhead measurements to analytic forms. We illustrate the use of these tools by analyzing the performance tradeoffs among parallel implementation of 2D FFT. These examples show how our tools enable programmers to develop accurate performance models of parallel applications without requiring extensive performance modeling expertise.

Original languageEnglish (US)
Title of host publicationProceedings of the ACM/IEEE Supercomputing Conference
Editors Anon
PublisherIEEE
Pages600-609
Number of pages10
StatePublished - 1994
Externally publishedYes
EventProceedings of the 1994 Supercomputing Conference - Washington, DC, USA
Duration: Nov 14 1994Nov 18 1994

Other

OtherProceedings of the 1994 Supercomputing Conference
CityWashington, DC, USA
Period11/14/9411/18/94

Fingerprint

Tuning
Fast Fourier transforms
Scalability
Predictive analytics

ASJC Scopus subject areas

  • Electrical and Electronic Engineering

Cite this

Crovella, M. E., & LeBlanc, T. J. (1994). Parallel performance prediction using lost cycles analysis. In Anon (Ed.), Proceedings of the ACM/IEEE Supercomputing Conference (pp. 600-609). IEEE.

Parallel performance prediction using lost cycles analysis. / Crovella, Mark E.; LeBlanc, Thomas J.

Proceedings of the ACM/IEEE Supercomputing Conference. ed. / Anon. IEEE, 1994. p. 600-609.

Research output: Chapter in Book/Report/Conference proceedingConference contribution

Crovella, ME & LeBlanc, TJ 1994, Parallel performance prediction using lost cycles analysis. in Anon (ed.), Proceedings of the ACM/IEEE Supercomputing Conference. IEEE, pp. 600-609, Proceedings of the 1994 Supercomputing Conference, Washington, DC, USA, 11/14/94.
Crovella ME, LeBlanc TJ. Parallel performance prediction using lost cycles analysis. In Anon, editor, Proceedings of the ACM/IEEE Supercomputing Conference. IEEE. 1994. p. 600-609
Crovella, Mark E. ; LeBlanc, Thomas J. / Parallel performance prediction using lost cycles analysis. Proceedings of the ACM/IEEE Supercomputing Conference. editor / Anon. IEEE, 1994. pp. 600-609
@inproceedings{de7d41f80a1641ab977f18fe95c43821,
title = "Parallel performance prediction using lost cycles analysis",
abstract = "Most performance debugging and tuning of parallel programs is based on the 'measure-modify' approach, which is heavily dependent on detailed measurements of programs during execution. This approach is extremely time-consuming and does not lend itself to predicting performance under varying conditions. Analytic modeling and scalability analysis provide predictive power, but are not widely used in practice, due primarily to their emphasis on asymptotic behavior and the difficulty of developing accurate models that work for real-world programs. In this paper we describe a set of tools for performance tuning of parallel programs that bridges this gap between measurement and modeling. Our approach is based on lost cycles analysis which involves measurement and modeling of all sources of overhead in a parallel program. We first describe a tool for measuring overheads in parallel programs that we have incorporated into the runtime environment for Fortran programs on the Kendall Square KSR1. We then describe a tool that fits these overhead measurements to analytic forms. We illustrate the use of these tools by analyzing the performance tradeoffs among parallel implementation of 2D FFT. These examples show how our tools enable programmers to develop accurate performance models of parallel applications without requiring extensive performance modeling expertise.",
author = "Crovella, {Mark E.} and LeBlanc, {Thomas J.}",
year = "1994",
language = "English (US)",
pages = "600--609",
editor = "Anon",
booktitle = "Proceedings of the ACM/IEEE Supercomputing Conference",
publisher = "IEEE",

}

TY - GEN

T1 - Parallel performance prediction using lost cycles analysis

AU - Crovella, Mark E.

AU - LeBlanc, Thomas J.

PY - 1994

Y1 - 1994

N2 - Most performance debugging and tuning of parallel programs is based on the 'measure-modify' approach, which is heavily dependent on detailed measurements of programs during execution. This approach is extremely time-consuming and does not lend itself to predicting performance under varying conditions. Analytic modeling and scalability analysis provide predictive power, but are not widely used in practice, due primarily to their emphasis on asymptotic behavior and the difficulty of developing accurate models that work for real-world programs. In this paper we describe a set of tools for performance tuning of parallel programs that bridges this gap between measurement and modeling. Our approach is based on lost cycles analysis which involves measurement and modeling of all sources of overhead in a parallel program. We first describe a tool for measuring overheads in parallel programs that we have incorporated into the runtime environment for Fortran programs on the Kendall Square KSR1. We then describe a tool that fits these overhead measurements to analytic forms. We illustrate the use of these tools by analyzing the performance tradeoffs among parallel implementation of 2D FFT. These examples show how our tools enable programmers to develop accurate performance models of parallel applications without requiring extensive performance modeling expertise.

AB - Most performance debugging and tuning of parallel programs is based on the 'measure-modify' approach, which is heavily dependent on detailed measurements of programs during execution. This approach is extremely time-consuming and does not lend itself to predicting performance under varying conditions. Analytic modeling and scalability analysis provide predictive power, but are not widely used in practice, due primarily to their emphasis on asymptotic behavior and the difficulty of developing accurate models that work for real-world programs. In this paper we describe a set of tools for performance tuning of parallel programs that bridges this gap between measurement and modeling. Our approach is based on lost cycles analysis which involves measurement and modeling of all sources of overhead in a parallel program. We first describe a tool for measuring overheads in parallel programs that we have incorporated into the runtime environment for Fortran programs on the Kendall Square KSR1. We then describe a tool that fits these overhead measurements to analytic forms. We illustrate the use of these tools by analyzing the performance tradeoffs among parallel implementation of 2D FFT. These examples show how our tools enable programmers to develop accurate performance models of parallel applications without requiring extensive performance modeling expertise.

UR - http://www.scopus.com/inward/record.url?scp=0028756159&partnerID=8YFLogxK

UR - http://www.scopus.com/inward/citedby.url?scp=0028756159&partnerID=8YFLogxK

M3 - Conference contribution

SP - 600

EP - 609

BT - Proceedings of the ACM/IEEE Supercomputing Conference

A2 - Anon, null

PB - IEEE

ER -