Show simple item record

dc.contributor.authorMellor-Crummey, John
Tallent, Nathan
dc.date.accessioned 2017-08-02T22:03:06Z
dc.date.available 2017-08-02T22:03:06Z
dc.date.issued 2008-10-13
dc.identifier.urihttps://hdl.handle.net/1911/96368
dc.description.abstract Understanding why the performance of a multithreaded program does not improve linearly with the number of cores in a sharedmemory node populated with one or more multicore processors is a problem of growing practical importance. This paper makes three contributions to performance analysis of multithreaded programs. First, we describe how to measure and attribute parallel idleness, namely, where threads are stalled and unable to work. This technique applies broadly to programming models ranging from explicit threading (e.g., Pthreads) to higher-level models such as Cilk and OpenMP. Second, we describe how to measure and attribute parallel overhead—when a thread is performing miscellaneous work other than executing the user’s computation. By employing a combination of compiler support and post-mortem analysis, we incur no measurement cost beyond normal profiling to glean this information. Using idleness and overhead metrics enables one to pinpoint areas of an application where concurrency should be increased (to reduce idleness), decreased (to reduce overhead), or where the present parallelization is hopeless (where idleness and overhead are both high). Third, we describe how to measure and attribute arbitrary performance metrics for high-level multithreaded programming models, such as Cilk. This requires bridging the gap between the expression of logical concurrency in programs and its realization at run-time as it is adaptively partitioned and scheduled onto a pool of threads. We have prototyped these ideas in the context of Rice University’s HPCTOOLKIT performance tools. We describe our approach, implementation, and experiences applying this approach to measure and attribute work, idleness, and overhead in executions of Cilk programs.
dc.format.extent 10 pp
dc.language.iso eng
dc.rights You are granted permission for the noncommercial reproduction, distribution, display, and performance of this technical report in any format, but this permission is only for a period of forty-five (45) days from the most recent time that you verified that this technical report is still available from the Computer Science Department of Rice University under terms that include this permission. All other rights are reserved by the author(s).
dc.title Effective Performance Measurement and Analysis of Multithreaded Applications
dc.type Technical report
dc.date.note October 13, 2008
dc.identifier.digital TR08-05
dc.type.dcmi Text
dc.identifier.citation Mellor-Crummey, John and Tallent, Nathan. "Effective Performance Measurement and Analysis of Multithreaded Applications." (2008) https://hdl.handle.net/1911/96368.


Files in this item

Thumbnail

This item appears in the following Collection(s)

Show simple item record