The Impact of Instruction-Level Parallelism on Multiprocessor Performance and Simulation Methodology
Pai, Vijay S.
instruction-level parallelism; shared-memory multiprocessors; performance evaluation
Current microprocessors exploit high levels of instruction-level parallelism (ILP). This theis presents the first detailed analysis of the impact of such processors on shared-memory multiprocessors. We find that ILP techniques substantially reduce CPU time in multiprocessors, but are less effective in reducing meory stall time for our applications. Consequently, despite the latency-tolerating techniques incorporated in ILP processors, memory stall time becomes a large component of execution time and parallel efficiencies are generally poorer in our ILP-based multiprocessor than in an otherwise equivalent previous-generation multiprocessor. We identify clustering independent read misses together in the processor instruction window as a key optimization to exploit the ILP features of current processors. We also use the above analysis to examine the validity of direct-execution simulators with previous-generation processor models to approximate ILP-based multiprocessors. We find that, with appropriate approximations, such simulators can reasonably characterize the behavior of applications with poor overlap of read misses. However, they can be highly inaccurate for applications with high overlap of read misses.
MetadataShow full item record
- ECE Publications