Now showing items 1-5 of 5
Compiler support for software prefetching
Due to the growing disparity between processor speed and main memory speed, techniques that improve cache utilization and hide memory latency are often needed to help applications achieve peak performance. Compiler-directed ...
Interprocedural pointer analysis for C
Many powerful code optimization techniques rely on accurate information connecting the definitions and uses of values in a program. This information is difficult to produce for programs written with pointer-based languages ...
Improved software pipelining for superscalar architectures
Although instruction scheduling is an scNP-complete problem (27), many techniques have been developed to improve pipelining efficiency. Among them, several were proposed for scVLIW machines, and were shown to be efficient ...
Accelerating the Arnoldi iteration: Theory and practice
The Arnoldi iteration is widely used to compute a few eigenvalues of a large sparse or structured matrix. However, the method may suffer from slow convergence when the desired eigenvalues are not dominant or well separated. ...
Efficient runtime support for cluster-based distributed shared memory multiprocessors
Distributed shared memory (DSM) systems provide a shared memory programming paradigm on top of a physically distributed network of computers. The DSM system removes the necessity for programmers to move data explicitly ...