Runtime Technologies of High Performance Parallel Computing
Doctor of Philosophy
Due to power constraints, future growth in computing capability must explicitly leverage parallelism in software to effectively exploit hardware parallelism found in both distributed and shared memory systems. The past decades have seen vast improvements in the performance of key building blocks of parallel computing, including communication runtime systems, runtime schedulers, and concurrent data structures. In the pursuit of high performance, however, these building blocks compromise on other desirable properties such as applicability and interoperability. The applicability problem of a parallel algorithm restricts the range of environments to which it applies. The interoperability problem of a parallel library prohibits arbitrary inter- action between new parallel code with legacy or serial code, which poses an obstacle to incremental adoption of new parallel libraries. In this thesis, I investigate the issues of applicability and interoperability in three key building blocks of parallel computing—a communication runtime for a partitioned global address space languages, a work-stealing runtime scheduler, and a concurrent FIFO queue. I demonstrate that these high performance building blocks of parallel software can be made fully interoperable with legacy or serial code and applicable in a broader range of environments while yielding equal or better performance.