Performance Optimizations for Software Transactional Memory
Mellor-Crummey, John; Scherer, William N., III
Doctor of Philosophy
The transition from single-core processors to multi-core processors demands a change from sequential programming to concurrent programming for mainstream programmers. However, concurrent programming has long been widely recognized as being notoriously difficult. A major reason for its difficulty is that existing concurrent programming constructs provide low-level programming abstractions. Using these constructs forces programmers to consider many low level details. Locks, the dominant programming construct for mutual exclusion, suffer several well known problems, such as deadlock, priority inversion, and convoying, and are directly related to the difficulty of concurrent programming. The alternative to locks, i.e. non-blocking programming, not only is extremely error-prone, but also does not produce consistently good performance. Better programming constructs are critical to reduce the complexity of concurrent programming, increase productivity, and expose the computing power in multi-core processors. Transactional memory has emerged recently as one promising programming construct for supporting atomic operations on shared data. By eliminating the need to consider a huge number of possible interactions among concurrent transactions, Transactional memory greatly reduces the complexity of concurrent programming and vastly improves programming productivity. Software transactional memory systems implement a transactional memory abstraction in software. Unfortunately, existing designs of Software Transactional Memory systems incur significant performance overhead that could potentially prevent it from being widely used. Reducing STM's overhead will be critical for mainstream programmers to improve productivity while not suffering performance degradation. My thesis is that the performance of STM can be significantly improved by intelligently designing validation and commit protocols, by designing the time base, and by incorporating application-specific knowledge. I present four novel techniques for improving performance of STM systems to support my thesis. First, I propose a time-based STM system based on a runtime tuning strategy that is able to deliver performance equal to or better than existing strategies. Second, I present several novel commit phase designs and evaluate their performance. Then I propose a new STM programming interface extension that enables transaction optimizations using fast shared memory reads while maintaining transaction composability. Next, I present a distributed time base design that outperforms existing time base designs for certain types of STM applications. Finally, I propose a novel programming construct to support multi-place isolation. Experimental results show the techniques presented here can significantly improve the STM performance. We expect these techniques to help STM be accepted by more programmers.
Applied sciences; Transactional memory; Parallel computing; Concurrent programming; Computer science