Efficient Distributed Shared Memory Based on Multi-Protocol Release Consistency
Carter, John B.
A distributed shared memory (DSM) system allows shared memory parallel programs to be executed on distributed memory multiprocessors. The challenge in building a DSM system is to achieve good performance over a wide range of shared memory programs without requiring extensive modifications to the source code. The performance challenge translates into reducing the amount of communication performed by the DSM system to that performed by an equivalent message passing program. This thesis describes four novel techniques for reducing the communication overhead of DSM, including: (i) the use of software release consistency, (ii) support for multiple consistency protocols, (iii) a multiple writer protocol, and (iv) an update timeout mechanism. Release consistency allows modifications of shared data to be handled via a delayed update queue, which masks network latencies. Providing multiple consistency protocols allows each shared variable to be kept consistent using a protocol well-suited to the way it is accessed. A multiple writer protocol addresses the problem of false sharing by reducing the amount of unnecessary communication performed to keep falsely shared data consistent. The update timeout mechanism reduces the impact of updates to stale data. These techniques have been implemented in the Munin DSM system. The impact of these features is evaluated by comparing the performance of a collection of shared memory programs running under Munin with equivalent message passing and conventional DSM programs. Over half of the shared memory programs achieved at least 95% of the speedup of their message passing equivalents. For the other programs, the performance bottlenecks were removed via minor program modifications. Furthermore, Munin programs achieved from 25% to over 100% higher speedups than equivalent conventional DSM programs when there was a high degree of sharing. The results indicate that DSM can be a viable alternative to message passing if the amount of unnecessary communication is minimized.
This work was also published as a Rice University thesis/dissertation: http://hdl.handle.net/1911/16717