This dissertation presents the design, implementation and evaluation of a physical memory management system that allows applications to transparently benefit from superpages. The benefit consists of fewer TLB misses and the consequent performance improvement, which is shown to be significant.
The size of main memory in workstations has been growing exponentially over the past decade. As a cause or consequence, the working set size of typical applications has been increasing at a similar rate. In contrast, the TLB size has remained small because it is usually fully associative and its access time must be kept low since it is in the critical path to every memory access. As a result, the relative TLB coverage---that is, the fraction of main memory that can be mapped without incurring TLB misses---has decreased by a factor of 100 in the last 10 years.
Because of this disparity, many modern applications incur a large number of TLB misses, degrading performance by as much as 30% to 60%, as opposed to the 4--5% degradation reported in the 80's or the 5--10% reported in the 90's.
To increase the TLB coverage without increasing the TLB size, most modern processors support memory pages of large sizes, called superpages . Since each superpage requires only one entry in the TLB to map a large region of memory, superpages can dramatically increase TLB coverage and consequently improve performance.
However, supporting superpages poses several challenges to the operating system, in terms of superpage allocation, promotion trade-offs, and fragmentation control. This dissertation analyzes these issues and presents a design of an effective superpage management system. An evaluation of the design is conducted through a prototype implementation for the Alpha CPU, showing substantial and sustained performance benefits. The design is then validated and further refined through an implementation for the Itanium processor.
The main contribution of this work is that it offers a complete and practical solution for transparently providing superpages to applications. It is complete because it tackles all the issues and trade-offs in realizing the potential of superpages. It is practical because it can be implemented with localized changes to the memory management subsystem, it minimizes the negative impact that could be observed in pathological cases, and can therefore be readily integrated into any general-purpose operating system.