For many users, the perceived speed of computing is increasingly dependent on the performance of network server systems, underscoring the need for high performance servers. Cost-effective scalable network servers can be built on clusters of commodity components (PCs and LANs) instead of using expensive multiprocessor systems. However, network servers cache files to reduce disk access, and the cluster's physically disjoint memories complicate sharing cached file data. Additionally, the physically disjoint CPUs complicate the problem of load balancing. This work examines the issue of cache management in scalable network servers at two levels---per-node (local) and cluster-wide (global).
Per-node cache management is addressed by the IO-Lite unified buffering and caching system. Applications and various parts of the operating system currently use incompatible buffering schemes, resulting in unnecessary data copying. For network servers, overall throughput drops for two reasons---copying wastes CPU cycles, and multiple copies of data compete with the filesystem cache for memory. IO-Lite allows applications, the operating system, file system, and network code to safely and securely share a single copy of data.
The cluster-wide solution uses a technique called Locality-Aware Request Distribution (LARD) that examines the content of incoming requests to determine which node in a cluster should handle the request. LARD uses the request content to dynamically partition the incoming request stream. This partitioning increases the file cache hit rates on the individual nodes, and it maintains load balance in the cluster.