Internet services continue to incorporate increasingly bandwidth-intensive applications, including audio and high-quality, feature-length video. As the pace of uniprocessor performance improvements slows, however, network servers can no longer rely on uniprocessor technology to fuel the overall performance improvements necessary for next-generation, high-bandwidth applications. Furthermore, rising per-machine power costs in the datacenter are driving demand for solutions that enable consolidation of multiple servers onto one machine, thus improving overall efficiency. This dissertation presents strategies that improve the efficiency and performance of server I/O using both virtual-machine concurrency and thread concurrency. Contemporary virtual machine monitors (VMMs) aim to improve server efficiency by enabling consolidation of separate isolated servers onto one physical machine. However, modern VMMs incur heavy device virtualization penalties, ultimately reducing application performance by up to a factor of 3. Contemporary parallelized operating systems aim to improve server performance by exploiting thread parallelism using multiple processors. However, the concurrency and communication models used to imply meat that parallelisms impose significant performance penalties, severely damaging the server's ability to leverage more processors to attain higher performance. This dissertation examines the architectural sources of these inefficiencies and introduces new OS- and VMM-level architectures that greatly reduce them.