Performance Analysis and Configuration Selection for Applications in the Cloud
Ng, T. S. Eugene
Master of Science
Cloud computing is becoming increasingly popular and widely used in both industries and academia. Making best use of cloud computing resources is critically important. Default resource configurations provided by cloud platforms are often not tailored for applications. Hardware heterogeneity in cloud computers such as Amazon EC2 leads to wide variation in performance, which provides an avenue for research in saving cost and improving performance by exploiting the heterogeneity. In this thesis, I conduct exhaustive measurement studies on Amazon EC2 cloud platforms. I characterize the heterogeneity of resources, and analyze the suitability of different resource configurations for various applications. Measurement results show significant performance diversity across resource configurations of different virtual machine sizes and with different processor types. Diversity in resource capacity is not the only reason for performance diversity; diagnostic measurements reveal that the influence from the cloud provider’s scheduling policy is also an important factor. Furthermore, I propose a nearest neighbor shortlisting algorithm that selects a configuration leading to superior performance for an application by matching the characteristics of the application with that of known benchmark programs. My experimental evaluations show that nearest neighbor greatly reduces the testing overhead since only the shortlisted top configurations rather than all configurations need to be tested; the method achieves high accuracy because the target application chooses the configuration for itself via testing. Even without any test, nearest neighbor is able to obtain a configuration with less than 5% performance loss for 80% applications.
Cloud Computing; Performance Analysis; Configuration Selection