Compilation Order Matters: Exploring the Structure of the Space of Compilation Sequences Using Randomized Search Algorithms
DateJune 18, 2004
Most modern compilers operate by applying a fixed sequence of code optimizations, called a compilation sequence, to all programs. Compiler writers determine a small set of good, general-purpose, compilation sequences by extensive hand-tuning over particular benchmarks. The compilation sequence makes a significant difference in the quality of the generated code; in particular, we know that a single universal compilation sequence does not produce the best results over all programs. Three questions arise in customizing compilation sequences: (1) What is the incremental benefit of using a customized sequence instead of a universal sequence? (2) What is the average computational cost of constructing a customized sequence? (3) When does the benefit exceed the cost? We present one of the first empirically derived cost-benefit tradeoff curves for custom compilation sequences. These curves are for two randomized sampling algorithms: descent with randomized restarts and genetic algorithms. They demonstrate the dominance of these two methods over simple random sampling in sequence spaces where the probability of finding a good sequence is very low. Further, these curves allow compilers to decide whether custom sequence generation is worthwhile, by explicitly relating the computational effort required to obtain a program-specific sequence to the incremental improvement in quality of code generated by that sequence.