A GLOBAL APPROACH TO DETECTION OF PARALLELISM (COMPILERS)
CALLAHAN, CHARLES DAVID, II
Doctor of Philosophy
Modern computers will increasingly rely on parallelism to achieve high computation rates. Techniques to automatically detect and exploit parallelism have been shown effective for computers with vector capabilities. To employ similar techniques for asynchronous multiprocessor machines, the analysis and transformations used for vectorization must be extended to apply to entire programs rather than single loops. Three subproblems are addressed. A sequential to parallel conversion techniques is presented. This algorithm, called a parallel code generator, is suitable for conversion of entire subroutines to parallel form. This algorithm is shown to be optimal for a restrictive form of the conversion problem. Additional transformations can be added to the basic parallel code generator. Loop interchange is added to the conversion problem but it is shown that finding the optimal solution is then NP-complete. The presence of loop carried dependences results in less efficient parallel code. Loop alignment is a general tool for removing loop carried dependences and improving the effectiveness of the parallel code generator. Loop alignment is hampered by alignment conflicts. A transformation called code replication can be used to break alignment conflicts at the cost of additional computation in the output program. It is shown that minimizing the amount of code replication is NP-hard and potentially exponential. Calls to external routines are also a problem for the parallel code generator. A method is developed to summarize the effects of external routines in such a way as to allow effective dependence analysis to be performed around the call site. The ability to distribute shared arrays over local memories is important to mask long global memory access times. A general algorithm is presented based on data dependence analysis and alignment conflicts. This techniques is integrated into the parallel code generator.