RCC: A compiler for the R language for statistical computing
Master of Science
R is a programming language for statistics that enables users to express computation at a high level of abstraction. Until now, its only implementation has been the R interpreter. Though interpretation is convenient for interactive use, it hampers the performance of computation-intensive programs. This thesis describes the design and implementation of RCC, a compiler that translates R into C to improve performance and enable future optimization. RCC uses runtime libraries of the open-source R interpreter, combining compiled and interpreted code to achieve a complete translation. Function definitions and control flow in R are translated directly into C, while operations such as dynamic function modification remain interpreted. RCC-generated code in the current version achieves over a threefold speedup compared to the R interpreter. Hand-coded experiments suggest that optimizing the generated code using knowledge about the runtime libraries could improve performance by a factor of 100.