Forward-time population genetics simulation and its applications in the mapping of complex human diseases
Doctor of Philosophy
Forward-time simulation has some unique advantages over the coalescent approach in the study of complex human diseases, but its application has been deterred, not only by its inherent inefficiency, but also by the lack of a generic simulation program and a suitable simulation framework for sample generation. In this dissertation, I first introduce a forward-time population genetics simulation environment simuPOP (Chapter 2). This R/Splus-like simulation environment provides a large number of objects and functions to manipulate populations, and a mechanism to evolve populations forward in time. Due to its flexible and extensible design, simuPOP can simulate large and complex evolutionary processes with ease. In Chapter 3, I simulate the evolution of the allelic diversity of human diseases using a forward-time approach. By observing the evolutionary patterns of the allelic diversity of rare and common human diseases, I conclude that the common disease common variant (CDCV) hypothesis holds and the phenomenon is caused by transient effects of demography (population expansion). The impact of genetic features like various demographic models, finite-allele mutation models, different multi-locus selection models on the allelic diversity of human diseases are discussed, along with an example of the impact of interaction between disease susceptibility loci. Chapter 4 is devoted to the topics of generating genetic samples using a forward-time approach. Topics discussed include what demographic model to use, when and how to introduce disease mutants, and how to control disease allele frequency at the last generation. The result of this Chapter is a simulation framework that allows us to simulate the evolution of complex human diseases and generate samples with designed disease allele frequencies at the ending generation. In the next Chapter, I generate populations of complex human diseases with different genetic histories. Three commonly used gene mapping methods, Transmission Disequilibrium Test (TDT), Linkage method (LOD) and chi-square association test, are applied to the affected sibpair samples ascertained from the simulated populations. By comparing the performance of these three methods against populations with varying past selection pressure on disease predisposing alleles, the number of disease-causing loci and varying levels of population structure, I discuss the impact of these factors on the power of gene mapping methods.
Biostatistics; Genetics; Statistics; Public health