Bayesian Methods for the Analysis of Microbiome Data
Wadsworth, W Duncan
Guindani, Michele; Vannucci, Marina
Doctor of Philosophy
Bacteria, archaea, viruses, and fungi are present in large numbers both on and inside of our bodies. On average, only one in ten of “our” cells contain human DNA. The other 90% belong to a tremendous diversity of microbes, some of which are fundamentally related to health and disease mechanisms as documented in numerous recent biomedical studies (Turnbaugh et al., 2009; The Human Microbiome Project, 2012b; Knights et al., 2013; Arpaia et al., 2013; Pickard et al., 2014). Some of these microbes are beneficial while others are detrimental, and, since their abundances are poorly understood, identifying microbes associated with interesting phenotypes is of great importance. However, due to the complexity of these systems and certain characteristics of the data there are still limited numbers of appropriate statistical tools available for such a task. In this research work I will describe the basic features of microbiome abundance data and present two new modeling approaches that can be used to address some of the challenges presented by this data type. The first approach accomplishes a data integration and model selection goal by associating covariates with microbiome data. The second provides a method of correcting for multiple hypotheses as is common when testing for differential species abundance between experimental or observational conditions. We illustrate the performances of both methods in simulation studies, and in applications to freely available datasets. Finally, we further discuss their potential in microbiome research and possible future extensions.