Bayesian graphical models for biological network inference
Doctor of Philosophy
In this work, we propose approaches for the inference of graphical models in the Bayesian framework. Graphical models, which use a network structure to represent conditional dependencies among random variables, provide a valuable tool for visualizing and understanding the relationships among many variables. However, since these networks are complex systems, they can be difficult to infer given a limited number of observations. Our research is focused on development of methods which allow incorporation of prior information on particular edges or on the model structure to improve the reliability of inference given small to moderate sample sizes. First, we propose an approach to graphical model inference using the Bayesian graphical lasso. Our method incorporates informative priors on the shrinkage parameters specific to each edge. We demonstrate through simulations that this method allows improved learning of the network structure when relevant prior information is available, and illustrate the approach on inference of the cellular metabolic network under neuroinflammation. This application highlights the strength of our method since the number of samples available is fairly small, but we are able to draw on rich reference information from publicly available databases describing known metabolic interactions to construct informative priors. Next, we propose a modeling approach for settings where we would like to estimate networks for a collection of possibly related sample groups, where the sample size for each subgroup may be limited. We use a Markov random field prior to link the graphs within each group, and a selection prior to infer which groups have shared network structure. This allows us to encourage common edges across sample groups, when supported by the data. We provide simulation studies to illustrate the properties of our method and compare its performance to competing approaches. We conclude by demonstrating use of the proposed method to infer protein networks for various subtypes of acute myeloid leukemia and to infer signaling networks under different experimental perturbations.
Statistics; Graphical models; Bayesian inference; Informative priors; Biological networks