Inferential Methods to Find Differences in Population of Graphical Models with Applications to Functional Connectomics
Baraniuk, Richard G.
Doctor of Philosophy
In many neuroimaging modalities, scientists observe neural activity at distinct units of brain function but seek to study and manipulate functional connectivity or unobserved latent relationships between these units. Functional connectivity is commonly described using networks where nodes correspond to brain locations or regions, electrodes, circuits or neurons while edges correspond to some notion of statistical dependence. Such net- work models are increasingly used in clinical neuroimaging where scientists seek to find robust network biomarkers to detect specific brain based disorders, explain underlying disease mechanisms and guide personalized treatment regimes. However, functional con- nectivity networks are never observed but estimated from complex and noisy data, and as a result, estimated networks are prone to statistical errors. This dissertation shows that failure to account for such statistical errors compromises subsequent inferential analyses to find differences in functional connectivity and proposes a new statistical framework that ameliorates these problems, thus improving the reproducibility of functional connectivity studies. Formally, this dissertation identifies a new statistical problem, Population Post-Selection Inference or popPSI, that arises in functional neuroimaging when scientists ask inferential questions such as — How do network metrics differ between a population of unhealthy subjects and healthy controls How do individual networks vary with symptom severity To investigate popPSI issues in such questions, we use two level models to study network differences, specifically employing Gaussian graphical models (GGMs) for functional connectivity. Whereas standard test statistics do not adequately control type I and type II errors for such models, R^3, our novel methodological approach, based on resampling, random penalization with random effects test statistics addresses the deficiencies of current test statistics employed in neuroimaging. Our framework is general and can be used to test general linear hypotheses of the network at the edge, node or global level. Using extensive simulation studies for a wide variety of sample sizes and network structures, we show that R3 offers improvements in statistical power and error for various network met- rics. Real data case studies reveal that our methods find meaningful and clinically relevant network differences in synesthesia, neurofibromatosis-1 and autism spectrum disorders.