Functional data classification and covariance estimation
Cox, Dennis D.
Doctor of Philosophy thesis
Focusing on the analysis of functional data, the first part of this dissertation proposes three statistical models for functional data classification and applies them to a real problem of cervical pre-cancer diagnosis; the second part of the dissertation discusses covariance estimation of functional data. The functional data classification problem is motivated by the analysis of fluorescence spectroscopy, a type of clinical data used to quantitatively detect early-stage cervical cancer. Three statistical models are proposed for different purposes of the data analysis. The first one is a Bayesian probit model with variable selection, which extracts features from the fluorescence spectroscopy and selects a subset from these features for more accurate classification. The second model, designed for the practical purpose of building a more cost-effective device, is a functional generalized linear model with selection of functional predictors. This model selects a subset from the multiple functional predictors through a logistic regression with a grouped Lasso penalty. The first two models are appropriate for functional data that are not contaminated by random effects. However, in our real data, random effects caused by devices artifacts are too significant to be ignored. We therefore introduce the third model, the Bayesian hierarchical model with functional predictor selection, which extends the first two models for this more complex data. Besides retaining high classification accuracy, this model is able to select effective functional predictors while adjusting for the random effects. The second problem focused on by this dissertation is the covariance estimation of functional data. We discuss the properties of the covariance operator associated with Gaussian measure defined on a separable Hilbert Space and propose a suitable prior for Bayesian estimation. The limit of Inverse Wishart distribution as the dimension approaches infinity is also discussed. This research provides a new perspective for covariance estimation in functional data analysis.
Biology; Biostatistics; Mathematics; Statistics