Show simple item record

dc.contributor.advisor Scott, David W.
dc.creatorSung, Hsi Guang
dc.date.accessioned 2009-06-04T07:02:33Z
dc.date.available 2009-06-04T07:02:33Z
dc.date.issued 2004
dc.identifier.citation Sung, Hsi Guang. "Gaussian mixture regression and classification." (2004) Diss., Rice University. https://hdl.handle.net/1911/18710.
dc.identifier.urihttps://hdl.handle.net/1911/18710
dc.description.abstract The sparsity of high dimensional data space renders standard nonparametric methods ineffective for multivariate data. A new procedure, Gaussian Mixture Regression (GMR), is developed for multivariate nonlinear regression modeling. GMR has the tight structure of a parametric model, yet still retains the flexibility of a nonparametric method. The key idea of GMR is to construct a sequence of Gaussian mixture models for the joint density of the data, and then derive conditional density and regression functions from each model. Assuming the data are a random sample from the joint pdf fX,Y, we fit a Gaussian kernel density model fˆX,Y and then implement a multivariate extension of the Iterative Pairwise Replacement Algorithm (IPRA) to simplify the initial kernel density. IPRA generates a sequence of Gaussian mixture density models indexed by the number of mixture components K. The corresponding regression function of each density model forms a sequence of regression models which covers a spectrum of regression models of varying flexibility, ranging from approximately the classical linear model (K = 1) to the nonparametric kernel regression estimator (K = n). We use mean squared error and prediction error for selecting K. For binary responses, we extend GMR to fit nonparametric logistic regression models. Applying IPRA for each class density, we obtain two families of mixture density models. The logistic function can then be estimated by the ratio between pairs of members from each family. The result is a family of logistic models indexed by the number of mixtures in each density model. We call this procedure Gaussian Mixture Classification (GMC). For a given GMR or GMC model, forward and backward projection algorithms are implemented to locate the optimal subspaces that minimize information loss. They serve as the model-based dimension reduction techniques for GMR and GMC. In practice, GMR and GMC offer data analysts a systematic way to determine the appropriate level of model flexibility by choosing the number of components for modeling the underlying pdf. GMC can serve as an alternative or a complement to Mixture Discriminant Analysis (MDA). The uses of GMR and GMC are demonstrated in simulated and real data.
dc.format.extent 157 p.
dc.format.mimetype application/pdf
dc.language.iso eng
dc.subjectStatistics
dc.title Gaussian mixture regression and classification
dc.type Thesis
dc.type.material Text
thesis.degree.department Statistics
thesis.degree.discipline Engineering
thesis.degree.grantor Rice University
thesis.degree.level Doctoral
thesis.degree.name Doctor of Philosophy
dc.identifier.callno THESIS STAT. 2004 SUNG


Files in this item

Thumbnail

This item appears in the following Collection(s)

Show simple item record