Synthesis of Patient Data to Predict Outcomes
Myers, Risa B
Jermaine, Christopher M
Doctor of Philosophy
Healthcare data is increasingly collected and stored in electronic format, providing access to previously untapped information. At the same time, healthcare costs continue to escalate. Predicting important outcomes such as 30-day mortality or complications from healthcare data, including patient monitoring data and perioperative clinical information, may provide advance warning of issues or identification of non-ideal care. This has the potential to lead to improved outcomes or reduced cost. In this research I describe statistical machine learning models that predict outcomes from clinical data. In particular, I focus on data with a temporal component. First, I describe an autoregressive-ordinal regression model that reduces time series data to a small set of representative numbers, based on time spent in the states of a hidden Markov model. The AR-OR model is a generative model using Bayesian techniques. This model is used to mimic expert anesthesiologist assessment of surgical vitals signs. I correlate the quality labels with key 30-day outcomes and am able to demonstrate high correlation of poor surgical vital sign quality with increased post-operative complications. Next, I describe enhancements to the AR-OR model that enable it to predict short-term outcomes from short duration time series. These improvements adaptively weight values in the time series by currency and allow for concurrent, independent series. I validate these additions by predicting elevated intracranial pressure crises and periods of depressed brain tissue oxygen in traumatic brain injury patients. After this, I focus on perioperative clinical data and describe the Cumulative Perioperative Model. This model illustrates how including time dependent patient data, such as initial hospital location and post-operative surgical destination, improves the ability to predict 30-day mortality and identify patients at risk. I implement this model using Markov random fields, conditional random fields, and logistic regression. All of these models and approaches demonstrate the ability to predict key outcomes from temporal healthcare data with a high level of accuracy.