Martin Zwick

Date of Award


Document Type


Degree Name

Doctor of Philosophy (Ph.D.) in Systems Science


Systems Science

Physical Description

1 online resource (xvi, 232 pages)




Legislative reforms aimed at slowing growth of US healthcare costs are focused on achieving greater value, defined specifically as health outcomes achieved per dollar spent. To increase value while payments are diminishing and tied to individual outcomes, healthcare must improve at predicting risks and outcomes.

One way to improve predictions is through better modeling methods. Current models are predominantly based on logistic regression (LR). This project applied Reconstructability Analysis (RA) to data on hip and knee replacement surgery, and considered whether RA could create useful models of outcomes, and whether these models could produce predictions complimentary to or even stronger than LR models.

RA is a data mining method that searches for relations in data, especially non-linear and higher ordinality relations, by decomposing the frequency distribution of the data into projections, several of which taken together define a model, which is then assessed for statistical significance. The predictive power of the model is expressed as the percent reduction of uncertainty (Shannon entropy) of the dependent variable (the DV) gained by knowing the values of the predictive independent variables (the IVs).

Results showed that LR and RA gave the same results for equivalent models, and showed that exploratory RA provided better models than LR. Sixteen RA predictive models were then generated across the four DVs: complications, skilled nursing discharge, readmissions, and total cost. While the first three DVs are nominal, RA generated continuous predictions for cost by calculating expected values. Models included novel comorbidity variables and non-hypothesized interaction terms, and often resulted in substantial reductions in uncertainty.

Predictive variables consisted of both delivery system variables and binary patient comorbidity variables. Complications were predicted by the total number of patient comorbidities. Skilled nursing discharges were predicted both by patient-related factors and delivery system variables (location, surgeon volume), suggesting practice patterns influence utilization of skilled nursing facilities. Readmissions were not well predicted, suggesting the data used in this project lacks the right variables or that readmissions are simply unpredictable. Delivery system variables (surgeon, location, and surgeon volume) were found to be the predominant predictors of total cost.

Risk ratios were generated as an additional measure of effect size. These risk ratios were used to classify the IV states of the models as indicating higher or lower risk of adverse outcomes. Some IV states showed nearly 25% of patients at increased risk, while other IV states showed over 75% of patients at decreased risk. In real time, such risk predictions could support clinical decision making and custom-tailored utilization of services.

Future research might address the limitations of this project’s data and employ additional RA techniques and training-test splits. Implementation of predictive models is also discussed, with considerations for data supply lines, maintenance of models, organizational buy-in, and the acceptance of model output by clinical teams for use in real-time clinical practice.

If outcomes and risk are adequately predicted, areas for potential improvement become clearer, and focused changes can be made to drive improvements in patient care. Better predictions, such as those resulting from the RA methodology, can thus support improvement in value—better outcomes at a lower cost. As reimbursement increasingly evolves into value-based programs, understanding the outcomes achieved, and customizing patient care to reduce unnecessary costs while improving outcomes, will be an active area for clinicians, healthcare administrators, researchers, and data scientists for many years to come.

Persistent Identifier