Document Type
Unpublished Work
Publication Date
2006
Subjects
System design, System analysis, Log-linear models, Data structures (Computer science)
Abstract
A linear regression model was used to predict variations in health status with the hypothesis that literacy level as a proximal measure for educational attainment would show a strong correlation with health status. Groups of other independent variables have also been introduced in the model to test whether the effect of literacy on health would have been mediated by other factors, or increased by them. The linear regression model did not show any strong relation between literacy and health, but pointed to other factors as predictors of health status.
Methods of data mining have been used for exploring throughout all the possible and statistically acceptable models of prediction of health, where the relations among variables are not necessarily linear. Even the employment of these methods does not suggest a major role of literacy in predicting health status: by itself, literacy is not a good predictor of health, even when a linear association is not assumed.
Reconstructability analysis makes it possible to analyze models in which variables have non linear relations, and in this case it could be used to isolate models in which literacy did have a predictive power on health status. Together with a healthy behavior variable (frequency of walking), and a measure of occupational status, literacy becomes a meaningful predictor of health, although the model that it builds with those variables is not an excellent one: other variables identify models that perform better, in terms of error produced, compared to the models where literacy is one of the predictors.
Reconstructability analysis was also used for isolating interaction effects that have been then tested for their statistical significance and predictive power in the regression equation. Variables constructed on the information derived from this analysis have been introduced and are statistically significant predictors in the regression line improving the variance explained by the model. Also the percent of correct predictions and the average error of the model improve, although the change due to the introduction of the new variables is small.
Keywords: Reconstructability analysis, linear regression, interaction effects, prediction, health status, literacy, education, OCCAM
Rights
© The Authors
Persistent Identifier
https://archives.pdx.edu/ds/psu/42838
Citation Details
Renato Carletti and Martin Zwick (2006). "Combining Linear Regression with Reconstructability Analysis: A Study of Differences in Health Status," unpublished work.
Included in
Life Sciences Commons, Medicine and Health Sciences Commons, Physical Sciences and Mathematics Commons, Social and Behavioral Sciences Commons