Document Type


Publication Date



Data mining, Managed care plans (Medical care) -- Statistical analysis, Least squares, System analysis


Networks are rarely subjected to hypothesis tests for difference, but when they are inferred from datasets of independent observations statistical testing is feasible. To demonstrate, a healthcare provider network is tested for significant change after an intervention using Medicaid claims data. First, the network is inferred for each time period with (1) partial least squares (PLS) regression and (2) reconstructability analysis (RA). Second, network distance (i.e., change between time periods) is measured as the mean absolute difference in (1) coefficient matrices for PLS and (2) calculated probability distributions for RA. Third, the network distance is compared against a reference distribution to estimate its probability of occurring by chance alone. The reference distribution is created through permutation - by randomly swapping observations between datasets - so that network inference and distance can be repeatedly measured when the null hypothesis is true. Change in the provider network is found to be statistically significant when measured by either RA or PLS. PLS indicates change among pairs of providers, while RA identifies the higher-way relationships among them.


Presented at the 2018 American Statistical Association Conference on Statistical Practice, Portland, Oregon, February 15-17, 2018.

Persistent Identifier