Predicting Opioid Overdoses in Pennsylvania

Nicholas Carolan, Amherst College

Abstract

The goal of this project is to develop models to predict the likelihood or risk of opioid overdoses across Pennsylvania. At time of publication, no data-driven county-level or state-wide prediction models for opioid overdoses in Pennsylvania are publicly available. I develop machine learning models trained on health and demographic datasets from the CDC, U.S. Census Bureau, and Pennsylvania state government. Two distinct models resulted. The first is a regression model predicting the number of fatal opioid overdoses per 100,000 residents of a county. Using demographic features selected through linear regression, it employs a feedforward neural network to predict death rates for the current year. The regression model achieves reasonable performance, explaining 47% of variance in its test set. The second model involves time series analysis on statewide administrations of anti-overdose drug Naloxone by emergency medical services. Using a recurrent neural network architecture, it predicts daily statewide administrations of Naloxone by EMS, albeit with limited performance. The aim of this project is to provide a single repository for data visualizations, predictions, and links to useful resources relating to Pennsylvania’s opioid crisis. The project website provides a central hub for this topic. The information provided is aimed at a diverse range of constituent groups, including first responders, local/county governments, state officials, other researchers, and residents of opioid-afflicted areas. First responders and officials may use project resources to make policy decisions, while researchers can implement predictive models for other states or experiment with new model architectures. Finally, residents of Pennsylvania can use this project to be more informed and prepared for the impact of opioids on their community.