Published In

Healthcare Analytics

Document Type


Publication Date



Artificial intelligence, Machine learning, Deep learning, Screening, Brief Intervention, and Referral to Treatment (SBIRT), Substance Abuse and Mental Health Services Administration’s (SAMHSA)


This study aims to train and validate machine learning and deep learning models to identify patients with risky alcohol and drug misuse in a Screening, Brief Intervention, and Referral to Treatment (SBIRT) program. An observational cohort of 6978 adults was admitted in the western region of Alabama at three medical facilities between January and December of 2019. Data were cleaned and pre-processed using data imputation techniques and an augmented sampling data method. The primary analysis involved the multi-class classification of alcohol and drug misuse. Our study shows that accurate identification of alcohol and drug use screening instrument scores was best accomplished with mixed-effects models following the imputation of missing data using the Generative Adversarial Imputation Networks (GAIN) method and then followed by applying the Synthetic Minority Over-sampling TEchnique-Nominal Continuous (SMOTE-NC) data augmentation method. Although mixed models are commonly employed in studies of electronic health records (EHRs), using the GAIN method followed by SMOTE-NC for diagnosing alcohol and drug use disorder is novel and original.


© 2022 The Author(s). Published by Elsevier Inc.



Persistent Identifier