PDXScholar - Urban Ecosystem Research Consortium of Portland/Vancouver: Near Real-Time Prediction of Ground PM2.5 Concentrations Across Oregon Using GEOS-CF and Machine Learning Models
 

Presenter(s) Information

Long T. BuiFollow

Start Date

3-17-2025 12:00 AM

End Date

3-17-2025 12:00 AM

Abstract

Despite significant efforts to reduce air pollution, many states in the United States, including Oregon, still facing PM2.5 particulate pollution. Providing near real-time ground-level pollution forecasts is essential due to the frequent wildfires in the area. Numerous studies have forecasted pollution levels spatially and temporally, but mostly they only provide results at measured locations, not allowing for predictions at other sites. This study proposes an approach using GEOS-CF combined with machine learning models to forecast ground PM2.5 concentration for the entire state of Oregon, not just at monitoring locations. Outputs from GEOS-CF, including PM2.5 concentrations at 985 hPa altitude and meteorological parameters, are integrated with ground-level PM2.5 observation data to create a big data set. This dataset is used for training and validating six machine learning models: Decision Tree, Random Forest, Gradient Boosting Machine (GBM), XGBoost, CatBoost, and LightGBM. The accuracy of these models is evaluated using the coefficient of determination (R2), Mean Absolute Error (MAE), and root mean square error (RMSE). Based on accuracy, an optimal model is formed and used to forecast ground-level pollution, which is particularly necessary in rural areas where ground observation data is lacking.

Subjects

Air quality, GIS / modeling, Sustainable development

Persistent Identifier

https://archives.pdx.edu/ds/psu/43091

Creative Commons License

Creative Commons Attribution-Share Alike 4.0 License
This work is licensed under a Creative Commons Attribution-Share Alike 4.0 License.

Share

COinS
 
Mar 17th, 12:00 AM Mar 17th, 12:00 AM

Near Real-Time Prediction of Ground PM2.5 Concentrations Across Oregon Using GEOS-CF and Machine Learning Models

Despite significant efforts to reduce air pollution, many states in the United States, including Oregon, still facing PM2.5 particulate pollution. Providing near real-time ground-level pollution forecasts is essential due to the frequent wildfires in the area. Numerous studies have forecasted pollution levels spatially and temporally, but mostly they only provide results at measured locations, not allowing for predictions at other sites. This study proposes an approach using GEOS-CF combined with machine learning models to forecast ground PM2.5 concentration for the entire state of Oregon, not just at monitoring locations. Outputs from GEOS-CF, including PM2.5 concentrations at 985 hPa altitude and meteorological parameters, are integrated with ground-level PM2.5 observation data to create a big data set. This dataset is used for training and validating six machine learning models: Decision Tree, Random Forest, Gradient Boosting Machine (GBM), XGBoost, CatBoost, and LightGBM. The accuracy of these models is evaluated using the coefficient of determination (R2), Mean Absolute Error (MAE), and root mean square error (RMSE). Based on accuracy, an optimal model is formed and used to forecast ground-level pollution, which is particularly necessary in rural areas where ground observation data is lacking.