Effects of Data Aggregation (buffer) Techniques on Bicycle Volume Estimation

Published In


Document Type


Publication Date



Researchers and practitioners commonly use a Direct Demand Model (DDM), which uses link, distance, and buffered variables (e.g., land use) to predict Annual Average Daily Bicycle Traffic (AADBT). Past studies deploy random buffer size combinations to find the best-fit variables for their specific DDMs. However, none of these studies seek to identify the best buffer types and sizes, and only a few past studies investigate the impacts of local characteristics on buffer type and size selection. Therefore, this study aims to determine the best buffer types and sizes and evaluate the impact of local characteristics on buffer type and size selection. To select the preferred buffer type and size, this study tests two types (Network and Euclidean) of buffers with seven unique sizes (0.1, 0.25, 0.50, 0.75, 1.0, 1.50, and 2.0 miles) and their combination for six different geographies (Portland, Eugene, Bend, Boulder, Charlotte, and Dallas). This study develops a total of 168 cross-validated (10 folds 5 repeats) generalized and city-specific Poisson regression models using emerging data sources (i.e., Strava, StreetLight) and contextual variables. Results recommend that a generalized model with the combination of Network and Euclidean buffers of multiple sizes provide the best prediction of AADBT, and Network buffers outperform Euclidean buffers. However, city-specific models with a single type and size of buffer sometimes outperform the generalized model. Network density determines the types and sizes of buffers. This research will help policymakers and modelers understand the sizes and types of buffers required to extract the variables to construct a DDM for AADBT estimations.


Copyright © 2023, The Author(s), under exclusive licence to Springer Science Business Media, LLC, part of Springer Nature