Sponsor
Portland State University. Department of Electrical and Computer Engineering
First Advisor
John Lipor
Term of Graduation
Winter 2026
Date of Publication
3-23-2026
Document Type
Thesis
Degree Name
Master of Science (M.S.) in Electrical and Computer Engineering
Department
Electrical and Computer Engineering
Language
English
Subjects
Agglomerative Clustering, Kullback-Leibler Divergence, Regionalization
Physical Description
1 online resource (vi, 113 pages)
Abstract
Regionalization is a clustering problem that seeks to partition geospatial data into geographically contiguous regions while remaining internally homogeneous in their attributes. It has been successfully applied towards the development of urban planning, natural resource discovery, and ecological analysis. Existing popular approaches optimize homogeneity using Euclidean or variance-based criteria, which ignore distributional differences such as variance shifts, multimodality, and higher-order dependence. This thesis introduces ARID (Agglomerative Regionalization via Information Divergence), a spatially constrained agglomerative clustering framework that replaces distance-based merging with an information-theoretic, Ward-like criterion that utilizes the Kullback-Leibler (KL) divergence. ARID constructs a neighborhood graph from coordinate space and restricts merges to topological neighbors to guarantee spatial contiguity throughout the hierarchy.
Because divergence-guided agglomeration depends critically on reliable estimation under the ultra-low sample sizes that occur early in the hierarchy, we benchmark state-of-the-art KL estimators under imbalanced samples across Gaussian and heavy-tailed distributions in both low- and high-dimensional settings, and propose stabilization strategies to mitigate early-stage estimation degeneracy. We then generate synthetic grid datasets designed to emphasize mean, variance, shape, and correlation differences, and show that ARID can recover contiguous regions in cases where Euclidean regionalization degrades, while remaining competitive in settings where Euclidean assumptions are optimal. A final evaluation on a Nevada case study involving data pertinent to geothermal resource exploration further illustrates that divergence-guided agglomeration can yield informative regional structures with competitive performance relative to the well-established regionalization algorithms: REDCAP, SKATER, and SCHAC.
Rights
In Copyright. URI: http://rightsstatements.org/vocab/InC/1.0/ This Item is protected by copyright and/or related rights. You are free to use this Item in any way that is permitted by the copyright and related rights legislation that applies to your use. For other uses you need to obtain permission from the rights-holder(s).
Persistent Identifier
https://archives.pdx.edu/ds/psu/44591
Recommended Citation
Sills, Joshua David, "ARID: Agglomerative Regionalization via Information Divergence, A Novel Clustering Algorithm for Geo-spatial Data" (2026). Dissertations and Theses. Paper 7019.