Clustering Quality Metrics for Subspace Clustering
L. Balzano was supported by DARPA grant 16-43-D3M-FP-037, NSF CAREER award CCF-1845076, AFOSR YIP award FA9550-19-1-0026, and ARO YIP award W911NF1910027. J. Lipor was supported by National Science Foundation DMS 1624776 and by the U.S. Army Basic Research Program under PE 61102, Project T25, Task 02 “Network Science Initiative,” managed at the U.S. Army ERDC with Portland State University under Cooperative Agreement Number W912HZ-17-2-0005.
We study the problem of clustering validation, i.e., clustering evaluation without knowledge of ground-truth labels, for the increasingly-popular framework known as subspace clustering. Existing clustering quality metrics (CQMs) rely heavily on a notion of distance between points, but common metrics fail to capture the geometry of subspace clustering. We propose a novel point-to-point pseudometric for points lying on a union of subspaces and show how this allows for the application of existing CQMs to the subspace clustering problem. We provide theoretical and empirical justification for the proposed point-to-point distance, and then demonstrate on a number of common benchmark datasets that our proposed methods generally outperform existing graph-based CQMs in terms of choosing the best clustering and the number of clusters.
Locate the Document
Lipor, J., & Balzano, L. (2020). Clustering quality metrics for subspace clustering. Pattern Recognition, 107328.