Published In
Journal of Computer and System Sciences
Document Type
Article
Publication Date
6-1-2024
Subjects
Coresets -- Fair clustering
Abstract
Fair clustering is a constrained clustering problem where we need to partition a set of colored points. The fraction of points of each color in every cluster should be more or less equal to the fraction of points of this color in the dataset. The problem was recently introduced by Chierichetti et al. (2017) [1]. We propose a new construction of coresets for fair clustering for Euclidean and general metrics based on random sampling. For the Euclidean space Rd, we provide the first coreset whose size does not depend exponentially on the dimension d. The question of whether such constructions exist was asked by Schmidt et al. (2019) [2]and Huang et al. (2019) [5]. For general metrics, our construction provides the first coreset for fair clustering. New coresets appear to be a handy tool for designing better approximation and streaming algorithms for fair and other constrained clustering variants.
Rights
Copyright (c) 2024 The Authors
This work is licensed under a Creative Commons Attribution 4.0 International License.
Locate the Document
DOI
10.1016/j.jcss.2024.103506
Persistent Identifier
https://archives.pdx.edu/ds/psu/41728
Citation Details
Bandyapadhyay, S., Fomin, F. V., & Simonov, K. (2024). On coresets for fair clustering in metric and Euclidean spaces and their applications. Journal of Computer and System Sciences, 142, 103506.