Published In

Journal of Computer and System Sciences

Document Type

Article

Publication Date

6-1-2024

Subjects

Coresets -- Fair clustering

Abstract

Fair clustering is a constrained clustering problem where we need to partition a set of colored points. The fraction of points of each color in every cluster should be more or less equal to the fraction of points of this color in the dataset. The problem was recently introduced by Chierichetti et al. (2017) [1]. We propose a new construction of coresets for fair clustering for Euclidean and general metrics based on random sampling. For the Euclidean space Rd, we provide the first coreset whose size does not depend exponentially on the dimension d. The question of whether such constructions exist was asked by Schmidt et al. (2019) [2]and Huang et al. (2019) [5]. For general metrics, our construction provides the first coreset for fair clustering. New coresets appear to be a handy tool for designing better approximation and streaming algorithms for fair and other constrained clustering variants.

Rights

Copyright (c) 2024 The Authors

Creative Commons License

This work is licensed under a Creative Commons Attribution 4.0 International License.

DOI

10.1016/j.jcss.2024.103506

Persistent Identifier

https://archives.pdx.edu/ds/psu/41728

Share

COinS