Online coresets for clustering with Bregman divergences

Show simple item record

dc.contributor.author Chhaya, Rachit
dc.contributor.author Choudhari, Jayesh
dc.contributor.author Dasgupta, Anirban
dc.contributor.author Shit, Supratim
dc.date.accessioned 2020-12-26T13:48:45Z
dc.date.available 2020-12-26T13:48:45Z
dc.date.issued 2020-12
dc.identifier.citation Chhaya, Rachit; Choudhari, Jayesh; Dasgupta, Anirban and Shit, Supratim, "Online coresets for clustering with Bregman divergences", arXiv, Cornell University Library, DOI: arXiv:2012.06522, Dec. 2020. en_US
dc.identifier.uri http://arxiv.org/abs/2012.06522
dc.identifier.uri https://repository.iitgn.ac.in/handle/123456789/6152
dc.description.abstract We present algorithms that create coresets in an online setting for clustering problems according to a wide subset of Bregman divergences. Notably, our coresets have a small additive error, similar in magnitude to the lightweight coresets Bachem et. al. 2018, and take update time O(d) for every incoming point where d is dimension of the point. Our first algorithm gives online coresets of size O~(poly(k,d,?,?)) for k-clusterings according to any ?-similar Bregman divergence. We further extend this algorithm to show existence of a non-parametric coresets, where the coreset size is independent of k, the number of clusters, for the same subclass of Bregman divergences. Our non-parametric coresets are larger by a factor of O(logn) (n is number of points) and have similar (small) additive guarantee. At the same time our coresets also function as lightweight coresets for non-parametric versions of the Bregman clustering like DP-Means. While these coresets provide additive error guarantees, they are also significantly smaller (scaling with O(logn) as opposed to O(dd) for points in Rd) than the (relative-error) coresets obtained in Bachem et. al. 2015 for DP-Means. While our non-parametric coresets are existential, we give an algorithmic version under certain assumptions.
dc.description.statementofresponsibility by Rachit Chhaya, Jayesh Choudhari, Anirban Dasgupta and Supratim Shit
dc.language.iso en_US en_US
dc.publisher Cornell University Library en_US
dc.subject Data Structures en_US
dc.subject Algorithms (cs.DS) en_US
dc.subject Machine Learning (cs.LG) en_US
dc.title Online coresets for clustering with Bregman divergences en_US
dc.type Pre-Print en_US
dc.relation.journal arXiv


Files in this item

Files Size Format View

There are no files associated with this item.

This item appears in the following Collection(s)

Show simple item record

Search Digital Repository


Browse

My Account