Repository landing page

We are not able to resolve this OAI Identifier to the repository landing page. If you are the repository manager for this record, please head to the Dashboard and adjust the settings.

Approximation and Streaming Algorithms for Projective Clustering via Random Projections

Abstract

Let Ο΅>0\epsilon>0 be any constant and let PP be a set of nn points in Rd\mathbb{R}^d. We design new streaming and approximation algorithms for clustering points of PP. Consider the projective clustering problem: Given k,q<nk, q < n, compute a set FF of kk qq-flats such that the function fkq(P,ρ)=βˆ‘p∈Pd(p,F)ρf_k^q(P,\rho)=\sum_{p\in P}d(p, F)^\rho is minimized; here d(p,F)d(p, F) represents the distance of pp to the closest qq-flat in FF. For ρ=∞\rho=\infty, we interpret fkq(P,ρ)f_k^q(P,\rho) to be max⁑r∈Pd(r,F)\max_{r\in P}d(r, F). When ρ=1,2\rho=1,2 and ∞\infty and q=0q=0, the problem corresponds to the well-known kk-median, kk-mean and the kk-center clustering problems respectively. Our two main technical contributions are as follows: (i) Consider an orthogonal projection of PP to a randomly chosen O(Cρ(q,Ο΅)log⁑n/Ο΅2)O(C_\rho(q,\epsilon)\log n/\epsilon^2)-dimensional flat. For every subset SβŠ†PS \subseteq P, we show that such a random projection will Ο΅\epsilon-approximate f1q(S,ρ)f_1^q(S,\rho). This result holds for any integer norm ρβ‰₯1\rho \ge 1, including ρ=∞\rho=\infty; here Cρ(q,Ο΅)C_\rho(q,\epsilon) is the size of the smallest coreset that Ο΅\epsilon-approximates f1q(β‹…,ρ)f_1^q(\cdot,\rho). For ρ=1,2\rho=1,2 and ∞\infty, Cρ(q,Ο΅)C_\rho(q,\epsilon) is known to be a constant which depends only on qq and Ο΅\epsilon. (ii) We improve the size of the coreset when ρ=∞\rho = \infty. In particular, we improve the bounds of C∞(q,Ο΅)C_\infty(q,\epsilon) to O(q3/Ο΅2)O(q^3/\epsilon^2) from the previously-known O(q6/Ο΅5log⁑1/Ο΅)O(q^6/\epsilon^5 \log 1/\epsilon). As applications, we obtain better approximation and streaming algorithms for various projective clustering problems over high dimensional point sets. E.g., when ρ=∞\rho =\infty and qβ‰₯1q\geq 1, we obtain a streaming algorithm that maintains an Ο΅\epsilon-approximate solution using O((d+n)q3(log⁑n/Ο΅4))O((d + n)q^3(\log n/\epsilon^4)) space, which is better than the input size O(nd)O(nd)

Similar works

Full text

thumbnail-image

MPG.PuRe

redirect
Last time updated on 23/08/2016

This paper was published in MPG.PuRe.

Having an issue?

Is data on this page outdated, violates copyrights or anything else? Report the problem now and we will take corresponding actions after reviewing your request.