Towards Scalable Spectral Clustering via Spectrum-Preserving
Sparsification
release_b66skkg2mjbcdhomsvt3qe2v5e
by
Yongyu Wang, Zhuo Feng
2017
Abstract
The eigendeomposition of nearest-neighbor (NN) graph Laplacian matrices is
the main computational bottleneck in spectral clustering. In this work, we
introduce a highly-scalable, spectrum-preserving graph sparsification algorithm
that enables to build ultra-sparse NN (u-NN) graphs with guaranteed
preservation of the original graph spectrums, such as the first few
eigenvectors of the original graph Laplacian. Our approach can immediately lead
to scalable spectral clustering of large data networks without sacrificing
solution quality. The proposed method starts from constructing low-stretch
spanning trees (LSSTs) from the original graphs, which is followed by
iteratively recovering small portions of "spectrally critical" off-tree edges
to the LSSTs by leveraging a spectral off-tree embedding scheme. To determine
the suitable amount of off-tree edges to be recovered to the LSSTs, an
eigenvalue stability checking scheme is proposed, which enables to robustly
preserve the first few Laplacian eigenvectors within the sparsified graph.
Additionally, an incremental graph densification scheme is proposed for
identifying extra edges that have been missing in the original NN graphs but
can still play important roles in spectral clustering tasks. Our experimental
results for a variety of well-known data sets show that the proposed method can
dramatically reduce the complexity of NN graphs, leading to significant
speedups in spectral clustering.
In text/plain
format
Archived Files and Locations
application/pdf 748.9 kB
file_ffn5ukqqerbsjfeq7k3emdoelm
|
arxiv.org (repository) web.archive.org (webarchive) |
1710.04584v1
access all versions, variants, and formats of this works (eg, pre-prints)