Sketching and Neural Networks
release_v2tl7w7zn5csvht5jqrus2bdsa
by
Amit Daniely and Nevena Lazic and Yoram Singer and Kunal Talwar
2016
Abstract
High-dimensional sparse data present computational and statistical challenges
for supervised learning. We propose compact linear sketches for reducing the
dimensionality of the input, followed by a single layer neural network. We show
that any sparse polynomial function can be computed, on nearly all sparse
binary vectors, by a single layer neural network that takes a compact sketch of
the vector as input. Consequently, when a set of sparse binary vectors is
approximately separable using a sparse polynomial, there exists a single-layer
neural network that takes a short sketch as input and correctly classifies
nearly all the points. Previous work has proposed using sketches to reduce
dimensionality while preserving the hypothesis class. However, the sketch size
has an exponential dependence on the degree in the case of polynomial
classifiers. In stark contrast, our approach of using improper learning, using
a larger hypothesis class allows the sketch size to have a logarithmic
dependence on the degree. Even in the linear case, our approach allows us to
improve on the pesky O(1/γ^2) dependence of random projections, on
the margin γ. We empirically show that our approach leads to more
compact neural networks than related methods such as feature hashing at equal
or better performance.
In text/plain
format
Archived Files and Locations
application/pdf 986.8 kB
file_upnsgekfs5hffpu3ijr3dzh7ji
|
arxiv.org (repository) web.archive.org (webarchive) |
1604.05753v1
access all versions, variants, and formats of this works (eg, pre-prints)