Learning Compressed Transforms with Low Displacement Rank
release_hqlsa2wb35bzhoosg2k4kicmwe
by
Anna T. Thomas and Albert Gu and Tri Dao and Atri Rudra and
Christopher Ré
2018
Abstract
The low displacement rank (LDR) framework for structured matrices represents
a matrix through two displacement operators and a low-rank residual. Existing
use of LDR matrices in deep learning has applied fixed displacement operators
encoding forms of shift invariance akin to convolutions. We introduce a class
of LDR matrices with more general displacement operators, and explicitly learn
over both the operators and the low-rank component. This class generalizes
several previous constructions while preserving compression and efficient
computation. We prove bounds on the VC dimension of multi-layer neural networks
with structured weight matrices and show empirically that our compact
parameterization can reduce the sample complexity of learning. When replacing
weight layers in fully-connected, convolutional, and recurrent neural networks
for image classification and language modeling tasks, our new classes exceed
the accuracy of existing compression approaches, and on some tasks also
outperform general unstructured layers while using more than 20x fewer
parameters.
In text/plain
format
Archived Files and Locations
application/pdf 2.7 MB
file_g2ufkq657zaxdlrokkt2eqwq7m
|
arxiv.org (repository) web.archive.org (webarchive) |
1810.02309v1
access all versions, variants, and formats of this works (eg, pre-prints)