GILE: A Generalized Input-Label Embedding for Text Classification
release_nuphn7ghjna77a6sbrmuzg5ade
by
Nikolaos Pappas, James Henderson
2018
Abstract
Neural text classification models typically treat output labels as
categorical variables which lack description and semantics. This forces their
parametrization to be dependent on the label set size, and, hence, they are
unable to scale to large label sets and generalize to unseen ones. Existing
joint input-label text models overcome these issues by exploiting label
descriptions, but they are unable to capture complex label relationships, have
rigid parametrization, and their gains on unseen labels happen often at the
expense of weak performance on the labels seen during training. In this paper,
we propose a new input-label model which generalizes over previous such models,
addresses their limitations and does not compromise performance on seen labels.
The model consists of a joint non-linear input-label embedding with
controllable capacity and a joint-space-dependent classification unit which is
trained with cross-entropy loss to optimize classification performance. We
evaluate models on full-resource and low- or zero-resource text classification
of multilingual news and biomedical text with a large label set. Our model
outperforms monolingual and multilingual models which do not leverage label
semantics and previous joint input-label space models in both scenarios.
In text/plain
format
Archived Files and Locations
application/pdf 352.6 kB
file_wkce7mg5izcyrm6xl7rznwa7uu
|
arxiv.org (repository) web.archive.org (webarchive) |
1806.06219v2
access all versions, variants, and formats of this works (eg, pre-prints)