Generalized Zero-Shot Recognition based on Visually Semantic Embedding
release_caz3houfvnaltevpcri7u4nqre
by
Pengkai Zhu, Hanxiao Wang, Venkatesh Saligrama
2018
Abstract
We propose a novel Generalized Zero-Shot learning (GZSL) method that is
agnostic to both unseen images and unseen semantic vectors during training.
Prior works in this context propose to map high-dimensional visual features to
the semantic domain, we believe contributes to the semantic gap. To bridge the
gap, we propose a novel low-dimensional embedding of visual instances that is
"visually semantic." Analogous to semantic data that quantifies the existence
of an attribute in the presented instance, components of our visual embedding
quantifies existence of a prototypical part-type in the presented instance. In
parallel, as a thought experiment, we quantify the impact of noisy semantic
data by utilizing a novel visual oracle to visually supervise a learner. These
factors, namely semantic noise, visual-semantic gap and label noise lead us to
propose a new graphical model for inference with pairwise interactions between
label, semantic data, and inputs. We tabulate results on a number of benchmark
datasets demonstrating significant improvement in accuracy over
state-of-the-art under both semantic and visual supervision.
In text/plain
format
Archived Files and Locations
application/pdf 2.5 MB
file_bpxs6r5e5rhd7ar5ook35wft5e
|
arxiv.org (repository) web.archive.org (webarchive) |
1811.07993v1
access all versions, variants, and formats of this works (eg, pre-prints)