Learning from Data with Noisy Labels Using Temporal Self-Ensemble
release_4tlb2rtknfhrjfzzkf2yx4flh4
by
Jun Ho Lee, Jae Soon Baik, Tae Hwan Hwang, Jun Won Choi
2022
Abstract
There are inevitably many mislabeled data in real-world datasets. Because
deep neural networks (DNNs) have an enormous capacity to memorize noisy labels,
a robust training scheme is required to prevent labeling errors from degrading
the generalization performance of DNNs. Current state-of-the-art methods
present a co-training scheme that trains dual networks using samples associated
with small losses. In practice, however, training two networks simultaneously
can burden computing resources. In this study, we propose a simple yet
effective robust training scheme that operates by training only a single
network. During training, the proposed method generates temporal self-ensemble
by sampling intermediate network parameters from the weight trajectory formed
by stochastic gradient descent optimization. The loss sum evaluated with these
self-ensembles is used to identify incorrectly labeled samples. In parallel,
our method generates multi-view predictions by transforming an input data into
various forms and considers their agreement to identify incorrectly labeled
samples. By combining the aforementioned metrics, we present the proposed self-ensemble-based robust training (SRT) method, which can filter the samples
with noisy labels to reduce their influence on training. Experiments on
widely-used public datasets demonstrate that the proposed method achieves a
state-of-the-art performance in some categories without training the dual
networks.
In text/plain
format
Archived Files and Locations
application/pdf 766.7 kB
file_fvknm2p5bbea5a5rq4tymeppm4
|
arxiv.org (repository) web.archive.org (webarchive) |
2207.10354v1
access all versions, variants, and formats of this works (eg, pre-prints)