Locally Adaptive Label Smoothing for Predictive Churn release_d6ak5z2q7zgaxeqti2bt435g6a

by Dara Bahri, Heinrich Jiang

Released as a article .

2021  

Abstract

Training modern neural networks is an inherently noisy process that can lead to high prediction churn – disagreements between re-trainings of the same model due to factors such as randomization in the parameter initialization and mini-batches – even when the trained models all attain similar accuracies. Such prediction churn can be very undesirable in practice. In this paper, we present several baselines for reducing churn and show that training on soft labels obtained by adaptively smoothing each example's label based on the example's neighboring labels often outperforms the baselines on churn while improving accuracy on a variety of benchmark classification tasks and model architectures.
In text/plain format

Archived Files and Locations

application/pdf  1.3 MB
file_4k3jul44bzaufpi5ggz3gghlca
arxiv.org (repository)
web.archive.org (webarchive)
Read Archived PDF
Preserved and Accessible
Type  article
Stage   submitted
Date   2021-06-11
Version   v2
Language   en ?
arXiv  2102.05140v2
Work Entity
access all versions, variants, and formats of this works (eg, pre-prints)
Catalog Record
Revision: ee627de4-ca97-49ff-bfb9-5b0b5c5e9300
API URL: JSON