Demystifying How Self-Supervised Features Improve Training from Noisy Labels release_aulvwedf45cbthwplwtxeybm3i

by Hao Cheng, Zhaowei Zhu, Xing Sun, Yang Liu

Released as a article .

2021  

Abstract

The advancement of self-supervised learning (SSL) motivates researchers to apply SSL on other tasks such as learning with noisy labels. Recent literature indicates that methods built on SSL features can substantially improve the performance of learning with noisy labels. Nonetheless, the deeper reasons why (and how) SSL features benefit the training from noisy labels are less understood. In this paper, we study why and how self-supervised features help networks resist label noise using both theoretical analyses and numerical experiments. Our result shows that, given a quality encoder pre-trained from SSL, a simple linear layer trained by the cross-entropy loss is theoretically robust to symmetric label noise. Further, we provide insights for how knowledge distilled from SSL features can alleviate the over-fitting problem. We hope our work provides a better understanding for learning with noisy labels from the perspective of self-supervised learning and can potentially serve as a guideline for further research. Code is available at github.com/UCSC-REAL/SelfSup_NoisyLabel.
In text/plain format

Archived Files and Locations

application/pdf  1.2 MB
file_mtue52277jefripvibfaubqmiy
arxiv.org (repository)
web.archive.org (webarchive)
Read Archived PDF
Preserved and Accessible
Type  article
Stage   submitted
Date   2021-10-18
Version   v1
Language   en ?
arXiv  2110.09022v1
Work Entity
access all versions, variants, and formats of this works (eg, pre-prints)
Catalog Record
Revision: 344bdd2b-656a-45fc-9258-e920a1632359
API URL: JSON