Natural Backdoor Attack on Text Data release_bwlbhcu3pzbftbyqkwpnscvrjq

by Lichao Sun, Philip S. Yu

Released as a article .

2020  

Abstract

Deep learning has been widely adopted in natural language processing applications in recent years. Many existing studies show the vulnerabilities of machine learning and deep learning models against adversarial examples. However, most existing works currently focus on evasion attack on text data instead of positioning attack, also named backdoor attack. In this paper, we systematically study the backdoor attack against models on text data. First, we define the backdoor attack on text data. Then, we propose the different attack strategies to generate trigger on text data. Next, we propose different types of the triggers based on modification scope, human recognition and special cases. Last, we evaluate the backdoor attack and the results show the excellent performance of with 100% backdoor attack success rate and sacrificing of 0.71% on text classification task.
In text/plain format

Archived Files and Locations

application/pdf  91.8 kB
file_fh4r6tv65rbxdj34r2jqh3g7ki
arxiv.org (repository)
web.archive.org (webarchive)
Read Archived PDF
Preserved and Accessible
Type  article
Stage   submitted
Date   2020-09-11
Version   v3
Language   en ?
arXiv  2006.16176v3
Work Entity
access all versions, variants, and formats of this works (eg, pre-prints)
Catalog Record
Revision: ae88a83a-a1e8-4c1b-90a2-5a1b95acc18a
API URL: JSON