VALUE: Understanding Dialect Disparity in NLU release_gd4crwqdjzbktdfhhmuzfqhtvu

by Caleb Ziems, Jiaao Chen, Camille Harris, Jessica Anderson, Diyi Yang

Released as a article .

2022  

Abstract

English Natural Language Understanding (NLU) systems have achieved great performances and even outperformed humans on benchmarks like GLUE and SuperGLUE. However, these benchmarks contain only textbook Standard American English (SAE). Other dialects have been largely overlooked in the NLP community. This leads to biased and inequitable NLU systems that serve only a sub-population of speakers. To understand disparities in current models and to facilitate more dialect-competent NLU systems, we introduce the VernAcular Language Understanding Evaluation (VALUE) benchmark, a challenging variant of GLUE that we created with a set of lexical and morphosyntactic transformation rules. In this initial release (V.1), we construct rules for 11 features of African American Vernacular English (AAVE), and we recruit fluent AAVE speakers to validate each feature transformation via linguistic acceptability judgments in a participatory design manner. Experiments show that these new dialectal features can lead to a drop in model performance.
In text/plain format

Archived Files and Locations

application/pdf  454.0 kB
file_l3ewf4gomfhqln3ntwoc3kn3pq
arxiv.org (repository)
web.archive.org (webarchive)
Read Archived PDF
Preserved and Accessible
Type  article
Stage   submitted
Date   2022-04-06
Version   v1
Language   en ?
arXiv  2204.03031v1
Work Entity
access all versions, variants, and formats of this works (eg, pre-prints)
Catalog Record
Revision: 314ce7c6-73de-43d8-b784-dc994fb00338
API URL: JSON