Can Explainable AI Explain Unfairness? A Framework for Evaluating Explainable AI
release_6kzy4k4znzffjkunc52n3lhlzu
by
Kiana Alikhademi, Brianna Richardson, Emma Drobina, Juan E. Gilbert
2021
Abstract
Many ML models are opaque to humans, producing decisions too complex for
humans to easily understand. In response, explainable artificial intelligence
(XAI) tools that analyze the inner workings of a model have been created.
Despite these tools' strength in translating model behavior, critiques have
raised concerns about the impact of XAI tools as a tool for `fairwashing` by
misleading users into trusting biased or incorrect models. In this paper, we
created a framework for evaluating explainable AI tools with respect to their
capabilities for detecting and addressing issues of bias and fairness as well
as their capacity to communicate these results to their users clearly. We found
that despite their capabilities in simplifying and explaining model behavior,
many prominent XAI tools lack features that could be critical in detecting
bias. Developers can use our framework to suggest modifications needed in their
toolkits to reduce issues likes fairwashing.
In text/plain
format
Archived Files and Locations
application/pdf 541.8 kB
file_xskdnhk6uzb7xgo6xub53jwmyu
|
arxiv.org (repository) web.archive.org (webarchive) |
2106.07483v1
access all versions, variants, and formats of this works (eg, pre-prints)