A Survey on the Robustness of Feature Importance and Counterfactual Explanations release_kqfkxrpqybcs5geu62mueyjxoi

by Saumitra Mishra, Sanghamitra Dutta, Jason Long, Daniele Magazzeni

Released as a article .

2021  

Abstract

There exist several methods that aim to address the crucial task of understanding the behaviour of AI/ML models. Arguably, the most popular among them are local explanations that focus on investigating model behaviour for individual instances. Several methods have been proposed for local analysis, but relatively lesser effort has gone into understanding if the explanations are robust and accurately reflect the behaviour of underlying models. In this work, we present a survey of the works that analysed the robustness of two classes of local explanations (feature importance and counterfactual explanations) that are popularly used in analysing AI/ML models in finance. The survey aims to unify existing definitions of robustness, introduces a taxonomy to classify different robustness approaches, and discusses some interesting results. Finally, the survey introduces some pointers about extending current robustness analysis approaches so as to identify reliable explainability methods.
In text/plain format

Archived Files and Locations

application/pdf  149.0 kB
file_qhdg4f52svhbjdrabt4c3mjwda
arxiv.org (repository)
web.archive.org (webarchive)
Read Archived PDF
Preserved and Accessible
Type  article
Stage   submitted
Date   2021-10-30
Version   v1
Language   en ?
arXiv  2111.00358v1
Work Entity
access all versions, variants, and formats of this works (eg, pre-prints)
Catalog Record
Revision: 8e05dc74-000c-4d6d-872c-6d90f6c17427
API URL: JSON