PACE: Posthoc Architecture-Agnostic Concept Extractor for Explaining CNNs
release_owt3ys2gtzhphbvndtdxqy5nku
by
Vidhya Kamakshi, Uday Gupta, Narayanan C Krishnan
2021
Abstract
Deep CNNs, though have achieved the state of the art performance in image
classification tasks, remain a black-box to a human using them. There is a
growing interest in explaining the working of these deep models to improve
their trustworthiness. In this paper, we introduce a Posthoc
Architecture-agnostic Concept Extractor (PACE) that automatically extracts
smaller sub-regions of the image called concepts relevant to the black-box
prediction. PACE tightly integrates the faithfulness of the explanatory
framework to the black-box model. To the best of our knowledge, this is the
first work that extracts class-specific discriminative concepts in a posthoc
manner automatically. The PACE framework is used to generate explanations for
two different CNN architectures trained for classifying the AWA2 and
Imagenet-Birds datasets. Extensive human subject experiments are conducted to
validate the human interpretability and consistency of the explanations
extracted by PACE. The results from these experiments suggest that over 72% of
the concepts extracted by PACE are human interpretable.
In text/plain
format
Archived Files and Locations
application/pdf 14.9 MB
file_d337q575sfc3dhoe5hbhwczqtq
|
arxiv.org (repository) web.archive.org (webarchive) |
2108.13828v1
access all versions, variants, and formats of this works (eg, pre-prints)