Amortized Generation of Sequential Counterfactual Explanations for Black-box Models
release_ege5p5252fhnzgqsq4e4bamsr4
by
Sahil Verma, Keegan Hines, John P. Dickerson
2021
Abstract
Explainable machine learning (ML) has gained traction in recent years due to
the increasing adoption of ML-based systems in many sectors. Counterfactual
explanations (CFEs) provide “what if” feedback of the form “if an input
datapoint were x' instead of x, then an ML-based system's output would be
y' instead of y.” CFEs are attractive due to their actionable feedback,
amenability to existing legal frameworks, and fidelity to the underlying ML
model. Yet, current CFE approaches are single shot – that is, they assume x
can change to x' in a single time period. We propose a novel
stochastic-control-based approach that generates sequential CFEs, that is, CFEs
that allow x to move stochastically and sequentially across intermediate
states to a final state x'. Our approach is model agnostic and black box.
Furthermore, calculation of CFEs is amortized such that once trained, it
applies to multiple datapoints without the need for re-optimization. In
addition to these primary characteristics, our approach admits optional
desiderata such as adherence to the data manifold, respect for causal
relations, and sparsity – identified by past research as desirable properties
of CFEs. We evaluate our approach using three real-world datasets and show
successful generation of sequential CFEs that respect other counterfactual
desiderata.
In text/plain
format
Archived Content
There are no accessible files associated with this release. You could check other releases for this work for an accessible version.
Know of a fulltext copy of on the public web? Submit a URL and we will archive it
2106.03962v1
access all versions, variants, and formats of this works (eg, pre-prints)