RadFusion: Benchmarking Performance and Fairness for Multimodal Pulmonary Embolism Detection from CT and EHR
release_qcdzdqrhkvbf3c2mkvlb7hgf24
by
Yuyin Zhou, Shih-Cheng Huang, Jason Alan Fries, Alaa Youssef, Timothy J. Amrhein, Marcello Chang, Imon Banerjee, Daniel Rubin, Lei Xing, Nigam Shah, Matthew P. Lungren
2021
Abstract
Despite the routine use of electronic health record (EHR) data by
radiologists to contextualize clinical history and inform image interpretation,
the majority of deep learning architectures for medical imaging are unimodal,
i.e., they only learn features from pixel-level information. Recent research
revealing how race can be recovered from pixel data alone highlights the
potential for serious biases in models which fail to account for demographics
and other key patient attributes. Yet the lack of imaging datasets which
capture clinical context, inclusive of demographics and longitudinal medical
history, has left multimodal medical imaging underexplored. To better assess
these challenges, we present RadFusion, a multimodal, benchmark dataset of 1794
patients with corresponding EHR data and high-resolution computed tomography
(CT) scans labeled for pulmonary embolism. We evaluate several representative
multimodal fusion models and benchmark their fairness properties across
protected subgroups, e.g., gender, race/ethnicity, age. Our results suggest
that integrating imaging and EHR data can improve classification performance
and robustness without introducing large disparities in the true positive rate
between population groups.
In text/plain
format
Archived Files and Locations
application/pdf 713.6 kB
file_4bd7wak5lvgfjptchqslmkt2o4
|
arxiv.org (repository) web.archive.org (webarchive) |
2111.11665v1
access all versions, variants, and formats of this works (eg, pre-prints)