AP-MTL: Attention Pruned Multi-task Learning Model for Real-time
Instrument Detection and Segmentation in Robot-assisted Surgery
release_zk2rzrjtfnaffgx3quyqd6qmiy
by
Mobarakol Islam, Vibashan VS, Hongliang Ren
2020
Abstract
Surgical scene understanding and multi-tasking learning are crucial for
image-guided robotic surgery. Training a real-time robotic system for the
detection and segmentation of high-resolution images provides a challenging
problem with the limited computational resource. The perception drawn can be
applied in effective real-time feedback, surgical skill assessment, and
human-robot collaborative surgeries to enhance surgical outcomes. For this
purpose, we develop a novel end-to-end trainable real-time Multi-Task Learning
(MTL) model with weight-shared encoder and task-aware detection and
segmentation decoders. Optimization of multiple tasks at the same convergence
point is vital and presents a complex problem. Thus, we propose an asynchronous
task-aware optimization (ATO) technique to calculate task-oriented gradients
and train the decoders independently. Moreover, MTL models are always
computationally expensive, which hinder real-time applications. To address this
challenge, we introduce a global attention dynamic pruning (GADP) by removing
less significant and sparse parameters. We further design a skip squeeze and
excitation (SE) module, which suppresses weak features, excites significant
features and performs dynamic spatial and channel-wise feature re-calibration.
Validating on the robotic instrument segmentation dataset of MICCAI endoscopic
vision challenge, our model significantly outperforms state-of-the-art
segmentation and detection models, including best-performed models in the
challenge.
In text/plain
format
Archived Files and Locations
application/pdf 794.0 kB
file_64uvoa5z5rgfjlwg22npoajbky
|
arxiv.org (repository) web.archive.org (webarchive) |
2003.04769v1
access all versions, variants, and formats of this works (eg, pre-prints)