MTFNet: Mutual-Transformer Fusion Network for RGB-D Salient Object Detection release_bzg5kd6wlvhsjbo67ci7fx3jqy

by Xixi Wang, Bo Jiang, Xiao Wang, Bin Luo

Released as a article .

2021  

Abstract

Salient object detection (SOD) on RGB-D images is an active problem in computer vision. The main challenges for RGB-D SOD problem are how to 1) extract the accurate features for RGB and Depth image data with clutter background or poor image quality and 2) explore the complementary information between RGB and Depth image data. To address these challenges, we propose a novel Mutual-Transformer Fusion Network (MTFNet) for RGB-D SOD. MTFNet contains two main modules, i.e., Focal Feature Extractor (FFE) and Mutual-Transformer Fusion (MTF). FFE aims to extract the more accurate CNN features for RGB and Depth images by introducing a novel pixel-level focal regularization to guide CNN feature extractor. MTF is designed to deeply exploit the multi-modal interaction between RGB and Depth images on both coarse and fine scales. The main benefit of MTF is that it conducts the learning of intra-modality and inter-modality simultaneously and thus can achieve communication across different modalities more directly and sufficiently. Comprehensive experimental results on six public benchmarks demonstrate the superiority of our proposed MTFNet.
In text/plain format

Archived Files and Locations

application/pdf  2.4 MB
file_agwi3iwgsjbetmw3esjeyswjxy
arxiv.org (repository)
web.archive.org (webarchive)
Read Archived PDF
Preserved and Accessible
Type  article
Stage   submitted
Date   2021-12-02
Version   v1
Language   en ?
arXiv  2112.01177v1
Work Entity
access all versions, variants, and formats of this works (eg, pre-prints)
Catalog Record
Revision: 34d80f3a-46c1-4712-87e7-d2691c283e68
API URL: JSON