Video Object Segmentation with Episodic Graph Memory Networks
release_po7rpyav7vfdpfcguuxkazspqe
by
Xinkai Lu, Wenguan Wang, Martin Danelljan, Tianfei Zhou, Jianbing Shen, Luc Van Gool
2020
Abstract
How to make a segmentation model efficiently adapt to a specific video and to
online target appearance variations are fundamentally crucial issues in the
field of video object segmentation. In this work, a graph memory network is
developed to address the novel idea of "learning to update the segmentation
model". Specifically, we exploit an episodic memory network, organized as a
fully connected graph, to store frames as nodes and capture cross-frame
correlations by edges. Further, learnable controllers are embedded to ease
memory reading and writing, as well as maintain a fixed memory scale. The
structured, external memory design enables our model to comprehensively mine
and quickly store new knowledge, even with limited visual information, and the
differentiable memory controllers slowly learn an abstract method for storing
useful representations in the memory and how to later use these representations
for prediction, via gradient descent. In addition, the proposed graph memory
network yields a neat yet principled framework, which can generalize well both
one-shot and zero-shot video object segmentation tasks. Extensive experiments
on four challenging benchmark datasets verify that our graph memory network is
able to facilitate the adaptation of the segmentation network for case-by-case
video object segmentation.
In text/plain
format
Archived Files and Locations
application/pdf 5.1 MB
file_qfenrlwhm5gfjcldyqcpbnkfba
|
arxiv.org (repository) web.archive.org (webarchive) |
2007.07020v2
access all versions, variants, and formats of this works (eg, pre-prints)