Knowledge-based Embodied Question Answering
release_jo3xt5ikkbh3vmnm3rfswlmqpm
by
Sinan Tan, Mengmeng Ge, Di Guo, Huaping Liu, Fuchun Sun
2021
Abstract
In this paper, we propose a novel Knowledge-based Embodied Question Answering
(K-EQA) task, in which the agent intelligently explores the environment to
answer various questions with the knowledge. Different from explicitly
specifying the target object in the question as existing EQA work, the agent
can resort to external knowledge to understand more complicated question such
as "Please tell me what are objects used to cut food in the room?", in which
the agent must know the knowledge such as "knife is used for cutting food".
To address this K-EQA problem, a novel framework based on neural program
synthesis reasoning is proposed, where the joint reasoning of the external
knowledge and 3D scene graph is performed to realize navigation and question
answering. Especially, the 3D scene graph can provide the memory to store the
visual information of visited scenes, which significantly improves the
efficiency for the multi-turn question answering. Experimental results have
demonstrated that the proposed framework is capable of answering more
complicated and realistic questions in the embodied environment. The proposed
method is also applicable to multi-agent scenarios.
In text/plain
format
Archived Files and Locations
application/pdf 2.0 MB
file_i6ti4fhaqvdt7ff2rqshgb3b5u
|
arxiv.org (repository) web.archive.org (webarchive) |
2109.07872v1
access all versions, variants, and formats of this works (eg, pre-prints)