Learning Hierarchical Teaching in Cooperative Multiagent Reinforcement
Learning
release_3dkvl5bzprcixnrsbe56i3rxhq
by
Dong Ki Kim, Miao Liu, Shayegan Omidshafiei, Sebastian Lopez-Cot,
Matthew Riemer, Golnaz Habibi, Gerald Tesauro, Sami Mourad, Murray Campbell,
Jonathan P. How
2019
Abstract
Heterogeneous knowledge naturally arises among different agents in
cooperative multiagent reinforcement learning. As such, learning can be greatly
improved if agents can effectively pass their knowledge on to other agents.
Existing work has demonstrated that peer-to-peer knowledge transfer, a process
referred to as action advising, improves team-wide learning. In contrast to
previous frameworks that advise at the level of primitive actions, we aim to
learn high-level teaching policies that decide when and what high-level action
(e.g., sub-goal) to advise a teammate. We introduce a new learning to teach
framework, called hierarchical multiagent teaching (HMAT). The proposed
framework solves difficulties faced by prior work on multiagent teaching when
operating in domains with long horizons, delayed rewards, and continuous
states/actions by leveraging temporal abstraction and deep function
approximation. Our empirical evaluations show that HMAT accelerates team-wide
learning progress in difficult environments that are more complex than those
explored in previous work. HMAT also learns teaching policies that can be
transferred to different teammates/tasks and can even teach teammates with
heterogeneous action spaces.
In text/plain
format
Archived Files and Locations
application/pdf 848.8 kB
file_hwr2mozwnnafhnoag7uu6k3774
|
arxiv.org (repository) web.archive.org (webarchive) |
1903.03216v1
access all versions, variants, and formats of this works (eg, pre-prints)