Mastering Atari with Discrete World Models
release_cepq24p46jhcxczcp4xvejwa6y
by
Danijar Hafner, Timothy Lillicrap, Mohammad Norouzi, Jimmy Ba
2020
Abstract
Intelligent agents need to generalize from past experience to achieve goals
in complex environments. World models facilitate such generalization and allow
learning behaviors from imagined outcomes to increase sample-efficiency. While
learning world models from image inputs has recently become feasible for some
tasks, modeling Atari games accurately enough to derive successful behaviors
has remained an open challenge for many years. We introduce DreamerV2, a
reinforcement learning agent that learns behaviors purely from predictions in
the compact latent space of a powerful world model. The world model uses
discrete representations and is trained separately from the policy. DreamerV2
constitutes the first agent that achieves human-level performance on the Atari
benchmark of 55 tasks by learning behaviors inside a separately trained world
model. With the same computational budget and wall-clock time, DreamerV2
reaches 200M frames and exceeds the final performance of the top single-GPU
agents IQN and Rainbow.
In text/plain
format
Archived Files and Locations
application/pdf 594.5 kB
file_5fjqwrhcrvgmpj4cmqmddbcy5e
|
arxiv.org (repository) web.archive.org (webarchive) |
2010.02193v1
access all versions, variants, and formats of this works (eg, pre-prints)