Deep Exploration via Bootstrapped DQN
release_653lkmhmbbgfnkpi3a4uunjihm
by
Ian Osband, Charles Blundell, Alexander Pritzel, Benjamin Van Roy
2016
Abstract
Efficient exploration in complex environments remains a major challenge for
reinforcement learning. We propose bootstrapped DQN, a simple algorithm that
explores in a computationally and statistically efficient manner through use of
randomized value functions. Unlike dithering strategies such as epsilon-greedy
exploration, bootstrapped DQN carries out temporally-extended (or deep)
exploration; this can lead to exponentially faster learning. We demonstrate
these benefits in complex stochastic MDPs and in the large-scale Arcade
Learning Environment. Bootstrapped DQN substantially improves learning times
and performance across most Atari games.
In text/plain
format
Archived Files and Locations
application/pdf 6.9 MB
file_3jiobidrnfatxezhgdounquxcu
|
arxiv.org (repository) web.archive.org (webarchive) |
1602.04621v2
access all versions, variants, and formats of this works (eg, pre-prints)