Gamma-Nets: Generalizing Value Estimation over Timescale
release_ls4tk777ingpndof3ciz5ap3am
by
Craig Sherstan, Shibhansh Dohare, James MacGlashan, Johannes
Günther, Patrick M. Pilarski
2019
Abstract
We present Γ-nets, a method for generalizing value function estimation
over timescale. By using the timescale as one of the estimator's inputs we can
estimate value for arbitrary timescales. As a result, the prediction target for
any timescale is available and we are free to train on multiple timescales at
each timestep. Here we empirically evaluate Γ-nets in the policy
evaluation setting. We first demonstrate the approach on a square wave and then
on a robot arm using linear function approximation. Next, we consider the deep
reinforcement learning setting using several Atari video games. Our results
show that Γ-nets can be effective for predicting arbitrary timescales,
with only a small cost in accuracy as compared to learning estimators for fixed
timescales. Γ-nets provide a method for compactly making predictions at
many timescales without requiring a priori knowledge of the task, making it a
valuable contribution to ongoing work on model-based planning, representation
learning, and lifelong learning algorithms.
In text/plain
format
Archived Files and Locations
application/pdf 2.8 MB
file_a7tkdlap2rfafc2ygdzcaemhbu
|
arxiv.org (repository) web.archive.org (webarchive) |
1911.07794v3
access all versions, variants, and formats of this works (eg, pre-prints)