Whatever Does Not Kill Deep Reinforcement Learning, Makes It Stronger
release_rzi4py4p4fejvp6oiiiaxjc5ga
by
Vahid Behzadan, Arslan Munir
2017
Abstract
Recent developments have established the vulnerability of deep Reinforcement
Learning (RL) to policy manipulation attacks via adversarial perturbations. In
this paper, we investigate the robustness and resilience of deep RL to
training-time and test-time attacks. Through experimental results, we
demonstrate that under noncontiguous training-time attacks, Deep Q-Network
(DQN) agents can recover and adapt to the adversarial conditions by reactively
adjusting the policy. Our results also show that policies learned under
adversarial perturbations are more robust to test-time attacks. Furthermore, we
compare the performance of ϵ-greedy and parameter-space noise
exploration methods in terms of robustness and resilience against adversarial
perturbations.
In text/plain
format
Archived Files and Locations
application/pdf 208.2 kB
file_2ztnid3oknh45muahzfivxsnne
|
arxiv.org (repository) web.archive.org (webarchive) |
1712.09344v1
access all versions, variants, and formats of this works (eg, pre-prints)