Regularization Matters in Policy Optimization release_ifdtx2wqbffubkxg3n352vnkjy

by Zhuang Liu, Xuanlin Li, Bingyi Kang, Trevor Darrell

Released as a article .

2021  

Abstract

Deep Reinforcement Learning (Deep RL) has been receiving increasingly more attention thanks to its encouraging performance on a variety of control tasks. Yet, conventional regularization techniques in training neural networks (e.g., L_2 regularization, dropout) have been largely ignored in RL methods, possibly because agents are typically trained and evaluated in the same environment, and because the deep RL community focuses more on high-level algorithm designs. In this work, we present the first comprehensive study of regularization techniques with multiple policy optimization algorithms on continuous control tasks. Interestingly, we find conventional regularization techniques on the policy networks can often bring large improvement, especially on harder tasks. Our findings are shown to be robust against training hyperparameter variations. We also compare these techniques with the more widely used entropy regularization. In addition, we study regularizing different components and find that only regularizing the policy network is typically the best. We further analyze why regularization may help generalization in RL from four perspectives - sample complexity, reward distribution, weight norm, and noise robustness. We hope our study provides guidance for future practices in regularizing policy optimization algorithms. Our code is available at https://github.com/xuanlinli17/iclr2021_rlreg .
In text/plain format

Archived Files and Locations

application/pdf  16.9 MB
file_pqa5hbqfe5cqrnjqv7cx3qlgsi
arxiv.org (repository)
web.archive.org (webarchive)
Read Archived PDF
Preserved and Accessible
Type  article
Stage   submitted
Date   2021-11-28
Version   v5
Language   en ?
arXiv  1910.09191v5
Work Entity
access all versions, variants, and formats of this works (eg, pre-prints)
Catalog Record
Revision: 136f1f42-8cff-4791-963d-1f0f59b4cd45
API URL: JSON