Layer-Wise Cross-View Decoding for Sequence-to-Sequence Learning release_5xlwhtz7ebddbfzfjzhzkbueye

by Fenglin Liu, Xuancheng Ren, Guangxiang Zhao, Xu Sun, Liangyou Li

Released as a article .

2020  

Abstract

In sequence-to-sequence learning, the decoder relies on the attention mechanism to efficiently extract information from the encoder. While it is common practice to draw information from only the last encoder layer, recent work has proposed to use representations from different encoder layers for diversified levels of information. Nonetheless, the decoder still obtains only a single view of the source sequences, which might lead to insufficient training of the encoder layer stack due to the hierarchy bypassing problem. In this work, we propose layer-wise cross-view decoding, where for each decoder layer, together with the representations from the last encoder layer, which serve as a global view, those from other encoder layers are supplemented for a stereoscopic view of the source sequences. Systematic experiments show that we successfully address the hierarchy bypassing problem and substantially improve the performance of sequence-to-sequence learning with deep representations on diverse tasks.
In text/plain format

Archived Files and Locations

application/pdf  1.7 MB
file_4d5s4etchrdcfot5ujcrpftvem
arxiv.org (repository)
web.archive.org (webarchive)
Read Archived PDF
Preserved and Accessible
Type  article
Stage   submitted
Date   2020-10-21
Version   v3
Language   en ?
arXiv  2005.08081v3
Work Entity
access all versions, variants, and formats of this works (eg, pre-prints)
Catalog Record
Revision: c3dccf82-10a4-4cef-8383-9e92c2c8f90f
API URL: JSON