TED: A Pretrained Unsupervised Summarization Model with Theme Modeling and Denoising
release_n4penvk2dven3nzj3s4q4c67z4
by
Ziyi Yang, Chenguang Zhu, Robert Gmyr, Michael Zeng, Xuedong Huang, Eric Darve
2020
Abstract
Text summarization aims to extract essential information from a piece of text
and transform the text into a concise version. Existing unsupervised
abstractive summarization models leverage recurrent neural networks framework
while the recently proposed transformer exhibits much more capability.
Moreover, most of previous summarization models ignore abundant unlabeled
corpora resources available for pretraining. In order to address these issues,
we propose TED, a transformer-based unsupervised abstractive summarization
system with pretraining on large-scale data. We first leverage the lead bias in
news articles to pretrain the model on millions of unlabeled corpora. Next, we
finetune TED on target domains through theme modeling and a denoising
autoencoder to enhance the quality of generated summaries. Notably, TED
outperforms all unsupervised abstractive baselines on NYT, CNN/DM and English
Gigaword datasets with various document styles. Further analysis shows that the
summaries generated by TED are highly abstractive, and each component in the
objective function of TED is highly effective.
In text/plain
format
Archived Files and Locations
application/pdf 424.8 kB
file_yrf5liggmbhjhaznwtba6y3v4a
|
arxiv.org (repository) web.archive.org (webarchive) |
2001.00725v3
access all versions, variants, and formats of this works (eg, pre-prints)