TED: A Pretrained Unsupervised Summarization Model with Theme Modeling and Denoising release_n4penvk2dven3nzj3s4q4c67z4

by Ziyi Yang, Chenguang Zhu, Robert Gmyr, Michael Zeng, Xuedong Huang, Eric Darve

Released as a article .

2020  

Abstract

Text summarization aims to extract essential information from a piece of text and transform the text into a concise version. Existing unsupervised abstractive summarization models leverage recurrent neural networks framework while the recently proposed transformer exhibits much more capability. Moreover, most of previous summarization models ignore abundant unlabeled corpora resources available for pretraining. In order to address these issues, we propose TED, a transformer-based unsupervised abstractive summarization system with pretraining on large-scale data. We first leverage the lead bias in news articles to pretrain the model on millions of unlabeled corpora. Next, we finetune TED on target domains through theme modeling and a denoising autoencoder to enhance the quality of generated summaries. Notably, TED outperforms all unsupervised abstractive baselines on NYT, CNN/DM and English Gigaword datasets with various document styles. Further analysis shows that the summaries generated by TED are highly abstractive, and each component in the objective function of TED is highly effective.
In text/plain format

Archived Files and Locations

application/pdf  424.8 kB
file_yrf5liggmbhjhaznwtba6y3v4a
arxiv.org (repository)
web.archive.org (webarchive)
Read Archived PDF
Preserved and Accessible
Type  article
Stage   submitted
Date   2020-10-18
Version   v3
Language   en ?
arXiv  2001.00725v3
Work Entity
access all versions, variants, and formats of this works (eg, pre-prints)
Catalog Record
Revision: 9a6a0f30-6f2a-4abe-a1c9-7813f1cf9e0d
API URL: JSON