A Survey of Neural Models for Abstractive Summarization

Tal Baumel, Michael Elhadad

Research output: Chapter in Book/Report/Conference proceedingChapterpeer-review

Abstract

In this chapter, we survey recent developments in abstractive summarization which use neural networks. Those methods achieve state-of-the-art ROUGE results for summarization tasks, especially when using short text as a source, such as single sentences or short paragraphs. We cover essential neural network concepts for abstractive summarization models. Because such models require massive training data, we also overview datasets used to train such models. We first describe the basic methodological concepts (word embeddings, sequence to sequence recurrent networks and attention mechanism). We provide didactic source code in Python to explain these basic concepts. We then survey four recent systems which, when combined, have resulted in dramatic improvements in single-document generic abstractive summarization in the past couple of years. These systems introduce re-usable techniques which address each aspect of the summarization challenge: dealing with large vocabulary while exploiting the high similarity between source and target documents; dealing with rare named-entities by detecting and copying them from the source to the target; avoiding repetition and redundancy by introducing a distractor mechanism; introducing sentence level assessment with the use of reinforcement learning.

Original languageAmerican English
Title of host publicationMultilingual Text Analysis
Subtitle of host publicationChallenges, Models, and Approaches
PublisherWorld Scientific Publishing Co.
Chapter6
Pages175-199
Number of pages25
ISBN (Electronic)9789813274884
ISBN (Print)9789813274877
DOIs
StatePublished - 1 Jan 2019

Publication series

NameMultilingual Text Analysis

All Science Journal Classification (ASJC) codes

  • General Computer Science

Fingerprint

Dive into the research topics of 'A Survey of Neural Models for Abstractive Summarization'. Together they form a unique fingerprint.

Cite this