Quick Notes - Day 7
All previous posts in this series here
Today’s theme: abstractive text summarization
TLDR: Extreme Summarization of Scientific Documents
Authors: Isabel Cachola, Kyle Lo, Arman Cohan, Daniel S. Weld
Published at: (may not be peer reviewed)ArXiv url
This paper introduces the task of (extreme) abstractive summarization of research articles. Abstract is already a summary, but also contains some information such as methodological details and stuff. The aim of this paper is to capture important aspects of the paper such as its main contributions. They created a dataset from openreview (which contains author written tldrs as well as reviewer version tldrs) built a tldr summarizer by tuning a BART model.
I find the idea of using openreview as dataset interesting, but I am not exactly as excited by the idea of creating such super short summaries for research papers. I feel it won’t serve any clear purpose that is not addressed by the abstract itself. I can’t envision a scenario where a human reader will read this tldr and get the idea. The authors write: “An alternative to abstracts, TLDRs of scientific papers leave out nonessential background or methodological details and capture the key important aspects of the paper, such as its main contributions.” - I don’t think those are non essential details. The paper’s key contributions can’t make any sense if methodology is flawed or if it just reinvents the wheel without being aware of related work or something like that.
I saw this paper mentioned in Import AI newsletter this week where there were exampls of poor output from this model. After seeing those, I felt the paper should have had some section with bad examples as well along with good ones, followed by some analysis of the model itself. So I was kind of frustrated towards the end.
Don’t Give Me the Details, Just the Summary! Topic-Aware Convolutional Neural Networks for Extreme Summarization
Authors: Shashi Narayan, Shay B. Cohen, Mirella Lapata
Published at: EMNLP 2018 url
This paper introduces the task of “extreme summarization” where the goal is to create a one sentence summary of a news article answering the question “What is this article about?”. In recent past, different variants of encoder-decoder architectures, RNNs etc became common methods for this kind of generation tasks. However, this paper instead uses an architecture entirely built on CNNs. They show with both automatic and human evaluations that this approach outperforms extractive summarization as well as other abstractive summarization approaches.
I came to this paper after reading the TLDR paper. Here too, I was fascinated by this idea, and for news domain, it actually made more sense, as such a summary could be a headline or the deck under it. However, even here, I wondered if all “what is this article about” questions can be answered meaningfully in one line. I would have loved to see a few examples of where one sentence summary was insufficient or the obtained summary was “bad” according to humans. Surely, these are all not perfect approaches, there have to be some such examples! I think some sort of model criticism should be made a part of all NLP papers!