Quick Notes - Day 1
My parental leave ends in a month, and so, I was considering getting back into some regular reading habit. I felt committing myself to writing quick notes on the papers I came across that day will be a good way to bring some working discipline back after months of a different form of busy-ness. Note that these notes are for my own reference, I don’t have any greater good in mind. Another important thing is that I am not reading each and every paper as if I am writing a review or I am writing my thesis on it. I am just skimming through, focusing on breadth rather than depth.
BERTScore: Evaluating Text Generation with BERT
Authors: Tianyi Zhang, Varsha Kishore, Felix Wu, Kilian Q. Weinberger, Yoav Artzi
Published at ICLR 2020. url
BERTscore is a new metric for evaluating text generation tasks such as machine translation, caption generation, some forms of text summarization, question answering etc. A common practice in the evaluation of text generation systems is to compare the system generated text with some reference texts created by humans. There are several measures to perform this evaluation (e.g., BLEU, ROUGE etc.). A primary criticism for such is that they only look at superficial/surface similarity between system generated and human reference texts, and thus, will penalize well generated, but more creative text that does not use the exact same vocabulary. BERTScore uses pre-trained contextual embeddings from BERT instead of direct n-gram overlap, thus (hopefully) capturing paraphrases and semantically similar texts.
Using a large set of experiments on two tasks (machine translation and image captioning), the authors show that BERTScore correlates highly with human judgements, and in case of machine translation, better than several other existing metrics.
I found the paper interesting - at its core, I think this is a simple (and obvious) idea for anyone familiar with how current text generation metrics work. I think the primary reason I liked this one is the extensive experiments and detailed analyses of results (there is more detailed analysis in appendix).
Data-Driven Sentence Simplification: Survey and Benchmark
Authors: F Alva-Manchego, C Scarton, L Specia
Published in Computational Linguistics, 2020. url
Sentence simplification is the task of automatically rewriting a given sentence into a more simpler, supposedly easy to comprehend language, without a loss of meaning. A range of approaches have been proposed for this task starting from rule based approaches in the 1990s, to neural sequence to sequence generation approaches of the recent past. Having spent some time on this topic in the past, I remember two major surveys of this topic - Siddharthan (2014) and Saggion (2017). The current survey extends existing ones by covering in detail a lot of recent work, and I felt it also addressed all aspects (datasets, methods, evaluation) of the task along with giving a critical summary of existing approaches and benchmarking some of the common datasets used for this task. This is a great starting point for anyone wanting to start working on this topic!
These are the two papers on which I spent some time today (beyond reading titles and abstracts and moving on that is!). I hope to continue my attempts to get back to what I call “working discipline” by writing these posts everyday (or every other day).