Quick Notes - Day 8

All previous posts in this series here

theme: ACL 2020

Clinical Reading Comprehension: A Thorough Analysis of the emrQA Dataset
Authors: Xiang Yue, Bernal Jimenez Gutierrez, Huan Sun
Published at: ACL 2020 url

This paper addresses the problem of question answering in clinical domain. They start with a qualitative analysis of a recent clinical QA dataset called emrQA and find that most answers in the dataset are incomplete and hard to read, and most questions are very simple. They then did a quantitative analysis with BERT, and other such models. They then worked on adding domain knowledge to their models. Finally, they study the generalizability of this approach on unseen medical notes, and evaluate it with medical experts. They conclude that we need better datasets for clinical reading comprehension and question answering.

I found the whole paper very interesting. It took up a meaningful problem, thoroughly studied an existing dataset both qualitatively as well as quantitatively, and then tried to extend the state of the art on clinical QA. I felt such papers which thoroughly analyse popular datasets used in NLP should come more often, so that we know what to do to create better quality datasets.

Text and Causal Inference: A Review of Using Text to Remove Confounding from Causal Estimates
Authors: Katherine A. Keith, David Jensen, Brendan O’Connor
Published at: ACL 2020 url

Apparently, “In causal research about human behavior and society, there are potentially many latent confounding variables that can be measured from unstructured text data”. This paper presents a review of using text in causal inference research, intended to serve three audiences: causal inference researchers, NLP researchers and applied practitioners. The authors started with examples of studies where text was used in causal inference research either as a confounder or as a surrogate for confounder. After a brief overview of casual inference frameworks, there was a review of how to learn confounders via text, and how to adjust for confounding bias. There is then a long review on evaluation aspects of causal methods.

I am amazed that this paper is appearing in a major NLP venue. It seems very off-beat compared to usual stuff. I found the topic very very interesting. Although I don’t understand much at this point, I felt the review did a good job of covering different aspects of the problem, and discussing open problems for each aspect. I would have loved to see some sort of case study towards the end illustrating Figure 2 in action though.

Written on May 7, 2020