Notes on Claire Cardie's EMNLP 2020 keynote

I saw Claire Cardie’s keynote talk at EMNLP 2020. This blog post is a collection of my thoughts about it. The talk video with slides can be accessed here.

About the talk: The talk is called: “Information Extraction Through the Years: How Did We Get Here?”. As the name indicates, it gives a historical overview of one NLP task - Information Extraction. Starting the early research in this direction, with shared tasks and their shortcomings, the article also talks about the state of the art research. There was a lot of discussion on the challenges of IE then and now. This was nicely complemented by opinions from “IE experts” here and there, throughout the talk.

While I did not work on IE as a researcher, I spent some time doing various information extraction tasks as a practitioner, and I also read quite a bit of IE papers while co-authoring our book where we had a chapter on IE. Thus, while I am not a total newbie to the topic, I am not an expert eihter. I found the talk very informative and insightful, and left me with my thoughts and ideas to explore further.

What I liked:

I loved the way the talk was organized, with some historical nuggets that went along with the more technical part
There was a good overview of the initial goals of IE, how they evolved and how the multiple subtasks were formed.
It covered early research to current ones including EMNLP 2020 papers.
I enjoyed the talk! It was a well made one and reasonably accessible even for ones who are not working on IE per se.

What I missed:

An overview of how IE looks in the industry now along with a discussion on the challenges in taking these research developments into broader industry usecases.
Some more discussion on the issues around annotated data creation for IE, how much data is needed for a new language/new domain etc
An overview of IE in non-English languages (e.g., lot of work exists on NER, but I don’t know if there is anything for event extraction, relation extraction etc)

My wishlist for NLP keynotes: A talk by a senior person with research and application development experience, that speaks about progress and challenges in all aspects of NLP starting right from data collection/annotation to deploying out there, on one NLP task. A talk that will give a broad overview of the topic to people with all backgrounds interested in NLP, basically!

Irrespective of my complaints, I would recommend watching this talk (or at least seeing the slides) to anyone working on developing some form of information extraction solutions as a researcher or as a practitioner!

Written on November 17, 2020