Notes on Claire Cardie's EMNLP 2020 keynote
About the talk: The talk is called: “Information Extraction Through the Years: How Did We Get Here?”. As the name indicates, it gives a historical overview of one NLP task - Information Extraction. Starting the early research in this direction, with shared tasks and their shortcomings, the article also talks about the state of the art research. There was a lot of discussion on the challenges of IE then and now. This was nicely complemented by opinions from “IE experts” here and there, throughout the talk.
While I did not work on IE as a researcher, I spent some time doing various information extraction tasks as a practitioner, and I also read quite a bit of IE papers while co-authoring our book where we had a chapter on IE. Thus, while I am not a total newbie to the topic, I am not an expert eihter. I found the talk very informative and insightful, and left me with my thoughts and ideas to explore further.
What I liked:
- I loved the way the talk was organized, with some historical nuggets that went along with the more technical part
- There was a good overview of the initial goals of IE, how they evolved and how the multiple subtasks were formed.
- It covered early research to current ones including EMNLP 2020 papers.
- I enjoyed the talk! It was a well made one and reasonably accessible even for ones who are not working on IE per se.
What I missed:
- An overview of how IE looks in the industry now along with a discussion on the challenges in taking these research developments into broader industry usecases.
- Some more discussion on the issues around annotated data creation for IE, how much data is needed for a new language/new domain etc
- An overview of IE in non-English languages (e.g., lot of work exists on NER, but I don’t know if there is anything for event extraction, relation extraction etc)
My wishlist for NLP keynotes: A talk by a senior person with research and application development experience, that speaks about progress and challenges in all aspects of NLP starting right from data collection/annotation to deploying out there, on one NLP task. A talk that will give a broad overview of the topic to people with all backgrounds interested in NLP, basically!
Irrespective of my complaints, I would recommend watching this talk (or at least seeing the slides) to anyone working on developing some form of information extraction solutions as a researcher or as a practitioner!