Day1@COLING 2020

I am attending COLING 2020 and here are some notes about what I found interesting in today’s program. Note that I did not read them thoroughly. I just used titles to choose what talks to attend and either watched the video or followed the slides, and then browsed through the paper if I found them interesting. So, this is all highly a highly ad-hoc and subjective selection process. I categorized the papers I followed into two groups: Model understanding and analysis and NLP Applications.

Model Understanding and Analysis:

Linguistic Profiling of a neural language model
This paper analyses what linguistic properties does a neural language model learn by designing a suite of probing tasks. It was hard to understand exactly how they came up with this suite of tasks, and whether it is possible to replicate these experiments on another language (they report with English) from the short presentation, but it seemed like an interesting approach for me, to understand BERT and its siblings.

A Closer Look at Linguistic Knowledge in Masked Language Models: The Case of Relative Clauses in American English
This paper also deals with understanding what linguistic knowledge is captured by neural language models, taking a specific case of relative clauses, through probing, diagnostic cases, and masked prediction tasks. While the models seem to do well with probing, their weaknesses are uncovered in the other tasks. I found the idea of going beyond probing interesting, and useful. However, I would have loved to see one more language or one more linguistic phenomenon in English being investigated.

Do word embeddings capture spelling variation?
It is more common to see analyses of semantic and syntactic properties of neural embeddings. This paper looks at word embeddings trained on twitter and reddit to understand if word embeddings capture spelling variations, and show that they do. In 2014-15, when Word2Vec was becoming popular, I used to train/test different word2vec models to check if they capture spelling variations (I was working on differentiating between non-native English writing of different language speakers at that time, and the data had a lot of spelling errors). I found that they capture a lot of spelling variations (a blog post in Telugu about this here). It was a joy to read this paper confirm those thoughts, in a much more formalized, structured and seemingly authoritative way!

Don’t take “nswvtnvakgxpm” for an answer - The surprising vulnerability of automatic content scoring systems to adversarial input
There is a lot of work on automated content scoring (i.e., scoring short answers for their correctness, not grammatical accuracy or vocabulary richness or that sort of stuff). In this paper, they show that existing content scoring systems are very sensitive to what can be called “adversarial input”. They propose some approaches to address this problem, however, they seem to conclude that there is clearly a lot to be done to address this issue of “cheating the machine”.

Probing Multilingual BERT for Genetic and Typological Signals
This paper probed the representations learnt by mBERT and concluded that it can infer a phylogenetic language tree for 100 languages accurately. There is a lot I don’t understand on this topic (computational historical linguistics?), but I find it pretty fascinating that deep neural network based models can be useful for doing this!

Financial Sentiment Analysis: An Investigation into Common Mistakes and Silver Bullets
This paper looks at some well-known sentiment analysis methods when applied to finance domain and identifies some common error patterns across the models. They then did a classification of these error patterns. It seems like the conclusion is that some of these error patterns are seen outside Finance domain too, and there is much work to be done in improving sentiment analysis in future. I was hoping to see that they will recommend training with adversarial examples or some such thing in the end, but no such thing happened :-)

NLP Applications:

Automated Prediction of Examinee Proficiency from Short-Answer Questions
This paper deals with the topic of short answer content (not form) based scoring for a medical exam. Generally, short answer scoring approaches rely on manually labeled data to predict human scores. However, this approach combines NLP and psychometric methods and proposes an approach that does not need such manually labeled data to train! I found the idea fascinating, although it was hard to get much from a short presentation. While skimming the paper, I found a section on “limitations and operational feasibility” of the approach, which is a cool one to read!

On the Helpfulness of Document Context to Sentence Simplification
Research on automatic text simplification typically works at the level of sentences. In this paper, the authors argue that looking at the context (both preceding and following sentences) improves the performance of text simplification systems. They also release a dataset of 116K sentence pairs containing normal and simplified versions of text. I have always wondered if document level context helps simplification (and also machine translation, which also seems to be done at sentence level mostly), and I liked the conclusion of this paper that it does.

AutoMeTS: The Autocomplete for Medical Text Simplification
This paper proposes an assistive writing approach to text simplification based on an autocomplete system using various pretrained language models. They also introduced a new dataset of parallel medical domain sentences from Wikipedia-Simple Wikipedia. This is an interesting application in the text simplification space, which can potentially be useful in many such scenarios where the simplification needs to be reliable.

French Biomedical Text Simplification: When Small and Precise Helps
This paper describes a biomedical text simplification approach for French which uses a small amount of domain specific simplification corpus and a large corpus translated from a English-Simple English dataset into French using machine translation (at least this is what I understood, it was kind of hard to follow!). They show that this approach achieves good results for a low resource domain specific simplification scenario. I found the idea very interesting, and it is great to see them release the dataset, but the corpus description was hard to follow!

Detecting de minimis Code-Switching in Historical German Books
When we hear the word code-switching in NLP, it is usually more in a modern context - people switching between languages on social media, or in email/chat/blogs etc. This paper proposes an approach to identify code-switching in historical German books from 17th to 19th century, identifying where latin/greek/english etc words appear in between in German text. Some interesting usecases for this kind of work are OCR and speech transcription. The results did not seem too fascinating during the quick glance to me, but this sure seems to be a very different code-switching problem compared to existing research.

Leveraging HTML in Free Text Web Named Entity Recognition
This short paper investigates the usefulness of HTML tags while doing NER on web text. Using five English datasets, they conclude that including tag related information improves NER performance (0.9% and 13.2% improvements in various settings). Like the authors say, it is common to strip off HTML before doing any NLP - so it was kind of interesting to know it can actually be useful.

Other interesting stuff I bookmarked to check out later are: A Corpus for Argumentative Writing Support in German

A Review of Dataset and Labeling Methods for Causality Extraction

Overall, an interesting day, and looking forward for more tomorrow!

Written on December 8, 2020