Reading Notes - Ranking Automatically Generated Questions Using Common Human Queries

This post is about the following short paper:

Ranking Automatically Generated Questions Using Common Human Queries

Yllias Chali and Sina Golestanirad

Proceedings of the 9th International Natural Language Generation conference

pages: 217–221

read online

Paper summary: This paper proposes an automatic question generation approach that can generate common and specific questions from a text passage and rank them using community based question answering systems (yahoo answers data). They use syntactic similarity between the text and question as a proxy for grammaticality. Questions are evaluated in terms of syntactic correctness and topical relevance. Their primary motivation seems to be that such question generation can be useful for users to post better search queries online.

I think the overall approach can be summarized in the following paragraph from the paper:

“Our question generation approach is built in five steps. In the first step, we tag named entities from a text which is related to the query. In the next step, we use question templates to generate basic questions, based on the tags from the previous step. In the third step, we apply a text simplifier to all of the sentences in the text and then we use a semantic role tagger to tag all of the arguments and predicates in these sen- tences. Fourth step is about applying another set of question rules to the extracted arguments and pred- icates in order to generate specific questions. In the final step, we use our proposed algorithm to rank all of the generated questions”

My thoughts:

  • It would have been great if more was said about their templates which they use in Step 2 and Step 4. That would have given us a better picture about what exactly are the kind of questions being generated. From what I read, it seemed to me that the answers are probably single words or multi-word expressions, not beyond that.

  • I thought the relative work is compact, and yet comprehensive. This could be because I read most of those references mentioned there.

  • Learning question templates from community question answering systems is an interesting idea to explore.

  • SEMILAR seems to be an interesting library to explore in the context of looking for semantically similar texts and comparing a question and its answer in terms of relatedness.

Written on June 27, 2017