Teaching Notes - Dialog Systems and Conversational Bots

This post is a continuation from my previous posts on teaching a 100-level undergraduate course called Language and Computers. As mentioned earlier, it is a very diverse class, and I use this textbook: Language and Computers by Marcus Dickinson, Chris Brew and Detmar Meurers.

The topic for the past week has been dialog systems - which include software that interact with humans to serve specific purpose (e.g., flight booking systems, hotel reservation systems), general purpose assistants (Google Assistant, Siri, Cortana etc), as well as chat bots such as Eliza, more recent stuff such as apps that help you with depression and so on.

I spent roughly 3-3.5 classes (50 min duration). In the introductory class, I discussed the following questions:

  • What are dialog systems, chat bots, conversational agents etc?
  • What kind of applications use these?
  • Did you use any of these, how did you feel?
  • What are the potential challenges that exist in building such systems?
  • Will they eventually resemble human conversators?

Along with these, I also discussed briefly about pragmatics, how linguists study communication, stuff like speech acts, Gricean Maxims for communication and how these insights are useful for dialog system development. Then, we looked at a few examples of such systems - Lets Go used in Pittsburgh’s Port Authority Transit buses (that is what the website said, the number did not work for me), tutoring systems such as ITSPOKE and Cognii that interacted with learners giving them feedback about their answers, AI based chatbots of recent past (e.g., client demos from morph.ai, x.ai), Bot creation APIs and finally, some discussion about Siri - by showing pictures of 2014 vs 2017 from Jurafsky and Martin, 3rd edition draft, and my own experience with Siri on my office Mac in Jan 2017 and on the day of my class. After all this discussion, we had some questions about how they think things will look in near and far future given the state of the art they saw.

In the next class, we spent more time on how such systems work - I grouped them into rule based, frame based, and information retrieval based - like in J&M. We spent some time looking at Eliza’s regular expressions from a public implementation, discussing how it works, for Frame based, we did this with the Flight booking example in J&M and LetsGo from Dicksinson et.al. textbook, and for information retrieval, we took a look at some FAQ bots, and Siri. In the last class, we discussed about different ways of evaluating dialog systems, for this, again, J&M’s book spoke about contemporary (September 2017!) evaluation approaches, which was useful. We discussed a lot of this in a Q&A format, and I very much enjoyed the interaction. NACLO’s Yes Bot question was useful to talk about some issues with dialog system design.

Overall, I found the textbook a bit dated in this aspect, because a lot of stuff happened in this direction in the past 3-5 years (i.e., after the book got published) and I found the draft chapters of J&M, 3rd Edition (Chapters 29 and 30) very useful to introduce this topic (even though that book in general is not for such a generic, diverse audience at an earlier stage of education). Using Siri as an example helped me seamlessly move into the next topic - Speech Recognition and Synthesis, as Siri does a particularly bad job with my Indian accent most of the time, even before messing up the language understanding part! :-) Thanks to the recent developments in this area, I think this topic came out well, despite the shortcomings of the textbook. There was a lot of accessible, outside stuff which resulted in good participation.

My next post will be on speech recognition/processing/synthesis - all put together.

Written on November 5, 2017