In 2016 Amazon inaugurated the Alexa Competition dedicated to “accelerating the field of conversational AI”, with the winner to be determined in late 2017. As part of the competition, university research teams attempted to build a socialbot that could converse coherently and engagingly with humans on popular topics for 20 minutes.
A research group at the Montreal Institute of Learning Algorithms, advised by Yoshua Bengio, submitted an entry by the name of MILABOT which successfully reached the semi-finals of the competition. The group recently published their methodology and the remainder of the blog will attempt to provide a succinct overview of this paper.
MILABOT relies on a composition of 22 individual models to generate responses where each model is able to independently generate a response to the ensuing dialog. To better understand MILABOT’s implementation we may bucket each of its 22 models into one of four categories: Template-based, Knowledge Base-based Question Answering, Retrieval-based and Generation-based.
Template-based models work by matching words and phrases to a predefined set of templates. This harkens back to an earlier age of artifical intelligence, where contexts and their appropriate responses were carefully hand-crafted by humans. An example of such a system, which MILABOT also makes use of, is Eliza. (Side-note: you can converse with Eliza here: http://psych.fullerton.edu/mbirnbaum/psych101/Eliza.htm)
Knowledge-Base based models are in a sense a refinement of Template-based models, where words and phrases are used to craft a query to a knowledge base and in turn generate a response. MILABOT for instance searches for a “wh-word (e.g “who”, “what”)” and the named entity that it is referring to in order to perform an appropriate factual search.
The Retrieval-based models rely on neural networks (deep learning) or more traditional machine learning techniques (logistic regression) to determine if MILABOT has seen a “similar” conversation context in a predefined database. Similarity does not imply that the conversations share words or phrases, but rather that they are conceptually similar. Once these models have identified a conversation match they simply “copy” the past response. Generative-based models take this a step further, and instead of just copying a historical response they use deep learning models to generate a response word by word.
Given these 22 individual models and their predicted responses MILABOT then selects the best response to a dialog context by way of a technique called reinforcement learning. This just allows MILABOT to plan its conversations as we humans presumably do.
We here at Kylie are particular excited to see how work and research at the frontier of conversational AI can be adapted to customer service support!
For the more technically inclined reader a preprint of the original publication can be found here: https://arxiv.org/pdf/1709.02349.pdf