Microsoft Releases Dialogue Dataset to Make Chatbots Smarter

Apr 26, 2017 | 1902 Views

Maluuba, a Microsoft company working towards general artificial intelligence, recently released a new open dialogue dataset based on booking a vacation - specifically, finding flights and a hotel.

The number of chat bots has risen recently, especially since Facebook opened their Messenger platform to these bots a year ago. At the moment, most bots only support very simple and sequential interactions. Advanced use cases such as travel planning remain difficult for chatbots. With this dataset Maluuba (recently acquired by Microsoft) helps researchers and developers to make their chatbots smarter.

Maluuba collected this data by letting two people communicate in a chatbox. One person played the user, while the other person acted as if he was a computer. The user tried to find the best deal for his flight, while the person who played chatbot used a database to retrieve the information. The interactions only consists of text (there is no spoken interaction), a conscious choice of the researchers. Most people prefer typing to speaking, and it means that this dataset is free from bad speech-recognition and background noise. The result is a dataset with 1,369 dialogues on travel planning, and can be downloaded for free.

Maluuba also presents a way of representing the dialogues. What makes travel planning more difficult is that users often change the topic of their conversation. Simultaneously you might discuss your plan to go to Waterloo, Montreal, and Toronto. We humans have no trouble with keeping apart different plans people make while talking. Unfortunately, If users explore multiple options before booking, computers tend to run into problems. Most chatbots forget everything you talked about when you suddenly enter a new destination. In the left image below you see the interaction with a "traditional" chatbot. As soon as the user utters a new city the bot forgets the old city. On the right you see a pattern that emerged in the datasets that are published: users compare multiple cities before making a decision. Read More

Source: Infoq