What is a chatbot: A beginner’s guide

You have probably heard the buzz about chat interfaces and artificial intelligence and how this is the next big thing. In this blog we try and explain what they are and why they probably do matter.

Broadly speaking a chat interface is: Anything that mimics speaking to a real human. There are two types:

Voice Assistants

Chatbots

Chatbots can also be built on websites.

What do chatbots do?

Chatbots have been around for ages. Traditionally chatbots have been fairly unintelligent and ineffective. This is because until recently the technology hasn't been up to the task of making them any better.

To understand how chatbots work, you first need to understand a little bit about artificial intelligence and machine learning. These are terms that get bandied around all the time these days, but it may help to be clear about what these mean.

In the past anyone who has wanted to build a chatbot needed to anticipate every input a user might say and then provide a relevant response. This is not only arduous to design, but also means establishing context in a conversation is virtually impossible.

If user says: Hi
Computer says: Hello
If user says: what is the weather today
Computer says: it is sunny
If user says: What about tomorrow?
Computer says: Huh?
If user says: will it be raining tomorrow?
Computer says: Huh?

Enter machine learning.

Whilst this isn’t a textbook definition - broadly, artificial intelligence can be defined as anything that imitates real intelligence. Our old school chatbot would fall into this category - along with more complex examples like video games. (Opponents in video games are programmed to react in certain ways to the user’s actions, this gives the illusion of an intelligent opponent.)

Machine learning is a subset of Artificial intelligence.

The awesome thing about machine learning is that the program creates the rules. In our old school chatbot, the designer provides the chatbot with rules. Chatbots using machine learning only require designers to feed in examples. The program will then create the rules.

Here’s a (very) basic example: A group are going fishing, the number of fish they catch is related to how long they spend fishing.

Traditional approach

To work out how many fish the group are likely to catch, traditionally, it would be sensible to plot some examples on a graph and draw a line of best fit.

From the line we can predict the number of fish that will be caught in a given number of hours. This example involves telling the program the function of the line and then using this function to make a prediction.

Machine learning

In machine learning, the program would simply be given the examples and programmed to create its own line of best fit.

The program splits the data into two batches: a training set, and a testing set.

This is incredibly powerful as the program can establish correlations that may not be obvious to a human.

Whilst great for simple correlations this is not as useful for complex associations, such as classifying images or determining what the meaning of a sentence is.

Neural Networks

To achieve effective results for more complicated tasks we need to use artificial neural networks. Neural networks are a further subset of machine learning.

Neural networks are so named because they are based (loosely) on neurons in the brain.

A neural network starts out as a network of transistor “neurons,” connected to each other with inputs and outputs, and it knows nothing—like an infant brain.

The way it “learns” is it tries to do a task, say handwriting recognition and, at first, its neural firings and subsequent guesses at deciphering each letter will be completely random. But when it’s told it got something right, the transistor connections in the firing pathways that happened to create that answer are strengthened; when it’s told it was wrong, those pathways’ connections are weakened.

After a lot of this trial and feedback the network has, by itself, formed smart neural pathways and the machine has become optimised for the task.

(In practice the number of nodes in each layer may number several million. Also, the neural network learns not by randomly trying numbers but using fancy methods such as stochastic gradient descent - this dramatically speeds up the process of training).

The brain learns a bit like this but in a more sophisticated way, as we continue to study the brain, we’re discovering ingenious new ways to take advantage of neural circuitry.

As a quick demonstration of the power of neural networks, here is a passage from Ernest Hemingway's “The Snows of Kilimanjaro” (translated from Japanese), before and after Google applied a neural network to their translate program:

Kilimanjaro is 19,710 feet of the mountain covered with snow, and it is said that the highest mountain in Africa. Top of the west, “Ngaje Ngai” in the Maasai language, has been referred to as the house of God. The top close to the west, there is a dry, frozen carcass of a leopard. Whether the leopard had what the demand at that altitude, there is no that nobody explained.

Kilimanjaro is a mountain of 19,710 feet covered with snow and is said to be the highest mountain in Africa. The summit of the west is called “Ngaje Ngai” in Masai, the house of God. Near the top of the west there is a dry and frozen dead body of leopard. No one has ever explained what leopard wanted at that altitude.

Natural language processing

This is a very powerful approach for natural language processing i.e. the task of taking a sentence, spoken by a human, and working out what it means.

We can now create a chatbot using a neural network that can take a sentence, determine the users intention and pull out the required information.

Not only this but we can now output the context from this sentence to allow the conversation to flow. For example, the user might then say ‘What about for less than £5’ and the chatbot can understand that we are referring to a pale ale for less than £5, without having to restate this directly.

Once we have the data in the correct format we can do whatever we want with it: return the user a series of pictures from our database, ask follow up questions, or even fetch something else from the web.

Why is this so useful?

Accessing information

Right now billions of people have, in their pockets, access to more information than anyone in history could have dreamed of possessing. We literally have billions of pages of text, billions of songs and billions of videos.

However, there is a limitation in terms of how we access it.

Despite the practically limitless amount of information available to us, accessing it still involves navigating the web using a mouse, or your thumbs on a smartphone screen.

You have to find an appropriate webpage, work out what is happening on that particular website, and where the information you are looking for is located. This often involves confusing menus, and a lot of trial and error.

This is not a natural way to find information. In fact, humans have only been doing this for around the last 30 years (Apple released the Macintosh complete with the first graphical user interface and mouse in 1984) - and even less for the internet.

What humans have been doing for at least the last 10,000+ years is: talking to each other. This is the most natural way to access information, and the way the web is shifting.

A conversational web

Imagine instead, that you could start chatting with a company on Facebook. You’re looking for a product. You ask for the product as you would a shop assistant on the high street. The chatbot intuitively understands what you are asking and instantly provides you with the information you require.

Whilst this is a simple example, it demonstrates the power of natural language processing and chatbots in providing your customers with useful information, instantly.

This example also shows that, it's not all about voice. Whilst voice recognition is going to become more and more common in the coming years (Google estimate 30% of all web use will be voice only by 2020), there are times where it's appropriate or easier not to use voice.

Text based chatbots are quicker and more intuitive than websites, and they can display rich content like videos, pictures and links.

Integrating with platforms like Facebook and Twitter has further advantages in that you immediately have a lot of information on your customer (their name, age etc.) This can make the experience more personalised and soon users will even be able to pay directly through Facebook (this is already being trialled in the US).

The future is here now

The internet and smartphones put the world’s information in our pockets - chat interfaces will make it as easy to access as chatting with your friends.

To read more about how chat interfaces could be useful to you, check out our chatbot blogs:

Web Usability Blog