This article was originally published on QBox’s blog, prior to Cyara’s acquisition of QBox. Learn more about Cyara + QBox.
I’ve built many chatbots over the years, and with each one I’ve found I’ve learnt more and more about what works and what doesn’t work in the build process. Here are some common errors that I’ve seen being made, including some errors I’ve made myself in the past that I hope you can learn from.
Cyara helps businesses test, monitor, and train chatbots to assure quality through the entire development lifecycle.
Failing to Plan Properly
Just think, if you’re building a house, you wouldn’t just start throwing some bricks and cement at the ground and hope that it eventually turns into a beautiful sturdy home, would you?
You would carefully plan the construction and start with building a good solid foundation first.
The same can be said for building a chatbot, it all starts with a good solid foundation.
This can be done through an intent and entity mapping process.
Start by listing all the main questions that you want your bot to answer and categorize each question into a subject area.
From here, sub-categories will start to evolve, and this will create potential intents and entities.
As you build your intent and entity map, consider how questions will be asked, and if they will be asked in a similar way, look to group them within the same intent and use entities for the variables.
Your mapping document will continue to evolve and there may be multiple versions through the build stage—your first one you attempt will probably look totally different to your latest version.
Once you’re happy with the mapping document and all the main questions have been categorized, you’ll be ready to start building your chatbot in your chosen NLP provider, using the intents and entities decided at this mapping stage.
Having too Little or too Much Training Data
Most NLP providers will advise having a minimum of five examples of training data in an intent.
But I’ve found having at least 20-30 examples per intent works well.
The danger in having an intent with very little training data in, is it’s likely the concepts being expressed in the intent are weak.
So, if a user is expressing the concept in a slightly different way it may result in the bot giving the dreaded “I’m sorry I don’t understand” answer, or even worse, give the user the wrong answer. I’ve found that for each concept in the intent, provide around 3 varied examples with the concept clearly expressed to maximize the bot’s learning.
Obviously if the intent is very simple, like a greeting intent, you may not need many training examples.
Similarly, if the intent is more complicated and covers a lot of different aspects, you may need to have a lot more than 30 examples to ensure the bot is deeply versed in the subject area it needs to cover.
If you find you’ve got an intent that’s starting to creep towards the 100+ training data, it may be that the intent is trying to cover far too many subject areas and it might be time to consider splitting that intent up into smaller intents.
The phrase “jack of all trades but master of none” springs to mind, an intent which serves nobody.
It won’t provide a great experience for the user and can also start to confuse other intents—an intent with a very large amount of training data in comparison to others is in danger of being artificially skewed towards that intent.
Using Real Customer Questions as Training Data Without Curation
If you’re lucky enough to have endless reams of real customer questions that cover the subject areas your bot needs to be knowledgeable about, you might be tempted to just add them all to the appropriate intents just as they are.
Easy right?
Actually, you could be causing more damage than good.
Real customer questions are quite often long-winded, very chatty, and contain lots of irrelevant information.
For instance, if you’re building a banking chatbot and you have an intent that covers asking for a new credit card, the chatbot needs to be trained on various ways of asking for a replacement card or ordering a new one.
It doesn’t need to know all the details surrounding why they need a new or replacement one, like their card getting eaten by their cute fluffy puppy, or the card being stolen whilst out on a food shop to get bread and milk.
All this unimportant information will make each training example far too long and will not provide good learning value to the chatbot. It could even start to cause confusion because of the number of words and the insignificance of some of those words.
Before adding each real customer question to an intent, cut out the fluff and make each one into a brief and clearly expressed piece of training data.
Basically, ensure each piece of training data is a smart piece of training data!
Failing to Monitor the Chatbot Adequately Once it’s Live
You may think that once all the hard work of building your chatbot, and spending hours and hours testing it and training it, you can just sit back, put your feet up and relax.
But once your chatbot is live, it’s not the end of the project.
You will need to dedicate time to monitor its performance and see exactly how it’s handling all your customer questions.
At first, I would suggest setting aside a couple of hours per day (or more if you can spare it) and check as many customer questions as you can within that time, and highlight any questions returning incorrect intents, or correct intents that are falling below your confidence threshold.
Obviously, these rogue ones will need to be trained, and perhaps do a couple of training sessions per week.
Try to do this daily monitoring/regular training for a month, and then this can be cut down to monitoring twice a week and training once a week.
Once your chatbot is more mature (six months or so down the line), monitoring should still be done once a week and training perhaps once a month, and this should be kept up for the lifetime of your chatbot.