This article was originally published on QBox’s blog, prior to Cyara’s acquisition of QBox. Learn more about Cyara + QBox.
We often see major performance issues on chatbots that have multiple sub-models and a master model controlling the individual sub-models. When we test each individual chatbot model using automated testing (this tests the training data within the model), we generally find they perform relatively well as separate chatbots, with just a few common challenges that need fixing.
Cyara offers automated chatbot testing, monitoring, and training solutions to help you accelerate chatbot development.
These challenges include some concepts expressed within the intents that are a little weak and need reinforcing with additional training data (to express those concepts in a varied way) and include intents that are very similar in their training data and benefit from being merged into one intent. These issues can usually be easily addressed at the sub-model level.
However, the main challenge is the master model failing to route users to the correct individual model. This is a common problem; each intent in a master model is made up by combining all the utterances in each individual model, resulting in very large intents, and each of these intents inevitably cover a wide range of topics. And because the intents are so large, with many training phrases in them, they get very difficult to balance and model performance at the master level becomes very poor.
So, although the individual sub models on the whole don’t have too many problems to fix them, the chatbot mainly fails at the master model level, i.e., routing its customers through to the wrong sub model. In effect, it’s almost irrelevant that the sub models do perform well—customers simply aren’t being routed to the right one!
Another discovery we usually see with a multiple sub-model setup, is the fact that the individual sub-models only have a handful of intents in them, and so are classed as very small models. In these circumstances, when there aren’t many intents across all the sub-models, it really makes more sense to just have one single large model handling all intents. We’ve seen hundreds of chatbots over the years that are single models handling 150+ intents very successfully, and we’ve built models ourselves with 200+ intents that have achieved 95%+ accuracy.
So here are some challenges to really think about before deciding to go down the multiple model path.
- There is a “double loss” effect when using master and sub models together. Firstly, there is always a danger of losing a percentage of your customers in the first classification of the chatbot, the master model, and then there is a danger of losing another percentage of your customers in the second classification in the sub model. It’s this double loss that is one of the biggest disadvantages of this type of setup.
- It also must be considered that the chatbot trainer has to undertake twice the training with this setup—once in the sub-model and then again in the master model. So, every time some re-training is needed, or new subject areas need to be added, it must be done in both the sub and the master models. There’s no avoiding this!
- Intents in the master model will undoubtedly grow very large, cumbersome, and imbalanced.
- Scaling will become difficult if you have a master model that is struggling to perform well in the first place—it’s always easier to scale up a model that is already smooth going and accurate.
Our Recommendations
We would recommend only considering separating a model into a master model and sub models if:
- The intent limits for your NLP provider have been reached:
- For Watson Assistant, the limits vary from 100 intents to 2000 intents, depending on your subscription.
- For MS LUIS, the limit is 500 intents
- For Dialogflow ES, the limit is 2000 intents
- For Lex, the limit is 100 intents
- (We suggest only considering multiple models when you are approaching the 150-200 intent mark)
- You have multiple different teams working on the same chatbot model and it’s not possible to work closely together, whether it is because of operational/business reasons or perhaps because of security/sensitive data issues.