How to Improve NLU Performance of Intelligent Virtual Assistants

In this blog we are going to talk about how to improve the performance of the Natural Language Processing (NLP) of intelligent virtual assistants. This is important because your intelligent virtual assistant (IVA) can quickly and reliably interpret a client's purpose and meet customer expectations thanks to better NLP. After you've put in all of the effort to design, build, test and launch your virtual assistant, now you want to make sure that over time it's getting smarter, improving and delivering a better experience to all of your users. So how do you improve Natural Language Processing?

What Is Natural Language Processing?

A chatbot’s ability to consistently understand and interact with a user is dictated by the robustness of the Natural Language Processing (NLP) that powers the conversation. NLP is the science of deducing the intention and related information from natural conversations.

The conversation flow in Kore.ai virtual assistants passes through various Natural Language Understanding (NLU) engines and conversation engines before the IVA decides upon action and response. The most basic duty of NLU is to understand the meaning of an audio or text input and determine its intention, essentially understanding human language.

The Kore.ai XO Platform uses a unique Natural Language Processing strategy, combining Fundamental Meaning and Machine Learning engines for maximum conversation accuracy with little upfront training. Bots built on Kore.ai’s platform can understand and process multi-sentence messages, multiple intents, contextual references made by the user, patterns and idiomatic sentences, and more.

How To Improve the NLP Performance of Virtual Assistants

As your virtual assistant engages with a diverse user base, it will generate a wealth of data. This data will provide insights into what aspects are functioning well and what isn't, helping you identify gaps and potential areas for enhancement.

There are two main methods to enhance the performance of Natural Language Processing (NLP). The first is by expanding or improving the data used for training your machine learning models or by further training the virtual assistant. The second method involves modifying the scope of your use cases, features, or capabilities.

Here are basic guidelines to keep in mind while reviewing IVA performance:

Identify problems – get a clear idea of what the IVA is supposed to accomplish. Talk to business analysts and IVA developers to understand the requirements and the actual functionality of the Virtual Assistant.
Review Data Analytics – A comprehensive analytics suite is essential for the effectiveness of a virtual assistant. The more detailed your data, the more it can help you to identify and understand the existing gaps in performance.

Brainstorm what an end-user might ask as part of achieving their intent – these would be the alternate utterances for every intent. Try to also include idioms and slang.

Method #1 Improve Based on Conversation Insights and Analytics

The NLP Insights feature helps you understand the analytics data and assess your virtual assistant’s performance in identifying and executing tasks. You can improve your IVA’s performance based on these insights. The Analyze > NLP Insights page shows the specific information in the following sections:

Intent Found: Number of identified intents
Intent Not Found: Number of unidentified intents
Unhandled Utterances: Number of unhandled utterances
Pinned: Pinned NLP Insight records. Specific records are pinned to highlight them for easy access and viewing.

However, to categorize the utterances as True Positives (TP), True Negatives (TN), False Positives (FP), and False Negatives (FN), you need to go through all the utterances across multiple tabs in NLP Insights. There could be millions of utterances that a bot designer needs to review, which could be tedious and time-consuming.

The Conversation Insights under Analyze in the Kore.ai XO Platform groups the utterances in a cluster, based on their semantic meaning and provides a name to each of these groups, which avoids the need to analyze all the utterances of the cluster.

Check for False-Positives & Out-of-Scope Queries

Out-of-scope queries refer to questions that the virtual assistant failed to comprehend. In such instances, it's also possible to identify false positives - situations where the virtual assistant mistakenly believes it has correctly understood the user's request when in reality, it has misinterpreted the user's intent.

Below you will find more details about TP, TN, FP, and FN scenarios with examples:

True Positive

True Positives (TP) refer to instances where the virtual assistant correctly identifies the intent of an utterance. For example, if the user says “What’s the weather today?”, and the virtual assistant correctly identifies the intent as “get_weather”, this would be a True Positive.
In this example the intent is correctly mapped to Check Balance, hence it is a true positive

True Negative

True Negatives (TN) refer to instances where the virtual assistant correctly identifies that an utterance did not match any of the defined intents. For example, if the user says “I’m not sure what you mean”, and the virtual assistant correctly identifies that this does not match any of the defined intents, this would be a True Negative.

In the following example, the user utterance “Extremely Likely” did not match with any defined intents and is categorized as Unidentified intent.

False Positive

False Positives (FP) refer to instances where the virtual assistant incorrectly identifies the intent of an utterance. For example, if the user provides his bank account name, and the virtual assistant incorrectly identifies the intent as “Close Account”, this would be a False Positive.

False Negative

False Negatives (FN) refer to instances where the virtual assistant incorrectly identifies that an utterance did not match any of the defined intents. For example, if the user says “What’s the weather today?”, and the virtual assistant incorrectly identifies that this does not match any of the defined intents, this would be a False Negative.
In this example, the “create account” utterance is wrongly mapped as an Unidentified intent, and hence would be False Negative.

Retrain Your Machine Learning Models

Once you've identified the false positives and out-of-scope queries, the next step is to add that data, the utterances, or those queries back into the training data. Optimizing your machine learning models through continuous retraining is key to enhancing the intelligence of your virtual assistant. This critical step helps to reduce discrepancies and improve how the virtual assistant understands a user during an engagement.

Method #2 Changing The Scope of Your Use Cases

Another way to improve NLP performance is by changing the scope of your use cases. For instance, you might have two unique use cases that are verbally similar and users might ask their questions in a similar manner for both. For example, ‘ Transfer funds’ and ‘Make a payment’ are two unique use cases that users may request in a similar manner.

This is why the scoping and design phase of your virtual assistant is so critical. You may discover that certain queries are being incorrectly categorized under the wrong intent. This insight allows you to revisit and adjust the scope of each use case, ensuring it's specific enough to match user queries accurately, while also being comprehensive enough to encompass the variety of ways a topic could be inquired about.

To learn more about building and improving virtual assistants you review our documentation page on improving NLP performance.

Want to Learn More?

We're here to support your learning journey. Ready to take on bot building but not sure where to start? Learn conversational AI skills and get certified on Kore.ai Experience Optimization (XO) Platform.

As a leader in conversational AI platforms and solutions, Kore.ai helps enterprises automate front and back-office business interactions to deliver extraordinary experiences for their customers, agents, and employees.