Voximplant Avatar: An AI chat and voice robot for you

We encounter voice or chat robots every day. Some of them call to notify us about goods shipments, some of them irritate us with ads. Many such robots greet us on websites asking if we need any help.

The most significant disadvantage of such robots is that they can say or perceive only previously scripted information. This is the main reason why such robots irritate us, because it's impossible to have an open dialogue with them. So we usually prefer to speak to a human instead.

Here is where Artificial Intelligence (AI) comes in handy. AI took great steps forward over the last few years and is still evolving. There are plenty of AI companions which can freely talk like real human beings. Such companions learn during communication, understand many phrases, and even try to answer the same question differently.

So, what if we combine such an AI companion and a voice/text robot? Doing this, we can create an assistant which can maintain a smooth conversation with customers. Teach this assistant everything that it may need, add a realistic speech synthesis engine — that’s what you get with Voximplant Avatar.

How avatar works

Avatar is a Voximplant service that uses AI and NLP (native language processing) for voice and text communication with customers. You can teach your avatar all necessary information, such as your cafe's working hours and delivery options, so it can answer customers' questions. You can also integrate it with your CRM so your avatar can help your customers book tables.

Additionally, you can supply your avatar with realistic speech synthesis and recognition engines, making it a realistic interlocutor. Modern speech engines sound great and AI and NLP make conversations more natural. Add telephony or chat features, and you receive a perfect assistant for your hotline, contact center, or website.

How to create an avatar

Let's create an avatar that can perform everything explained above. For this, you will need a Voximplant account. Log in to the control panel, find the Avatars section, and click Create. Give your avatar a name, choose a language and time zone, and create it!

At this step, you already have a smart avatar that can analyze human speech and understand the speaker's intentions. Now we can teach our avatar all the information it may need.

How to teach an avatar

To teach your avatar, open it, go to the Intents section and click Add.

To understand what a customer wants, an avatar needs to find intents in a customer's speech. But a customer can ask the same question differently, thus why your avatar needs training.

Open the intent you created and go to the Training phrases section. We need to write several phrases on how a customer can ask the question.

What are your opening hours?
Do you work tomorrow?
How late are you open until tonight?

Then go to the Answers section and write some variants of your answers.

When you save the intent, you will see a yellow Training required button at the top of the screen.

Click Train and wait, this will take a moment. AI will analyze all possible variants of this question and will be ready to answer in a real case.

You can add as many intents as you want. Let's teach our avatar intents surrounding delivery options and booking a table.

Next write a JavaScript scenario on the Dialogue scenario tab. I created a simple scenario where the avatar greets a customer and searches for intents to learn about working hours and delivery options in the customer's speech and answers.

addState({
    name: 'start',
    onEnter:async(event)=> {
        // greet a customer when it connects to the dialogue
        return Response({utterance: 'Pineapple garden, how can I help you?', listen: true})
    },
    onUtterance:async(event)=>{
        // search for intents in a customer's speech
        if (event.intent === 'openHours' || event.intent === 'delivery' || event.intent === 'reservation') {
            // answer a customer's intent and keep listening
            return Response({utterance: event.response, listen: true});
        } else {
            // if an intent is not clear, ask to rephrase
            return Response({utterance: 'Sorry, I didn\'t catch that. I can help you with open hours, deliveries, and reservations', listen: true});
        }
    }
});

// set the entry point
setStartState('start');

You can find more information on how to write avatar scenarios in the Voximplant documentation.

Now let's test our avatar. Click the Debug button at the top right corner to run the scenario.

Ask your avatar any questions to see if it processes incoming intents. The avatar should recognize your intent and answer your question correctly. Let's try!

Voila! Your avatar recognizes intents perfectly and gives correct answers. Now it's time to teach the avatar to do something more serious than just answering the questions, like booking tables.

How avatar can book tables

First, teach your avatar a booking intent the same way we did previously. Upon recognizing this intent, the avatar needs to collect all the necessary information for booking, primarily time, date, and the number of people.

A customer might provide necessary information during their intention to book a table; for example, they could say, "I need to book a table for two", so we already know the number of people. That is why we need to collect the information during intent recognition.

Let's create a booking object in the scenario:

let reservationForm = {
    slotTime: null,
    slotPeopleCount: null,
    uncertainUtterancesCount: 0
};

Now we need to collect all the necessary information to fill out the form. Check what is missing and ask the customer questions. If the customer could not provide a certain answer, increase the uncertainUtterancesCount counter, to avoid looping.

When all the information is collected, we need to summarize it to the customer and ask for confirmation. If the customer confirms, we record the booking object and now can send it to your CRM or backend via an API request.

I also added several exit points to the scenario to avoid looping. For example, if the avatar did not understand the client three times, or the customer states that the avatar cannot help them, or says goodbye, the scenario will end. Take a look at my final scenario:

let reservationForm = {
    slotTime: null,
    slotPeopleCount: null,
    uncertainUtterancesCountweirdUtterancesInRow: 0
};

addState({
    name: 'start',
    onEnter:async(event)=> {
        // if this is the first time in this state, greet the client. if it is not, ask what we can help with
        if (event.visitsCounter === 1) {
            return Response({utterance: 'Pineapple garden, how can I help you?', listen: true})
        } else {
            return Response({utterance: 'Can I help you with somehting else?', listen: true})
        }
    },
    onUtterance:async(event)=>{
        // search for intents
        if (event.intent === 'openHours' || event.intent === 'delivery') {
            // answer the intent and go to the 'start' state
            return Response({utterance: event.response, nextState: 'start'});
        } else if (event.intent === 'reservation') {
            // the client may already give you some information
            if (event.entities.systemTime) {
                reservationForm.slotTime = event.entities.systemTime[0].value;
            }
             if (event.entities.systemNumber) {
                reservationForm.slotPeopleCount = event.entities.systemNumber[0].value;
            }
            return Response({utterance: 'Sure!', nextState: 'reservation'});
        } else if (event.intent === 'no') {
            // if the client says "no thanks", say goodbye and end the conversation
            return Response({utterance: 'Ok! Hope I\'ve helped. See you!', nextState: 'final'});
        } else if (event.intent === 'yes') {
            // if the client says that there's a question, ask and listen
            return Response({utterance: 'Sure, so what\'s your question?', listen: true});
        } else {
            // if the client's intent is not clear, ask for missing information, but not more than 3 times
            if (event.utteranceCounter < 3) {
                return Response({utterance: 'Sorry, I didn\'t catch that. I can help you with open hours, deliveries, and reservations', listen: true});
            } else {
                return Response({utterance: 'I\'m so sorry, but I couldn\'t understand you. Bye!', nextState: 'final'});
            }
        }
    }
});

addState({
    name: 'reservation',
    onEnter:async(event)=> {
        if (reservationForm.uncertainUtterancesCount > 2 ) {
            // if the client cannot answer avatar's questions, or if your avatar does not understand the client, stop asking anything
            reservationForm.uncertainUtterancesCountweirdUtterancesInRow = 0;
            return Response({utterance: 'Sorry I couldn\'t understand you', nextState: 'start'});
        } else if (reservationForm.slotTime && reservationForm.slotPeopleCount) {
            // if the information is given, confirm it
            return Response({nextState: 'reservationConfirm'})
        } else if (!reservationForm.slotTime && !reservationForm.slotPeopleCount) {
            // if something is wrong, ask about it
            return Response({utterance: 'For how many people and which date would you like a reservation?', listen: true})
        } else if (!reservationForm.slotPeopleCount) {
            return Response({utterance: 'And for how many people do you need a table?', listen: true})
        } else {
            return Response({utterance: 'And for which date?', listen: true})
        }
    },
    onUtterance:async(event)=>{
        // check if you have all the necessary information
        if (event.entities.systemTime || event.entities.systemNumber) {
            if (event.entities.systemTime) {
                reservationForm.slotTime = event.entities.systemTime[0].value;
            }
            if (event.entities.systemNumber) {
                reservationForm.uncertainUtterancesCount = 0;
                reservationForm.slotPeopleCount = event.entities.systemNumber[0].value;
            }
            reservationForm.uncertainUtterancesCount = 0;
            return Response({nextState: 'reservation'});
        } else {
            reservationForm.uncertainUtterancesCountweirdUtterancesInRow += 1;
        }

        if (event.intent === 'openHours' || event.intent === 'delivery') {
            // if during filling the form any other intent is found, answer it and continue filling the form
            return Response({utterance: event.response, nextState: 'reservation'});
        } else {
            // continue filling the form
            return Response({nextState: 'reservation'});
        }
    }
});

addState({
    name: 'reservationConfirm',
    onEnter:async(event)=> {
        // convert to human-readable date
        const months = ['January', 'February', 'March', 'April', 'May', 'June', 'July', 'August', 'September', 'October', 'November', 'December'];
        const monthStr = months[parseInt(reservationForm.slotTime.substring(5, 7), 10) - 1];
        const day = parseInt(reservationForm.slotTime.substring(8, 10), 10);
        const hour = parseInt(reservationForm.slotTime.substring(11, 13), 10);
        const minute = reservationForm.slotTime.substring(14, 16);
        return Response({utterance: `So you want to book a table for ${reservationForm.slotPeopleCount} people at ${day} ${monthStr} ${hour}:${minute}`, listen: true});
    },
    onUtterance:async(event)=>{
        if (event.intent === 'yes') {
            return Response({utterance: 'Awesome! We will be waiting for you', nextState: 'start'});
        } else if (event.intent === 'no') {
            reservationForm.slotTime = null;
            reservationForm.slotPeopleCount = null;
            reservationForm.uncertainUtterancesCount = 0;
            return Response({utterance: 'I see, sorry.', nextState: 'start'});
        } else {
            if (event.utteranceCounter < 3) {
                return Response({utterance: 'I\'m sorry, so do you want to make a reservation?', listen: true});
            } else {
                return Response({utterance: 'Sorry, I can\'t help you. Hopefully I will be able to assist you next time. Bye', nextState: 'final'});
            }
        }
    }
});

addState({
    name: 'final',
    onEnter:async(event)=> {
        return Response({isFinal: true, needRedirectionToOperator: false, reservation:reservationForm})
    }
});

// set the entry point
setStartState('start');

Now we have an avatar that can tell customers about hours of operation, delivery options, and also book a table and integrate with your CRM or backend.

How to integrate telephony and chat

The last thing your avatar needs is telephony or chat service to be able to communicate with your customer. To integrate it with telephony, go to the Integration tab of your avatar and copy the integration script. Then create an application on the Voximplant platform and paste the integration code into the application scenario.

Set up the speech synthesis and recognition modules by choosing the language and a suitable voice, rent or connect a phone number, set up a routing rule, and your avatar is ready to answer your customers' calls!

You can also connect your avatar to a text chat, for example, on your website. Avatars work perfectly with both voice and text channels. You can test how an avatar communicates in a chat via this simple demo.

Result

We created a simple avatar that can communicate with customers, answer their questions, and book tables via API requests. This is a simple case, but you can extend the logic and make your avatar an indispensable assistant for your hotline or contact center.

Voximplant's speech synthesis and recognition modules allow you to choose a very realistic voice or even integrate 3rd party voices, so your customers won't be able to tell if it is a real person or a robot. And constantly evolving AI and NLP will make your avatar better every day!

Register at the Voximplant platform and create your own avatar today! Everyone who tests in and leaves a review will get a guaranteed prize.

Blog