Contact us
Young boy in an orange shirt focused on building with colorful blocks, developing creativity and fine motor skills.

Building blocks of conversational AI

Young boy in an orange shirt focused on building with colorful blocks, developing creativity and fine motor skills. Young boy in an orange shirt focused on building with colorful blocks, developing creativity and fine motor skills.
Dhyaanesh Mullagur
Technical Principal - Valtech
Headshot of Dhyaanesh Mullagur

July 18, 2022

Now that you have identified what the business and customer needs are and decided to integrate voice automation into your company’s processes, where do you start?

In this article we will be exploring Dialogflow, Google Cloud’s state-of-the-art managed Natural Language Understanding (NLU) platform that allows users to build and manage conversational user experiences for a variety of use cases. Dialogflow is built upon a conversational core that Google has spent years developing, and it is packaged as a platform for product teams to work with.

The foundation

Dialogflow CX consists of some foundational building blocks that make up the core of the product.

Utterance: A sentence said by either the user or the bot.

Training Phrases: A set of utterances used to train the intent classification model.

Intent: An end user’s intent for one of their utterances. Intent classification is the process of matching an utterance to an intent.

Entity: In addition to recognizing the intent of an utterance as a whole, Dialogflow can recognize key pieces of information within the utterance itself, depending on what matters to your business. These are known as entities. For example, if a customer says, “I would like a large lemonade,” we can train Dialogflow to extract “Large” as a size and “Lemonade" as a menu item. This information can then be passed to the back end to assemble and check out the customer's cart.

Pages: A page can visualize the current conversation state and usually corresponds to a customer activity. For example, ordering, customizing an item and checking out would each have their individual pages. Within a page, one can define the conversation logic necessary to move to another page.

Routes: A route defines the logic needed to move from one page to another. There are three types of routes: event, conditional and intent. For example, when a customer says, “I am ready to check out,” the programmed checkout intent route would trigger and transition the conversation to the checkout page for further assistance.

A combination of these five building blocks enables teams to build complex conversation flows to support customer needs.

The architecture

Using the foundations of Dialogflow CX listed above and a firm understanding of customer needs, one can build complex conversational flows to serve customers. Next, we need to integrate Dialogflow CX with other systems to make our complex conversation ready for production and customer use.

There are three key pieces that link to our Dialogflow Conversational Model:

  • The customer channels (Client)

  • The back-end systems (Webhooks)

  • The logging and analytics

The customer channels

Let’s start by talking about the ways in which the customer interacts with the conversational agent. When interacting with our customers, the key is to meet them where they are. To this end, it is important to employ a variety of different channel technologies. Engagement can be conducted via voice or chat on a variety of different platforms — a good Conversational AI experience will provide continuity across these disparate channels.

Voice

For virtual agents in the contact center space, Dialogflow supports one-click integrations with a few IVR providers. This makes it easy to connect to existing IVR systems such as Avaya, Twilio, Audiocodes, or VoxImplant. With this integration, a phone number is assigned, through which interactions with the bot can occur.

Voice can be leveraged outside of the call center as well using Dialogflow APIs. For example, this can be used when there is a need to integrate the virtual agent to the drive-thru or enable voice on a mobile application. A key API end point available is DetectStreamingIntent, with which one can stream voice data directly to the Dialogflow agent and receive a response to play to the client. This allows you to build a custom integration for your specific use case.

Chat

Readily available chat integrations include one-click web, Facebook and line integrations. Your bot can be embedded within any website with a few lines of html code:

<script src="https://www.gstatic.com/dialogflow-console/fast/messenger-cx/bootstrap.js?v=1"></script> <df-messenger 
df-cx="true"
chat-title="Blog Bog"
agent-id="{agent-id}"
language-code="en"
></df-messenger>

If you have your agent ID value, this code will allow you to quickly embed it into a website. Chat also supports Rich Responses, which allows you to utilize buttons and events to drive the conversation forward.

A key architectural consideration here is the omnichannel use case. Dialogflow provides the ability to connect multiple channels to the same agent, and because it is a managed service, it will scale automatically to account for additional traffic. It also can define responses to be channel-specific; the user can receive a different response based on the channel they are interacting from. This allows you to build a consistent user experience across the various channels through which the customer might interact.

The backend systems

The next piece of the integration puzzle is connecting to backend systems using webhooks in Dialogflow CX.

This allows us to create a personalized conversation for the customer by incorporating external data into the conversation.

This is done by setting up an HTTPS POST end point that Dialogflow can connect to. When the user says, “Give me a large lemonade,” this allows us to use the extracted entities “Large” and “Lemonade” and add them to a cart in the back end. When the user then says “Checkout” or “I’m done” to trigger the checkout intent, we can initiate another back-end call to tally up the total for the cart and proceed to checkout. It also allows us to write connections to third-party systems such as active directories.

This way, for a contact center solution, we can use the phone number the customer is calling from to provide the conversational model with additional context for the call, helping us to serve the customer better. Another popular integration via the webhooks is to CRM systems like salesforce, where we can log the details of customer calls and update any cases where necessary.

Dialogflow CX supports any number of webhooks per agent, provides different authentication mechanisms, and encrypts all traffic in transit to ensure that any sensitive customer data is secure.

Logging and analytics

The last piece of the puzzle is logging and analytics. Dialogflow CX stores logs in the Google Cloud Platform’s Cloud Logging service. From here you can stream the logs using Pub/Sub and DataFlow to Bigquery for analysis. Bigquery is a Google BigData tool that uses SQL-like queries to perform analysis on the data. Bigquery can be integrated with visualization tools such as Data Studio, Looker and Tableau to build out dashboards for the business.

The first reason for establishing a solid logging pipeline is to ensure that the conversational design team has the right data available to monitor and enhance the agent. A conversational model is a growing product that should constantly be nurtured even after its release. For example, the team should keep an eye on any customer utterances that do not match their intents, because this may be an indication of poor training phrases or additional features that customers are expecting.

The second reason for the logging pipeline is to build the right analytics dashboards. This allows business users to accurately track and measure the Key Performance Indicators for the product. Some metrics to track are the duration of session, check price, accuracy of orders, average intent detection confidence and average sentiment score.

This logging pipeline gives the product team the resources to monitor and enhance the conversational model in production and provides the business with insight into the performance of the system in which they have invested.

We hope this article serves as a helpful starting point to understanding some of the building blocks and features that Dialogflow CX provides. Whether it is in the drive-thru, a chatbot on a mobile application, or in the call center, Google’s CCAI platform offers a set of tools that product teams can get creative with to convert the various analog processes to digital processes. These are tools meant to help the team innovate creative solutions based on specific automation needs. From here, I encourage you to run with this idea and see where it takes you!

If you would like to learn more about why we love Dialogflow CX, take a look at some of the advanced features below, which are crucial in creating amazing voice experiences for customers.

Advanced features of CX

Composite entities are a type of entity in Dialogflow that is a combination of multiple different entities. For example, if you build an agent with the “Size” and “Item” entities, you will create an order_item entity that is a composite of both of those.

Both “@size @item” or “@item @size” would be part of your order_item entity. This allows for different permutations and combinations in speech and allows our bot to recognize both “Large Coca-Cola” and “Coca-Cola large.”

Using this composite entity structure we can stitch entities together and allow the bot to in a sense “learn” the menu and listen for those items.

Another key feature in Dialogflow CX, which shines in verticals like QSR and healthcare, is Auto Speech Adaptation. Auto Speech Adaptation is a feature in Dialogflow that automatically uses the conversation state to pass relevant entities and training phrases over to the Speech-to-Text (STT) engine as context for the transcription. This enables the STT engine to favor use case-specific words like the entities it is expecting in that state of the conversation and tremendously helps improve transcription. For QSR use cases where the bot is extracting menu items from the users’ utterances, this becomes necessary, especially for menu items unique to that restaurant chain.

Custom Voice is one of the most recent and fun features that Google has added. Custom Voice lets you define your brand and engage with your customers through voice. Google allows you to build custom voice models that can be a fun way to improve customer experience and showcase or define a voice behind the brand.

Another feature is Automated Test Cases, through which Dialogflow CX allows developers to convert conversations into test cases that can be run via APIs. This gives the product team a robust set of tests that can cover all the different conversations the bot is expected to support. This in turn gives us the confidence to release the bot and continue iterating on it, knowing there are quick tests available to ensure backward compatibility.

A/B Testing: The Google Dialogflow CX team also released a neat A/B testing feature that allows you to version the agent at different states and split the traffic going to each state to test the performance for any changes. It is a great way to reduce the impact of any large changes to the agent.

Personally Identifiable Information (PII) redaction gives us the ability to keep PII data redacted in logs to ensure that any customer information is secure.

Dual-Tone Multi Frequency (DTMF) inputs are integrated with Dialogflow CX for voice agents. This gives the user multiple options to enter the desired information. When collecting numbers, it is often easier to let the customer type the keys into their phone instead of saying it out. This DTMF support is often leveraged when verifying a customer’s identity, either with the last four digits of their social or phone number. Such sensitive information is better typed out, especially if the user is in a public space.

Contact us

We would love to hear from you! Please fill out the form and the nearest person from office will contact you.

Let's reinvent the future