Build a low-code chatbot using Autohook and YAML

i_am_daniele

Daniele

Posted on April 16, 2020

Build a low-code chatbot using Autohook and YAML

Chatbots are useful to provide relevant and timely information via direct messages. Developers can build logic to create a ‘script’ that directs users to the relevant answers they’re looking for.
I’m amazed by the people coming together in this current crisis to help however they can. I wanted to do my part and hope that it inspires others to contribute too. Chatbots can be used to help people find information from reputable sources. In my previous tutorial, I used Autohook to reply to direct messages with little code. Still, you will need to define the flow between a user and a bot, and that logic is usually tailored for each project.

I wanted to build something that requires minimal coding, so that anyone could build a useful service. I created a low-code, bot framework that pulls together all the relevant bits:

  • It uses Autohook to receive direct message events
  • It defines a set of text files with the logic to reply to a user
  • It configures a welcome message for an account designed as chat agent
  • It uses the Twitter API to send a direct message to the user who contacted the chat agent

With this framework, I built a bot that provides regional information about COVID-19 for the US, Spain and Italy, which are severely affected by the current emergency. By chatting with the bot, you can get useful resources from local authorities, and find ways to donate.

The bot uses public domain information from the CDC, Italy’s Ministero della Salute (Ministry of Health) and the Spanish Ministerio de Sanidad, Consumo y Bienestar Social (Ministry of Health, Consumer Affairs and Welfare). The bot needs to adapt to the situation in each country; for example, people in Italy need a certificate to travel, and the bot should be able to provide the link to the official website to fill the form to Italian users.

The project is also configured to run either locally or on a remote server. You can find the full project on Glitch, and you can interact with the final result.

Setup your Twitter app

To get started, you need to set up a Twitter app and to have a valid developer account. We covered this process in my previous tutorial about Autohook. Read the section named “Your Twitter app” to learn more.

Building our flow

There seems to be no in-between when it comes to setting up the actual conversation flow. Developers either build custom logic with code, or they purchase access to sophisticated interactive workflow automation platforms. On Twitter, chatbots are equipped with quick reply options, so we can build a flow based entirely on the inputs from these quick replies.

Screenshot of a Twitter direct message with a chat bot. In this screenshot, the chatbot is using a multi-language welcome message to greet a Twitter user the first time they start a conversation. Three quick reply options (one per language) are also shown.

By relying on quick replies, we can turn our chatbot into a finite-state machine; in other words our bot can be in exactly one state at any given time, and the number of states are predetermined (finite). This seems to be exactly what we need!

  • Each reply from the chatbot will have one or more quick reply option
  • The user can only tap one option per reply
  • Each option brings the bot to the next state
  • The state is basically our next message
  • Any user input from the keyboard is discarded; it will simply trigger a “do not understand” reply, and it will repeat the last message

I mentioned my bot is answering in three languages. How do we apply the finite-state paradigm to our flow? In this case, each language is a state, plus another state for the “do not understand” case (more about that later). Users can only select one option (a language in this case); any other input triggers a “do not understand” message, bringing the user back to the language selector.

A diagram of a finite-state machine. The starting point is the welcome message, and each subsequent state is driven by a quick reply option. Because the number of option is finite, each option will drive to one state.

In the diagram, each bubble represents a chat message from our bot, and each arrow is an input from the user. By giving our bot a finite-state mindset, we can be sure to articulate the full conversation.

Write, don't code

How do we translate that conversation into code? We don’t. Instead of writing code, we will write text files. Each file will contain one or more bubbles as defined in our diagram. Each bubble will list the options it supports. Each option will bring the user to the next state.
We will use the YAML format for our text files; this way we can specify tags and lists of options – which is exactly what we need to build our quick replies.

start:
  text: |
    🇺🇸 Hi! I'm a bot and I'll help you with resources on COVID-19. Please select your country.

    🇮🇹 Ciao! Sono un robot e posso aiutarti con risorse utili su COVID-19. Per favore, seleziona la tua nazione.

    🇪🇸 ¡Hola! Soy un robot y puedo ayudar con recursos utiles sobre la COVID-19. Por favor, selecciona tu nación.
  options:
    - label: 🇺🇸 United States
      description: Contains resources from CDC
      metadata: scripts/en-us/for-me.yaml#start
    - label: 🇮🇹 Italia
      description: Usa le linee guida del Ministero della Salute
      metadata: 'scripts/it-it/for-me.yaml#start'
    - label: 🇪🇸 España
      description: Con las lineas guias del MSCBS
      metadata: 'scripts/es-es/for-me.yaml#start'
do_not_understand:
  text: |
    🇺🇸 I'm sorry, I can't read what you type. Please try again by selecting one of the options instead of using your keyboard.

    🇮🇹 Mi dispiace, non posso leggere quello che scrivi. Per favore, prova di nuovo selezionando una delle opzioni anziché usare la tastiera.

    🇪🇸 Lo siento, no puedo leer lo que escribes. Por favor, intenta seleccionar una de las opciones en vez de escribir.
Enter fullscreen mode Exit fullscreen mode

We then pass these files to our finite-state machine processor. The Agent module simply opens the file specified in the metadata field, gets the bubble with the name specified after the # character, and handles that over to the code that will send the direct message back to the user.

In practical terms, our language selector contains three options. When the user selects España, our finite-state machine will open the file scripts/es-es/for-me.yaml and look for a bubble named start.

class Agent extends EventEmitter {
  constructor() {
    super();
    this.script = null;
    this.file = null;
    this.entryPoint = 'start';
    this.state = {};
  }

  loadFile(file, state = {}) {
    [this.file, this.entryPoint] = file.split('#');
    this.entryPoint = this.entryPoint || 'start';
    try {
      this.script = YAML.load(this.file);
    } catch (e) {
      console.error('Exception:', e);
    }

    this.step(this.entryPoint);
  }

  currentStep() {
    return this.script ? this.script[this.entryPoint] : null;
  }

  setState(state) {
    this.state = Object.assign(this.state, state);
  }

  step(entryPoint = 'start', state = {}) {
    if (entryPoint.indexOf('#') > -1) {
      this.loadFile(entryPoint, state);
      return;
    }

    if (!this.script[entryPoint]) {
      throw Error(`No entry point named ${entryPoint} defined in ${this.file}. Add the entry point in your script file.`);
    }
    this.setState(state);
    this.entryPoint = entryPoint;
    this.emit('step');
  }
}
Enter fullscreen mode Exit fullscreen mode

Send the message

The twitter_dm.js module contains the logic to send direct messages; here we’re simply wrapping the send logic from my previous tutorial in a separate module to keep things tidy.
The JSON payload for a direct message is quite verbose, so we added a MessageCreate class. It helps compose the JSON body easily, and it will also translate incoming JSON requests into a valid message. This can allow us to translate raw JSON directly into a chat bubble with no effort!

I don’t understand – is it really that easy?

The only thing left to handle is keyboard input. The chatbot will simply acknowledge its ignorance – it will send a message saying it doesn’t understand, then it will send the previous message again. How do we handle this case?

Our finite-state machine will once again help us. After all, we defined any other input to be just one case, so we will need to define that “do not understand” case as a bubble too. Because the keyboard input will not send metadata, we will have our agent fall back to that “do not understand” bubble when our code does not receive metadata. We then call our previous agent’s step again to display the options.

In the code, you can see how we're even localizing the message if the user has already selected a language!

In this screenshot, I typed something in my chat window. The chatbot is not built to understand things users will type, so it will reply with a generic message saying it doesn't understand.

We hope that you will dig in and take the code for a spin! As I mentioned, if you’d like to add other languages or enhance it further, you can find the full project on Github. If you add anything, submit a pull request! If you want to try it live you can interact with the final result.

I used several libraries and services beyond the Twitter API to make this tutorial, but you may have different needs and requirements and should evaluate whether those tools are right for you. Let me know if this inspires you to build anything!

💖 💪 🙅 🚩
i_am_daniele
Daniele

Posted on April 16, 2020

Join Our Newsletter. No Spam, Only the good stuff.

Sign up to receive the latest update from our blog.

Related