Understanding Genkit flows with Czech language tricks

denisvalasek

Denis Valášek

Posted on September 28, 2024

Understanding Genkit flows with Czech language tricks

Intro

Developing with LLMs still carries some risks: they're hard to debug, it's difficult to observe what's happening, and when you're using multiple prompts for a single task, it can quickly become a mess. Let's explore Firebase Genkit to see how it can make things a bit easier. I'll use a practical example of converting text with difficult characters (equations, dates, physics symbols) in Czech to their spoken variants, which will be better handled by Text-to-Speech systems.

What is Genkit?

Genkit is an open source framework from Firebase that provides libraries and developer tools to get the most of the various AI and LLM APIs, not just from Google. While it's made by the Firebase team, there are many additional plugins available that enable support for other providers like Anthropic, OpenAI and Ollama simply by switching a line of code.

A great part of Genkit are also developer tools that add additional observability and debugging options to developing LLM-powered apps, giving you a bit of reassurance in the non-deterministic characteristics that come with building on LLMs.

Getting started with Firebase Genkit

Use official docs to set up a new Genkit project. We will be using TypeScript, but Genkit also provides Go support. Make sure to say yes to creating a sample flow.

Flows

Flows are wrapper functions that enable us to tie various tasks together into a single callable function, while giving us observability and easier testing.

When creating new Genkit project, in the index.ts, you will have sample flow generated:

// Define a simple flow that prompts an LLM to generate menu suggestions.
export const menuSuggestionFlow = defineFlow(
  {
    name: 'menuSuggestionFlow',
    inputSchema: z.string(),
    outputSchema: z.string(),
  },
  async (subject) => {
        // Construct a request and send it to the model API.
    const llmResponse = await generate({
      prompt: `Suggest an item for the menu of a ${subject} themed restaurant`,
      model: gemini15Flash,
      config: {
        temperature: 1,
      },
    });

        // Handle the response from the model API. In this sample, we just convert
    // it to a string, but more complicated flows might coerce the response into
    // structured output or chain the response into another LLM call, etc.
    return llmResponse.text();
  }
);
Enter fullscreen mode Exit fullscreen mode

Run genkit start in your project folder to build the project and start the developer UI at http://localhost:4000

Here you can see the flow being ready in the left menu under Flows:

Genkit Developer tools - Sample flow

Let's try it out! In the code, we can see that this flow expects just a string input (inputSchema: z.string()), so providing "sushi" as a restaurant theme should be enough.

Genkit Developer tools - Sample flow run

We can also click on the "View trace" button, to see exactly what's happening inside our flow with defined spans. Currently, we only have the model call.

Genkit Developer tools - Sample flow traces

Preparing our prompts

Genkit supports the Dotprompt prompt format via the Dotprompt plugin, let's install it by following the docs. Don't forget to add Dotprompt as a plugin when calling the configureGenkit(...) function.

Dotprompt enables you to save your prompts as separate files that you can then directly use in your code. This separation helps you keep things clean and manageable as you scale, upgrade prompts etc.

Opening the Gemini 1.5 Flash model (or other models) within the Genkit UI will give us access to a prompt testing & creation UI.

Image description

Feel free to play with the UI and create some sample prompts. When you are done, click "Export prompt" and save the result in the prompts folder in your project.

We can modify the .prompt file to accept one or more parameters by adding the input header with a schema. The parameters can then be used inside our prompt text using the handlebars syntax.

Note: If you don't speak Czech, you might be wondering what's inside the prompts. The first one gives a generic instruction about the role the model is to assume - in this case, a student doing their physics finals. The 2nd one gives various examples about how specific things should be rewritten. For example, 1998 should be written as nine thousand ninety eight, etc. This helps the TTS engine pronounce words correctly in Czech.

Here is my physics.prompt including parameters:

---
model: vertexai/gemini-1.5-flash
config:
  temperature: 0.3
  maxOutputTokens: 8192
  topK: 32
  topP: 0.95
tools: []
input:
    schema:
        question: string
---

{{role "system"}}
Jsi kluk, jmenuješ se AI maturant a právě maturuješ na osmiletém gymnáziu, vytahuješ si téma z fyziky. Tvůj výklad by neměl zabrat déle než jednu minutu, poté budou následovat otázky od komise, které budou navazovat na dané téma a předmět. Na začátku uvítej komisi a pokračuj ve svém výkladu. Pokud dostaneš příklad k vypočítání, buď co nejpřesnější, dávej si pozor na jednotky, nezaokrouhluj na vysoké čísla, je velmi důležité, aby výsledky byly správně. V úvodu zahrň vypočítaný příklad včetně rovnice.

{{role "user"}}
Tématem je "{{question}}". Co mi o tomto tématu řekneš prosím?
Enter fullscreen mode Exit fullscreen mode

And spoken_text.prompt:

---
model: vertexai/gemini-1.5-flash
config:
  temperature: 0.4
  topK: 32
  maxOutputTokens: 8192
  topP: 0.95
tools: []
input:
    schema:
        writtenText: string
---

{{role "user"}}
Prosím převěď následující text tak, aby se dal snadno vyslovit. Mimo jiné, následuj tyto pravidla:
===PRAVIDLA:===
# Vzorečky piš tak, aby se daly přečíst a vyslovit.
===Příklad:===
F = m * a
===Výsledek:===
Vektor síly se rovná hmotnosti tělesa vynásobenou vektorem zrychlení.
===Příklad:=== 
g = G * M / r^2
===Výsledek:===
Intezita gravitačního pole se rovná gravitační konstata krát hmotnost tělesa děleno vzdáleností tělesa na druhou
# Používej psané číslovky:
===Příklad:===
20. století
===Výsledek:===
dvacáté století
===Příklad:===
0,00027 N/kg
===Výsledek:===
nula celá nula nula nula dvacet sedm newtonů na kilogram

# Místo znaků jako =, / nebo * používej slovní spojení jako "rovná se", "děleno" nebo "krát".
===Příklad:===
2 * 3 = 6
===Výsledek:===
Dva krát tři se rovná šest

# Místo 1., 2., 3., piš "první", "druhý", "třetí" nebo Za prvé, za druhé, za třetí a podobně, v závislosti na kontextu.
===Příklad:===
1. Místo
===Výsledek:===
první místo

===Příklad:=== 
1) Věda a výzkum
===Výsledek:===
Za prvé, věda a výzkum

===Příklad:=== 
Alexandr I.
===Výsledek:===
Alexandr první

=== Konec pravidel ===
Následující text: 
{{writtenText}}

Převedený text:
Enter fullscreen mode Exit fullscreen mode

As a quick check, this is how your folder structure should look like now:

Project structure

Connecting the prompts inside a flow

Let's modify the pregenerated example flow, changing the name and input/output schemas. Genkit uses zod to define strongly typed schemas enforced even at runtime.

We also need to use the Dotprompt plugin to import the prompts into our code by calling promptRef with our prompt file name as a parameter.

Now we make 2 sepaerate calls to the model with our prompts, chaining them together with the results of the previous one, with the option to also not generate the spoken form at all.

const physicsPrompt = promptRef("physics");
const spokenTextPrompt = promptRef("spoken_text");

export const studentAnswerFlow = defineFlow(
  {
    name: "studentAnswerFlow",
    inputSchema: z.object({
      question: z.string(),
      convertToSpokenText: z.boolean(),
    }),
    outputSchema: z.object({
      answer: z.string(),
      spokenAnswer: z.string().optional(),
    }),
  },
  async (inputs) => {
    const answerResponse = await physicsPrompt.generate({
      input: {
        question: inputs.question,
      },
    });

    // If we don't need spoken answer, return only the written answer
    if (!inputs?.convertToSpokenText) {
      return {
        answer: answerResponse.text(),
      };
    }

    const spokenAnswerResponse = await spokenTextPrompt.generate({
      input: {
        writtenText: answerResponse.text(),
      },
    });

    return {
      answer: answerResponse.text(),
      spokenAnswer: spokenAnswerResponse.text(),
    };
  }
);
Enter fullscreen mode Exit fullscreen mode

Now, when we load the Genkit UI and open the updated flow, we can already see the newly defined parameters as JSON. Let's populate the question parameter with "Mechanika kapalin a plynů" and set convertToSpokenText to true.

Updated flow with predefined parameters

Running the flow, we can see the outputs as we have defined them in the code. We can notice that the outputs are slightly different, because the second prompt is transforming the text to be better understood by the TTS engine.

Image description

Opening the traces, we can see both calls being defined. Drilling deeper, we can inspect the individual runs for every prompt, along with timings, to see what's taking up the most time.
Trace spans timings

Trace prompt detail

Deploying flows

Genkit is highly flexible with deployment options - you can deploy to any Node.js environment. But you get the most out of the framework by deploying to Google Cloud Run or Cloud Functions for Firebase . The details for deployment are well described in the docs. I would recommend Cloud Run as it uses a single Cloud Run service for all the flows, keeping the costs low in case you need to deploy multiple flows and keep minInstances: 1 to avoid longer cold starts.

You can view the complete sample project on GitHub.

Thanks for reading and special thanks to Peter Friese from the Firebase team for helping me out!

Google Cloud credits are provided for this project #AISprint
💖 💪 🙅 🚩
denisvalasek
Denis Valášek

Posted on September 28, 2024

Join Our Newsletter. No Spam, Only the good stuff.

Sign up to receive the latest update from our blog.

Related