Denis Valášek
Posted on September 28, 2024
Intro
Developing with LLMs still carries some risks: they're hard to debug, it's difficult to observe what's happening, and when you're using multiple prompts for a single task, it can quickly become a mess. Let's explore Firebase Genkit to see how it can make things a bit easier. I'll use a practical example of converting text with difficult characters (equations, dates, physics symbols) in Czech to their spoken variants, which will be better handled by Text-to-Speech systems.
What is Genkit?
Genkit is an open source framework from Firebase that provides libraries and developer tools to get the most of the various AI and LLM APIs, not just from Google. While it's made by the Firebase team, there are many additional plugins available that enable support for other providers like Anthropic, OpenAI and Ollama simply by switching a line of code.
A great part of Genkit are also developer tools that add additional observability and debugging options to developing LLM-powered apps, giving you a bit of reassurance in the non-deterministic characteristics that come with building on LLMs.
Getting started with Firebase Genkit
Use official docs to set up a new Genkit project. We will be using TypeScript, but Genkit also provides Go support. Make sure to say yes to creating a sample flow.
Flows
Flows are wrapper functions that enable us to tie various tasks together into a single callable function, while giving us observability and easier testing.
When creating new Genkit project, in the index.ts, you will have sample flow generated:
// Define a simple flow that prompts an LLM to generate menu suggestions.
export const menuSuggestionFlow = defineFlow(
{
name: 'menuSuggestionFlow',
inputSchema: z.string(),
outputSchema: z.string(),
},
async (subject) => {
// Construct a request and send it to the model API.
const llmResponse = await generate({
prompt: `Suggest an item for the menu of a ${subject} themed restaurant`,
model: gemini15Flash,
config: {
temperature: 1,
},
});
// Handle the response from the model API. In this sample, we just convert
// it to a string, but more complicated flows might coerce the response into
// structured output or chain the response into another LLM call, etc.
return llmResponse.text();
}
);
Run genkit start
in your project folder to build the project and start the developer UI at http://localhost:4000
Here you can see the flow being ready in the left menu under Flows:
Let's try it out! In the code, we can see that this flow expects just a string input (inputSchema: z.string()
), so providing "sushi" as a restaurant theme should be enough.
We can also click on the "View trace" button, to see exactly what's happening inside our flow with defined spans. Currently, we only have the model call.
Preparing our prompts
Genkit supports the Dotprompt prompt format via the Dotprompt plugin, let's install it by following the docs. Don't forget to add Dotprompt as a plugin when calling the configureGenkit(...)
function.
Dotprompt enables you to save your prompts as separate files that you can then directly use in your code. This separation helps you keep things clean and manageable as you scale, upgrade prompts etc.
Opening the Gemini 1.5 Flash model (or other models) within the Genkit UI will give us access to a prompt testing & creation UI.
Feel free to play with the UI and create some sample prompts. When you are done, click "Export prompt" and save the result in the prompts
folder in your project.
We can modify the .prompt file to accept one or more parameters by adding the input header with a schema. The parameters can then be used inside our prompt text using the handlebars syntax.
Note: If you don't speak Czech, you might be wondering what's inside the prompts. The first one gives a generic instruction about the role the model is to assume - in this case, a student doing their physics finals. The 2nd one gives various examples about how specific things should be rewritten. For example, 1998 should be written as nine thousand ninety eight, etc. This helps the TTS engine pronounce words correctly in Czech.
Here is my physics.prompt
including parameters:
---
model: vertexai/gemini-1.5-flash
config:
temperature: 0.3
maxOutputTokens: 8192
topK: 32
topP: 0.95
tools: []
input:
schema:
question: string
---
{{role "system"}}
Jsi kluk, jmenuješ se AI maturant a právě maturuješ na osmiletém gymnáziu, vytahuješ si téma z fyziky. Tvůj výklad by neměl zabrat déle než jednu minutu, poté budou následovat otázky od komise, které budou navazovat na dané téma a předmět. Na začátku uvítej komisi a pokračuj ve svém výkladu. Pokud dostaneš příklad k vypočítání, buď co nejpřesnější, dávej si pozor na jednotky, nezaokrouhluj na vysoké čísla, je velmi důležité, aby výsledky byly správně. V úvodu zahrň vypočítaný příklad včetně rovnice.
{{role "user"}}
Tématem je "{{question}}". Co mi o tomto tématu řekneš prosím?
And spoken_text.prompt:
---
model: vertexai/gemini-1.5-flash
config:
temperature: 0.4
topK: 32
maxOutputTokens: 8192
topP: 0.95
tools: []
input:
schema:
writtenText: string
---
{{role "user"}}
Prosím převěď následující text tak, aby se dal snadno vyslovit. Mimo jiné, následuj tyto pravidla:
===PRAVIDLA:===
# Vzorečky piš tak, aby se daly přečíst a vyslovit.
===Příklad:===
F = m * a
===Výsledek:===
Vektor síly se rovná hmotnosti tělesa vynásobenou vektorem zrychlení.
===Příklad:===
g = G * M / r^2
===Výsledek:===
Intezita gravitačního pole se rovná gravitační konstata krát hmotnost tělesa děleno vzdáleností tělesa na druhou
# Používej psané číslovky:
===Příklad:===
20. století
===Výsledek:===
dvacáté století
===Příklad:===
0,00027 N/kg
===Výsledek:===
nula celá nula nula nula dvacet sedm newtonů na kilogram
# Místo znaků jako =, / nebo * používej slovní spojení jako "rovná se", "děleno" nebo "krát".
===Příklad:===
2 * 3 = 6
===Výsledek:===
Dva krát tři se rovná šest
# Místo 1., 2., 3., piš "první", "druhý", "třetí" nebo Za prvé, za druhé, za třetí a podobně, v závislosti na kontextu.
===Příklad:===
1. Místo
===Výsledek:===
první místo
===Příklad:===
1) Věda a výzkum
===Výsledek:===
Za prvé, věda a výzkum
===Příklad:===
Alexandr I.
===Výsledek:===
Alexandr první
=== Konec pravidel ===
Následující text:
{{writtenText}}
Převedený text:
As a quick check, this is how your folder structure should look like now:
Connecting the prompts inside a flow
Let's modify the pregenerated example flow, changing the name and input/output schemas. Genkit uses zod to define strongly typed schemas enforced even at runtime.
We also need to use the Dotprompt plugin to import the prompts into our code by calling promptRef
with our prompt file name as a parameter.
Now we make 2 sepaerate calls to the model with our prompts, chaining them together with the results of the previous one, with the option to also not generate the spoken form at all.
const physicsPrompt = promptRef("physics");
const spokenTextPrompt = promptRef("spoken_text");
export const studentAnswerFlow = defineFlow(
{
name: "studentAnswerFlow",
inputSchema: z.object({
question: z.string(),
convertToSpokenText: z.boolean(),
}),
outputSchema: z.object({
answer: z.string(),
spokenAnswer: z.string().optional(),
}),
},
async (inputs) => {
const answerResponse = await physicsPrompt.generate({
input: {
question: inputs.question,
},
});
// If we don't need spoken answer, return only the written answer
if (!inputs?.convertToSpokenText) {
return {
answer: answerResponse.text(),
};
}
const spokenAnswerResponse = await spokenTextPrompt.generate({
input: {
writtenText: answerResponse.text(),
},
});
return {
answer: answerResponse.text(),
spokenAnswer: spokenAnswerResponse.text(),
};
}
);
Now, when we load the Genkit UI and open the updated flow, we can already see the newly defined parameters as JSON. Let's populate the question
parameter with "Mechanika kapalin a plynů" and set convertToSpokenText
to true.
Running the flow, we can see the outputs as we have defined them in the code. We can notice that the outputs are slightly different, because the second prompt is transforming the text to be better understood by the TTS engine.
Opening the traces, we can see both calls being defined. Drilling deeper, we can inspect the individual runs for every prompt, along with timings, to see what's taking up the most time.
Deploying flows
Genkit is highly flexible with deployment options - you can deploy to any Node.js environment. But you get the most out of the framework by deploying to Google Cloud Run or Cloud Functions for Firebase . The details for deployment are well described in the docs. I would recommend Cloud Run as it uses a single Cloud Run service for all the flows, keeping the costs low in case you need to deploy multiple flows and keep minInstances: 1 to avoid longer cold starts.
You can view the complete sample project on GitHub.
Thanks for reading and special thanks to Peter Friese from the Firebase team for helping me out!
Google Cloud credits are provided for this project #AISprint
Posted on September 28, 2024
Join Our Newsletter. No Spam, Only the good stuff.
Sign up to receive the latest update from our blog.