Teaching an AI Dog New Tricks (The Gemini Integration)
Joshua Tuddenham
Posted on November 23, 2024
The Art of Prompt Engineering (Or: How I Learned to Stop Worrying and Trust the AI)
First attempt at prompting Gemini:
const prompt = "Plan a nice trip";
Result: "Have you considered going somewhere? Perhaps doing things when you get there? Maybe eating food?"
Right. Maybe I need to be a bit more specific. Turns out "plan a nice trip" is to Gemini what "fetch" is to a golden retriever - technically understood, but interpreted rather loosely.
Take Two: The First Real Attempt
My first serious attempt at structuring the prompts was... well, let's say Wooster had some creative interpretations:
type BasicTripRequest = {
destination: string;
duration: number;
startDate: string;
};
const firstPrompt = `
Plan a ${tripRequest.duration} day trip to ${tripRequest.destination}
starting on ${tripRequest.startDate}.
Please include:
- Daily activities
- Locations
- Approximate costs
- Duration of activities
`;
Result: "Day 1 you could maybe explore downtown? Stay as long as you're having fun! Costs depend on how many treats you buy along the way. Some people spend a little, some spend a lot. Day 2 there's this AMAZING park the locals love..."
Not quite the structured response I was hoping for. Time to try JSON.
Adding More Structure (But Not Enough)
Here was my first JSON attempt:
const secondPrompt = `
Generate a JSON itinerary for a ${tripRequest.duration}-day trip to ${tripRequest.destination}.
Each day should have 2-3 activities.
Format:
{
"days": [
{
"dayNumber": number,
"activities": [
{
"name": string,
"location": string,
"description": string,
"duration": string,
"category": string, // This caused problems later...
"cost": string
}
]
}
]
}
`;
This worked better, but the categorization was a mess. I got everything from "Fun stuff" to "Walking around looking at things" as categories. It was like asking Wooster to sort his toys - everything ended up in the "things I can put in my mouth" category.
Take this response for example:
{
"days": [
{
"dayNumber": 1,
"activities": [
{
"name": "Explore the city vibes",
"location": "wherever the wind takes us",
"description": "Just soak in the atmosphere, you know?",
"duration": "as long as we're having fun",
"category": "Places with good smells",
"cost": "whatever's in the wallet"
}
]
}
]
}
Time to get serious about types.
The Final Evolution
After much trial and error (and Wooster suggesting "belly rubs" as a valid activity category), I landed on this much more structured approach:
export const createPrompt = (
days: number,
location: string,
startDate: string,
): string => `
Generate **ONLY** JSON data for a ${days}-day trip to ${location}, starting on ${startDate}.
Have no more than three activities per day. Exclude arrival and departure logistics.
Duration must be in format "X hours" or "X.5 hours".
Price must be in format "$X" where X is a number.
Activities MUST include:
- "activityName": string
- "description": string (20-50 words)
- "location": string
- "price": string
- "duration": string
- "difficulty": "Easy" | "Moderate" | "Challenging"
- "category": "Adventure" | "Cultural" | "Nature" | "Food & Drink" | "Shopping" | "Entertainment"
- "bestTime": "Early Morning" | "Morning" | "Afternoon" | "Evening" | "Night"
- "bookingRequired": boolean
`;
And later, after realizing I needed geolocation for mapping of activities:
export const destinationPromptTemplate = (destination: string) => `
Generate **ONLY** a valid JSON object for ${destination}.
Include:
- "latitude": number (decimal degrees)
- "longitude": number (decimal degrees)
- "destinationName": string
- "country": string
- "description": string (50-200 words)
// ... other structured fields
`;
Finally, I was getting responses that looked like actual travel plans rather than Wooster's diary entries:
{
"days": [
{
"dayNumber": 1,
"activities": [
{
"activityName": "Morning Market Tour",
"description": "Explore the historic Camden Market, featuring local artisans and food vendors. Perfect for gathering travel supplies (and snacks).",
"location": "Camden Lock Place, London",
"price": "$15",
"duration": "2.5 hours",
"difficulty": "Easy",
"category": "Food & Drink",
"bestTime": "Morning",
"bookingRequired": false
},
{
"activityName": "Regent's Park Walk",
"description": "A scenic stroll through one of London's most beautiful parks. Watch for squirrels (Wooster insisted this was important).",
"location": "Regent's Park, London",
"price": "$0",
"duration": "1.5 hours",
"difficulty": "Easy",
"category": "Nature",
"bestTime": "Afternoon",
"bookingRequired": false
}
]
}
]
}
Still with a hint of Wooster's personality, but now in a format that wouldn't make TypeScript cry.
Handling the Responses
Of course, getting the prompt right was only half the battle. Gemini's responses weren't always perfectly formatted JSON, so I needed some utility functions to clean things up:
export const cleanLLMJsonResponse = (text: string): string => {
// Step 1: Remove markdown code blocks with any language specification
const withoutCodeBlocks = text.replace(
new RegExp('```
(?:json)?\\s*([\\s\\S]*?)\\s*
```', 'g'),
"$1"
);
// Step 2: Remove potential comments
const withoutComments = withoutCodeBlocks.replace(
new RegExp('\\/\\*[\\s\\S]*?\\*\\/|\\/\\/.*', 'g'),
""
);
// Step 3: Detect and replace incomplete URLs with a placeholder
const withCompleteUrls = withoutComments.replace(
new RegExp('"website":\\s*"https:([^",}]*)', 'g'),
'"website": "https://example.com"'
);
// Step 4: Trim whitespace
return withCompleteUrls.trim();
};
// Validate the JSON structure
export const validateJSON = (jsonString: string): void => {
try {
const parsed = JSON.parse(jsonString);
if (typeof parsed !== "object" || parsed === null) {
throw new Error("Response is not a valid JSON object or array");
}
} catch (error) {
console.error(
"JSON Validation failed. Invalid JSON string:",
jsonString.slice(0, 500),
);
throw new Error(
`Invalid JSON response: ${error instanceof Error ? error.message : "Unknown error"}`,
);
}
};
// Clean up any non-printable characters
export const cleanJSON = (jsonString: string): string => {
// Remove control characters (ASCII 0 to 31)
const cleanedString = jsonString.replace(/[\x00-\x1F]/g, "");
// Remove non-printable characters
return cleanedString.replace(/[^\x20-\x7E]/g, "");
};
Why did I need all this? Well, Gemini had some... interesting habits:
- Sometimes wrapping responses in markdown code blocks
- Occasionally including helpful comments (which broke the JSON)
- Returning incomplete URLs
- Adding mysterious control characters
- And my personal favorite: sneaking in emoji (those non-printable characters had to go)
Using these utilities together:
async function generateTripPlan(tripRequest: TripRequest) {
try {
const result = await model.generateContent(createPrompt(/* ... */));
const text = result.response.text();
// Clean up the response
const cleanedResponse = cleanLLMJsonResponse(text);
const sanitizedJSON = cleanJSON(cleanedResponse);
// Validate before parsing
validateJSON(sanitizedJSON);
return JSON.parse(sanitizedJSON);
} catch (error) {
console.error("Failed to generate valid trip plan:", error);
throw new Error("Failed to generate trip plan");
}
}
What I Actually Learned
- Start with strict types from the beginning
- Be explicit about formats
- The more specific the prompt constraints, the better
- Always sanitize and validate AI responses
- Geolocation data should have been there from the start
- Rate limiting is important
- Regex is your best friend when dealing with AI responses
- Never trust an AI to format URLs correctly
Next up: the frontend implementation, or as I like to call it, "Making Wooster Look Presentable for Company" (and this time with proper TypeScript interfaces from day one).
This article was originally published on my blog. Follow me there for more content about AI integration, TypeScript, and full-stack development.
Posted on November 23, 2024
Join Our Newsletter. No Spam, Only the good stuff.
Sign up to receive the latest update from our blog.