Generating meaningful test data using Gemini

intersystemsdev

InterSystems Developer

Posted on June 22, 2024

Generating meaningful test data using Gemini

We all know that having a set of proper test data before deploying an application to production is crucial for ensuring its reliability and performance. It allows to simulate real-world scenarios and identify potential issues or bugs before they impact end-users. Moreover, testing with representative data sets allows to optimize performance, identify bottlenecks, and fine-tune algorithms or processes as needed. Ultimately, having a comprehensive set of test data helps to deliver a higher quality product, reducing the likelihood of post-production issues and enhancing the overall user experience. 

In this article, let's look at how one can use generative AI, namely Gemini by Google, to generate (hopefully) meaningful data for the properties of multiple objects. To do this, I will use the RESTful service to generate data in a JSON format and then use the received data to create objects.

Image description

This leads to an obvious question: why not use the methods from %Library.PopulateUtils to generate all the data? Well, the answer is quite obvious as well if you've seen the list of methods of the class - there aren't many methods that generate meaningful data.

So, let's get to it.

Since I'll be using the Gemini API, I will need to generate the API key first since I don't have it beforehand. To do this, just open aistudio.google.com/app/apikey and click on Create API key.

Image description

and create an API key in a new project

Image description

After this is done, you just need to write a REST client to get and transform data and come up with a query string to a Gemini AI. Easy peasy 😁

For the ease of this example, let's work with the following simple class

Image description

Class Restaurant.Dish Extends (%Persistent, %JSON.Adaptor)
{
Property Name As %String;
Property Description As %String(MAXLEN = 1000);
Property Category As %String;
Property Price As %Float;
Property Currency As %String;
Property Calories As %Integer;
}

In general, it would be really simple to use the built-in %Populate mechanism and be done with it. But in bigger projects you will get a lot of properties which are not so easily automatically populated with meaningful data.

Anyway, now that we have the class, let's think about the wording of a query to Gemini. Let's say we write the following query:

{"contents": [{
    "parts":[{
      "text": "Write a json object that contains a field Dish which is an array of 10 elements. Each element contains Name, Description, Category, Price, Currency, Calories of the Restaurant Dish."}]}]}

If we send this request to https://generativelanguage.googleapis.com/v1beta/models/gemini-pro:generateContent?key=APIKEY we will get something like:

 
Spoiler

{
  "Dish": [
    {
      "Name": "Dish 1",
      "Description": "A delicious dish with a unique flavor.",
      "Category": "Main Course",
      "Price": 15,
      "Currency": "$",
      "Calories": 500
    },
    {
      "Name": "Dish 2",
      "Description": "A flavorful dish with a spicy kick.",
      "Category": "Appetizer",
      "Price": 10,
      "Currency": "$",
      "Calories": 300
    },
    {
      "Name": "Dish 3",
      "Description": "A hearty dish with a comforting flavor.",
      "Category": "Main Course",
      "Price": 20,
      "Currency": "$",
      "Calories": 600
    },
    {
      "Name": "Dish 4",
      "Description": "A refreshing dish with a zesty flavor.",
      "Category": "Salad",
      "Price": 12,
      "Currency": "$",
      "Calories": 250
    },
    {
      "Name": "Dish 5",
      "Description": "A sweet dish with a decadent flavor.",
      "Category": "Dessert",
      "Price": 8,
      "Currency": "$",
      "Calories": 400
    },
    {
      "Name": "Dish 6",
      "Description": "A savory dish with a smoky flavor.",
      "Category": "Main Course",
      "Price": 18,
      "Currency": "$",
      "Calories": 550
    },
    {
      "Name": "Dish 7",
      "Description": "A light dish with a fresh flavor.",
      "Category": "Appetizer",
      "Price": 9,
      "Currency": "$",
      "Calories": 200
    },
    {
      "Name": "Dish 8",
      "Description": "A hearty dish with a comforting flavor.",
      "Category": "Soup",
      "Price": 11,
      "Currency": "$",
      "Calories": 350
    },
    {
      "Name": "Dish 9",
      "Description": "A refreshing dish with a zesty flavor.",
      "Category": "Salad",
      "Price": 14,
      "Currency": "$",
      "Calories": 300
    },
    {
      "Name": "Dish 10",
      "Description": "A sweet dish with a decadent flavor.",
      "Category": "Dessert",
      "Price": 10,
      "Currency": "$",
      "Calories": 450
    }
  ]
}

Already not bad. Not bad at all! Now that I have the wording of my query, I need to generate it as automatically as possible, call it and process the result.

Next step - generating the query. Using the very useful article on how to get the list of properties of a class we can generate automatically most of the query.

ClassMethod GenerateClassDesc(classname As %String) As %String
{
    set cls=##class(%Dictionary.CompiledClass).%OpenId(classname,,.status)
    set x=cls.Properties
    set profprop = $lb()
    for i=3:1:x.Count() {
        set prop=x.GetAt(i)
        set $list(profprop, i-2) = prop.Name        
    }
    quit $listtostring(profprop, ", ")
}

ClassMethod GenerateQuery(qty As %Numeric) As %String [ Language = objectscript ]
{
    set classname = ..%ClassName(1)
    set str = "Write a json object that contains a field "_$piece(classname, ".", 2)_
        " which is an array of "_qty_" elements. Each element contains "_
        ..GenerateClassDesc(classname)_" of a "_$translate(classname, ".", " ")_". "
    quit str
}

When dealing with complex relationships between classes it may be easier to use the object constructor to link different objects together or to use a built-in mechanism of %Library.Ppulate.

Following step is to call the Gemini RESTful service and process the resulting JSON.

ClassMethod CallService() As %String
{
 Set request = ..GetLink()
 set query = "{""contents"": [{""parts"":[{""text"": """_..GenerateQuery(20)_"""}]}]}"
 do request.EntityBody.Write(query)
 set request.ContentType = "application/json"
 set sc = request.Post("v1beta/models/gemini-pro:generateContent?key=<YOUR KEY HERE>")
 if $$$ISOK(sc) {
    Set response = request.HttpResponse.Data.Read()  
    set p = ##class(%DynamicObject).%FromJSON(response)
    set iter = p.candidates.%GetIterator()
    do iter.%GetNext(.key, .value, .type ) 
    set iter = value.content.parts.%GetIterator()
    do iter.%GetNext(.key, .value, .type )
    set obj = ##class(%DynamicObject).%FromJSON($Extract(value.text,8,*-3))
    
    set dishes = obj.Dish
    set iter = dishes.%GetIterator()
    while iter.%GetNext(.key, .value, .type ) {
        set dish = ##class(Restaurant.Dish).%New()
        set sc = dish.%JSONImport(value.%ToJSON())
        set sc = dish.%Save()
    }    
 }
}

Of course, since it's just an example, don't forget to add status checks where necessary.

Now, when I run it, I get a pretty impressive result in my database. Let's run a SQL query to see the data.

Image description

The description and category correspond to the name of the dish. Moreover, prices and calories look correct as well. Which means that I actually get a database, filled with reasonably real looking data. And the results of the queries that I'm going to run are going to resemble the real results.

Of course, a huge drawback of this approach is the necessity of writing a query to a generative AI and the fact that it takes time to generate the result. But the actual data may be worth it. Anyway, it is for you to decide 😉

 
P.S.
At this point Gemini API is available in a limited number of countries and territories listed below:

  • Algeria
  • American Samoa
  • Angola
  • Anguilla
  • Antarctica
  • Antigua and Barbuda
  • Argentina
  • Armenia
  • Aruba
  • Australia
  • Azerbaijan
  • The Bahamas
  • Bahrain
  • Bangladesh
  • Barbados
  • Belize
  • Benin
  • Bermuda
  • Bhutan
  • Bolivia
  • Botswana
  • Brazil
  • British Indian Ocean Territory
  • British Virgin Islands
  • Brunei
  • Burkina Faso
  • Burundi
  • Cabo Verde
  • Cambodia
  • Cameroon
  • Caribbean Netherlands
  • Cayman Islands
  • Central African Republic
  • Chad
  • Chile
  • Christmas Island
  • Cocos (Keeling) Islands
  • Colombia
  • Comoros
  • Cook Islands
  • Côte d'Ivoire
  • Costa Rica
  • Curaçao
  • Democratic Republic of the Congo
  • Djibouti
  • Dominica
  • Dominican Republic
  • Ecuador
  • Egypt
  • El Salvador
  • Equatorial Guinea
  • Eritrea
  • Eswatini
  • Ethiopia
  • Falkland Islands (Islas Malvinas)
  • Fiji
  • Gabon
  • The Gambia
  • Georgia
  • Ghana
  • Gibraltar
  • Grenada
  • Guam
  • Guatemala
  • Guernsey
  • Guinea
  • Guinea-Bissau
  • Guyana
  • Haiti
  • Heard Island and McDonald Islands
  • Honduras
  • India
  • Indonesia
  • Iraq
  • Isle of Man
  • Israel
  • Jamaica
  • Japan
  • Jersey
  • Jordan
  • Kazakhstan
  • Kenya
  • Kiribati
  • Kyrgyzstan
  • Kuwait
  • Laos
  • Lebanon
  • Lesotho
  • Liberia
  • Libya
  • Madagascar
  • Malawi
  • Malaysia
  • Maldives
  • Mali
  • Marshall Islands
  • Mauritania
  • Mauritius
  • Mexico
  • Micronesia
  • Mongolia
  • Montserrat
  • Morocco
  • Mozambique
  • Namibia
  • Nauru
  • Nepal
  • New Caledonia
  • New Zealand
  • Nicaragua
  • Niger
  • Nigeria
  • Niue
  • Norfolk Island
  • Northern Mariana Islands
  • Oman
  • Pakistan
  • Palau
  • Palestine
  • Panama
  • Papua New Guinea
  • Paraguay
  • Peru
  • Philippines
  • Pitcairn Islands
  • Puerto Rico
  • Qatar
  • Republic of the Congo
  • Rwanda
  • Saint Barthélemy
  • Saint Kitts and Nevis
  • Saint Lucia
  • Saint Pierre and Miquelon
  • Saint Vincent and the Grenadines
  • Saint Helena, Ascension and Tristan da Cunha
  • Samoa
  • São Tomé and Príncipe
  • Saudi Arabia
  • Senegal
  • Seychelles
  • Sierra Leone
  • Singapore
  • Solomon Islands
  • Somalia
  • South Africa
  • South Georgia and the South Sandwich Islands
  • South Korea
  • South Sudan
  • Sri Lanka
  • Sudan
  • Suriname
  • Taiwan
  • Tajikistan
  • Tanzania
  • Thailand
  • Timor-Leste
  • Togo
  • Tokelau
  • Tonga
  • Trinidad and Tobago
  • Tunisia
  • Türkiye
  • Turkmenistan
  • Turks and Caicos Islands
  • Tuvalu
  • Uganda
  • United Arab Emirates
  • United States
  • United States Minor Outlying Islands
  • U.S. Virgin Islands
  • Uruguay
  • Uzbekistan
  • Vanuatu
  • Venezuela
  • Vietnam
  • Wallis and Futuna
  • Western Sahara
  • Yemen
  • Zambia
  • Zimbabwe

If you're not in one of these countries or territories, you will get an error {"error": {"code": 400, "message": "User location is not supported for the API use.", "status": "FAILED_PRECONDITION"}. In this case, try Gemini Pro in Vertex AI.

 

P.P.S. The first image is how Gemini imagines the "AI that writes a program to create test data" 😆

💖 💪 🙅 🚩
intersystemsdev
InterSystems Developer

Posted on June 22, 2024

Join Our Newsletter. No Spam, Only the good stuff.

Sign up to receive the latest update from our blog.

Related