How I created AI-powered ORM for PostgreSQL, MySQL and SQLite and why you shouldn't use it

meatboy

Meat Boy

Posted on October 13, 2023

How I created AI-powered ORM for PostgreSQL, MySQL and SQLite and why you shouldn't use it

tl;dr
I created AI-powered ORM for Node with TypeScript or JavaScript called ormgpt. It works, it's silly and please don't use it.

npm install ormgpt

Cutting edge, blazing fast technology everywhere

In the last few years number of new ORMs (object relation mappers) and query builders has grown like crazy. A few years ago, the golden standard was either an ORM like Sequelize or a query builder like Knex. Since then we got TypeORM, Bookshelf, Objection, Mikro-ORM, Prisma, Drizzle, kysely and many, many more. While I agree that more option is good since anyone can choose the best-suited solution for their needs, it also creates many copy-alike libs.

At this point, I think ORMs have become the new days since the last javascript frameworks but for the backend.

Another hot topic, wider than just the javascript ecosystem is AI. Entire group of algorithms to recognize patterns, predict output and generate things. Now tech startup not only must store data in the hot blockchain, NoSQL or vector database, but compute on edge computing using quantum technology. Must be also AI - artificially intelligent.

AI

Afternoon idea

My thought was, what if I create a hot, new lib to access data like ORMs or query builders but using AI? So anyone can access data using plain language like:

give me 10 recent posts from the category travel and where the author is John Doe, with the author and comments info  
Enter fullscreen mode Exit fullscreen mode

or even in other languages like for example German

bitte legen Sie einen neuen Benutzer Hans Schmidt mit Wohnadresse München, Kaufingerstraße und Neuhauser Straße 1A an
Enter fullscreen mode Exit fullscreen mode

so I messed a little with OpenAI API to call with prompts like

Prepare SQL query for ${prompt}
Enter fullscreen mode Exit fullscreen mode

but that was too general. I tried then

Having database schema: 

${dbSchema}

Prepare SQL query for:

${prompt}
Enter fullscreen mode Exit fullscreen mode

still, often it was returning with invalid queries or additional comments. So I went even stricter passing also dialect, entire db schema and asking to not write any other response than a query.

You are an SQL engine brain. 
You are using ${this.dialect} dialect.
Having db schema as follows:
${this.dbSchema}

Write a query to fulfil the user request: ${request}

Don't write anything else than SQL query.
Enter fullscreen mode Exit fullscreen mode

And that worked quite well. So the next part was to prepare methods to call OpenAI programmatically and adapters for database engines.

Method calling OpenAI was pretty simple and using built-in fetch:


  private async getResponse(request: string): Promise<string> {
    const prompt = `
      You are an SQL engine brain. 
      You are using ${this.dialect} dialect.
      Having db schema as follows:
      ${this.dbSchema}

      Write a query to fulfil the user request: ${request}

      Don't write anything else than SQL query.
    `;

    const response = await fetch(this.apiUrl, {
      method: "POST",
      headers: {
        "Content-Type": "application/json",
        Authorization: `Bearer ${this.apiKey}`,
      },
      body: JSON.stringify({
        model: this.model,
        messages: [
          {
            role: "user",
            content: prompt,
          },
        ],
        ...this.modelOptions,
      }),
    });

    const data = (await response.json()) as ErrorResponse | SuccessResponse;

    if (data.hasOwnProperty("error")) {
      throw new Error((data as ErrorResponse).error.message);
    }

    return (data as SuccessResponse).choices[0].message.content;
  }
Enter fullscreen mode Exit fullscreen mode

I know OpenAI has also SDK lib but I prefer simple calls instead of another dependency since it's hard to manage them in the long term. API allows direct access to the resource, SDK package would have to be updated separately and eventually can be abandoned.

pkg

For the database engine, I choose to support Postgres, MySQL and SQLite out of the box. They are the most popular and I worked with all of them before with success. The first was SQLite which allowed me to experiment with different interfaces of adapter. With such an interface, anyone can create their own adapter for other engines like Oracle, ClickHouse, CouchDB etc. I decided to stick with the smallest possible set of methods in the interface, leaving other responsibilities than executing queries to native clients:

export interface DatabaseEngineAdapter {
  executeQuery(query: string): Promise<any[]>;
}
Enter fullscreen mode Exit fullscreen mode

Then I created silly adapters:

import { DatabaseEngineAdapter } from "./DatabaseEngineAdapter";
import betterSqlite3, { Statement } from "better-sqlite3";

export class SqliteAdapter implements DatabaseEngineAdapter {
  private db: betterSqlite3.Database;

  constructor({ dbFilePath }: { dbFilePath: string }) {
    this.db = new betterSqlite3(dbFilePath);
  }

  executeQuery(query: string): Promise<unknown[]> {
    return new Promise((resolve, reject) => {
      const statement: Statement = this.db.prepare(query);
      if (this.isSelectQuery(query)) {
        resolve(statement.all());
      } else {
        const info = statement.run();
        resolve([]);
      }
    });
  }

  private isSelectQuery(query: string): boolean {
    return query.trim().toLowerCase().startsWith("select");
  }
}
Enter fullscreen mode Exit fullscreen mode

Now it's possible to run.

For example for request to SQLite database with simple schema of users, posts, comments and likes:

  const sqliteAdapter = new SqliteAdapter({
    dbFilePath: "./db.sqlite",
  });

  const ormgpt = new ormGPT({
    apiKey: process.env.OPENAI_API_KEY || "",
    schemaFilePath: "./schema.sql",
    dialect: "postgres",
    dbEngineAdapter: sqliteAdapter,
  });

ormgpt.query("give me post with id 1, all comments for this post and user information about author");
Enter fullscreen mode Exit fullscreen mode

generated query:

SELECT 
  p.id AS post_id, 
  p.title, 
  p.body, 
  c.id AS comment_id, 
  c.body AS comment_body, 
  u.username AS author_username, 
  u.email AS author_email 
FROM 
  posts p 
  JOIN comments c ON p.id = c.post_id 
  JOIN users u ON u.id = p.user_id 
WHERE 
  p.id = 1;
Enter fullscreen mode Exit fullscreen mode

and after execution response from the database:

[
  {
    post_id: 1,
    title: 'Hello world!',
    body: 'This is my first post!',
    comment_id: 1,
    comment_body: 'Hello world!',
    author_username: 'test',
    author_email: 'test@example.com'
  }
]
Enter fullscreen mode Exit fullscreen mode

It's kind of hard to test such app because it's non-deterministic. The only way I thought about is to test short, precise statements like "create x with y and z" and then look up db if it's there.

Db

Conclusion

Here we come to the conclusion, of why this lib is useless for now. If you look for something more complex like joins, nested subqueries or engine-related queries with the current state of GPT is not possible to get results you can rely on. However, at least you can minimize randomness by being very strict about the requirements in your statement and decreasing "temperature" as low as 0 for deterministic results!

Anyway, as an experimental project, I decided to finish it. So the last part was to allow fine-tuning model parameters:

export type ModelTuning = {
  temperature: number;
  max_tokens: number;
  top_p: number;
  frequency_penalty: number;
  presence_penalty: number;
}
Enter fullscreen mode Exit fullscreen mode

and prepare Postgres and MySQL adapters. The last part was to publish lib. The name ormGPT comes from ORM + GPT model but in fact it's neither orm nor query builder. The proper ORM should "map" the database into objects. Then maybe it's "intelligent" query builder? Also no. Query builder usually allows you to chain query object before generating sql. You can chain plain string, but is that enough? Maybe it should be chatGPTtoQueryFacade.js?

Too much thinking, not enough willingness. Published as ormGPT.

That's it. Tiny afternoon project, you shouldn't use in your production application. Or maybe you should? At the end you can tell your clients, you are using cutting-edge technologies and advanced AI.

You can find the lib on the NPM: https://www.npmjs.com/package/ormgpt

or at the GitHub repository:

GitHub logo pilotpirxie / ormGPT

An ORM based on OpenAI that translates plain language into SQL queries and executes them on a database.

ormGPT

An ORM based on OpenAI that translates plain human language into SQL queries and executes them on a database.

Currently supports database dialects: MySQL, PostgreSQL, and SQLite.

Supported languages: English, German, French, Spanish, Polish, Italian, Dutch, Portuguese, Ukrainian, Arabic, Chinese, Japanese, Korean, Turkish and many more.

ormgpt.query("give me post with id 1, all comments for this post and user information about author");
Enter fullscreen mode Exit fullscreen mode

Generated query:

SELECT 
  p.id AS post_id, 
  p.title, 
  p.body, 
  c.id AS comment_id, 
  c.body AS comment_body, 
  u.username AS author_username, 
  u.email AS author_email 
FROM 
  posts p 
  JOIN comments c ON p.id = c.post_id 
  JOIN users u ON u.id = p.user_id 
WHERE 
  p.id = 1;
Enter fullscreen mode Exit fullscreen mode

Response:

[
  {
    post_id: 1,
    title: 'Hello world!',
    body: 
Enter fullscreen mode Exit fullscreen mode
💖 💪 🙅 🚩
meatboy
Meat Boy

Posted on October 13, 2023

Join Our Newsletter. No Spam, Only the good stuff.

Sign up to receive the latest update from our blog.

Related