How I created AI-powered ORM for PostgreSQL, MySQL and SQLite and why you shouldn't use it
Meat Boy
Posted on October 13, 2023
tl;dr
I created AI-powered ORM for Node with TypeScript or JavaScript called ormgpt. It works, it's silly and please don't use it.
npm install ormgpt
Cutting edge, blazing fast technology everywhere
In the last few years number of new ORMs (object relation mappers) and query builders has grown like crazy. A few years ago, the golden standard was either an ORM like Sequelize or a query builder like Knex. Since then we got TypeORM, Bookshelf, Objection, Mikro-ORM, Prisma, Drizzle, kysely and many, many more. While I agree that more option is good since anyone can choose the best-suited solution for their needs, it also creates many copy-alike libs.
At this point, I think ORMs have become the new days since the last javascript frameworks but for the backend.
Another hot topic, wider than just the javascript ecosystem is AI. Entire group of algorithms to recognize patterns, predict output and generate things. Now tech startup not only must store data in the hot blockchain, NoSQL or vector database, but compute on edge computing using quantum technology. Must be also AI - artificially intelligent.
Afternoon idea
My thought was, what if I create a hot, new lib to access data like ORMs or query builders but using AI? So anyone can access data using plain language like:
give me 10 recent posts from the category travel and where the author is John Doe, with the author and comments info
or even in other languages like for example German
bitte legen Sie einen neuen Benutzer Hans Schmidt mit Wohnadresse München, Kaufingerstraße und Neuhauser Straße 1A an
so I messed a little with OpenAI API to call with prompts like
still, often it was returning with invalid queries or additional comments. So I went even stricter passing also dialect, entire db schema and asking to not write any other response than a query.
YouareanSQLenginebrain.Youareusing${this.dialect}dialect.Havingdbschemaasfollows:${this.dbSchema}Writeaquerytofulfiltheuserrequest:${request}Don't write anything else than SQL query.
And that worked quite well. So the next part was to prepare methods to call OpenAI programmatically and adapters for database engines.
Method calling OpenAI was pretty simple and using built-in fetch:
privateasyncgetResponse(request:string):Promise<string>{constprompt=`
You are an SQL engine brain.
You are using ${this.dialect} dialect.
Having db schema as follows:
${this.dbSchema}
Write a query to fulfil the user request: ${request}
Don't write anything else than SQL query.
`;constresponse=awaitfetch(this.apiUrl,{method:"POST",headers:{"Content-Type":"application/json",Authorization:`Bearer ${this.apiKey}`,},body:JSON.stringify({model:this.model,messages:[{role:"user",content:prompt,},],...this.modelOptions,}),});constdata=(awaitresponse.json())asErrorResponse|SuccessResponse;if (data.hasOwnProperty("error")){thrownewError((dataasErrorResponse).error.message);}return (dataasSuccessResponse).choices[0].message.content;}
I know OpenAI has also SDK lib but I prefer simple calls instead of another dependency since it's hard to manage them in the long term. API allows direct access to the resource, SDK package would have to be updated separately and eventually can be abandoned.
For the database engine, I choose to support Postgres, MySQL and SQLite out of the box. They are the most popular and I worked with all of them before with success. The first was SQLite which allowed me to experiment with different interfaces of adapter. With such an interface, anyone can create their own adapter for other engines like Oracle, ClickHouse, CouchDB etc. I decided to stick with the smallest possible set of methods in the interface, leaving other responsibilities than executing queries to native clients:
For example for request to SQLite database with simple schema of users, posts, comments and likes:
constsqliteAdapter=newSqliteAdapter({dbFilePath:"./db.sqlite",});constormgpt=newormGPT({apiKey:process.env.OPENAI_API_KEY||"",schemaFilePath:"./schema.sql",dialect:"postgres",dbEngineAdapter:sqliteAdapter,});ormgpt.query("give me post with id 1, all comments for this post and user information about author");
[
{
post_id: 1,
title: 'Hello world!',
body: 'This is my first post!',
comment_id: 1,
comment_body: 'Hello world!',
author_username: 'test',
author_email: 'test@example.com'
}
]
It's kind of hard to test such app because it's non-deterministic. The only way I thought about is to test short, precise statements like "create x with y and z" and then look up db if it's there.
Conclusion
Here we come to the conclusion, of why this lib is useless for now. If you look for something more complex like joins, nested subqueries or engine-related queries with the current state of GPT is not possible to get results you can rely on. However, at least you can minimize randomness by being very strict about the requirements in your statement and decreasing "temperature" as low as 0 for deterministic results!
Anyway, as an experimental project, I decided to finish it. So the last part was to allow fine-tuning model parameters:
and prepare Postgres and MySQL adapters. The last part was to publish lib. The name ormGPT comes from ORM + GPT model but in fact it's neither orm nor query builder. The proper ORM should "map" the database into objects. Then maybe it's "intelligent" query builder? Also no. Query builder usually allows you to chain query object before generating sql. You can chain plain string, but is that enough? Maybe it should be chatGPTtoQueryFacade.js?
Too much thinking, not enough willingness. Published as ormGPT.
That's it. Tiny afternoon project, you shouldn't use in your production application. Or maybe you should? At the end you can tell your clients, you are using cutting-edge technologies and advanced AI.