Running Text-To-SQL on MongoDB: Working with Postgres, MySQL, MongoDB and Others
Mrunmay Shelar
Posted on August 7, 2024
When working with LangDB, one question that developers ask how langdb integrates with their existing data infrastructure. Fortunately, LangDB offers seamless connectivity to your pre-existing databases, allowing you to leverage your data without significant modifications.
Database Integrations
Table Engines
LangDB, built on top of Clickhouse, provides native support for integrating with a wide range of databases. This means you can use the Integration Engines of Clickhouse to continue using your existing databases while benefiting from LangDB's SQL arsenal.
Here are a few examples of how you can connect to some popular databases. You can find more about them here: Table Engines
NOTE: Original article can be found at https://app.langdb.ai/share/apps/4250cb28-5e2b-421f-9711-0f334452aef5
PostgreSQL
CREATE TABLE [IF NOT EXISTS] table_name
(
col_name1 type1 ,
col_name2 type2 ,
...
) ENGINE = PostgreSQL(host:port, database, table, user, password)
MySQL
CREATE TABLE [IF NOT EXISTS] table_name
(
col_name1 type1 ,
col_name2 type2 ,
...
) ENGINE = MySQL(host:port, database, table, user, password)
MongoDB
CREATE TABLE [IF NOT EXISTS] table_name
(
col_name1 type1 ,
col_name2 type2 ,
...
) ENGINE = MongoDB(host:port, database, collection, user, password [, 'ssl=true&tls=true&tlsAllowInvalidCertificates=true&connectTimeoutMS=30000&socketTimeoutMS=60000&authSource=admin']);
Table Functions
Apart from the table engines, you can also query your existing databases directly using table functions
There are Postgresql, MongoDB, MySQL, and many other table functions.
You can read more about them here: Table Functions
Let's take the example of MongoDB. On MongoDB Atlas, I will use Sample Analytics Dataset.
The sample_analytics
database contains three collections for a typical financial services application. It has customers, accounts, and transactions.
We will focus on accounts
, but feel free to play around. The accounts
collection contains the details of the users.
We can use the mongodb
table function to query the collection quickly.
SELECT * from mongodb(
'host:port',
'sample_analytics',
'customers',
'user',
'password',
'_id String,
username String,
name String,
address String,
birthdate DateTime,
email String,
accounts Array(Int32)',
'connectTimeoutMS=10000&ssl=true&authSource=admin'
) limit 5;
_id | username | name | address | birthdate | accounts | |
---|---|---|---|---|---|---|
5ca4bbcea2dd94ee58162a69 | valenciajennifer | Lindsay Cowan | Unit 1047 Box 4089 DPO AA 57348 | 1994-02-19 23:46:27 | cooperalexis@hotmail.com | [116508] |
5ca4bbcea2dd94ee58162b29 | crodriguez | Charles Jones | 183 Young Mountain Rossmouth, DC 11579 | 1974-01-24 12:12:53 | vancejohnny@hotmail.com | [467666] |
5ca4bbcea2dd94ee58162a76 | portermichael | Lauren Clark | 1579 Young Trail Jessechester, OH 88328 | 1980-10-28 16:25:59 | briannafrost@yahoo.com | [883283,980867,164836,200611,528224,931483] |
5ca4bbcea2dd94ee58162a72 | wesley20 | James Sanchez | 8681 Karen Roads Apt. 096 Lowehaven, IA 19798 | 1973-01-13 16:17:26 | josephmacias@hotmail.com | [987709] |
5ca4bbcea2dd94ee58162bf4 | skinnercraig | Ashley Lindsey | Unit 6691 Box 1189 DPO AP 53029 | 1994-06-16 07:38:48 | robertwalker@gmail.com | [375655,892096,401997,253554,890055,959435] |
However, to use LangDB's collection of tools like Text-to-SQL and Embeddings, it is better to use the table engines. You can do something like this for the above query to get read-only access to the remote MongoDB collection.
CREATE TABLE IF NOT EXISTS customers_mongodb
(
_id "String",
username "String",
name "String",
address "String",
birthdate DateTime,
email "String",
accounts Array("Int32")
) ENGINE = MongoDB(
'host:port',
'sample_analytics',
'customers',
'user',
'password',
'ssl=true&tls=true&tlsAllowInvalidCertificates=true&connectTimeoutMS=30000&socketTimeoutMS=60000&authSource=admin'
);
SELECT * FROM customers_mongodb limit 5
_id | username | name | address | birthdate | accounts | |
---|---|---|---|---|---|---|
5ca4bbcea2dd94ee58162a69 | valenciajennifer | Lindsay Cowan | Unit 1047 Box 4089 DPO AA 57348 | 1994-02-19 23:46:27 | cooperalexis@hotmail.com | [116508] |
5ca4bbcea2dd94ee58162b29 | crodriguez | Charles Jones | 183 Young Mountain Rossmouth, DC 11579 | 1974-01-24 12:12:53 | vancejohnny@hotmail.com | [467666] |
5ca4bbcea2dd94ee58162a76 | portermichael | Lauren Clark | 1579 Young Trail Jessechester, OH 88328 | 1980-10-28 16:25:59 | briannafrost@yahoo.com | [883283,980867,164836,200611,528224,931483] |
5ca4bbcea2dd94ee58162a72 | wesley20 | James Sanchez | 8681 Karen Roads Apt. 096 Lowehaven, IA 19798 | 1973-01-13 16:17:26 | josephmacias@hotmail.com | [987709] |
5ca4bbcea2dd94ee58162bf4 | skinnercraig | Ashley Lindsey | Unit 6691 Box 1189 DPO AP 53029 | 1994-06-16 07:38:48 | robertwalker@gmail.com | [375655,892096,401997,253554,890055,959435] |
Now that we have created a table, we can use a Text-To-SQL
model directly onto the database.
SELECT text_to_sql('How Many accounts are from people who were born in 1994')
Question: How many accounts are from people who were born in 1994?
SQLQuery: `SELECT COUNT(*) as account_count FROM customers_mongodb WHERE toYear(birthdate) = 1994`
SQLResult: [{"account_count":"19"}]
Let's verify and run that query for ourselves to confirm it.
SELECT COUNT(*) as account_count FROM customers_mongodb WHERE toYear(birthdate) = 1994
account_count |
---|
19 |
LangDB's features like Text-to-SQL and embeddings can streamline your data querying and analysis process, making extracting valuable insights from your databases more accessible. The example we walked through illustrates the simplicity of getting started with LangDB and how it can complement your existing setup.
As you explore LangDB further, remember that its flexibility and compatibility can significantly enhance your development workflow.
So, if you're ready to take your database experience to the next level with Gen AI, try LangDB. Its integration capabilities and robust features might become a go-to tool in your development arsenal.
Posted on August 7, 2024
Join Our Newsletter. No Spam, Only the good stuff.
Sign up to receive the latest update from our blog.
Related
August 7, 2024