Announcing AnnaDB - next-gen NoSQL database
Roman Right
Posted on September 13, 2022
I'm excited to introduce AnnaDB - the next-generation developer-first NoSQL data store.
I work with many small projects daily - proofs of concepts and experiments with new frameworks or patterns. For these purposes, I needed a database that works with flexible data structures, as I change it frequently during my experiments. And it must support relations out of the box, as this is a natural part of the structures' design - links to other objects. I tried a lot (if not all) databases, but nothing fit my requirements well. So, I decided to make my own then. This is how AnnaDB was born.
AnnaDB is an in-memory data store with disk persistence. It is written with Rust, a memory-safe compilable language. AnnaDB is fast and safe enough to be and the main data storage, and the cache layer.
Features
- Flexible object structure - simple primitives and complicated nested containers could be stored in AnnaDB
- Relations - you can link any object to another, and AnnaDB will resolve this relation on finds, updates, and other operations.
- Transactions - out of the box
Basics
I want to start with the basic concepts and examples of the syntax here and continue with the usage example.
Collections
AnnaDB stores objects in collections. Collections are analogous to tables in SQL databases.
Every object and sub-object (item of a vector or map) that was stored in AnnaDB has a link (id). This link consists of the collection name and unique uuid4 value. One object can contain links to objects from any collections - AnnaDB will fetch and process them on all the operations automatically without additional commands (joins or lookups)
TySON
The AnnaDB query language uses the TySON
format. The main difference from other data formats is that each item has a value and prefix. The prefix can mark the data or query type (as it is used in AnnaDB) or any other useful for the parser information. This adds more flexibility to the data structure design - it is allowed to use as many custom data types as the developer needs.
You can read more about the TySON
format here
Data Types
There are primitive and container data types in AnnaDB.
Primitive data types are a set of basic types whose values can not be decoupled. In TySON, primitives are represented as prefix|value|
or prefix
only. Prefix in AnnaDB shows the data type. For example, the string test
will be represented as s|test|
, where s
- is a prefix that marks data as a string, and test
is the actual value.
Container data types keep primitive and container objects using specific rules. There are only two container types in AnnaDB for now. Maps and vectors.
- Vectors are ordered sets of elements of any type. Example:
v[n|1|,n|2|,n|3|,]
- Maps are associative arrays. Example:
m{ s|bar|: s|baz|,}
More information about AnnaDB data types can be found in the documentation
Query
Query in AnnaDB is a pipeline of steps that should be applied in the order it was declared. The steps are wrapped into a vector with the prefix q
- query.
collection|test|:q[
find[
],
sort[
asc(value|num|),
],
limit(n|5|),
];
If the pipeline has only one step, the q
vector is not needed.
collection|test|:find[
gt{
value|num|:n|4|,
},
];
Server
To run AnnaDB locally please type the next command in the terminal:
docker run --init -p 10001:10001 -t romanright/annadb:0.1.0
Client
AnnaDB shell client is an interactive terminal application that connects to the DB instance, validates and handles queries. It fits well to play with query language or work with the data manually.
The client can be installed via pip
pip install annadb
Run
annadb --uri annadb://localhost:10001
Usage example
You are prepared for the fun part of the article now. Let's play with AnnaDB!
I'll create a database for the candy store to show the features.
Insert primitive
Let's start with categories. I'll represent categories as simple string objects. Let's insert the first one into the categories
collection.
Request:
collection|categories|:insert[
s|sweets|,
];
collection|categories|
shows on which collection the query will be applied. In our case - categories
.
insert[...]
- is a query step. You can insert one or many objects using the insert
operation.
s|sweets|
- is the object to insert. In this case, it is a string primitive. Prefix s
means that it is a string, |
wrap the value of the primitive. Other primitive types could be found in the Data Types section.
Response:
result:ok[
response{
s|data|:ids[
categories|d2e9fecd-8b3d-429d-9a9b-34810120a221|,
],
s|meta|:insert_meta{
s|count|:n|1|,
},
},
];
If everything is ok, the result will have an ok[...]
vector with responses for all the transaction pipelines. Each response contains data
and meta
information. In our case, there is only one response with a vector of ids
in data
and a number of inserted objects in meta
.
Insert container
Let's insert a more complicated object now - a chocolate bar. It will have fields:
- name
- price
- category
For the category, I'll use the already created one.
Request:
collection|products|:insert[
m{
s|name|:s|Tony's|,
s|price|:n|5.95|,
s|category|:categories|d2e9fecd-8b3d-429d-9a9b-34810120a221|,
},
];
The query is similar to the previous one, but the object is not a primitive but a map. The value of the category
field is a link that was received after the previous insert.
Response:
result:ok[
response{
s|data|:ids[
products|17b12780-349c-4091-9bd2-7e08ad509ad0|,
],
s|meta|:insert_meta{
s|count|:n|1|,
},
},
];
The response is nearly the same as before - link in data and number of inserted objects in meta.
Get object
Let's retrieve the information about this chocolate bar now. I'll use the get
operation for this, to the object by id
Request:
collection|products|:get[
products|17b12780-349c-4091-9bd2-7e08ad509ad0|,
];
This time I use the get[...]
query step. Using this step you can retrieve one or many objects using object links.
Response:
result:ok[
response{
s|data|:objects{
products|17b12780-349c-4091-9bd2-7e08ad509ad0|:m{
s|category|:s|sweets|,
s|price|:n|5.95|,
s|name|:s|Tony's|,
},
},
s|meta|:get_meta{
s|count|:n|1|,
},
},
];
In the response here you can see the objects{...}
map, where keys are links to objects and values are objects. objects{}
map keeps the order - it will return objects in the same order as they were requested in the get step, or as they were sorted by the sort step.
The category was fetched automatically and the value was returned.
Let's insert another chocolate bar there to have more objects in the collection:
collection|products|:insert[
m{
s|name|:s|Mars|,
s|price|:n|2|,
s|category|:categories|d2e9fecd-8b3d-429d-9a9b-34810120a221|,
},
];
I use the same category id for this bar.
Modify primitive
Let's modify the category to make it more accurate.
Request:
collection|categories|:q[
get[
categories|d2e9fecd-8b3d-429d-9a9b-34810120a221|,
],
update[
set{
root:s|chocolate|,
},
],
];
The query here consists of 2 steps. Get the object by link
step and modify this object
step. The update[...]
operation is a vector of update operators. Read more about the update.
Response:
result:ok[
response{
s|data|:ids[
categories|d2e9fecd-8b3d-429d-9a9b-34810120a221|,
],
s|meta|:update_meta{
s|count|:n|1|,
},
},
];
The response of the update operation contains the ids of the updated objects as data and the number of the updated objects as meta.
Let's take a look at how this affected the chocolate objects.
Request:
collection|products|:find[
];
To find objects, I use the find[...]
operation. It is a vector of find operators. If it is empty, all the collection objects will be returned.
Response:
result:ok[
response{
s|data|:objects{
products|0b6ddf36-b8ba-487f-acd8-4dfee05d5177|:m{
s|price|:n|2|,
s|name|:s|Mars|,
s|category|:s|chocolate|,
},
products|17b12780-349c-4091-9bd2-7e08ad509ad0|:m{
s|price|:n|5.95|,
s|name|:s|Tony's|,
s|category|:s|chocolate|,
},
},
s|meta|:find_meta{
s|count|:n|2|,
},
},
];
The category was changed for both products, as the category object was linked with these objects.
Modify container
Now I'll increase the price of the bars, where it is less than 2
Request:
collection|products|:q[
find[
lt{
value|price|:n|3|,
},
],
update[
inc{
value|price|:n|2|,
},
],
];
The find
step can stay before the update
step as well. All the found objects will be updated. Read more about find
operation and operators here.
Response:
result:ok[
response{
s|data|:ids[
products|0b6ddf36-b8ba-487f-acd8-4dfee05d5177|,
],
s|meta|:update_meta{
s|count|:n|1|,
},
},
];
The response is similar to the previous one.
Here is how all the products look like after the update:
result:ok[
response{
s|data|:objects{
products|0b6ddf36-b8ba-487f-acd8-4dfee05d5177|:m{
s|category|:s|chocolate|,
s|name|:s|Mars|,
s|price|:n|4|,
},
products|17b12780-349c-4091-9bd2-7e08ad509ad0|:m{
s|name|:s|Tony's|,
s|price|:n|5.95|,
s|category|:s|chocolate|,
},
},
s|meta|:find_meta{
s|count|:n|2|,
},
},
];
Sort objects
To sort objects, I'll use the sort
operation against the price field.
Request:
collection|products|:q[
find[
],
sort[
asc(value|price|),
],
];
The sort[...]
operation is a vector of sort operators - asc
and desc
. Sort operators are modifiers that contain paths to the sorting value. The sort
operation is not an independent step, it can stay only after find-like operations that return objects. You can read more about sort here
Response:
result:ok[
response{
s|data|:objects{
products|0b6ddf36-b8ba-487f-acd8-4dfee05d5177|:m{
s|name|:s|Mars|,
s|price|:n|4|,
s|category|:s|chocolate|,
},
products|17b12780-349c-4091-9bd2-7e08ad509ad0|:m{
s|category|:s|chocolate|,
s|price|:n|5.95|,
s|name|:s|Tony's|,
},
},
s|meta|:find_meta{
s|count|:n|2|,
},
},
];
Objects in the response are sorted by price now.
It is useful to use limit
and offset
operations together with sort. You can read about them in the documentation
Delete objects
After any find-like step, you can use the delete
operation to delete all the found objects. Or it can be used independently to delete the whole collection.
Request:
collection|products|:q[
find[
gt{
value|price|:n|5|,
},
],
delete,
];
The delete
operation is a primitive without value.
Response:
result:ok[
response{
s|data|:ids[
products|17b12780-349c-4091-9bd2-7e08ad509ad0|,
],
s|meta|:update_meta{
s|count|:n|1|,
},
},
];
The response contains affected ids in data
and the number of deleted objects in meta
.
Using from your app
AnnaDB has a Python driver. It has an internal query builder - you don't need to learn AnnaDB query syntax to work with it. But it supports raw querying too.
I'll add drivers for other languages soon. If you can help me with it, I'll be more than happy :)
Plans
This is the very early version of the database. It can already do things, and I use it in a few of my projects. But there are many features to work on yet.
Drivers
I plan to add drivers to support the most popular languages, like JS
, Rust
, Go
, and others. If you can help with this - please get in touch with me.
Rights management
This is probably the most important feature to implement. Authentication, authorizations, roles, etc.
Performance increase
There are many performance-related things to improve now.
Query features
- Projections
- More find and update operators
- Developer experience improves
Data Types
I plan to add more data types like geo points and graph vertices to make AnnaDB more comfortable working with different data fields.
Managed service
My big goal is to make a managed data store service. Hey, AWS, Google Cloud, MS Azure, I'm ready for collaborations! ;)
Links
- Documentation - https://annadb.dev
- GitHub Repo - https://github.com/Anna-Team/AnnaDB
- Python Driver - https://pypi.org/project/annadb/
- My Twitter - https://twitter.com/roman_the_right
If you face any bug or weird behavior, please, let me know.
Posted on September 13, 2022
Join Our Newsletter. No Spam, Only the good stuff.
Sign up to receive the latest update from our blog.