Distributed storage in 30 minutes

Author: Igor Zolotarev

Hello, my name is Igor, and I am a part of the Tarantool DB team. When developing, I often need rapid prototypes of database applications, for example, to test code or to create an MVP. Of course, I would like such a prototype to require minimal effort to refine, in case it is decided to use it in production.

I don't like wasting my time configuring an SQL database, thinking about how to manage data sharding, or spending even more time studying connector interfaces. I prefer just to write a few lines of code, run it, and have everything working out of the box. To develop distributed applications rapidly, I use Cartridge, a framework for managing cluster applications based on Tarantool, a NoSQL database.

Today I will show how to quickly write a Cartridge-based application, cover it with tests, and run it. The article will be of interest to anyone tired of spending a lot of time prototyping applications, as well as those who want to try a new NoSQL technology.

From this article, you will learn what Cartridge is and what principles to have in mind when writing cluster business logic in it.

We will write a cluster application for storing data about employees of a company. The steps to accomplish it are:

Creating an application from a template with cartridge-cli
Describing your business logic in Lua in terms of Cartridge cluster roles
- Data Storage
- Custom HTTP API
Writing tests
Launching and configuring a small cluster locally
- Downloading configuration
- Configuring failover

Cartridge framework

Cartridge is a framework for developing cluster applications. It manages several instances of the Tarantool NoSQL database and shards data using the vshard module. Tarantool is a persistent in-memory database. It is very fast due to storing data in RAM but also reliable since Tarantool dumps all data to the hard disk and allows you to set up replication. Cartridge takes care of configuring Tarantool nodes and sharding cluster nodes, which leaves a developer with only writing the business logic of the applications and configuring the failover.

Advantages of Cartridge

Sharding and replication out of the box
Built-in failover support
CRUD, a NoSQL cluster query language
Integration testing of the entire cluster
Ansible-based cluster management
Cluster administration utility
Monitoring tools

Creating the first application

To do this, we will need cartridge-cli. It is a utility for working with Cartridge applications. It allows you to create an application from a template, manage a locally running cluster, and connect to Tarantool instances.

Installing Tarantool and cartridge-cli

On Debian or Ubuntu:

curl -L https://tarantool.io/fJPRtan/release/2.8/installer.sh | bash

sudo apt install cartridge-cli

On CentOS, Fedora, or ALT Linux:

curl -L https://tarantool.io/fJPRtan/release/2.8/installer.sh | bash

sudo yum install cartridge-cli

On macOS:

brew install tarantool

brew install cartridge-cli

Let's create a template application named myapp:

cartridge create --name myapp

cd myapp

tree .

Now we get a project structure similar to this:

myapp
├── app
│ └── roles
│ └── custom.lua
├── test
├── init.lua
├── myapp-scm-1.rockspec

the init.lua file is the entry point of the Cartridge application. It defines the cluster's configuration and calls the functions required at the start of each application node.
the app/roles/ directory contains "roles" describing the application's business logic.
the myapp-scm-1.rockspec file specifies the application's dependencies

By now, we already have a working "Hello, world!" application. It can be started with the following commands

cartridge build

cartridge start -d

cartridge replicasets setup --bootstrap-vshard

After that, access to localhost:8081/hello will show "Hello world!".

Let's now create a small template-based application, a sharded storage with an HTTP API for storing and receiving data. To do this, we need to understand how to implement cluster business logic in Cartridge.

Writing business logic in Cartridge

Each cluster application is based on roles, Lua modules describing the application's business logic. For example, they can be modules that store data, provide HTTP API, or cache data from an Oracle database. A role is assigned to a set of instances joined by replication (a replica set), and it is enabled at each instance. Replica sets can have different set of roles.

In Cartridge, on each cluster node, there is a cluster configuration. It describes the cluster's topology and, optionally, the configuration that your role would use. Such configuration can be changed in runtime to affect the role's behavior.

Each role has a structure similar to this:

return {
    role_name = 'your_role_name',

    init = init,
    validate_config = validate_config,
    apply_config = apply_config,
    stop = stop,

    rpc_function = rpc_function,
    dependencies = {
        'another_role_name',
    },
}

Role lifecycle

An instance is starting.
The role named role_name waits for the start of all its dependent roles specified in dependencies.
The validate_config function is called to check whether the role's configuration is valid.
The role initialization function init is called. This function performs the actions that need to be done once, when the role is started for the first time.
The apply_config function is called to apply the configuration (if it is specified). The validate_config and apply_config functions are also called whenever the role's configuration changes.
The role is saved in the registry. From there it will be available to other roles on the same node via cartridge.service_get('your_role_name').
The functions declared in a role will be available from other nodes via cartridge.rpc_call('your_role_name', 'rpc_function').
Before a role is stopped or restarted, the stop function is launched. It terminates the role, for example, removing the fibers created by the role.

Cluster NoSQL queries

There are several ways to write cluster queries in Cartridge:

Calling functions via the vshard API (this is a complicated but flexible way):

vshard.router.callrw(bucket_id, 'app.roles.myrole.my_rpc_func', {...})

Tarantool CRUD
- Simple function calls: crud.insert / get / replace / ...
- Support for calculating bucket_id is limited
- Roles must depend on crud-router / crud-storage

Application structure

Suppose we want a cluster with one router and two groups of storages with two instances each. This topology is typical for both Redis Cluster and MongoDB Cluster. For the stateful failover to save the state of the current masters, the cluster will include stateboard, yet another instance. When increased reliability is required, it is better to use an etcd cluster instead of stateboard.

The router will distribute requests across the cluster and manage the failover.

Writing custom roles

We will need to write two roles: one for data storage, and one for HTTP API.

In the app/roles directory, we create two new files: app/roles/storage.lua and app/roles/api.lua

Data Storage

Let's describe the role for data storage. In the init function, we will create a table and indexes for it, then add crud-storage to its dependencies.

The Lua code in the init function is equivalent to the following pseudo-SQL code:

CREATE TABLE employee(
    bucket_id unsigned,
    employee_id string,
    name string,
    department string,
    position string,
    salary unsigned
);
CREATE UNIQUE INDEX primary ON employee(employee_id);
CREATE INDEX bucket_id ON employee(bucket_id);

Add the following code to the app/roles/storage.lua file:

local function init(opts)
    -- opts has the attribute indicating if the function is called at the master or at the replica
    --  we create tables only at the master instance, they will appear automatically at the replica
    if opts.is_master then
        -- Creating a table with employees
        local employee = box.schema.space.create('employee', {if_not_exists = true})

        -- setting the format
        employee:format({
            {name = 'bucket_id', type = 'unsigned'},
            {name = 'employee_id', type = 'string', comment = 'ID СЃРѕС‚СЂСѓРґРЅРёРєР°'},
            {name = 'name', type = 'string', comment = 'Р¤РРћ СЃРѕС‚СЂСѓРґРЅРёРєР°'},
            {name = 'department', type = 'string', comment = 'РћС‚РґРµР»'},
            {name = 'position', type = 'string', comment = 'Р”РѕР»Р¶РЅРѕСЃС‚СЊ'},
            {name = 'salary', type = 'unsigned', comment = 'Р—Р°СЂРїР»Р°С‚Р°'}
        })

        -- Create the primary index
        employee:create_index('primary', {parts = {{field = 'employee_id'}},
            if_not_exists = true })

        -- Indexing by bucket_id, it is necessary for sharding
        employee:create_index('bucket_id', {parts = {{field = 'bucket_id'}},
            unique = false,
            if_not_exists = true })
    end

    return true
end

return {
    init = init,
    -- <<< remembering the crud-storage dependency
    dependencies = {'cartridge.roles.crud-storage'},
}

We will not need the rest of the functions from the role's API, since our role has no configuration and it does not allocate resources to be cleaned after the role's work is complete.

HTTP API

We will need the second role to fill the tables with data and retrieve this data on request. The role will access the Cartridge's built-in HTTP server. It depends on crud-router.

Let's define a function to handle POST requests. The request body will contain the object to be saved to the database.

local function post_employee(request)
    -- getting an object from the request body
    local employee = request:json()

    -- writing it to the database
    local _, err = crud.insert_object('employee', employee)

    -- if an error occurs, writing it to the log and returning 500
    if err ~= nil then
        log.error(err)
        return {status = 500}
    end
    return {status = 200}
end

The GET method will take the employees' salary values as a parameter. The expected response is a JSON with a list of employees whose salary is higher than the one specified in the request.

SELECT employee_id, name, department, position, salary
FROM employee
WHERE salary >= @salary

local function get_employees_by_salary(request)
    -- get the salary parameter from the query
    local salary = tonumber(request:query_param('salary') or 0)

    -- selecting the employee data
    local employees, err = crud.select('employee', {{'>=', 'salary', salary}})

    -- if an error occurs, writing it to the log and returning 500
    if err ~= nil then
        log.error(err)
        return { status = 500 }
    end

    -- the employees table stores the list of rows that meet the condition and the space format
    -- the unflatten_rows function converts a table row to a key-value table
    employees = crud.unflatten_rows(employees.rows, employees.metadata)

    employees = fun.iter(employees):map(function(x) 
        return {
            employee_id = x.employee_id, 
            name = x.name, 
            department = x.department, 
            position = x.position, 
            salary = x.salary,
        }
    end):totable()
    return request:render({json = employees})
end

Now let's write the init function for the role. Here we will turn to the Cartridge's registry to get an HTTP server and use it to assign the HTTP endpoints for the application.

local function init()
    -- getting an HTTP-server from the Cartridge's registry
    local httpd = assert(cartridge.service_get('httpd'), "Failed to get httpd serivce")

    -- setting the routes
    httpd:route({method = 'GET', path = '/employees'}, get_employees_by_salary)
    httpd:route({method = 'POST', path = '/employee'}, post_employee)

    return true
end

Putting everything together:

app/roles/api.lua

local cartridge = require('cartridge')
local crud = require('crud')
local log = require('log')
local fun = require('fun')

-- the 'GET /employees' method returns a list of employees with salaries greater than the one specified in the request
local function get_employees_by_salary(request)
    -- getting the salary parameter from the query
    local salary = tonumber(request:query_param('salary') or 0)

    -- selecting the employee data
    local employees, err = crud.select('employee', {{'>=', 'salary', salary}})

    -- if an error occurs, writing it to the log and returning 500
    if err ~= nil then
        log.error(err)
        return { status = 500 }
    end

    -- the employees table stores the list of rows that meet the condition and the space format
    -- the unflatten_rows function converts a table row to a key-value table
    employees = crud.unflatten_rows(employees.rows, employees.metadata)
    employees = fun.iter(employees):map(function(x) 
        return {
            employee_id = x.employee_id, 
            name = x.name, 
            department = x.department, 
            position = x.position, 
            salary = x.salary,
        }
    end):totable()
    return request:render({json = employees})
end

local function post_employee(request)
    -- getting an object from the request body
    local employee = request:json()

    -- writing it to the database
    local _, err = crud.insert_object('employee', employee)

    -- if an error occurs, writing it to the log and returning 500
    if err ~= nil then
        log.error(err)
        return {status = 500}
    end
    return {status = 200}
end

local function init()
    -- getting an HTTP-server from the Cartridge's registry
    local httpd = assert(cartridge.service_get('httpd'), "Failed to get httpd service")

    -- setting the routes
    httpd:route({method = 'GET', path = '/employees'}, get_employees_by_salary)
    httpd:route({method = 'POST', path = '/employee'}, post_employee)

    return true
end

return {
    init = init,
    -- addind the crud-storage dependency
    dependencies = {'cartridge.roles.crud-router'},
}

init.lua

Let's describe the init.lua file. It is the entry point of a Cartridge application. To configure a cluster instance, the function cartridge.cfg() should be called in the init file of the cartridge.

cartridge.cfg(<opts>, <box_opts>)

<opts>, the default cluster parameters
- the list of available roles (all roles must be specified, even the permanent ones, to have them appear in the cluster)
- sharding parameters
- WebUI configuration
- etc
<box_opts>, the Tarantool default parameters (passed to the instance's box.cfg{})

#!/usr/bin/env tarantool

require('strict').on()

-- specifying the path to search for modules
if package.setsearchroot ~= nil then
    package.setsearchroot()
end

-- configuring Cartridge
local cartridge = require('cartridge')

local ok, err = cartridge.cfg({
    roles = {
        'cartridge.roles.vshard-storage',
        'cartridge.roles.vshard-router',
        'cartridge.roles.metrics',
        -- <<< Adding crud roles
        'cartridge.roles.crud-storage',
        'cartridge.roles.crud-router',
        -- <<< Adding custom roles
        'app.roles.storage',
        'app.roles.api',
    },
    cluster_cookie = 'myapp-cluster-cookie',
})

assert(ok, tostring(err))

The final step is to describe the dependencies of the application in the myapp-scm-1.rockspec file.

package = 'myapp'
version = 'scm-1'
source  = {
    url = '/dev/null',
}
-- Adding the dependencies
dependencies = {
    'tarantool',
    'lua >= 5.1',
    'checks == 3.1.0-1',
    'cartridge == 2.7.3-1',
    'metrics == 0.11.0-1',
    'crud == 0.8.0-1',
}
build = {
    type = 'none';
}

The application's code is ready to run, but let's write some tests to make sure it works as expected.

Writing tests

Every application needs testing. The usual luatest is sufficient for unit tests. But to write an integration test, you may want to use the cartridge.test-helpers module. It is shipped with Cartridge and can be used to run a cluster of any structure for the tests.

local cartridge_helpers = require('cartridge.test-helpers')
-- creating a test cluster
local cluster = cartridge_helpers.Cluster:new({
    server_command = './init.lua', -- test application entrypoint
    datadir = './tmp', -- directory for xlog, snap, and other files
    use_vshard = true, -- enable cluster sharding
    -- list of replica sets:
    replicasets = {
        {
            alias = 'api',
            uuid = cartridge_helpers.uuid('a'),
            roles = {'app.roles.custom'}, -- list of roles assigned to the replicaset
            -- list of instances in the replicaset:
            servers = {
                { instance_uuid = cartridge_helpers.uuid('a', 1), alias = 'api' },
                ...
            },
        },
        ...
    }
})

Let's write an auxiliary module to use in the integration tests. In this module, a test cluster with two replica sets is created. Each replica set contains one instance:

The auxiliary module code:

test/helper.lua

local fio = require('fio')
local t = require('luatest')
local cartridge_helpers = require('cartridge.test-helpers')

local helper = {}

helper.root = fio.dirname(fio.abspath(package.search('init')))
helper.datadir = fio.pathjoin(helper.root, 'tmp', 'db_test')
helper.server_command = fio.pathjoin(helper.root, 'init.lua')

helper.cluster = cartridge_helpers.Cluster:new({
    server_command = helper.server_command,
    datadir = helper.datadir,
    use_vshard = true,
    replicasets = {
        {
            alias = 'api',
            uuid = cartridge_helpers.uuid('a'),
            roles = {'app.roles.api'},
            servers = {
                { instance_uuid = cartridge_helpers.uuid('a', 1), alias = 'api' },
            },
        },
        {
            alias = 'storage',
            uuid = cartridge_helpers.uuid('b'),
            roles = {'app.roles.storage'},
            servers = {
                { instance_uuid = cartridge_helpers.uuid('b', 1), alias = 'storage' },
            },
        },
    }
})

function helper.truncate_space_on_cluster(cluster, space_name)
    assert(cluster ~= nil)
    for _, server in ipairs(cluster.servers) do
        server.net_box:eval([[
            local space_name = ...
            local space = box.space[space_name]
            if space ~= nil and not box.cfg.read_only then
                space:truncate()
            end
        ]], {space_name})
    end
end

function helper.stop_cluster(cluster)
    assert(cluster ~= nil)
    cluster:stop()
    fio.rmtree(cluster.datadir)
end

t.before_suite(function()
    fio.rmtree(helper.datadir)
    fio.mktree(helper.datadir)
    box.cfg({work_dir = helper.datadir})
end)

return helper

The integration test code:

test/integration/api_test.lua

local t = require('luatest')
local g = t.group('integration_api')

local helper = require('test.helper')
local cluster = helper.cluster

g.before_all = function()
    g.cluster = helper.cluster
    g.cluster:start()
end

g.after_all = function()
    helper.stop_cluster(g.cluster)
end

g.before_each = function()
    helper.truncate_space_on_cluster(g.cluster, 'employee')
end

g.test_get_employee = function()
    local server = cluster.main_server

    -- filling the storage with data via HTTP API:
    local response = server:http_request('post', '/employee',
        {json = {name = 'John Doe', department = 'Delivery', position = 'Developer',
        salary = 10000, employee_id = 'john_doe'}})
    t.assert_equals(response.status, 200)

    response = server:http_request('post', '/employee',
        {json = {name = 'Jane Doe', department = 'Delivery', position = 'Developer',
        salary = 20000, employee_id = 'jane_doe'}})
    t.assert_equals(response.status, 200)

    -- Making a GET request and checking if the output data is correct
    response = server:http_request('get', '/employees?salary=15000.0')
    t.assert_equals(response.status, 200)
    t.assert_equals(response.json[1], {name = 'Jane Doe', department = 'Delivery', employee_id = 'jane_doe',
        position = 'Developer', salary = 20000
    })
end

Running the tests

If you had launched the application before

Stopping the application:

cartridge stop

Removing the directory containing the data:

rm -rf tmp/

Building the application and setting the dependencies:

cartridge build

./deps.sh

Running the linter:

.rocks/bin/luacheck .

Running the tests to record the coverage:

.rocks/bin/luatest --coverage

Generating the coverage reports and looking at the result:

.rocks/bin/luacov .
grep -A999 '^Summary' tmp/luacov.report.out

Running locally

To run applications locally, you can use cartridge-cli, but the roles we have written should be added to replicasets.yml:

router:
  instances:
  - router
  roles:
  - failover-coordinator
  - app.roles.api
  all_rw: false
s-1:
  instances:
  - s1-master
  - s1-replica
  roles:
  - app.roles.storage
  weight: 1
  all_rw: false
  vshard_group: default
s-2:
  instances:
  - s2-master
  - s2-replica
  roles:
  - app.roles.storage
  weight: 1
  all_rw: false
  vshard_group: default

To see the parameters of the configured instances, take a look at the instances.yml file.

Running the cluster locally:

cartridge build
cartridge start -d
cartridge replicasets setup --bootstrap-vshard

Now we can enter WebUI to load the roles' configuration and to configure the failover. To configure a stateful failover, do the following:

click the Failover button
choose stateful
specify the address and the password:
- localhost:4401
- passwd

Let's see how it works. Now the leader in the s-1 replica set is s1-master.

Let's stop it:

cartridge stop s1-master

Now s1-replica becomes the leader:

Let's restore s1-master:

cartridge start -d s1-master

s1-master is up again, but s1-replica is still the leader because of the stateful failover:

Let's load the configuration for the cartridge.roles.metrics role. To do this, switch to the Code tab and create the metrics.yml file with the following contents:

export:
  - path: '/metrics'
    format: prometheus
  - path: '/health'
    format: health

After we click the Apply button, the metrics will become available at each node of the application at the localhost:8081/metrics endpoint. The health-check page at the localhost:8081/health address will also appear.

This completes the basic setup of a small application: the cluster is ready to run and now we can write an application to communicate with the cluster using the HTTP API or via a connector. We can also expand the functionality of the cluster.

Conclusion

Many developers hate wasting time configuring a database. We prefer simply writing code and leaving cluster management to a framework. To solve this problem, I use Cartridge, a framework that manages a cluster containing several instances of a Tarantool database.

Now you know:

how to build a reliable cluster application based on Cartridge and Tarantool,
how to write the code for a small application to store information about employees,
how to add tests,
how to configure a cluster.

I hope my story was helpful and you will start using Cartridge to create applications. I would be glad to hear feedback on whether you managed to write a Cartridge application quickly and easily as well as questions about its use.

What's next?

Check out the documentation on the official website
Try Cartridge in the sandbox
Ask your questions to the community in the Telegram chat

Blog