Using Liquibase with CockroachDB
BobFerris
Posted on July 28, 2022
CockroachDB is the world’s most evolved cloud SQL database — giving all of your apps effortless scale, bulletproof resilience and low latency performance for users anywhere. Like any SQL database data is stored in tables created using Data Definition Language (DDL). One of the challenges associated with creating and maintaining a SQL database is how to implement tables and then change those table definitions in a consistent, versionable and easy to use manner.
This is where a schema migration tool such as Liquibase can really simplify the process. For a more in depth discussion about the benefits of schema migration tools refer to this blog post on the subject.
Installing Liquibase
Installation of Liquibase is straightforward. The basic steps consist of --
- Download and install the Liquibase binary
- Download the Postgres driver (CockroachDB is Postgres write protocol compatible and Liquibase uses the Postgres driver to communicate with CockroachDB)
- Configure the Liquibase properties file with your CockroachDB information
- Create a Liquibase schema change log and add one or more changesets to the change log.
- Run Liquibase and you are on your way.
Of course this is a bit of an oversimplification. Detailed installation instructions can be found on the official Cockroach Labs documentation site here.
Data Types
SQL data types can roughly be categorized into 5 different groups.
- Numeric (decimal, int, etc)
- Character or String
- Datetime
- Boolean
- Miscellaneous (blob, clob, xml, jsonb, etc)
Of course, each database can implement these basic data types in slightly different ways. The database ecosystem can’t always agree on things! Since a schema migration tool like Liquibase is database agnostic, handling these different data type implementations is one of the challenges that Liquibase has to solve. Liquibase allows users to define tables with Liquibase defined data types and then it converts those data types to the corresponding data type in the target database. I have had several Cockroach users ask for the details of exactly how Liquibase converts its data types to CockroachDB data types. Providing this mapping is the main reason for writing this blog.
Before we get into that mapping it is also important to realize that Liquibase allows flexibility in how a user can define their changesets for table creation. Liquibase supports writing changesets in SQL, XML, YAML and JSON. I will focus on SQL and XML.
When using a SQL changeset such as the one shown here -
<changeSet id="12" author="max" runInTransaction="false">
<validCheckSum>ANY</validCheckSum>
<sqlFile path="create_next.sql"/>
</changeSet>
Where the create_next.sql
file contains the following -
create table balance
(
id int not null primary key,
balance numeric(19, 2) not null,
name varchar(128) not null,
type varchar(25) not null
);
Liquibase will interpret this as straight DDL and create the table as one would expect using the data types specified. This means only Cockroach supported data types can be specified in SQL changesets. CockroachDB supported data types can be found here. The CockroachDB database table that results from running liquibase update
against this changeset is -
root@localhost:26257/liquibase> show create table balance;
table_name | create_statement
-------------+-------------------------------------------------
balance | CREATE TABLE public.balance (
| id INT8 NOT NULL,
| balance DECIMAL(19,2) NOT NULL,
| name VARCHAR(128) NOT NULL,
| type VARCHAR(25) NOT NULL,
| CONSTRAINT "primary" PRIMARY KEY (id ASC),
| FAMILY "primary" (id, balance, name, type)
| )
When using XML changesets, Liquibase supports other “non-Cockroach standard” data types and will convert them for you when the changeset is executed. The table below shows the Liquibase data type to specify, the second column shows the data type created in Cockroach when using a SQL changeset and the third column shows the resulting Cockroach data type when the Liquibase data type is used in an XML changeset. If you are really interested in how Liquibase maps data types you can view their code, but I hope you find this table easier to digest.
Liquibase Data Type | SQL Changeset | XML Changeset |
---|---|---|
bigint | int8 | int8 |
blob | bytes | oid (bytea in postgres) |
boolean | bool | bool |
char | char(x) | char(x) |
clob | n/a | string |
currency | n/a | decimal |
datetime | n/a | timestamp |
date | date | date |
decimal | decimal(x,x) | decimal(x,x) |
double | n/a | float8 |
float | float8 | float8 |
int | int8 | int8 |
mediumint | n/a | n/a |
nchar | n/a | n/a |
ncharvar | n/a | n/a |
number | n/a | decimal |
smallint | int2 | int2 |
time | time | time |
timestamp | timestamp | timestamp |
tinyint | n/a | int2 |
uuid | uuid | uuid |
varchar | varchar() | varchar |
jsonb | jsonb | jsonb |
serial | int8 unique_rowid() | int8 unique_rowid() |
For example, notice that the currency data type is not a supported CockroachDB data type so it can’t be used in a SQL changeset. However, if currency is used in a XML changeset, Liquibase will create the column as a decimal data type. Since CockroachDB is Postgres wire compatible, you will notice that the data type conversions generally line up with what you would expect from Postgres.
Primary Keys
In any distributed database, obtaining an even distribution of data across all the nodes in the cluster is an important consideration for a balanced, performant cluster. CockroachDB stores data in its key/value layer ordered lexicographically by key. This means sequential keys have the potential to hot spot or overload the particular node that is the leaseholder for that range of data. Cockroach recommends defining primary keys using a meaningful non-sequential business key or a UUID.
Creating a UUID primary key with Liquibase using a SQL changeset is straightforward. The changeset would look like -
create table uuid_key
(
id UUID not null primary key DEFAULT gen_random_uuid(),
field1 varchar(25) not null
);
And the generated table looks like -
root@localhost:26257/liquibase> show create table uuid_key;
table_name | create_statement
-------------+--------------------------------------------------
uuid_key | CREATE TABLE public.uuid_key (
| id UUID NOT NULL DEFAULT gen_random_uuid(),
| field1 VARCHAR(25) NOT NULL,
| CONSTRAINT "primary" PRIMARY KEY (id ASC),
| FAMILY "primary" (id, field1)
| )
However, creating this with an XML changeset isn’t quite as intuitive. Using the following XML changeset will create the exact same table -
<changeSet id="15" author="max">
<createTable tableName="uuid_key">
<column name="id" type="uuid" defaultValueComputed="gen_random_uuid()">
<constraints primaryKey="true" nullable="false"/>
</column>
<column name="field1" type="varchar(25)"/>
</createTable>
</changeSet>
In situations where you don’t want to use a UUID the next best option for a randomly distributed key is to use the serial data type in Cockroach. The serial data type is shorthand to declare an INT8 field with the default value assigned by the unique_rowid() function. A SQL changeset would simply look like the following -
create table serial_key
(
id serial not null primary key,
type varchar(25) not null
);
And the resulting table would be -
root@localhost:26257/liquibase> show create table serial_key;
table_name | create_statement
-------------+-------------------------------------------------
serial_key | CREATE TABLE public.serial_key (
| id INT8 NOT NULL DEFAULT unique_rowid(),
| type VARCHAR(25) NOT NULL,
| CONSTRAINT "primary" PRIMARY KEY (id ASC),
| FAMILY "primary" (id, type)
| )
Using the following XML changeset will create the exact same table -
<changeSet id="17" author="max">
<createTable tableName="serial_key2">
<column name="id" type="serial">
<constraints primaryKey="true" nullable="false"/>
</column>
<column name="type" type="varchar(25)"/>
</createTable>
</changeSet>
Final Thoughts
One last issue that I’ve seen Cockroach users bump into with Liquibase deals with running multiple changesets against the same table in quick succession. One of CockroachDB’s unique features is the ability to run schema changes online, requiring no downtime. To accomplish this, CockroachDB asynchronously replicates the schema change to all nodes in the cluster. However, all nodes need to finish the schema change before the table can be changed again. On a local test cluster this probably won’t be a problem, but in a larger multi-region deployment this can take a little longer and cause an error if multiple changes are trying to be made back to back, for example, in a CI/CD pipeline. Try to keep all changes to a table in a single changeset if possible!
The goal of this post was to highlight and answer some of the common questions I’ve seen asked when using Liquibase with CockroachDB to manage your database schema. I hope you found it interesting and useful.
Posted on July 28, 2022
Join Our Newsletter. No Spam, Only the good stuff.
Sign up to receive the latest update from our blog.