Version Control for Distributed SQL Databases with Flyway
Jimmy Guerrero
Posted on August 13, 2020
Flyway is an open source database version control and migration tool that stresses simplicity and convention over configuration. Changes to the database can be written in SQL (and in some database-specific dialects like PL/SQL and T-SQL) or Java. You interact with Flyway using a command-line client, however there are a variety of plugins that can be leveraged, including Maven, Gradle, Spring Boot, and more.
Supported databases include Oracle, SQL Server, DB2, MySQL, PostgreSQL, and others. Because YugabyteDB is PostgreSQL compatible, most third-party tools and apps will work “out-of-the-box.” Flyway is no exception here. This allows developers to deploy and roll-back schema changes to YugabyteDB using Flyway by making use of the PostgreSQL JDBC driver.
Flyway relies on seven commands to manage database version control.
- Migrate: Migrates the schema to the latest version. Flyway will create the schema history table automatically if it doesn’t exist.
- Clean: Drops all objects in the configured schemas. Info: Prints the details and status information about all the migrations.
- Validate: Validates the applied migrations against the available ones.
- Undo: Undoes the most recently applied versioned migration.
- Baseline: Baselines an existing database, excluding all migrations up to and including baselineVersion.
- Repair: Repairs the schema history table.
In this blog post we’ll walk you though the following steps:
- Install a 3 node YugabyteDB cluster on Google Kubernetes Platform
- Install and configure Flyway locally
- Run some test migrations
- Rollback a migration
New to distributed SQL or YugabyteDB? Read on.
What is Distributed SQL?
Distributed SQL databases are becoming popular with organizations interested in moving data infrastructure to the cloud or to cloud native environments. This is often motivated by the desire to reduce TCO or move away from the scaling limitations of monolithic RDBMS like Oracle, MySQL, and SQL Server. The basic characteristics of Distributed SQL are:
- A SQL API for querying and modeling data, with support for traditional RDBMS features like primary keys, foreign keys, indexes, stored procedures, and triggers.
- Automatic distributed query execution so that no single node becomes a bottleneck.
- A distributed SQL database should support automatically distributed data storage. This includes indexes which should be automatically distributed (aka sharded) across multiple nodes of the cluster so that no single node becomes a bottleneck for ensuring high performance and high availability.
- Distributed SQL systems should also provide for strongly consistent replication and distributed ACID transactions. For a deeper discussion about what Distributed SQL is, check out, “What is Distributed SQL?”
What is YugabyteDB?
YugabyteDB is an open source, high-performance distributed SQL database built on a scalable and fault-tolerant design inspired by Google Spanner. YugabyteDB is PostgreSQL wire compatible with support for advanced RDBMS features like stored procedures, triggers, and UDFs.
Ok, on with the demo…
Step 1: Install YugabyteDB on a GKE Cluster using Helm 3
In this section we are going to install YugabyteDB on the cluster. The complete steps are documented here. We’ll assume you already have a GKE cluster up and running as a starting point.
The first thing to do is to add the charts repository.
$ helm repo add yugabytedb https://charts.yugabyte.com
Now, fetch the updates.
$ helm repo update
Create a namespace. In this case we’ll call it yb-demo
.
$ kubectl create namespace yb-demo
Expected output:
namespace/yb-demo created
We are now ready to install YugabyteDB. In the command below we’ll be specifying values for a resource constrained environment.
$ helm install yb-demo yugabytedb/yugabyte \
--set resource.master.requests.cpu=1,resource.master.requests.memory=1Gi,\
resource.tserver.requests.cpu=1,resource.tserver.requests.memory=1Gi,\
enableLoadBalancer=True --namespace yb-demo --wait
To check the status of the cluster, execute the below command:
$ kubectl get services --namespace yb-demo
Note the external-IP for yb-tserver-service
which we are going to use to establish a connection between YugabyteDB and Flyway. From the screenshot above we can see that the IP is 35.224.XX.XX
and the YSQL port is 5433
.
Step 2: Create the flyway_test database and set a password
Let’s create a dedicated database called flyway-test
. To connect to the YSQL service run the following command:
$ kubectl exec -n yb-demo -it yb-tserver-0 -- ysqlsh -h yb-tserver-0.yb-tservers.yb-demo
Next, create the flyway-test
database and connect to it using the following commands:
yugabyte=# CREATE DATABASE flyway_test;
yugabyte=# \c flyway_test;
By default, a YugabyteDB installation doesn’t have a password setup for the default yugabyte
user. Specifying one is done the same way you’d do it in PostgreSQL.
flyway_test=# ALTER ROLE yugabyte WITH PASSWORD 'password';
Step 3: Install and configure Flyway
For the purposes of this demo we’ll be installing and configuring Flyway Community Edition. After downloading Flyway, navigate to the flyway-6.5.2
directory. We are going to want to make a few configuration changes to get Flyway connected to the YugabyteDB database we just deployed.
In the conf directory locate the flyway.conf
file. Uncomment or modify the following lines:
flyway.url=jdbc:postgresql://35.224.6.55:5433/flyway_test
flyway.user=yugabyte
flyway.password=password
At this point Flyway will be able to connect to YugabyteDB running on GKE.
Step 4: Run database migrations against YugabyteDB
We are now ready to create our first migration in the /sql
directory. Let’s name it V1__Create_motorcycle_manufacturers_table.sql
:
CREATE TABLE motorcycle_manufacturers (
manufacturer_id SERIAL PRIMARY KEY,
manufacturer_name VARCHAR(50) NOT NULL
);
We can use the following command in the flyway-6.5.2 directory to run the migration.
$ flyway migrate
Flyway Community Edition 6.5.2 by Redgate
Database: jdbc:postgresql://35.224.XX.XX:5433/flyway_test (PostgreSQL 11.2)
Successfully validated 1 migration (execution time 00:00.282s)
Creating Schema History table "public"."flyway_schema_history" ...
Current version of schema "public": << Empty Schema >>
Migrating schema "public" to version 1 - Create motorcycle manufacturers table
Successfully applied 1 migration to schema "public" (execution time 00:02.941s)
Using the YSQL command line or your favorite database administration tool, we can verify that Flyway has indeed created the motorcycle_manufacturers
table with the two specified columns.
You’ll notice that an additional flyway_schema_history
table was created as well. Flyway uses this table to store the attributes of any changes we’ve made to the database.
Next, let’s insert some data into the table we just created. In the /sql
directory, let’s create a SQL file with the contents below and name it V2__Insert_into_motorcycle_manufacturers
.
INSERT INTO motorcycle_manufacturers
(manufacturer_id, manufacturer_name)
VALUES
(default, 'Harley-Davidson'),
(default, 'Yamaha');
Run this new migration.
$ flyway migrate
Flyway Community Edition 6.5.2 by Redgate
Database: jdbc:postgresql://35.224.6.55:5433/flyway_test (PostgreSQL 11.2)
Successfully validated 2 migrations (execution time 00:00.390s)
Current version of schema "public": 1
Migrating schema "public" to version 2 - Insert into motorcycle manufacturers
Successfully applied 1 migration to schema "public" (execution time 00:01.605s)
Let’s verify that the data was successfully inserted.
With a basic understanding of how migrations work in Flyway, you can use a similar process to issue other operations like undo
, validate
, baseline
, etc.
Step 5: Rollback a database migration
Let’s go ahead and clean up all the objects and data we created using the clean
command.
$ flyway clean
Flyway Community Edition 6.5.2 by Redgate
Database: jdbc:postgresql://35.224.6.55:5433/flyway_test (PostgreSQL 11.2)
Successfully dropped pre-schema database level objects (execution time 00:00.054s)
Successfully cleaned schema "public" (execution time 00:05.657s)
Successfully dropped post-schema database level objects (execution time 00:00.052s)
The output above confirms that the table was successfully dropped along with the Flyway versioning table.
Conclusion
That’s it! You now have a 3 node YugabyteDB cluster on GKE, with versioning control managed by a local install of Flyway. For more information about how to perform various operations in Flyway using the Java API, Maven or Gradle, check out the Flyway documentation.
Posted on August 13, 2020
Join Our Newsletter. No Spam, Only the good stuff.
Sign up to receive the latest update from our blog.