TIL: about Entity Relationship Schemas
NDREAN
Posted on August 14, 2022
When you design a relational database, you code two entities, migration files and data mapping files. You may also want to use ER diagrams for modelling a database as you may find it easier or faster to reason or want to communicate with them. Even if experienced teams might not integrate such diagrams in the design pipeline, almost every framework lets you take an ERD snapshot of your data structure. We gathered notes on a few tools to interact with diagrams as code and show where they can be used.
Diagram-as-code, or not
You have three main formats to draw ER diagrams: DOT, UML and DBML. While DOT and PLANTUML formats are multi-purpose drawing utilities, DBML is an easy-to-read DSL focused on tables, to design the data structure in code and draw an ER diagram from it. An example of DBML code where a table is built, with foreign keys, composite indexes and relations 1-n (<
).
Diagram-as-code with DBML is interesting because you can convert this code into a raw SQL migration script. This means you can run a migration with the framework or run it directly against the database with the command line.
However, this is a "first" design tool only. You can only run the first full migration once.
Furthermore, it remains to produce the code base in the framework for it to communicate with the database.
Conversely, we can convert a raw SQL migration script into DBML and produce an ERD in code from it, but for the same reasons, this is only meaningful if you have the first full migration script.
In conclusion, DBML is not a CD tool but rather a first useful "diagram-as-code" design tool.
Most frameworks have libraries to generate an ERD snapshot from the code base, whether from the migration files or the model/schema. Almost all of them will produce DOT files. The Graphviz utility will generate an SVG or PNG. For example, NodeJS/Sequelize with sequelize-erd
, or Ruby/Rails with rails-erd
, or Elixir/Ecto with ecto_erd
. The code base remains the source of truth. An example is further below.
Finally, an ERD snapshot can also be generated directly from the database, with in-build tool or database tools like dbeaver or utilities like Planter and pg-to-dbml. More on it below.
Set up and process
We will use Elixir/Ecto and PostgreSQL (MySQL works as well). One reason to use Ecto is that the code used for the migration is independent of the code used by the mapper to talk and manage the database.
We start in the "standard" way: code the playground example with Elixir/Ecto, run the migration and produce tests. From this point, you can generate PUML or DOT or DBML code and visualise all of them!
Then, we will convert what can be the starting point - the DBML code - back to SQL, and run it with the database command line client psql
(resp. mysql
) and test the database.
Tools
the DBML CLI converts between DBML and SQL. This can also be done programmatically with the node package
@dbml/core
.the online free service dbdiagram.io makes this DBML interesting: you type the code in DBML and a synced diagram is drawn for you. You can generate a migration script (SQL code) for the selected database - Postgres or MySQL - and you can print/save the ERD.
VSCode has an extension vscode-dbml based on @dbml/core to convert back and forth between SQL and DBML: in the command palette, (
>DBML: To SQL
) and vice-versa (>DBML:From SQL
).
- the node package pg-to-dbml connects to your Postgres database and produces DBML code from it. Just pass the database url (the db is set by a flag):
pg-to-dbml --c=postgresql://USER:PASSWORD@HOST:PORT -o=pathToOutput --db=DB_NAME
- You also have PlantUML. This is a powerful drawing tool. The Golang program Planter generates diagrams in PUML format from PostgreSQL tables. Just clone the repo, install it (Go needed of course), and pass your database url to it:
./planter "postgres://postgres@localhost/my_app_repo?sslmode=disable" -o my-db.uml
The "plantuml" code for ERD is easily understandable:
To visualise the diagram, you can use directly plantuml
or use an IDE extension (such as PlantUML in VScode, press opt-D
) or use the official on-line drawer.
- the build-in ER tool for PostgreSQL included in pgAdmin. You can design a database with it, and conversely, generate an ERD snapshot from an existing database. You can learn how to use it with this video.
MySQL has also it's build-in Workbench. Also to mention the free Sequel Ace for MySQL/MariaDB.
- the excellent database tool dbeaver:
- projects tailored for the framework such as ecto_erd. It is Ecto's swiss army knife for ERD. You can generate the default DOT format or PUML code or DBML code. More on this below.
The diagram below shows where all these tools can be used.
Playground example
We take four models/schemas talking about movies, actors, producers and characters. The database should capture the following relations:
- a Movie has one Producer, and a Producer can have many movies,
- a Character has one Movie,
- an Actor has a unique Character per Movie through a unique Contract for a certain amount,
- a Character can have many Actors per Movie through contracts,
From Ecto code base to DBML and back
We set up the PostgreSQL database, code the migrations with Ecto, run them and save the logs to get the SQL script. We need to clean the logs to extract only the SQL commands. You can do this programmatically with for example a small cleaning task. You then use the DBML cli to convert the SQL script into DBML code:
mix ecto.gen.repo -r MyApp.Repo
[... some config...]
mix ecto.create
mix ecto.gen.migration create_movies
[...code each migration...]
# ---------------------------
mix ecto.migrate --log-migrations-sql > migration-ecto.sql
mix clean.sql migration-ecto.sql
sql2dbml --postgres migration-ecto.sql -o migration-ecto.dbml
We use the handy dbdiagram.io to visualise the diagram.
Since we have the code base written in Elixir/Ecto, you can also use the mix task ecto_erd
. The command below will generate a DOT file that you convert into PNG with the Graphviz utility. Note that you can also display a DOT file into the browser
mix ecto.gen.erd --output-path=my_app-db.dot
dot -Tsvg my_app-db.dot -o my_app-db.svg
From DBML code to the migration
Suppose you designed the database in DBML. You can convert this code to SQL with the @dbml/cli:
dbml2sql --postgres migration-ecto.dbml -o new-migration.sql
There are different ways to run a migration transaction from this generated SQL script; it can be done via the framework or directly in the database.
run a migration transaction with Ecto. The migration file will use the
up
function fromEcto.Migration
. It is a bit tedious: you wrap each raw SQL command with anexecute("CREATE...")
. You could do this with a custom mix task. You can place the generated file into a subdirectory, say "priv/repo/migrations/from-diagram" for example and use the--migrations-path
flag to run it.a better way is to run the migration-ecto.sql script directly with the command line client
psql
. Prior to this, we need to wrap the whole script with aBEGIN; ...;COMMIT;
for a transaction. You can use for example a simple custom mix taskmix sql.prepare migration-ecto.sql
mix ecto.drop && mix ecto.create
mix sql.prepare migration-ecto.sql
psql -d my_app_repo -f migration-ecto.sql
mix test
- lastly, you can also use the pgAdmin interface. This is a "manual" operation and thus can't be used when you release your Docker image. Once you run the script, you can generate an ERS snapshot. Conversely, you can draw an ER diagram with it and generate the migration script. To use it, navigate to your database folder in the integrated browser, then the public folder, select "Tools" (top menu), then "Query Tool", then select the SQL script (first icon as shown below).
Then execute the loaded script with F5 (or the triangle),
and finally, check the tables (maybe Refresh with a right click). If you go back to the database folder, click right and select "Generate ERD" as below
and you have an ER snapshot of the database:
pgadmin snap
Conclusion
ERD is a valuable tool to communicate and visualizing the database.
"Diagram-as-code" with DBML is interesting especially with the handy dbdiagram.io but not a CD tool, rather limited to being a first design tool in dev mode; you can't manage subsequent migrations and the code will not be kept synced automatically with the project.
You can generate an ERD snapshot from:
- the (Postgres) database with Planter to get a nice PUML,
- the code base with ecto_erd to get a nice DOT.
You note that the DOT file generated by ecto_erd
is richer than the DBML code since it adds the associations created by unique composite indexes as we see immediately below when comparing both diagrams.
Code
DBML code
Enum "actor_status" {
"celeb"
"blist"
}
Table "producers" {
"id" uuid [pk]
"name" varchar(255) [not null]
"status" "actor_status"
"inserted_at" timestamp(0) [not null]
"updated_at" timestamp(0) [not null]
Indexes {
name [unique, name: "producers_name_index"]
}
}
Table "actors" {
"id" uuid [pk]
"name" varchar(255) [not null]
"status" actor_status
"inserted_at" timestamp(0) [not null]
"updated_at" timestamp(0) [not null]
Indexes {
name [unique, name: "actors_name_index"]
}
}
Table "movies" {
"id" bigserial [pk]
"title" varchar(255) [not null]
"producer_id" uuid [not null]
"inserted_at" timestamp(0) [not null]
"updated_at" timestamp(0) [not null]
Indexes {
title [unique, name: "movies_title_index"]
}
}
Table "characters" {
"id" bigserial [pk]
"name" varchar(255) [not null]
"movie_id" bigint [not null]
"inserted_at" timestamp(0) [not null]
"updated_at" timestamp(0) [not null]
Indexes {
name [unique, name: "characters_name_index"]
}
}
Table "contracts" {
"id" bigserial [pk]
"salary" decimal(7, 2) [not null]
"movie_id" bigint [not null]
"actor_id" uuid [not null]
"character_id" bigint [not null]
"inserted_at" timestamp(0) [not null]
"updated_at" timestamp(0) [not null]
Indexes {
(movie_id, actor_id, character_id) [unique, name: "contract"]
}
}
Ref:"producers"."id" < "movies"."producer_id"
Ref:"movies"."id" < "characters"."movie_id"
Ref:"movies"."id" < "contracts"."movie_id"
Ref:"actors"."id" < "contracts"."actor_id"
Ref:"characters"."id" < "contracts"."character_id"
Mix task helper
# lib/mix/tasks/sql_helper.ex
defmodule Mix.Tasks.Sql.Clean do
@shortdoc "clean the migration file to raw SQL"
require Logger
use Mix.Task
@impl Mix.Task
def run(arg) do
case arg do
[] ->
Logger.debug("Please enter a filename")
[filename] ->
with {:ok, txt} <- File.read(filename) do
txt
|> String.split("\n")
|> Enum.filter(&(String.contains?(&1, "CREATE") or String.contains?(&1, "ALTER")))
|> Enum.map(&String.replace(&1, "[]", ";"))
|> Enum.filter(&(not String.contains?(&1, "execute")))
|> to_string()
|> then(fn t -> File.write(filename, t) end)
end
System.cmd("echo", ["\e[32m \u2714 \e[0m", "File ready for conversion to dbml"], into: IO.stream())
end
end
defmodule Mix.Tasks.Sql.Prepare do
@shortdoc "prepare SQL script for psql migration"
require Logger
use Mix.Task
@impl Mix.Task
def run(arg) do
case arg do
[] ->
Logger.debug("Please enter a filename")
[filename] ->
filename
|> File.write("COMMIT;", [:append])
|> then(fn _ ->
{:ok, txt} = File.read(filename)
File.write(filename, "BEGIN;" <> txt)
end)
System.cmd("echo", ["\e[32m \u2714 \e[0m", "Ready for transaction"], into: IO.stream())
end
end
end
Ecto Migration, Schema, Seeds, Tests
⏩ check github repo for Ecto code.
Posted on August 14, 2022
Join Our Newsletter. No Spam, Only the good stuff.
Sign up to receive the latest update from our blog.