Matthew Cale
Posted on November 17, 2021
What is this?
This is a technical guide for doing common management tasks with a Postgres database running in Docker.
Why make this?
As a quick reference for taking database backups and restoring them.
Steps
Run The Postgres Image
Docker makes it incredibly easy to get going with a Postgres database with persistent storage. Let's try it out.
- Let's just run the Postgres container pointing its volume location to a spot on disk we like. For me that's: ~/docker/volumes/postgres.
# In your terminal
docker run \
--name pg-db \
-e POSTGRES_PASSWORD=shiny_metal_pass \
-d \
-p 6543:5432 \
-v $HOME/docker/volumes/postgres:/var/lib/postgresql/data \
postgres
-
Let's dissect this line by line:
-
docker run
-- Runs a specified image (in our case it'spostgres
, see last line of command). Source -
--name pg-db
-- Sets the name of the container running the postgres image. Source -
-e POSTGRES_PASSWORD=shiny_metal_pass
-- Sets an environment variable which will be the default password for the superuser. Note: This should be a secret π€«. Source -
-d
-- Detach from the terminal instance. i.e. run in the background. -
-p 6543:5432
-- Map the local machine's port6543
to the container's port5432
. So from our machine to connect we will use 6543 which in turn will hit the port in the container running on 5432. -
-v $HOME/docker/volumes/postgres:/var/lib/postgresql/data
-- Map any data created in the container at/var/lib/postgresql/data
to a desired location on local disk:$HOME/docker/volumes/postgres
. Source -
postgres
-- The name of the image in Docker hub we want to create a container from.
-
When I run it I see the following
- When I check our volume location (note: for me this was an empty location prior to invoking the command above) I see this:
- Finally, let's see if we can connect to our database. I am using TablePlus but the same concepts apply with any general purpose database IDE or
psql
(found here). For me the connection looks like this:
- And when we connect we can run the following to get a lay of the land.
-- From a SQL Editor (PSQL / TablePlus Session)
SELECT datname FROM pg_database;
-- RESULTS:
-- postgres
-- template1
-- template0
Create Some Persistent Data
So, we've got a database! What now? Well let's store some data in it!
- Start by creating a new database called
todo_it_well
. You guessed it this, YATA (yet another todo app). This time we are doing it in SQL later we will also show you how to use thecreatedb
tool.
-- From a SQL Editor
CREATE DATABASE todo_it_well;
-- Query 1 OK: CREATE DATABASE
- Switch to our new database. For me, in TablePlus, this just means editing my connection to use
todo_it_well
instead ofpostgres
for the chosen database.
- Next, create some tables. We will, for this demonstration, create a
users
table andtasks
table so that users can track their tasks or "todos". The only almost complex bit of this is declaring a relationship between tasks and users. This post is not about SQL so don't think too hard on these commands for now just know that they make a viable todo application.
-- From a SQL Editor
CREATE TABLE "user" (
id SERIAL PRIMARY KEY NOT NULL,
created_at TIMESTAMP NOT NULL DEFAULT NOW(),
updated_at TIMESTAMP NOT NULL DEFAULT NOW(),
email TEXT NOT NULL,
"password" TEXT NOT NULL,
display_name TEXT
);
CREATE TABLE "task" (
id SERIAL PRIMARY KEY NOT NULL,
created_at TIMESTAMP NOT NULL DEFAULT NOW(),
updated_at TIMESTAMP NOT NULL DEFAULT NOW(),
owner_id INT NOT NULL,
description TEXT NOT NULL,
label TEXT,
completed_on TIMESTAMP,
removed_on TIMESTAMP,
CONSTRAINT fk_owner
FOREIGN KEY(owner_id)
REFERENCES "user"(id)
);
-- Query 1 OK: CREATE TABLE
-- Query 2 OK: CREATE TABLE
- Before we finish this step let's create some real data in here! Let's insert some users first and then some tasks for them!
-- From a SQL Editor
INSERT INTO "user"
("email", "password", "display_name")
VALUES
('matt@email.fun', 'password-gasworld', 'matt-the-magic-monkey'),
('jennie@yahoo.fun', 'password-sassworld', 'dat-girl-jen'),
('timothy24@email.fun', 'pw-vw', 'yo-boi-tom-o-tee');
-- users created with ids: 1, 2 and 3
- Next, the tasks:
-- From a SQL Editor
INSERT INTO "task"
("owner_id", "description", "label")
VALUES
(1, 'read this awesome post', 'tech'),
(1, 'do dishes', NULL),
(2, 'read chapter of lotr', 'lit'),
(2, 'sign up to volunteer', NULL),
(3, 'call brother', NULL),
(3, 'make grocery list', 'health'),
(3, 'rake leaves', 'house');
- Validate that our data looks as expected:
-- From a SQL Editor
SELECT
u.display_name,
t.description
FROM
"user" u
INNER JOIN
"task" t ON u.id = t.owner_id
WHERE
u.id = 2
AND t.completed_on IS NULL;
-- RESULTS:
-- display_name description
-- dat-girl-jen read chapter of lotr
-- dat-girl-jen sign up to volunteer
Create A Backup
So here we are with some valuable customer data that we really do not wanna lose. A good practice for any developer is to take regular (hopefully automated) database backups. Doing this in Postgres isn't too hard so let's get to it. I should state there are several ways to do this. This is just one of many ways to get this job done. I think it is the most straightforward way; though, there are definitely faster ways.
This approach consists of issuing a set of commands to the container that could likely be CRON job-ified without too much effort.
- First let's create a backups directory at a spot that we share with our container's filesystem and our host machine:
# In your terminal
docker exec -t pg-db bash -c 'mkdir /var/lib/postgresql/data/backups'
- This command (run from the host machine) executes a command (
-c
)mkdir
on the container and creates a directory called backups which is located at the spot on the container that is shared with the host machine. You can verify this on the host by running:
# In your terminal
ls $HOME/docker/volumes/postgres | grep backups
# my machine outputs: backups
- Next, let's create our backup
# In your terminal
docker exec -t pg-db bash -c 'pg_dump todo_it_well -U postgres --file=/var/lib/postgresql/data/backups/todo_it_well_db-$(date +%Y-%m-%d).sql'
Let's breakdown this command:
-
docker exec -t pg-db bash -c
- We tell Docker we want to run a command on the
pg-db
container
- We tell Docker we want to run a command on the
-
pg_dump todo_it_well -U postgres
- We invoke the
pg_dump
tool on the container telling it to target thetodo_it_well
db and run commands as thepostgres
user.
- We invoke the
-
--file=/var/lib/postgresql/data/backups/todo_it_well_db-$(date +%Y-%m-%d).sql
- This is the spot on the container we want to place our back up at and as you can see it is the shared backups directory we just made. There is a little
date
logic that interpolates the current date as YEAR-MONTH-DAY.
- This is the spot on the container we want to place our back up at and as you can see it is the shared backups directory we just made. There is a little
If this went as planned we should have made our database backup and we can verify it on our host machine by running:
# In your terminal
ls $HOME/docker/volumes/postgres/backups
# my machine outputs: todo_it_well_db-2021-11-07.sql
cat $HOME/docker/volumes/postgres/backups/todo_it_well_db-2021-11-07.sql
# output includes: -- PostgreSQL database dump
- If you see some SQL in your output then congratulations! You have successfully created a database backup and your data strategy for your awesome app is just the tiniest bit more durable. π
Drop Data And Restore It
Great, so our data has been backed up! This means that you can restore folks back to a particular point in time when things were as they should be. So long as you have an application that can reliably serve this data (ruminates on articles using Goose to accomplish database versioning) you can restore functionality and value to your users.
BUT, backing up your data is only half the battle. Actually restoring it is another matter. Let's practice the act of restoring our data.
First, I should confess that this article commits a sin of simplicity in that I wanted to get going as simply as possible. You more than likely should not place your backups in the exact same spot as the rest of your mounted volume data, but it was useful in getting us going faster.
- So to start, we are going to put our backup temporarily out of this spot so that we can pretend we had mounted two volumes:
# In your terminal
# Let's move our data to our home directory and pretend this is where we safely mounted our backups when starting the container, ok?
cp \
$HOME/docker/volumes/postgres/backups/todo_it_well_db-2021-11-07.sql \
$HOME/todo_it_well_db-2021-11-07.sql
- Next, verify that our data is where we expect it to be:
# In your terminal
ls $HOME | grep todo_it_well_db-2021-11-07.sql
# my machine outputs todo_it_well_db-2021-11-07.sql
- Ok cool, we are pretending the we have a spot outside the normal Postgres files where we keep our data safely backed up. Now, the fun part. OMG! We just got attacked π°. All records were destroyed by the evil hacker! Let's stop our container and delete all of our persistent data.
# In your terminal
# Stop the container forcibly
docker rm -f pg-db
# Delete our persistent data
rm -rf $HOME/docker/volumes/postgres
- Verify the data is indeed gone.
# In your terminal
ls $HOME/docker/volumes/postgres
# my machine outputs: No such file or directory
- Now we have discovered the attack and want to bring our application back online. So we start up our container as before:
# In your terminal
docker run \
--name pg-db \
-e POSTGRES_PASSWORD=shiny_metal_pass \
-d \
-p 6543:5432 \
-v $HOME/docker/volumes/postgres:/var/lib/postgresql/data \
postgres
- However, sadly, our data, including our database, is gone and we cannot connect to the todo_it_well database.
- Fear not, π¦ΈββοΈ we have a backup! Simply copy the backup into the container:
# In your terminal
docker cp $HOME/todo_it_well_db-2021-11-07.sql pg-db:/var
- Next, you will need to recreate the database (so your backup data has a place to restore to):
# In your terminal
docker exec -t pg-db bash -c "createdb \
-h localhost \
-p 5432 \
-U postgres \
todo_it_well"
- Finally, run the
psql
SQL file command and take a deep breath, you are recovering from a very stressful attack π.
docker exec -t pg-db bash -c \
"psql \
-U postgres \
-d todo_it_well \
-f /var/todo_it_well_db-2021-11-07.sql"
- Lastly, let's verify that our data was properly restored by running our query from earlier:
-- From a SQL Editor
SELECT
u.display_name,
t.description
FROM
"user" u
INNER JOIN
"task" t ON u.id = t.owner_id
WHERE
u.id = 2
AND t.completed_on IS NULL;
-- RESULTS
-- display_name description
-- dat-girl-jen read chapter of lotr
-- dat-girl-jen sign up to volunteer
There you have it! You have done a ton! You set up a dockerized Postgres DB, you created your schemas and planted some data in your database, you took a back up of the data and then restored the data from back up after losing it. You rock πΈ.
Closing Thoughts
There are some issues with our set up and certainly there are some ease of use and durability improvements we could make, but this is a fully working solution. Some of the areas we should checkout:
- Custom Dockerfile that includes a backups location.
- A run command that references the previously mentioned backups location mount.
- A cron job run on the host or on the container that creates backups.
- A secure, but accessible spot for these backups to stored. Maybe DigitalOcean Spaces, why not π¦?
Resources
- https://graspingtech.com/docker-compose-postgresql/
- https://simkimsia.com/how-to-restore-database-dumps-for-postgres-in-docker-container/
- https://stackoverflow.com/questions/40632228/input-file-appears-to-be-a-text-format-dump-please-use-psql
- https://www.tutorialspoint.com/postgresql/postgresql_create_database.htm
Posted on November 17, 2021
Join Our Newsletter. No Spam, Only the good stuff.
Sign up to receive the latest update from our blog.