Farley Knight
Posted on March 24, 2020
What is Docker?
Docker implements a software concept called a container. Essentially, the idea is when you ship code from development to production, you ship your code inside a container.
Because applications ship in containers, developers and/or devops are responsible for creating a custom container for their application. Some have coined the term "dockerize" as a verb, which means "to create a docker container for a specific application or service". One of the first tasks when learning Docker is to "dockerize" an application.
Why Use Containers?
In the early days of the web, it was common for developers to have a set of very delicate steps for deploying. These steps might include, in some logical order:
- Creating a ZIP (.zip) or tarball (.tar.gz) file with the source code.
- Sending that file to the host server. If you're behind a corporate firewall, you might have to send it through multiple machines.
- Decompress the file, creating a new directory, followed by symlinking other directories (like log directories and temp file directories) to subdirectories of this new directory.
- Restart the web server with the new source code.
- Delete the artifact and clean up old versions of the source code.
This entire model of deployments has many potential problems.
- Files must be put in the correct directory for the web server to read them.
- If the deployment process is very manual, then the deployment coordinator must be sure they do not make a mistake during this process.
- This is especially common in the case of sites that are not updated frequently.
- Files could be readable by the web server.
- If files were marked as only readable by the user (per Unix file permissions) then the web server will not be able to access them.
- Managing security around which users will be doing deployments adds further complexity.
- Do certain commands require
sudo
? Who hassudo
rights on your team? - Do they require a user be added to a security group?
- Do certain commands require
- What if you created one user account for deployments?
- Then team members will need to be aware of those plaintext credentials, which is rife with potential security mistakes.
- Old versions of the code must be kept around, in case of the need to roll back.
- If we accumulate old versions of the code, it may start filling up the file system, causing hard disk space issues.
- Log files also grow very quickly on the hard disk.
- Log rotation tools must be implemented to save disk space.
- If processes are known to be unresponsive or crash, then we need process monitoring tools to ensure they continue to be available, or perform necessary restarts.
- Processes with memory leaks or ones that consume many CPU cycles can interfere with other processes.
- This can make services unavailable. It could even crash the host system entirely.
- There might be essential libraries that must be installed on the operating system level for an application to run correctly.
- If you always keep the same physical machine, and the same libraries, you can install these libraries once, and not worry.
- The process of installing new libraries (and verifying library versions) can be an error prone process.
- What if you must move to a new physical server? It becomes necessary to install all libraries on the new machine.
Can Docker perfectly solve all of these problems?
No
Can it solve most of them, and make the rest routine?
Definitely
Let's go over some of the benefits of using Docker. Each container can:
- Limit the amount of CPU the application is using.
- Limit the amount of memory the application is using.
- Limit the networking resources of the application.
- Keep track of its dependencies via a
Dockerfile
, which describes the process to build a specific container. - Track the health of an application or service via health checks using
docker-compose
. - Define networking configurations between multiple containers, much like networking between physical machines.
- Use the file system only temporarily.
- Containers are not meant to be permanent, which makes for better reproducibility in application environments.
It is important to emphasize, because Docker containers, by default, do not have a permanent file system, this means when your container is shut down, any files created or modified will be reset on the next container deployment. Be sure you are aware of this when you create your Docker container. If your application needs to store data, it should be done on some external system, or it must attach something called a Docker volume.
The value of reproducible environments
Reproducibility is a fundamental tenet of science. In computer science, and its implementation via software engineering, reproducibility can be found in unit and integration tests. Docker brings reproducibility into deployments and devops. There are many benefits to this new paradigm:
- Writing and testing your code in the same environment as you deploy your code means there's less of a chance of production-only bugs.
- All dependencies are tracked via the container image.
- New talent on your team can get up to speed quickly by running and working on a container.
- Docker images can be tracked by version, so you can roll back to previous images when a deployment is botched.
- Scaling up or down the number of application instances, databases, load balancers, or job queues to a cloud such as AWS or Google Cloud, can be easily automated with tools like
docker-compose
and Kubernetes.
All reproducibility is possible because of container images.
What is a Container Image?
If you are familiar with the concepts behind virtual machines (VM), you may have heard of a VM image. It is a template for creating new virtual machines. There are some similarities, but also important differences.
Container images are up of layers. Each layer represents a Docker instruction. All except the last layer is read-only. This allows Docker to reduce the size of images by sharing common layers between running containers. The diagram below shows how you might deploy several instances of an image as different containers. Because each layer is read-only, these layers can be shared amongst several containers without the risk of data corruption. Only the last layer is writable, and this layer is usually kept as thin as possible.
In the next section, we're going to Dockerize a simple Node.js app.
Let's make an app.js
to Dockerize
For this tutorial, we will be using Node.js, since Express is the most popular Node.js framework. For a future project, perhaps we can use Meteor, which is also popular.
To create an Express app, all you need is a single JavaScript file. The official documentation shows a simple "Hello, World" tutorial.
const express = require('express')
const app = express()
app.get('/', function (req, res) {
res.send('Hello World')
})
app.listen(3000)
To run this simple web server, we need to give it a folder. Create one somewhere. For this tutorial, I'm creating the directory the-greatest-node-js-app-ever
. In that folder, we're going to install Express:
$ cd the-greatest-node-js-app-ever
$ npm install express --save
$ node app.js
Example app listening on port 3000!
NOTE: If you are on a Mac, you might see this alert. You can click "Allow" in this case.
Now switch to your web browser and go to http://localhost:3000
. You should see something similar to this:
Adding a package.json
If we want our app to be self-contained and deployable, we should probably keep track of what dependencies we're using. In Node.js, that is handled by a file called package.json
.
{
"name": "the-greatest-node-js-app-ever",
"version": "1.0.0",
"description": "The Greatest Node.js app ever! On Docker",
"author": "Farley Knight <farleyknight@gmail.com>",
"main": "app.js",
"scripts": {
"start": "node app.js"
},
"dependencies": {
"express": "^4.17.1"
}
}
The version number of the express
package might have changed since the time of this writing. Make sure to include the latest version instead of ^4.17.1
.
After creating package.json
we should install of the necessary packages via npm install
. This step is important because it will generate package-lock.json
, which locks the version numbers for all of our package dependencies. Locking the version number prevents unintended upgrades or downgrades.
$ npm install
npm notice created a lockfile as package-lock.json. You should commit this file.
npm WARN the-greatest-node-js-app-ever@1.0.0 No repository field.
npm WARN the-greatest-node-js-app-ever@1.0.0 No license field.
added 50 packages from 37 contributors and audited 126 packages in 2.307s
found 0 vulnerabilities
Once we have our package-lock.json
file, we can create the Dockerfile
.
Creating a Dockerfile
We're going to use the following content for our Dockerfile
.
FROM node:10
# Create a directory called `/workdir` and make that the working directory
ENV APP_HOME /workdir
RUN mkdir ${APP_HOME}
WORKDIR ${APP_HOME}
# Install all of the packages mentioned in `package.json`
RUN npm install
# Copy the project over
COPY . ${APP_HOME}
# We'll access the app via port 3000
EXPOSE 3000
# Run this command when the container is ready
ENTRYPOINT ["node", "app.js"]
In a future tutorial, we will go over the details of these lines. For now, we'll continue with the process of building a Docker container.
Building the Docker Container
Docker containers are based on Docker images. You can think of an image like an installation package. It contains all the necessary data to run the container. During the deployment process, a Docker image will be sent to the host machine. The host will then use that image to create the container.
To build the image, make sure you're in the project's directory and run docker build .
.
$ docker build .
Sending build context to Docker daemon 3.584kB
Step 1/8 : FROM node:10
10: Pulling from library/node
3192219afd04: Extracting [===========================================> ] 39.45MB/45.38MB
...
...
This can take a little while, but you should see a lot of activity from that one single command. At the end of the process, there will be a line saying Successfully built c132a227961b
(although yours will have a different image ID than mine).
$ docker build .
...
...
Step 9/9 : CMD ["node", "app.js"]
---> Running in a812b758efa8
Removing intermediate container a812b758efa8
---> c132a227961b
Successfully built c132a227961b
By the way, don't forget the .
at the end, which is necessary. It tells Docker to build the image based on the Dockerfile
in the current directory.
We can see a list of all our Docker images by running docker images
.
$ docker images
REPOSITORY TAG IMAGE ID CREATED SIZE
<none> <none> c132a227961b About a minute ago 910MB
The list of images is sorted by newest first, so you should see your image listed here with the image ID (in my case c132a227961b
). However, under the REPOSITORY
and TAG
, it only shows <none>
. It's not critical that those be filled in with values. Your container can run just fine without them. But, trying to remember the image ID is an error prone process. Thankfully Docker gives us the ability to name and tag our images.
Giving Your Image a Name
It is much easier if we give our images human readable names. Let's rebuild the image, but this time with the --tag
flag.
$ docker build --tag the-greatest-node-js-app-ever-on-docker .
Sending build context to Docker daemon 2.006MB
Step 1/9 : FROM node:10
Running docker images
again gives us a container with a name:
$ docker images
REPOSITORY TAG IMAGE ID CREATED SIZE
the-greatest-node-js-app-ever-on-docker latest c132a227961b About a minute ago 910MB
Turning an Image into a Container
Now that we have an image, we can tell Docker to run it, which will create our container.
$ docker run --detach --publish 3000:3000 the-greatest-node-js-app-ever-on-docker
03665844b45a03e88a813c815c8d02d72712b27faa2332975778e0a847fad41d
The command docker run
needs a few command line arguments:
-
--detach
- This flag tells Docker to run the container and immediately detach from the shell. In other words, the Docker container should now run in the background. -
--publish 3000:3000
- Thepublish
flag is to make a port available to the outside world. In this case, we're mapping the internal port 3000 to the external port 3000. Therefore, we can access our Express app viahttp://localhost:3000
. If we wanted, we could have set this to--publish 80:3000
and that would mean thathttp://localhost
would be the link to access our app.- Be mindful of the ordering. The syntax
80:3000
means that the outside world will see port 80, but inside the Docker container, we're using port 3000. Lots of Unix commands use the ordering source first, target second. But, Docker's port is reversed: target first, source second.
- Be mindful of the ordering. The syntax
-
the-greatest-node-js-app-ever-on-docker
- The name of the image we want to use should be the last argument.
To verify that everything works correctly, go to your web browser and double check http://localhost:3000
looks like this:
Now that our container is running, let's discuss how to manage it.
Docker Container Management
Similar to how we manage processes on a machine using a command line ps -aux
(where ps
is short for processes), we have a similar command for Docker, which is docker ps
. This is what mine looks like, while writing this tutorial:
$ docker ps
CONTAINER ID IMAGE COMMAND CREATED STATUS PORTS NAMES
db032070bda8 the-greatest-node-js-app-ever-on-docker "docker-entrypoint.s…" 6 seconds ago Up 5 seconds 0.0.0.0:3000->3000/tcp jovial_carson
Like most processes, this Docker container is running in the background. To gracefully shutdown or stop this container, we can run docker stop <CONTAINER-ID>
. In our case, the container ID is db032070bda8
.
$ docker stop db032070bda8
Also like most processes, containers sometimes can become unresponsive when trying a graceful shutdown and need to be forced to shutdown. For ordinary processes, we would use a kill -9
command. For Docker, the command is docker kill <CONTAINER-ID>
.
$ docker kill db032070bda8
Interacting with Your Container
A Docker container is meant to act as an isolated environment, almost like a separate host machine. This means you can "login" and run a Bash shell inside your container. Once you're inside the container, you can look around and verify your application is working properly. The command for this is docker exec -it <CONTAINER-ID> /bin/bash
. The flag -i
stands for interactive and the flag -t
is used to create TTY session, much like an SSH session.
$ docker exec -it db032070bda8 /bin/bash
root@db032070bda8:/workdir# pwd
/workdir
root@db032070bda8:/workdir# ls
Dockerfile app.js node_modules package-lock.json package.json
root@db032070bda8:/workdir#
Removing Your Stopped Container and Image
Docker management means creating and maintaining a collection of containers and images, and running them as needed. It also includes removing those containers and images as well. In most Unix-like environments, the rm <FILE-PATH>
command deletes a file.
Steps to delete old containers and images:
- First run the command
docker rm <CONTAINER-ID>
to delete the container. - Finally run the command
docker rmi <IMAGE-ID>
to delete the image.
Note that even when you stop a container, it is still being managed by Docker. Since containers rely on images, you must remove the stopped container first, then you may remove the image. If you do not run those two steps in order, you'll get an error message like this:
$ docker rmi c132a227961b
Error response from daemon: conflict: unable to delete c132a227961b (must be forced) - image is being used by stopped container db032070bda8
If you run the commands in the correct order, it should look something like this:
$ docker rm db032070bda8
db032070bda8
$ docker rmi c132a227961b
Untagged: the-greatest-node-js-app-ever-on-docker:latest
Deleted: sha256:c132a227961bf42ac0664e7ab470931ae440661a4eae98b286016cd5a20c3c46
Deleted: sha256:ca7c95922974a846620e0ce42fbc65b585b58457ca30a9910687d2a701f598fa
Deleted: sha256:3e2c92e96f06d4282152faf9f81c9fb5bd138f57786112775afed57ba12a1f1b
Deleted: sha256:ac7b17970c321c61a620b284f81825e2867b7477a552a485ce2226ac2b06004d
Deleted: sha256:9ca2186b2dfe59cc5eed7b6ff743da708d35d5c14445d49048cf8924d6017767
Deleted: sha256:ed667d696e50cb479043af9725dbd5f40e300e923192c4e337f40ce95a1dfa1a
Deleted: sha256:9f49958e02bd156c2ba0a0cef23736dfcab645a4f40f6590a48df9674c723c0a
Deleted: sha256:bf5333fd26a86ab238b781f2012e0c47d09b978ae39372e2fb441adce07e1c05
Conclusion
In this post, we've covered the basics of Docker, what containers and images are, and how they are useful in the world of software development. We discussed what Docker images are and how they produce Docker containers. Additionally, we explained the value of containers and images, and showed how to dockerize a very simple Node.js application. In future posts, I hope to discuss the Dockerfile in more detail, as well as Docker volumes and Docker networking.
Posted on March 24, 2020
Join Our Newsletter. No Spam, Only the good stuff.
Sign up to receive the latest update from our blog.