What is Docker? Creating a container for a Node.js application
Guilherme Dugaich de Oliveira
Posted on July 17, 2022
Context and Motivation
A software is, basically, a set of files that is read, interpreted and executed in some way by the computer. This basic definition leaves room for a question: what about running the same program on different machines? They must share a similar environment, with the same resources needed to run such software.
This is an age-old problem in the computing world. The famous “on my machine runs” meme shows that if some code is executed locally on a developer's machine, it does not guarantee that the same program will run correctly on another machine, or on a server in a production environment.
Before talking about Docker, it's important to talk about the problem it solves and what was the tool used before it. The challenge is to be able to run the same program in different environments, different machines. Any software has dependencies, which are libraries of code that the software needs to function. Also, it needs executable binaries to run. In order for your program to run successfully on a given machine, you need to make sure that its dependencies and binaries are installed.
If a developer writes Python code on their machine and pushes that code to Github, making it public on the internet, and someone else clones that project on their machine and tries to run it, will it work? Only if the dependencies are installed and Python is working on its correct version. What if the project was developed on a Windows computer, and the other person tries to run it on a Linux machine? Some adaptation will also be required.
In the example of just two developers, this doesn't seem to be a big problem, but on larger projects, with hundreds of people working and multiple development, staging, and production environments, this can become a nightmare. This article intends to give an overview of one way to solve this problem, which is with Docker. To be able to follow the example tutorial that will be done below, you need a basic knowledge of Node.js, Linux systems and REST APIs.
Virtual Machines
As a rule, computers have a single operating system, at least that's how they come from the factory. To try to use more than one operating system without having to buy another computer, there are some alternatives. You can install another system on the same machine, sharing the same hardware, and make a dual boot setup, where the user chooses between two systems when starting the machine.
This is a good solution, but it does not allow both systems to run simultaneously. For this, another type of solution emerged, virtualization. A single machine can have its resources (memory, storage, CPU, etc.) divided between virtual machines, which are simulations of other computers. This division of resources is done by a special type of software called a hypervisor. And even with virtualization we still have a default machine operating system, which is called the host system (host OS). And the hypervisor is installed on it.
A hypervisor is able to do the following division: allocate 2GB of memory, 100GB of disk storage and 2 CPU cores for a Linux (Ubuntu) system, and 4GB of memory, 200GB of disk storage and 4 CPU cores for a Windows system, all on the same hardware. Obviously, the hardware in question has to have enough resources to run the virtual machines. Virtualized systems, running on top of the hypervisor, are called guest operating systems.
The user can, while using the host OS, open a hypervisor window and use another system, as if it were running natively. This opens up the possibility of running multiple machines simultaneously, as many as the hardware can handle, which is a very powerful utility. However, it is still an expensive option in terms of hardware and processing, as each virtual machine builds its own operating system from scratch.
This is a very basic explanation of virtual machines, but it allows you to understand how this solution that came up long before Docker, and is still widely used. Virtual machines virtualize the hardware, booting an entirely new operating system from scratch. On the other hand, Docker virtualizes the operating system.
Docker
According to the official documentation, Docker is an open platform for developing, shipping and running applications. It allows you to separate the application from the infrastructure for faster software delivery. With Docker it is possible to manage the infrastructure in the same way that you manage the code.
For a more practical definition, Docker is an application that you install on your machine, like any other, and it has both a command line interface (CLI) and a graphical interface on the desktop. It allows you to package your applications in isolated environments called containers. The properly configured container has everything needed to run an application, including the previously mentioned binaries and libraries.
Unlike virtual machines, Docker is not virtualizing hardware resources, but simulating an isolated environment to run an application. This concept will become clearer with examples.
The container can be thought of as a microcomputer running on top of the Docker execution engine, and that microcomputer is isolated from the rest of the machine. An application running in the container does not know about the machine's resources, or how it is being used by other applications. Containers are fast and lightweight, allowing for a great software development and deployment experience.
A detail that differentiates containers from virtual machines is the fact that they can be easily shared through their images, which are files that contain all the information about a given container, and Docker uses them as a starting point to create a new one. Anyone can send and receive container images and have them running on the docker engine in their local machines or cloud environments.
Docker sets out to do three things: build, push, and run images. That is, it can create a container from the image, send this image to other developers, in addition to cloud environments and other remote container repositories. And of course, it also has the ability to run these images, as long as Docker is properly installed.
The idea is really a little abstract, but it is important to understand that the container behaves as if it were an isolated machine, like a normal computer, where there is a file system, folders, executable programs and everything else. This concept will be important when explaining Docker commands.
Creating a container for an application
Now, let's build a container for a Node.js application with Express and see in practice how it all works. To keep the focus on Docker, the application will be very simple, a single endpoint that returns a message. Make sure you have Node and the npm package manager installed on the machine. To create the application, start a new directory with a name of your choice and inside it execute the following commands.
$ npm init -y
$ npm install express
The first command creates a Node.js project in the current directory, starting a package.json
file. The second installs Express, the framework we use to create the REST endpoint. Then create an index.js
file in the project root with the following code:
const express = require('express');
const app = express();
const PORT = process.env.PORT || 3000;
app.get('/', (req, res) => {
res.send('I S2 Containers');
});
app.listen(PORT, () => {
console.log(`Node app running on port ${PORT}`)
});
Here is our Node.js application! A single GET endpoint that returns the message “I S2 Containers” to the client. To start the server and make the endpoint available, run the command node index.js
from the project root. It is now possible to call http://localhost:3000/
directly from the browser or any HTTP client to see the magic happening.
Okay, we already have an application, but what if we want another developer to run this application on their machine, before deploying it? We would have to upload the application on Github, or on any other open platform, the person would have to download the project, install Node, install the dependencies and only then run it. Docker makes this process simpler. To turn the application into a container, we need to have Docker installed locally. If you don't already have it, follow the instructions in the official documentation and install.
First, we need to create a file called Dockerfile
at the root of the project. This is where the instructions for building and running that application will be. It works as a sequence of steps, or commands, that Docker will follow to build and run the image of the application. After creating this file, your project should look something like this:
Now, lets write the Dockerfile
and check what each command means
FROM node:17
WORKDIR /app
ENV PORT 3000
COPY package.json /app/package.json
RUN npm install
COPY . /app
CMD ["node", "index.js"]
FROM node:17
- This command tells Docker which base image we are using for our application. Here it is important to mention Docker Hub, which is Docker's remote repository on the internet, where users can download pre-made images. In our example, we are using the image called node, which is the image of a container that already has all the Node.js dependencies we need installed, and we also pass the tag 17, which is the version of Node used. With this command, Docker understands that it will start creating the container from an image that already exists. From here, every command in the file will be run from that base image. Every Dockerfile
must start with a FROM
command.
WORKDIR /app
- Defines which is the main directory of the application, inside the container. This is where the subsequent commands will be applied. The container has its own file system, and the /app
directory will be at the root of that file system.
ENV PORT 3000
- Sets the PORT environment variable to the value 3000.
COPY package.json /app/package.json
- Copies the package.json
file to our previously defined working directory.
RUN npm install
- Runs the Node dependency installation command. It is worth remembering that this command is being executed inside the /app
directory, which contains the package.json
file.
COPY /app
- Copies the entire contents of the local root directory into our application's directory.
CMD [“node”, “index.js”]
- Defines the default command to be executed when the container starts. When we tell Docker to run our image as a container, it will look at this command and understand that when starting the container, it will run the command node index.js
, which is the command that spins up the HTTP server we built.
Ok, now that we have our Dockerfile
ready, we can create our image.
$ docker build --tag i-love-containers .
With this command, Docker understands that it has to build the image. The tag option passed defines a name for the image, i-love-containers
, and the period at the end of the command is defining the path where the Dockerfile
is located, which is in the project root.
After executing the command, the logs of the things that Docker has done will be shown in the terminal. It is clear that it executes the commands specified in the Dockerfile
. And now that we have our image built, just use the docker images
command in your terminal to see the images available on the machine. With the image ready, let's run it as a container.
$ docker run -p 5000:3000 -d i-love-containers
The parameter -p 5000:3000
is used to indicate that port 3000 of the container must be mapped to port 5000 of the machine where Docker is running. That is, to access our endpoint on the local machine we use http://localhost:5000/
. This is evidence of the container's independence from the rest of the computer, it needs to explicitly know the port we are going to request. The -d
parameter is to run in detach mode, which means the process will start in the background.
Now we can run docker ps
to see which containers are running. Notice that docker gave your container a name, something random, in the NAMES column. This command only shows currently running containers, and to show all available containers, including inactive ones, use docker ps -a
.
Calling the endpoint on port 5000, we see that it returns the expected message, our application is running inside the container. It is important to note that the Node installed locally on our machine is not running, only the one that is in the container.
You can stop the container from running with the docker stop <container name>
command and similarly get it running again with the docker start
command.
Deploy
We have a few options to make our application available to the world. First, we can upload our image to the aforementioned Docker hub, which is a central repository of images on the internet, where anyone can download images that they have access to. Docker Hub is a very complete tool and has several features. If you're interested in how it works, and how you can easily make your image available on the Docker hub, study the tool's documentation.
With a Docker image, it is possible to deploy the same container on several cloud platforms such as Heroku, AWS, Google Cloud, and others. The subject of deploying containers is quite extensive and deserves a post dedicated just to that. For now, it is interesting to know that all major cloud platforms have container deployment mechanisms, which makes your application very adaptable from one platform to another.
Why Docker?
First, containers are much lighter in terms of memory and processing when compared to a virtual machine that needs to spin up an entire operating system, since containers share the same host OS, used by the Docker engine. To be even more specific, they share the same kernel, unlike virtual machines that each have their own.
For those unfamiliar with the term, the kernel is the brain of an operating system, it is the part of the software that communicates with the hardware. When we talk about a Linux system, we are actually talking about a system that uses the Linux kernel, and there are several operating systems that use it. A system that uses the Linux kernel is commonly called a Linux distribution, like Ubuntu, CentOS, Kali and others. When building a virtual machine, it is necessary to create a kernel from scratch, which is much more cumbersome than simply starting a Docker container, which already uses the hardware's kernel resources.
Here it's worth mentioning a small disadvantage of Docker. Since containers share the same kernel, it is only possible to run containers that are based on images from the same host OS. So we can only run Linux-based containers on Linux machines, and the same for Windows and MacOS. A container of a Windows image would not work on a Docker installed on Linux, and vice versa.
As we saw in the example, this is not such a big problem, since it is possible to run Docker inside WSL 2 running on Windows. There are several mechanisms to work around this problem. One of the biggest use cases for Docker is to deploy applications to cloud environments, where Linux is most often used.
Currently, many companies use containers for microservices architectures, where parts of the system are separated into smaller applications with well-defined responsibilities. This makes maintenance, testing and understanding of complex systems easier. We can have a container running Node.js, another running PostgreSQL or another database, another running a front-end application with React, all within the same business logic, but divided into independent containers, each with it's own deployment strategies and details.
I hope this article has been useful for those of you who didn't know Docker, or knew and had some doubts about how it works. Knowing Docker today is a fundamental skill for developers, to increase the power of their applications, making them scalable and easy to deploy.
To give credit where credit is due, this article was inspired by NetworkChuck's YouTube video.
Posted on July 17, 2022
Join Our Newsletter. No Spam, Only the good stuff.
Sign up to receive the latest update from our blog.