Mastering Docker Multistage Builds
Raunak Gurud
Posted on January 19, 2024
This blog post explores the process of dockerization a TypeScript Express app while taking into consideration best practices. It showcases the usage of multistage Docker files to optimize the Dockerization process.
This is not on basic of docker it will deep dive into optimize way to write Dockerfile to improve performance security speed
let's start by building a simple express app using typescript
mkdir ts-docker
cd ts-docker
# Initialize project
npm init
# Add dependencies
yarn add express
# Add dev dependencies
yarn add -D typescript @types/express @types/node
# Preinstalled tsc globally
tsc --init
# Change tsconfig.json
"rootDir": "./src",
"outDir": "./dist",
# start the server
npm run build
npm run start
src/index.ts
The provided TypeScript code sets up a basic Express.js server. It creates an Express app, defines a health check endpoint (/health), and starts the server on a specified port.
npm run build
Build scripts simply compile the TypeScript code into JavaScript and output it into the "dist" folder. You will now observe a "/dist" folder created.
npm run start
The server is currently running on PORT 8090 in your local system. You can check the health of the system by using the following route: http://localhost:8090/health
If you see this message, it indicates that everything is functioning properly āØāØ.
Now, let's move to the next phase and dockerize it. Kindly close the server by using the command Ctrl + C
.
Dockerization app
create a filename Dockerfile
please ensure that you use the exact filename provided
# Base image
FROM node:18
# Setup workdir
WORKDIR /usr/app
COPY package*.json .
# Install Dependencies
RUN npm install
# Copy Application Code
COPY . .
# Build Application
RUN npm run build
# Run Command
EXPOSE 8090
CMD node dist/index.js
1.Base Image:* Specifies that this image is based on the official Node.js 18 image, which includes Node.js and npm.
FROM node:18
2.Working Directory: Sets the working directory inside the container to /usr/app
.
WORKDIR /usr/app
3.Copy Package Files: Copies the package.json and package-lock.json files into the working directory.
COPY package*.json .
4.Install Dependencies: Runs the npm install
command to install the dependencies specified in the package.json file.
RUN npm install
5.Copy Application Code: Copies the rest of the application code into the working directory.
COPY . .
6.Build Application: Executes the npm run build
command to build the application. This assumes there is a build script specified in the package.json file.
RUN npm run build
7.Expose Port: Informs Docker that the application inside the container will use port 8090.
EXPOSE 8090
8.Run Command: Specifies the default command to run when the container starts. It runs the Node.js application from the dist
directory (assuming the build output is there).
CMD node dist/index.js
To ignore files in your Docker build, you can add them to the .dockerignore file.
touch .dockerignore
Here are the files and folders that should be ignored after the build process:
- /dist
- /node_modules
Make sure you have Docker installed and running on your system.
Now let's build the image
What is a Image ??
Images are typically built from a set of instructions provided in a special file called a Dockerfile. This file defines the base image, sets up the working directory, copies files into the image, installs dependencies, and specifies the commands to be run.
To build an image, use the following command
docker build -t [tag-name] .
In this command, the , -t, flag is used to provide a tag name for the image.
Go to console and type
docker build -t ts-docker/single .
Building an image can be time-consuming if you do not have the base image already installed. Be patient š§āāļø
To execute the build image within a container, utilize the subsequent Docker command
docker run -it -p [external port]:[internal port] --name [container name] [image ID or tag name]
- ( -it) : This option enables an interactive mode for the container and allocates a pseudo-TTY. It allows you to interact with the container's command-line interface.
- (-p) : [external port]:[internal port] : This option maps an external port to an internal port of the container. Replace , [external port], with the desired port on your host machine and , [internal port], with the corresponding port inside the container.,
- (--name) [container name] : This option assigns a specific name to the container. Replace , [container name], with your desired name.
- [image ID or tag name] : This is the ID or tag name of the Docker image you want to use for creating the container.
To run the build image of your application in a container, use the following Docker command:
docker run -it -p 8090:8090 --name single ts/single
This Docker command runs a container named "single" from the "ts/single" image. It allocates a terminal (-it), maps port 8090 on the host to port 8090 in the container (-p 8090:8090), allowing interaction with the containerized TypeScript application.
If you can see this message, congratulations!
You have successfully created a Docker image with a TypeScript Express app. To verify its functionality, please visit the "/health" route of the server.
The issues with the normal container.
The current issue lies in the excessive size of the Image. Despite avoiding the utilization of numerous packages, the Docker image size reaches a staggering 1.14GB. This contradicts the fundamental objective of containerizing the application into a compact file, as it essentially transforms into a virtual machine (VM).
Let's explore the steps to resolve these issues.
1. Using && instead of 2 RUN Commands
When building Docker containers, aim for smaller images. Shared layers and smaller sizes lead to faster transfer and deployment. But how do you manage size when each RUN statement creates a new layer and intermediate artifacts are needed? Many Dockerfile use unconventional techniques to address this issue.
FROM ubuntu
RUN apt-get update && apt-get install vim
Why the &&
? Why not running two RUN
statements like this?
FROM ubuntu
RUN apt-get update
RUN apt-get install vim
Docker layers store the difference between image versions. Like git commits, they are useful for sharing with other repositories or images. When you request an image, you only download the layers you don't already have, making sharing more efficient. However, layers use space and increase the size of the final image. Combining multiple RUN statements on a single line was a common practice.
We have already implemented this point, so there is no need to worry. Now, let us explore other areas for improvement.
2. Multistage docker builds
When creating Docker images for applications, developers often need additional tools and dependencies during the build process that are not required for the final runtime image. Including these unnecessary elements in the final image can result in larger image sizes, which may impact performance and security.
Traditional Docker app.
What is docker multistage?
Docker multistage builds are a feature that allows you to use multiple FROM
statements in a single Dockerfile. This enables you to build and compile your application in one stage and then copy only the necessary artifacts into a smaller and more efficient image for runtime in the final stage. The primary goal is to create smaller Docker images by eliminating unnecessary build dependencies and files.
Using multistage build
This Dockerfile utilizes multistage builds to separate the build and runtime environments effectively. In the first stage, the application is compiled and built, while the second stage focuses on creating a smaller and more efficient image specifically for running the application. By adopting this approach, the final image size is reduced, unnecessary build artifacts are excluded from the runtime environment, and both security and efficiency are enhanced.
Below is a simplified example of a Dockerfile that demonstrates the use of multistage builds for your application
Now we understand that Typescript is essential during the development process only. We can utilize Typescript to build the JavaScript file during the builder stage and then use the generated JavaScript file in the next phase, known as the runner stage.
Stage 1: Build Stage
# Use Node.js 18 as the base image for the builder stage
FROM node:18 as builder
WORKDIR /build
COPY package*.json .
RUN npm install
COPY . .
RUN npm run build
- The first stage involves setting up the build environment, also known as the builder. This builder can be utilized in subsequent stages for further development.
- To generate the corresponding JS code in the builder phase, I copied the
package.json
,package-lock.json
, andsrc
files. - We have executed ,npm run build, which will generate a ,/dist, folder containing all the required JS files.
Stage 2: Runtime Stage
# Use Node.js 18 as the base image for the runtime stage
FROM node:18 as runner
# Set the working directory within the container to /app
WORKDIR /app
# Copy package.json from the build stage to the working directory
COPY --from=builder build/package*.json .
COPY --from=builder build/dist dist/
COPY --from=builder build/.env .
# Install only production dependencies
RUN npm install --only=prod
# Define the command to run the application
CMD [ "npm", "start" ]
- The second stage (
runner
) also uses Node.js version 18 as the base image but is intended for the runtime environment. - It sets the working directory within the container to
/app
. -
package.json
and package-lock.json is copied from thebuild
stage to the working directory to ensure the correct production dependencies are available. - The
dist
, and.env
files are copied from thebuilder
stage to the working directory /app. - The RUN command(--only=prod) is used to install only the production dependencies, effectively reducing the size of the image.
Now let's create the image again using this multistage statergy in your application
Please update your Dockerfile with the code provided below.with code given below
Build The image with new [tag-name]. Here you can give any name as your wish we are gonna use t*s-docker/multistage*
Go to console and type:
docker build -t ts-docker/multistage .
Use the updated image to run the container.
docker run -it -p 8090:8090 --name multistage ts-docker/multistage
If you see this, you have successfully created a Docker image using a multistage build. You can check by going to the /health route of the server.
Now, let's take a look at the size of the image
In this case we have only removed the src/ directory and the devDependency packages containing all the code realated to typescript
The size is reduces but we still have the same problem the image is too big
(1.14 GB) --> (1.1 GM)
Using multistage builds is a highly effective approach to decreasing the final image size and eliminating unnecessary components from the runtime image. By doing so, it contributes to enhancing security and efficiency in your workflow.
What else can we do to reduce the size of image. Let see
3. Using slim version of runtime
By analyzing the code, it is evident that there is no further content that can be eliminated. However, even after removing excess content, the image size remains excessively large. Therefore, based on first principles, it can be deduced that the root issue lies in the initial size of the node base image.
The current image includes Node.js, yarn, npm, bash, and other binaries. It is based on Ubuntu, providing a fully fledged operating system. However, when running a container, only Node.js is necessary. Docker containers should only contain the essential components to run a single process, without needing an operating system. In this case, everything except Node.js could be removed.
By using smaller base images with Alpine
In other words, a Linux distribution that is smaller in size and more secure.
You shouldn't take their words for granted. Let's check if the image is smaller.
You should tweak the Dockerfile
and use node:alpine
Create a image for this Dockerfile config.
Go to console and type:
docker build -t ts-docker/alpine .
Now, let's take a look at the size of the image
Yaayy!!! š„³š„³ as you can see the size of the image is now 147 MB from 1.1 GB
Transitioning from a larger Node.js image in the build stage to a more streamlined Alpine-based image in the runtime stage significantly trims the overall image by almost 1 GB. The Alpine base contains only the essential commands for execution, enhancing efficiency and minimizing the container's footprint, resulting in a more optimized and lightweight deployment.
What have we achieved:
- Smaller Image Size: image is lightweight, resulting in significantly reduced image sizes, promoting faster downloads, and efficient resource utilization.
- Enhanced Security: Alpine is designed with security in mind, offering a minimal and secure environment, reducing the attack surface, and including security-focused components like Musl libc and BusyBox.
- Container Best Practices: Aligns well with containerization best practices, emphasizing simplicity, minimizing unnecessary components, and ensuring cleaner and more efficient Dockerfile with multistage approach
- Faster Builds and Deployment: Smaller images lead to faster build times and quicker deployments, especially beneficial in CI/CD pipelines and scenarios requiring rapid scaling.
In conclusion, whether optimizing Docker images, implementing multistage builds, or choosing a specific Node.js image, the overarching goal is efficiency, security, and adherence to best practices.
š If you had a blast exploring. Stay tuned for the next blog where we'll explore more such topics. Don't forget to share this blog with your tech-savvy pals! Together, we're coding our way to greatness. šš»
Follow me @Raunak Gurud ā¤ļøā”
Posted on January 19, 2024
Join Our Newsletter. No Spam, Only the good stuff.
Sign up to receive the latest update from our blog.