Building a Jupyter Notebook Environment in Docker for Data Analysis on AWS EC2

zahraajawad

Zahraa Jawad

Posted on September 30, 2024

Building a Jupyter Notebook Environment in Docker for Data Analysis on AWS EC2

Outline

  • What is Jupyter Notebook
  • Docker in the AWS environment with the Jupyter Notebook
  • Install Jupyter Notebook using Docker in an AWS environment

What is Jupyter Notebook

JupyterLab: A Next-Generation Notebook Interface JupyterLab is the latest web-based interactive development environment for notebooks, code, and data. Its flexible interface allows users to configure and arrange workflows in data science, scientific computing, computational journalism, and machine learning. A modular design invites extensions to expand and enrich functionality.

Image description

JupyterLab is the latest web-based interactive development environment for notebooks, code, and data. Its flexible interface allows users to configure and arrange workflows in data science, scientific computing, computational journalism, and machine learning. A modular design invites extensions to expand and enrich functionality.

Docker in the AWS environment with the Jupyter Notebook
In this work, we will take practical and distinctive advantage of the uses of Docker installed in the AWS environment by building and preparing an environment for data analysis using Jupyter Notebook inside a Docker container and running it on AWS EC2, as it provides many important benefits, especially in the fields of data analysis and data science. These benefits include:

1. Portability and Replication
Docker containers ensure that your work environment is consistent across different systems. You can easily move the container between different machines without worrying about system compatibility.

2. Ease of Setup and Operation
With AWS EC2, you can quickly set up a new instance and launch a Docker container. This reduces the time and effort required to set up a data analysis environment, allowing you to focus on the actual work instead of setting up the infrastructure.

3. Easily Scalable
AWS EC2 provides the flexibility to scale resources as needed. You can increase or decrease the size of the instance based on your analysis requirements, saving operational costs and ensuring optimal performance.

4. Remote Access and Collaboration
Jupyter Notebook provides an interactive web interface that can be accessed from anywhere. This facilitates collaboration between teams, as multiple users can access the same environment and work on the same projects in real-time.

5. Integration with Big Data Tools
You can integrate Jupyter Notebook with big data tools like Apache Spark and Hadoop, making it easier to analyze and visualize big data.

  1. Data Security With AWS, you can use advanced security features like Identity and Access Management (IAM), Virtual Private Networks (VPC), and encryption, ensuring your sensitive data is protected.

Install Jupyter Notebook using Docker in an AWS environment

To install Jupyter Notebook using Docker in an AWS environment, follow these steps:

Step 1 "Launch Instance"

When logging into the AWS account, we select the EC2 service through Services or by the search box:

Image description

Click on Launch instance

Image description

Under Name and tags:

Enter a name to identify your instance, For this tutorial, name the instance (Jupyter Notebook)

Image description

Under Application and OS Images:

From Quick Start, choose an AMI that meets your web server needs
Here we choose Ubuntu (which is free tier eligible)

Image description

Under Instance type:

Choose the type of instance, here we choose t2.micro(which is a free tier eligible).

Image description

Under Key pair (login):

Choose the key pair

Image description
or create new key pair:
Give a name to the key pair, then click Create key pair:

Image description

Under Network settings: under Firewall (security groups)

Choose to Create security groups
To Allow SSH traffic by clicking on the check box

Image description

Leave all other configurations as they are (default settings)

In the Summary panel, review your instance configuration and then choose Launch instance.

Image description

Successfully initiated launch of instance and to see the instance click on the ID:

Image description

Your instance will first be Pending, and will then go into the Running state.

Image description

Step 2: "Connect to the instance"

To connect to your instance, select the instance and choose Connect.

Image description

There are many ways to connect to ec2, here we will choose the SSH client to connect.
After selecting the "SSH Client" section, copy and execute the following commands in the terminal as per the following steps:

Image description

Open Terminal (here we use Git Bash)

Image description

Change the directory with the cd command (change directory), where you have downloaded your pem file(key pair).

In this article, the pem file is stored in the downloads folder.

Execute the cd command to change the path to the location of the encryption key
cd Download/

Image description

Execute the following commands sequentially

1-Chmod 400 [key pair name].pem

2-ssh -i /path/key-pair-name.pem instance-user-name@instance-public-dns-name

Image description

After the command is executed you will be prompted to type “Yes” to continue with the connection

Image description

And that’s it! Now we’re logged in to our AWS instance.

Image description

Now We get root permission by executing the sudo -i command

Image description

Executing the command "sudo -i" means booting as root on Linux. The main feature of this command is that it gives you full admin (root user) privileges, allowing you to perform commands and operations that require root user privileges.

We update the repositories through the command:
sudo apt update && sudo apt upgrade -y

Image description

Image description

Docker installation:
I used Docker installed on an instance in the AWS account, and this was explained in the article:

https://dev.to/zahraajawad/docker-basics-with-some-of-its-commands-and-how-to-install-docker-by-aws-7b6

• Create a Dockerfile

Create a Dockerfile describing how to build the image. This file contains the instructions necessary to install and configure the application within the image.
we can do it by the command(nano):
nano Dockerfile

Image description

Write the instructions necessary to install and configure the application:

Image description

Then follow the steps to store and exit the file:
Ctrl+x : to exit
Y: to save then enter.

Image description
Now to build the Jupyter Notebook image we execute the following command:

docker build -t my-jupyter-notebook .

Image description

Image built successfully
Image description
To make sure, we execute the command:
docker images

Image description

run the container using the following command:
docker run -d -p 8888:8888 my-jupyter-notebook

Image description

Access to Jupyter Notebook
To access Jupyter Notebook, we must open its port, and this is done through the following steps:

  • Go back to the instance and select it by clicking on the checkbox, then go to the security box

Image description

  • Open the security group by clicking on it

Image description

  • Choose the Inbound rules then edit Inbound rules

Image description

  • Click Add Rule

Image description

  • Enter the rule Image description Note: In practice, it is not preferable to leave the state 0.0.0.0/0 This is easy to hack, but we are here to learn the labs and the building process.

Now, Go back to the instance and select it, then go to the Details and copy the Public IPv4 address

Image description

Paste the public IPv4 address with port 8888 into the browser and press Enter

Image description

Image description
Jupyter Notebook has been successfully built and you can work on it

References:

💖 💪 🙅 🚩
zahraajawad
Zahraa Jawad

Posted on September 30, 2024

Join Our Newsletter. No Spam, Only the good stuff.

Sign up to receive the latest update from our blog.

Related