Spells under the magic of docker

Hey there, fellow coders! Today we're going to talk about using namespaces and isolation in the Linux kernel to create an environment similar to Docker. We'll show you how to do this using Python and Node.js, and we'll even throw in a few jokes to keep things light!

Before we get started, let's talk about what we mean by namespaces and isolation. In Linux, namespaces allow us to isolate different parts of the system, such as the filesystem, network, and process IDs, so that processes running in different namespaces can't see or interact with each other. This is the same technique that Docker and other containerization tools use to create isolated environments for running applications.

Now, let's dive into the code!

First up, we have the Python example:

import os
import subprocess

# Create a new namespace for the container
pid = os.fork()
if pid == 0:
    # In the child process

    # Mount a new filesystem for the container
    subprocess.run(['mount', '-t', 'tmpfs', 'tmpfs', '/mnt/container_root'], check=True)

    # Change the root directory of the process to the container's root
    os.chdir('/mnt/container_root')
    os.chroot('.')
    subprocess.run(['mount', 'proc', '/proc', '-t', 'proc'], check=True)

    # Set up the container's network namespace
    subprocess.run(['ip', 'link', 'set', 'lo', 'up'], check=True)
    subprocess.run(['ip', 'link', 'add', 'veth0', 'type', 'veth', 'peer', 'name', 'veth1'], check=True)
    subprocess.run(['ip', 'link', 'set', 'veth1', 'netns', str(os.getpid())], check=True)
    subprocess.run(['ip', 'addr', 'add', '192.168.1.1/24', 'dev', 'veth0'], check=True)
    subprocess.run(['ip', 'link', 'set', 'veth0', 'up'], check=True)
    subprocess.run(['ip', 'netns', 'exec', str(os.getpid()), 'ip', 'addr', 'add', '192.168.1.2/24', 'dev', 'veth1'], check=True)
    subprocess.run(['ip', 'netns', 'exec', str(os.getpid()), 'ip', 'link', 'set', 'veth1', 'up'], check=True)

    # Run a command in the container
    subprocess.run(['bash'], check=True)

    # Exit the child process
    os._exit(0)

# In the parent process
os.waitpid(pid, 0)

This Python code spells out each step needed to create a containerized environment.

We start by creating a new namespace for the container using the os.fork() command. We then mount a new filesystem at /mnt/container_root using the subprocess module to run the mount command.

Next, we change the root directory of the process to the container's root using os.chdir() and os.chroot(). We then mount the /proc filesystem to the container's /proc directory.

We set up a network namespace for the container by creating a virtual ethernet pair using the ip link add command and moving one end of the pair to the container's network namespace using the ip link set and ip netns commands. We then assign IP addresses to each end of the pair and bring them up using the ip addr add and ip link set commands.

Finally, we run a command in the container using subprocess.run() and pass in the command we want to run (bash). We then exit the child process using os._exit(0).

Now, let's move on to the Node.js example

const { spawn } = require('child_process');
const { unshare } = require('child_process').spawnSync;

// Create a new namespace for the container
unshare(['--mount-proc', 'bash'], { stdio: 'inherit' });

This Node.js code uses the child_process module to create a new namespace for the container using the unshare command. We pass in the --mount-proc flag to mount the /proc filesystem in the container, and we pass in the command we want to run in the container (bash).

And that's it! With just a few lines of code, we've created a containerized environment using namespaces and isolation in the Linux kernel.

I hope this article has helped you understand how to use namespaces and isolation in the Linux kernel to create containerized environments. Remember, you can use these techniques to create isolated environments for running applications, just like Docker and other containerization tools do.

Bonus track, Why windows cannot do it?

Let's talk about something that's been bugging us for a while - why can't Windows use namespaces and isolation like Linux for containers? It's like watching a horse trying to dance to techno music - it just doesn't work!

So, what's the deal? Well, it all comes down to how Windows handles processes and resources. You know how Windows processes are all tightly integrated with the operating system? Yeah, that makes it hard to isolate them from each other. It's like trying to separate peanut butter and jelly once they're already mixed in.

On the other hand, Linux processes are designed to be more independent, so it's easier to put them in their own namespaces. It's like having a bunch of roommates who never talk to each other, but they still manage to live together without any problems.

And don't even get me started on how Windows handles system calls. It's like they're speaking a different language than Linux. Linux uses the clone system call to create new namespaces, while Windows uses some other weird mechanism. It's like trying to teach a dog to meow.

To make things even more complicated, Windows has a different security model than Linux. So even if you manage to get your containers up and running, it's hard to keep them safe from outside threats. It's like trying to keep your cheese safe from a hungry cat - it's not going to happen.

But hey, let's not be too hard on Windows. They're trying their best, and they've actually made some progress. They now have support for Docker containers and Windows Containers for Kubernetes. It's like watching a horse learn how to salsa dance - it's not perfect, but it's still impressive

Hope you enjoy this, thanks for reading

Blog

Spells under the magic of docker

krlz

Bonus track, Why windows cannot do it?

Join Our Newsletter. No Spam, Only the good stuff.

Related