Foundation Technologies of Containerization
Arief Warazuhudien
Posted on March 6, 2023
Containerization is a powerful method for deploying and managing applications in a cloud environment, due to several factors. One of the key advantages of containerization is improved portability, as containers can be easily moved between different environments without requiring any changes to the application code. This is made possible by the use of operating system-level virtualization, which provides process-level isolation while sharing the same kernel.
The two main technologies that support containerization are cgroups and namespaces. Cgroups enable the control and management of system resources, such as CPU, memory, and network bandwidth, by allowing the allocation of resources to specific processes or groups of processes. This ensures that each container has access to the resources it needs to operate effectively, without interfering with other containers on the same system.
Namespaces, on the other hand, provide process-level isolation by creating a virtualized environment for each container, which is separate from the host system and other containers. Namespaces allow containers to have their own network interfaces, file systems, and process trees, while sharing the same kernel as the host system.
Together, cgroups and namespaces provide the foundation for containerization, by enabling the creation of isolated and portable containers that can be easily deployed and managed at scale. The use of containerization has become increasingly popular in recent years, due to its many benefits, such as improved scalability, efficiency, and reliability. As the technology continues to evolve and improve, containerization is expected to become even more prevalent in the cloud computing landscape.
The background
Cgroups, short for control groups, were introduced to the Linux kernel in 2007 by Paul Menage, Rohit Seth, and several other contributors. Cgroups were designed to enable the control and management of system resources, such as CPU, memory, and network bandwidth, by allowing the allocation of resources to specific processes or groups of processes. Cgroups provide a hierarchical structure for resource control, with each group having a set of resource limits that can be adjusted dynamically based on the needs of the system.
The original motivation behind cgroups was to improve the resource management capabilities of the Linux kernel, particularly in multi-tenant environments such as hosting providers and cloud computing platforms. By allowing the allocation of resources to specific processes or groups of processes, cgroups enable better resource utilization and more efficient sharing of resources across different applications and users.
Namespaces, on the other hand, were introduced to the Linux kernel in 2008 by Eric W. Biederman. Namespaces were designed as a lightweight alternative to virtual machines, providing process-level isolation while sharing the same kernel. Namespaces allow containers to have their own network interfaces, file systems, and process trees, while sharing the same kernel as the host system.
The original motivation behind namespaces was to enable the creation of isolated environments for different applications or users, without the need for full-blown virtual machines. Namespaces were designed to provide a more lightweight and efficient alternative to virtualization, with faster startup times and lower resource usage.
Together, cgroups and namespaces provide the foundation for containerization, by enabling the creation of isolated and portable containers that can be easily deployed and managed at scale. Containerization has become increasingly popular in recent years, due to its many benefits, such as improved scalability, efficiency, and reliability, and is expected to continue to grow in popularity as the technology continues to evolve and improve.
Example to use this tecnolgies barely
Let's say we want to create a container that runs a web server, using cgroups to limit the amount of CPU and memory resources it can use, and namespaces to isolate it from the host system and other containers.
First, we would create a cgroup hierarchy for the container, which can be done using the cgcreate command. This would allow us to specify the CPU and memory limits for the container, and ensure that it has access to only the resources it needs.
Next, we would create a namespace for the container, using the unshare command. This would create a virtualized environment for the container, which is separate from the host system and other containers. We can use the ip command to create a network namespace, the mount command to create a file system namespace, and the pid command to create a process namespace.
Finally, we would start the web server in the container, using the docker run command. This would launch the container with the specified cgroup and namespace settings, ensuring that it has access to only the resources it needs and is isolated from the host system and other containers.
#!/bin/bash
# Create a new cgroup
cgcreate -g cpu:myapp
# Set the CPU shares for the cgroup to limit resource usage
cgset -r cpu.shares=512 myapp
# Create a new namespace for the container
unshare --fork --pid --mount-proc /bin/bash
# Mount the new file system in the namespace
mount -t tmpfs cgroup myapp /sys/fs/cgroup/cpu,cpuacct/
# Set the process limit for the namespace
echo "100" > /sys/fs/cgroup/cpu,cpuacct/myapp/tasks
# Start the application in the namespace
./myapp
# Cleanup
umount /sys/fs/cgroup/cpu,cpuacct/
cgdelete cpu:myapp
This Bash script creates a new cgroup named "myapp" and sets the CPU shares to limit resource usage. It then creates a new namespace for the container and mounts a new file system in the namespace. The process limit is set for the namespace, and the application is started inside the namespace. Finally, the file system is unmounted and the cgroup is deleted.
CGroup
Cgroups, short for control groups, are a feature of the Linux kernel that enable the control and management of system resources, such as CPU, memory, and network bandwidth, by allowing the allocation of resources to specific processes or groups of processes.
At a high level, cgroups work by creating a hierarchical structure of resource limits, with each cgroup having a set of resource limits that can be adjusted dynamically based on the needs of the system. Cgroups are implemented as a filesystem in the Linux kernel, with each cgroup represented by a directory in the filesystem. The cgroup filesystem can be mounted at /sys/fs/cgroup, and cgroups can be created and managed using standard file system commands like mkdir, rmdir, and chown.
Cgroups work by associating a set of resource limits with each cgroup. For example, a cgroup may have a CPU limit of 50%, which means that processes running in that cgroup are allowed to use up to 50% of the available CPU resources. Cgroups can also be used to limit other system resources, such as memory, network bandwidth, and I/O operations.
When a process is started, it is assigned to a cgroup based on its attributes, such as its user ID, group ID, or parent process. The cgroup limits are enforced by the Linux kernel scheduler, which assigns CPU time to processes based on their cgroup assignments and resource limits. If a cgroup exceeds its resource limits, the Linux kernel will throttle the processes running in that cgroup to prevent them from consuming too many resources.
Cgroups provide a powerful tool for managing system resources in a Linux environment, and are used extensively in containerization technologies like Docker to provide fine-grained resource control for containers. By using cgroups, it is possible to allocate resources to specific processes or groups of processes, and ensure that system resources are used efficiently and fairly.
Cgroups are implemented as a file system in the Linux kernel, with each cgroup represented by a directory in the file system. The cgroup file system is mounted at /sys/fs/cgroup, and cgroups can be managed using standard file system commands like mkdir, rmdir, and chown.
The cgroup file system is implemented using the Virtual File System (VFS) layer in the Linux kernel, which provides a unified interface for file systems. Each cgroup is represented by a directory in the cgroup file system, with a set of control files that allow the management of cgroup attributes and resource limits.
The cgroup code in the Linux kernel is responsible for managing cgroup creation and deletion, as well as enforcing resource limits and managing the cgroup hierarchy. The cgroup code is implemented using kernel data structures like task_struct, which represents a process, and cgroup_subsys_state, which represents the state of a cgroup.
The cgroup code in the Linux kernel is organized into subsystems, each of which is responsible for managing a specific set of resources, such as CPU, memory, or network bandwidth. Each subsystem provides a set of control files that allow the management of cgroup attributes and resource limits for that subsystem.
For example, the CPU subsystem provides control files like cpu.cfs_period_us, which sets the period of the CPU scheduler, and cpu.cfs_quota_us, which sets the amount of CPU time that a cgroup is allowed to consume during a period. The memory subsystem provides control files like memory.limit_in_bytes, which sets the maximum amount of memory that a cgroup is allowed to consume.
The cgroup code in the Linux kernel is highly configurable and can be customized to meet the needs of different environments and use cases. Containerization technologies like Docker rely heavily on cgroups for resource management and provide a user-friendly interface for managing cgroups and their associated attributes and resource limits.
Namespace
Namespaces are a feature of the Linux kernel that provide process-level isolation by creating separate environments, or namespaces, for system resources like process IDs, network interfaces, and file systems. Each namespace provides a separate view of the resource it manages, isolating processes within the namespace from processes in other namespaces.
At a high level, namespaces work by creating a new namespace and associating it with a set of processes. Once a process is associated with a namespace, it can only access the resources within that namespace, and not those in other namespaces. Namespaces are implemented as a set of kernel data structures, which are used to keep track of the state of each namespace and enforce isolation between namespaces.
For example, the PID namespace provides process-level isolation by creating a separate view of the process ID space for each namespace. When a new namespace is created, a new PID table is created for the namespace, and all processes created within the namespace are given process IDs that are unique within the namespace. This allows processes within the namespace to refer to other processes by their local PID, without conflicting with process IDs in other namespaces.
Similarly, the network namespace provides network-level isolation by creating a separate view of the network interfaces and routing tables for each namespace. When a new namespace is created, a new set of network interfaces and routing tables is created for the namespace, and processes within the namespace can only access the network resources within the namespace.
Namespaces provide a powerful tool for managing system resources and isolating processes within a Linux environment. Containerization technologies like Docker rely heavily on namespaces to provide process-level isolation for containers, allowing multiple containers to run on the same system without interfering with each other. By using namespaces, it is possible to create separate environments for different processes and applications, and enforce strict isolation between them to improve security and reliability.
Namespaces are implemented in the Linux kernel using a combination of kernel data structures and system calls. Here are some of the key components in the Linux kernel that are used to implement namespaces:
- Namespace identifiers: Each namespace has a unique identifier, which is used to associate processes with the namespace. The identifier is represented by a data structure called a "nsproxy", which is stored in the process descriptor (task_struct) for each process.
- Namespace objects: Each namespace has an associated kernel data structure that represents the state of the namespace. For example, the PID namespace has a "pid_namespace" data structure that stores the mapping between process IDs and processes within the namespace.
- Namespace operations: Each namespace has a set of system call operations that can be used to create, manipulate, and delete the namespace. For example, the clone() system call can be used to create a new process in a separate namespace, and the setns() system call can be used to change the namespace of an existing process.
- Namespace file systems: Each namespace has an associated file system that provides a view of the resources within the namespace. For example, the /proc file system provides a view of the process ID space, and the /sys file system provides a view of the system hardware.
- Namespace hierarchy: Namespaces can be organized into a hierarchy, with each namespace having a parent namespace. This allows for the creation of nested namespaces, with child namespaces inheriting properties from their parent namespaces.
These components work together to provide process-level isolation and resource management in the Linux kernel. By creating separate namespaces for different resources like process IDs and network interfaces, the Linux kernel is able to provide a high level of isolation between processes and applications, improving security and reliability in complex environments.
Conclusion
In conclusion, cgroups and namespaces are two powerful features of the Linux kernel that have enabled containerization to become the standard for modern application deployment. Cgroups allow for fine-grained control over resource allocation and management, making it possible to limit the amount of CPU, memory, and other resources used by a process or group of processes. Namespaces provide process-level isolation, allowing multiple applications to run on the same system without interfering with each other.
Together, cgroups and namespaces have revolutionized the way that applications are developed, deployed, and managed, providing a powerful toolset for managing system resources and isolating processes within a Linux environment. Containerization technologies like Docker rely heavily on cgroups and namespaces to provide process-level isolation for containers, allowing multiple containers to run on the same system without interfering with each other. As containerization continues to grow in popularity, it is clear that cgroups and namespaces will continue to play a critical role in the future of application deployment and management.
Posted on March 6, 2023
Join Our Newsletter. No Spam, Only the good stuff.
Sign up to receive the latest update from our blog.
Related
November 29, 2024