Getting Started with Systems Programming with Rust (Part 1)
Beka Modebadze
Posted on August 13, 2021
You can find original post on my personal blog.
A modern computer is a very complex creation that evolved into the current state through decades of research and development. Sometimes it appears to be like black magic. There’s no magic in it, just science. However, some of the minds like Alan Turing, Charles Babbage, Ada Lovelace, John von Neumann, and many others are magical, as they made this possible.
Ok, that’s enough of introductions and let us dive into the fundamentals of systems programming. In this part we’ll learn:
- - What is the process?
- - How are they created and executed?
- - Look at some code examples in Rust and compare them to C.
Before diving into code we’ll start to build up from the lowest level of the main components of the operating systems. As shown in Figure 1-a the lowest level of any computer is Hardware, next comes the Kernel mode which runs on bare metal. This is where the operating system, like Linux, is located.
Figure 1-a.
On top of the Kernel mode, we have a User-mode. For a user to be able to interact with the kernel AND use other higher-level software, like web browser, E-mail reader, etc. it requires a user interface program. This can be a window, Graphical User Interface, or it can be a shell which is a command interpreted that is used to read commands from a terminal and execute them
Processes: Parent and Child
The main concept in all operating systems is a process. A process is basically a running program. You can think of it as a drawer that contains all the information about that particular program. Some processes start running at the start of the computer, some run in the background, and some are called and interacted by the user, through the shell, for example.All the processes have an id. The very first process is initiated, when the system is booted. This process has an id of 1 and is called init. After that, init will call other processes and so on. When we type a command in a shell for the OS to execute, the system should create a new process that will run the compiler. When the process has finished compiling, it will make a system call to terminate itself.
In UNIX systems every new process is a child process of some parent process. Process creation is done by cloning a parent process, which is referred to as forking (Figure 1-b). Each process has one parent but can have multiple child processes. The structure of the processes resembles a tree, where init is the root, meaning it’s at the top of the hierarchy.
After the process’s creation, the parent and the child processes are the same, except the parent will have some arbitrary ID number, and the child process will have an ID equal to 0. Next, the system substitutes the child process’s execution with a new program. When the process is done fulfilling its purpose, it’s terminated and exited normally (voluntary). The process can also be exited due to an error or killed by another process (involuntary).
Figure 1-b.
The system also keeps track of all the processes, maintaining their data in what’s called a processes table. It holds information like process id, process owner, process priority, environment variables for each process, the parent process. In addition to that, it also holds the info in what state a particular process is. Each process can be in one of the following four states:
- RUNNABLE — The process is running / actively using the CPU.
- SLEEPING — The process is runnable, but is waiting for another process to stop/finish first.
- STOPPED — This state indicates that the process has been suspended for further proceeding. It can be restarted to run again by a signal.
- ZOMBIE — The process is terminated when ‘system exit’ is called or someone else kills the process. However, the process has not been removed from the process table.
Often processes have to interact with each other and can change the state and go from Running to sleeping, then back to running (Figure 1-c). This is usually done by a SIGSTOP
signal, which is issued by Ctrl + Z (We’ll review signals in-depth in upcoming parts). Same with the stopped process, its activity can be restarted. Except for the Zombie state, which once killed can’t be restarted or continued.
Figure 1-c.
C vs Rust
In C, which is an official Linux kernel programming language, process creation is done first by forking the new process and then explicitly asking a system to execute a new directive on a child process. If we don’t do that, both parent and child processes will be executing the same directive. Here is an example of executing ls
command, which lists files of given directory:
#include stdio.h
#include sys/types.h
#include sys/wait.h
int main()
{
pid_t pid;
switch (pid = fork()) {
case -1:
perror("fork failed");
break;
case 0:
printf("I'm child process and I will execute ls command");
char *argv_list[] = {NULL};
if (execv("ls", argv_list) == -1) {
perror("Error in execve");
exit(EXIT_FAILURE);
}
break;
default:
printf("I'm parent process and I'll just print this");
}
return 0;
}
As you can see we have to manage the processes manually and monitor if the execution was successful. Also, we have to handle errors. If we want a command to be executed only by a child we have to manually check if the current process is a child, which is done here by case 0
. In Rust, the same can be achieved with a standard library’s process module:
use std::process::Command;
fn main() {
let child = Command::new("ls")
.env("PATH", "/bin")
.output()
.expect("failed to execute process");
// if no error, program will continue..
}
Here Command::new()
is a process builder which is responsible for spawning and handling a child process. Just like in a C code, we supply a command we want to execute, environmental variables, command argument, and call output
method on it. The output will execute the command as a child process, waiting for it to finish, and returns the collected output.
Instead of output()
we also have options to use either status()
or spawn()
. Each of these methods is responsible for forking a child process with subtle differences:
output()
: Will run the program and return a result of the Output
, only after child processes finish running. status()
: Will run the program and return a result of ExitStatus
after process compilation. This allows checking the status of the compiled program.spawn()
: Will run the program and return a result which is a Child
process. This doesn’t wait for the program compilation. This option allows for wait
and kill
directives or we can get an id of that process. Here, env()
is optional, as the Command is smart enough to look for the path of a /bin folder. Finally, all the error handling is done by expect()
. It unwraps the result if Ok
, meaning the program was executed successfully, or Err
if something went wrong and will panic!
. If you want your program not to terminate if Err
encountered you can do something like this:
use std::process::Command;
main() {
let user_input = get_user_input(); // custom function
if let Err(_) = Command::new(&user_input)
.envs("PATH", "/bin")
.status() {
println!("{}: command not found!", &cmd);
}
// the rest of the program...
}
Here status()
is handier and calling it will return Ok
if the legit command is supplied by the user and execute. But we are only interested in handling if the unavailable command was supplied. That’s why we only check if Err
was returned, and if so print that “command was not found” into the terminal and continue the current program execution, instead of terminating.
Finally, the spawn()
is used to manage the order of execution between several children and parent processes. It contains stdin
stdout
and stderr
fields and has, familiar to C programmers, wait()
, kill()
and id()
methods. We’ll look at this part of the processes in the next part and we’ll also see how Rust takes care of race conditions when two or more threads can access shared data and they try to change it at the same time.
Summary
In this introductory part, we reviewed what are processes, how they are created and compared Rust's implementation of the processes creation and command execution to C. We saw that Rust code not only is less prone to human errors but it’s less verbose and more concise. In the next parts, we’ll take a look at managing processes execution time and states, and handling system signals
Posted on August 13, 2021
Join Our Newsletter. No Spam, Only the good stuff.
Sign up to receive the latest update from our blog.