Single syscall "Hello, world" - in Rust - part 2

szymongib

Szymon Gibała

Posted on March 1, 2023

Single syscall "Hello, world" - in Rust - part 2

... or There and Back Again

Returning to Rust

As in every hero's journey, after gaining the wisdom of the gods, the hero shall bring it back to his roots. If we squint our eyes a bit, that is (kind of) what we are going to do.

Our exploration in the previous post revealed some suspects that blow up our syscalls count in a simple "Hello, world" Rust application, those are:

  • Rust runtime
  • libc

Let's try getting rid of them.

No std

As a first step of our journey back to Rust, we need to cut some fat out of Rust runtime. We can do it by removing the standard library with no_std attribute. Our minimal "Hello, world" program will look like this:

#![no_std]
#![no_main]

use libc;

const MSG: &'static str = "Hello, world!\n";

#[no_mangle]
pub extern "C" fn main(_argc: isize, _argv: *const *const u8) -> isize {
    unsafe {
        libc::printf(MSG.as_ptr() as *const _);
    }
    0
}

#[panic_handler]
fn panic(_: &core::panic::PanicInfo) -> ! {
    loop {}
}
Enter fullscreen mode Exit fullscreen mode

Since there is plenty of good resources about no_std Rust, let's just get a quick overview of what is happening here:

  • #[no_std] attribute tells Rust to not link the standard library.
  • Since it is the standard library that defines panic_handler for Rust programs, without it we need to define our own using #[panic_handler] attribute on a function with a proper signature.
  • On the same note we cannot use Rust's default main function as an "entry point" to our program, so we need to use #![no_main] attribute and provide our own main function. #[no_mangle] attribute tells the compiler to not change the name of our function, so it can be found (and called) by libc.
  • And finally, we do not have access to the println macro, so we use libc::printf instead.

For our Cargo.toml we specify libc as a dependency and tell the compiler that we want our program to abort on panic. This second piece is necessary again because we do not use std and we are building a program for a target where eh_personality (eh stands for "exception handling") is defined in the standard library. Since eh_personality is necessary for stack unwinding when a panic occurs, aborting absolves us from the need to provide it:

...
[profile.dev]
panic = "abort"

[profile.release]
panic = "abort"

[dependencies]
libc = { version = "0.2.139", default-features = false }
Enter fullscreen mode Exit fullscreen mode

You can try it for yourself by removing panic = "abort" lines. Rust compiler error messages would point you in the right direction.

and voila, it works:

cargo run
Enter fullscreen mode Exit fullscreen mode
Hello, world!
Enter fullscreen mode Exit fullscreen mode

However... In respect of your screen real estate, I am not even going to bother pasting strace output here as it is bloated like a dead whale (ok, maybe not that bad just 35 syscalls...). This is because we still use the printf function, libc, and on top of that link it dynamically...

Statically linking libc (especially musl) to no_std program turns out to be not a trivial task, and since we need to get rid of it anyway, let's not go down this rabbit hole. Let's instead get rid of it altogether.

No libc

Okay, we can remove libc from our dependencies, remove calls to printf that depend on it, and we are good. Right?

#![no_std]
#![no_main]

const MSG: &'static str = "Hello, world!\n";

#[no_mangle]
pub extern "C" fn main(_argc: isize, _argv: *const *const u8) -> isize {
    0
}

#[panic_handler]
fn panic(_: &core::panic::PanicInfo) -> ! {
    loop {}
}
Enter fullscreen mode Exit fullscreen mode

Yes... but well, no...

cargo run
...
/usr/bin/ld: /usr/lib/gcc/x86_64-pc-linux-gnu/12.2.0/../../../../lib/Scrt1.o: in function `_start':
/build/glibc/src/glibc/csu/../sysdeps/x86_64/start.S:115: undefined reference to `__libc_start_main'
collect2: error: ld returned 1 exit status
...
Enter fullscreen mode Exit fullscreen mode

(cut down to only relevant parts)

We have at least two problems here. The first one is what we see on the screen -- a beautiful linker error -- and the second one is what we do not see on the screen -- our "Hello, world!" message -- because we removed the printf function call.

Since tackling the first one is a prerequisite for the second, let's start with it.

As I mentioned earlier, with our #[no_std] binary we had to provide a custom main function that is called by libc. But now that we do not have libc there is nothing to call our main function...

However, as we can see in the error message above something still refers to __libc_start_main function, which reasonably so cannot be found. We can see that error originated in the Scrt1.o file in _start function. Scrt1.o is a part of the C runtime startup code so we can reason that Rust still tries to link it to our binary.

Since this is not really the problem with our code, but more with the build (linking) process, we need to tell the compiler to not link those files, and we can do it by passing -nostartfiles flag to the linker.

RUSTFLAGS="-C link-arg=-nostartfiles" ...
Enter fullscreen mode Exit fullscreen mode

In gcc docs we can read:

-nostartfiles
Do not use the standard system startup files when linking. The standard system libraries are used normally, unless -nostdlib, -nolibc, or -nodefaultlibs is used.

To not specify RUSTFLAGS every time, we can move them to .cargo/config.toml:

[target.'cfg(target_os = "linux")']
rustflags = ["-C", "link-args=-nostartfiles"]
Enter fullscreen mode Exit fullscreen mode

And we are good to go!

cargo run
Enter fullscreen mode Exit fullscreen mode
[1]    1434 segmentation fault (core dumped)  cargo run
Enter fullscreen mode Exit fullscreen mode

I will take that as a no...

Remember the _start function, right? Well, since we no longer link startup files it is no longer here (surprise!), and apparently, it is "kind of" needed.

When OS loads the program it will look for the entry point address in the ELF file header to start the execution. However, if the entry point function was not found during the linking process, the address will be set to 0x0 (NULL), which usually is a protected memory area.

readelf -h ./target/debug/hello-world | grep Entry
Enter fullscreen mode Exit fullscreen mode
  Entry point address:               0x0
Enter fullscreen mode Exit fullscreen mode

The _start function is a default expected by the linker. This means we could simply rename our main function to _start... Or we can convince the linker that our function is better! We can use the same trick as before -- providing a linker configuration -- this time by passing --entry flag as a link-arg:

RUSTFLAGS="-C link-arg=--entry=main" cargo run
Enter fullscreen mode Exit fullscreen mode
[1]    2022 segmentation fault (core dumped)  RUSTFLAGS="-C link-arg=--entry=main" cargo run
Enter fullscreen mode Exit fullscreen mode

😑 ...

Is the entry point set?

readelf -h ./target/debug/hello-world | grep Entry
Enter fullscreen mode Exit fullscreen mode
  Entry point address:               0x1020
Enter fullscreen mode Exit fullscreen mode

Looks like it is, but... Remember our Assembly program? On top of write we also used the exit syscall, and without it, the program would segfault too. We might be facing a simillar issue here.

We can take a closer look at this by checking the Assembly code generated by the Rust compiler. To do that new need to expand our RUSTFLAGS, this time with --emit=asm flag:

RUSTFLAGS="-C link-arg=--entry=main --emit=asm" cargo build --release
Enter fullscreen mode Exit fullscreen mode

We are building in release mode this time to cut out debug symbols noise from the Assembly file. With that, we can find a concise .s file in the target/release/deps directory, and see our main function:

    .text
    .file   "hello_world.57a4ccb5-cgu.0"
    .section    .text.main,"ax",@progbits
    .globl  main
    .p2align    4, 0x90
    .type   main,@function
main:
    retq
.Lfunc_end0:
    .size   main, .Lfunc_end0-main
...
Enter fullscreen mode Exit fullscreen mode

The function contains only a single retq instruction, no syscalls, and no exit codes. To get even closer to the binary we can disassemble main directly, and confirm what we have already seen:

objdump --disassemble=main ./target/release/hello-world
Enter fullscreen mode Exit fullscreen mode
./target/release/hello-world:     file format elf64-x86-64


Disassembly of section .text:

0000000000001000 <main>:
    1000:   c3                      ret
Enter fullscreen mode Exit fullscreen mode

Since we are yet to research how to print to the screen without libc, let's confirm the hypothesis by doing something that we can empirically detect in our code. We can do it by calling panic!, or simply adding an endless loop:

Since that is exactly the behavior of our panic_handler the result is effectively the same.

...
#[no_mangle]
pub extern "C" fn main(_argc: isize, _argv: *const *const u8) {
    loop{}
}
...
Enter fullscreen mode Exit fullscreen mode
RUSTFLAGS="-C link-arg=--entry=main --emit=asm" cargo run
Enter fullscreen mode Exit fullscreen mode

And now we are stuck, which is what endless loops usually do. We can update .cargo/config.toml with our new link-arg and move on:

[target.'cfg(target_os = "linux")']
rustflags = ["-C", "link-args=-nostartfiles --entry=main"]
Enter fullscreen mode Exit fullscreen mode

We know the drill now from our previous adventures, we just write(2) and exit(2), and we are done! It might be time to reach out to some old "friends"...

Assembly. Again...

Remember the "wisdom of the gods" part? Yeah, that was not (entirely) a joke.

Since we already have tremendous experience with assembly after writing our "Hello, world" program, it would be a shame not to use it again... You might be asking, "Am I cheating once more?". Maybe. Or no, because it is me who made up those rules [evil laugh or something...].

Regardless, this time we are going to use Assembly from Rust (see, it is not cheating!). Fortunately, we can do that fairly easily, all we need is the asm! macro:

use core::arch::asm;
...
#[no_mangle]
pub extern "C" fn main(_argc: isize, _argv: *const *const u8) {
    unsafe {
        // Execute write syscall
        asm!(
            "syscall",
            in("rax") 1, // write syscall number
            in("rdi") 1, // stdout file descriptor
            in("rsi") MSG.as_ptr(),
            in("rdx") MSG.len(),
        );
        // Execute exit syscall
        let exit_code = 0;
        asm!(
            "syscall",
            in("rax") 60,
            in("rdi") exit_code,
            options(noreturn)
        );
    }
}
...
Enter fullscreen mode Exit fullscreen mode

And run it:

cargo run
Enter fullscreen mode Exit fullscreen mode
Hello, world!
Enter fullscreen mode Exit fullscreen mode

Amazing! Let's just confirm with strace, and we are done...

strace -c ./target/debug/hello-world
Enter fullscreen mode Exit fullscreen mode
Hello, world!
% time     seconds  usecs/call     calls    errors syscall
------ ----------- ----------- --------- --------- ----------------
  0.00    0.000000           0         1           write
  0.00    0.000000           0         1           mmap
  0.00    0.000000           0         1           mprotect
  0.00    0.000000           0         1           brk
  0.00    0.000000           0         1         1 access
  0.00    0.000000           0         1           execve
  0.00    0.000000           0         2         1 arch_prctl
  0.00    0.000000           0         1           set_tid_address
  0.00    0.000000           0         1           set_robust_list
  0.00    0.000000           0         1           rseq
------ ----------- ----------- --------- --------- ----------------
100.00    0.000000           0        11         2 total
Enter fullscreen mode Exit fullscreen mode

Ah, yes, of course... A quick look at the output of:

strace ./target/debug/hello-world
Enter fullscreen mode Exit fullscreen mode

Can refresh some memory pages in my head...

...
access("/etc/ld.so.preload", R_OK)      = -1 ENOENT (No such file or directory)
...
Enter fullscreen mode Exit fullscreen mode

ld.so.preload again, isn't it? We never actually got to build our no_std binary statically, so it is still dynamically linked.

file ./target/debug/hello-world
Enter fullscreen mode Exit fullscreen mode
./target/debug/hello-world: ELF 64-bit LSB pie executable, x86-64, version 1 (SYSV), dynamically linked, interpreter /lib64/ld-linux-x86-64.so.2, BuildID[sha1]=ca87b904fa8cd9cb232c819143edd5abb16cdaa7, with debug_info, not stripped
Enter fullscreen mode Exit fullscreen mode

Been there done that, we know what to do, and after quick facepalm, we can again set target-feature=+crt-static, this time in .cargo/config:

[target.'cfg(target_os = "linux")']
rustflags = ["-C", "link-args=-nostartfiles --entry=main", "-C", "target-feature=+crt-static"]
Enter fullscreen mode Exit fullscreen mode

And run it again:

cargo build && strace -c ./target/debug/hello-world
Enter fullscreen mode Exit fullscreen mode
Hello, world!
% time     seconds  usecs/call     calls    errors syscall
------ ----------- ----------- --------- --------- ----------------
  0.00    0.000000           0         1           write
  0.00    0.000000           0         1           execve
------ ----------- ----------- --------- --------- ----------------
100.00    0.000000           0         2           total
Enter fullscreen mode Exit fullscreen mode

There we have it, mission accomplished! "Hello, world!" program with just a single syscall, written in Rust (kind of).

It still hurts to look at tho. Perhaps we could use some library to have nice and smooth Rust functions and let someone more fluent speak Assembly for us... You know, sweep it under the rug type of thing...

Rust is not JavaScript, but there actually is a crate for that. Let's add it to our Cargo.toml:

[dependencies]
sc = "0.2.7"
Enter fullscreen mode Exit fullscreen mode

Our main function parameters are useless anyway, and we no longer need pub extern "C", so we clean it up as well with our last refactor. The final program looks much nicer:

#![no_std]
#![no_main]

#[macro_use]
extern crate sc;

const MSG: &'static str = "Hello, world!\n";

#[no_mangle]
fn main() {
    unsafe {
        syscall!(WRITE, 1, MSG.as_ptr(), MSG.len());
        syscall!(EXIT, 0);
    }
}

#[panic_handler]
fn panic(_: &core::panic::PanicInfo) -> ! {
    loop {}
}
Enter fullscreen mode Exit fullscreen mode

Conclusion

Ah, what a journey it was! All that hassle for writing a "Hello, world!" program...

We achieved our goal of cutting it down to a single syscall, but there are still a lot of areas we have just scratched the surface. There is also this kernel thingy that actually performs the action requested by a system call and so on, but that is a story for another day (or maybe a few years worth of stories).

In any case, I hope you enjoyed this little exploration, and maybe even learned a thing or two. If you have any questions, comments, or suggestions, feel free to reach out.

💖 💪 🙅 🚩
szymongib
Szymon Gibała

Posted on March 1, 2023

Join Our Newsletter. No Spam, Only the good stuff.

Sign up to receive the latest update from our blog.

Related