Beyond Pointers: How Rust outshines C++ with its Borrow Checker

vikram2784

Vikram Fugro

Posted on July 3, 2023

Beyond Pointers: How Rust outshines C++ with its Borrow Checker

As we all know, memory management has always been a notorious challenge in languages that don't have the garbage collector, with issues such as null pointers, dangling references, and memory leaks.

Rust's borrow checker is a powerful static analysis tool that ensures memory safety of your program at runtime.

Let's explore some simple examples of how Rust's Borrow Checker shines at compile-time, covering issues that C++, simply cannot contend with.

1. Reference Invalidation:

Here's a C++ example that basically creates a vector, pushes an element into the vector, takes a pointer to the first element, adds some more elements to the vector and then prints the pointer value.

// clang++ -std=c++20  -Wall -Wextra -Wpedantic -Weverything -Wno-c++98-compat

#include <iostream>
#include <vector>

using namespace std;

int main() {
    vector<int> v;

    v.push_back(100);

    int* ptr = &v[0];
    std::cout << "Before: *ptr (v[0]) = " << *ptr << std::endl;

    // add some more elements
    v.push_back(1);
    v.push_back(2);
    v.push_back(3);
    v.push_back(4);
    v.push_back(5);

    std::cout << "After:  *ptr (v[0]) = " << *ptr << std::endl;

    return 0;
}

Enter fullscreen mode Exit fullscreen mode

The code compiles fine with no warnings/errors and here's the output at runtime:

Before: *ptr (v[0]) = 100
 After: *ptr (v[0]) = -1813810918
Enter fullscreen mode Exit fullscreen mode

What happened to the After value? - The pointer ptr got invalidated due to reallocation (for accomodating the current and possible future elements) triggered at insertion. Note that C++ compiler didn't give any warning during compilation.

Now let's see what Rust's borrow-checker finds here. Here's the equivalent Rust code snippet...

fn main() {
    let mut v = vec![];

    v.push(100);

    let ptr = &v[0];

    // add some more elements
    v.push(1);
    v.push(2);
    v.push(3);
    v.push(4);
    v.push(5);

    println!("{}", *ptr);
}
Enter fullscreen mode Exit fullscreen mode

... and when trying to compile it, results in the following error:

 error[E0502]: cannot borrow `v` as mutable because it is also borrowed as immutable
  --> test.rs:9:5
   |
6  |     let ptr = &v[0];
   |                - immutable borrow occurs here
...
9  |     v.push(1);
   |     ^^^^^^^^^ mutable borrow occurs here
...
15 |     println!("{}", *ptr);
   |                    ---- immutable borrow later used here

error: aborting due to previous error

For more information about this error, try `rustc --explain E0502`.

Enter fullscreen mode Exit fullscreen mode

The error message mentioned above emphasizes that the reference acquired at line 6 must stay valid until line 15. As a consequence, the compiler prohibits any mutable references that span from line 6 to line 15. Essentially, the code already has an immutable reference obtained before the mutable borrowing occurs. Later on, the same immutable reference is used, but it risks becoming invalidated due to the preceding mutation.

Image description
In the picture above, green region indicates how long the immutable reference is valid for and red region is where the mutation takes place. As you can see there is a clear overlap.

The compiler upholds the principle that mutability is exclusive and cannot be shared within the specified range. In essence, it saves your program from crashing due to a dangling reference. (ptr can turn dangling, since v.push(), like it's C++ counterpart also triggers a reallocation, whenever required).

While it is true that C++ offers smart pointers like unique_ptr, shared_ptr, etc, similar to Rust's Box, Rc, etc, it is important to note that the focus of our discussion here is specifically on the borrow checker in Rust. The concept of smart pointers, although related, is distinct from the borrow checker.

2. Out-of-Scope Variable Access

In its native behavior, C++ compilers such as clang++/g++ do offer warnings when returning a reference to a local variable from a function. However, it's important to note they may not issue warnings for more advanced scenarios, unless I have inadvertently overlooked some details here. I am no C++ expert.

Here's an example:

// clang++ -std=c++20  -Wall -Wextra -Wpedantic -Weverything -Wno-c++98-compat

#include <iostream>

using namespace std;

class Rectangle {
  size_t& width;
  size_t& height;

public:
  Rectangle (size_t& w, size_t& h): width(w), height(h) {}

  size_t area () const {
    return width * height;
  }
};

static Rectangle
make_rectangle(size_t width, size_t height) {
  return Rectangle(width, height);
}

int main() {
  Rectangle rec = make_rectangle(52, 10);
  std::cout << rec.area() << std::endl;
}

Enter fullscreen mode Exit fullscreen mode

In the provided code snippet, make_rectangle() returns an object with its fields aliasing the local variables, and surprisingly, the compiler does not generate any warnings/errors. However, it's worth noting that static analysis tools like cppcheck are capable of detecting this issue. Therefore, relying on static analysis tools becomes necessary in C++ to catch such occurrences.

Let's look at the equivalent example in Rust.

struct Rectangle<'a> {
    width: &'a usize,
    height: &'a usize,
}

impl<'a> Rectangle<'a> {
    fn area(&self) -> usize {
        self.width * self.height
    }
}

fn make_rectangle(width: usize, height: usize) -> Rectangle {
    Rectangle {
        width: &width,
        height: &height,
    }
}

fn main() {
    let rec = make_rectangle(52, 10);
    println!("{}", rec.area());
}

Enter fullscreen mode Exit fullscreen mode

Compiling the code, throws the following error:

error[E0106]: missing lifetime specifier
  --> test.rs:12:61
   |
12 | fn make_rectangle(width: usize, height: usize) -> Rectangle<'_> {
   |                                                             ^^ expected named lifetime parameter
   |
   = help: this function's return type contains a borrowed value, but there is no value for it to be borrowed from
help: consider using the `'static` lifetime

For more information about this error, try `rustc --explain E0106`.
Enter fullscreen mode Exit fullscreen mode

The compiler talks about lifetimes, which by the way, is one of the primary reasons why Rust doesn't need a garbage collector. It's a bit advanced but a very powerful concept. Coming back to the error, it basically says "I see that Rectangle here is borrowing certain things, but you need to tell me for how long?. I am sorry, I can't proceed without knowing that!"

Lifetime is an attribute of a reference in Rust that tells the compiler how long the reference is supposed to be valid for. It helps compiler statically analyze how safely the references are used in the code. If the compiler sees that specified lifetime is meant to be longer than the data it points to, it cries foul! So mapping this understanding to our error message above, we can see the data (local variables living on the stack) that the returned object is pointing to, will be invalidated once the function call is over.

Lifetime denotions are prefixed by a ' and are defined in the same as way as generic types. Let's specify a lifetime.


fn make_rectangle<'a>(width: usize, height: usize) -> 
    Rectangle<'a> {
    Rectangle {
        width: &width,
        height: &height,
    }
}

Enter fullscreen mode Exit fullscreen mode

and here's the compiler output:


error[E0515]: cannot return value referencing function parameter `width`
  --> a1.rs:13:5
   |
13 | /     Rectangle {
14 | |         width: &width,
   | |                ------ `width` is borrowed here
15 | |         height: &height,
16 | |     }
   | |_____^ returns a value referencing data owned by the current function

error[E0515]: cannot return value referencing function parameter `height`
  --> a1.rs:13:5
   |
13 | /     Rectangle {
14 | |         width: &width,
15 | |         height: &height,
   | |                 ------- `height` is borrowed here
16 | |     }
   | |_____^ returns a value referencing data owned by the current function

error: aborting due to 2 previous errors

For more information about this error, try `rustc --explain E0515`.

Enter fullscreen mode Exit fullscreen mode

We now have defined a lifetime 'a, but 'a (i.e a borrow of local variables) is supposed to live beyond make_rectangle()'s execution which obviously isn't correct, hence the above error.

But what does <'a> on the Rectangle really mean here? - It's a part of the Rectangle's definiton and it means Rectangle will borrow stuff for "some" lifetime 'a.

If we make make_rectangle() accept references instead of values, the compiler will be happy as far as the defintion of make_rectangle() goes.


fn make_rectangle<'a>(width: &'a usize, height: &'a usize) -> 
    Rectangle<'a> {
    Rectangle {
        width: width,
        height: height,
    }
}

Enter fullscreen mode Exit fullscreen mode

We assure the compiler here that make_rectangle() will be receiving references that live for "some" lifetime 'a which at the very least are guaranteed to live longer than make_rectangle()'s execution. Note this denotion is similar to how we define generic types eg. Rectangle<T> or make_rectangle<T>() where T could be "some" concrete type such an integer, string, vector, etc.

Understanding lifetimes might be a bit daunting to begin with, but as you write and test different code it will grow on you :). The Rust compiler messages have become very friendly and easy to understand.

Let's deepen our understanding of lifetimes. Here's another C++ example that demonstrates a scenario where the compiler fails to identify the issue, but cppcheck successfully detects it.


// clang++ -std=c++20  -Wall -Wextra -Wpedantic -Weverything -Wno-c++98-compat

#include <iostream>

using namespace std;

static int& 
bar(int &x, int &y) { 
  if (x > 10) { 
    return x;
  } 

  return y;
}

static int& 
foo(int &x) { 
  int y = 10;

  return bar(x, y);
}

int main() { 
  int val = 1;
  std::cout << foo(val) << std::endl;
}

Enter fullscreen mode Exit fullscreen mode

As observed in the above example, foo() conditionally returns a reference to its local variable. While a tool like cppcheck can identify this issue, if foo() belonged to a library and we don't have control over its implementation, we would have limited options to address the problem directly.

Rust tackles this issue by providing the ability to express the constraint directly within the function definition using lifetimes. Here's what I mean


fn bar(x: &i32, y: &i32) -> &i32 {
    if (*x > 10)
        return x;
    y
}

Enter fullscreen mode Exit fullscreen mode

The compiler won't compile this function. It throws the following error:


error[E0106]: missing lifetime specifier
 --> src/lib.rs:1:29
  |
1 | fn bar(x: &i32, y: &i32) -> &i32 {
  |           ----     ----     ^ expected named lifetime parameter
  |
  = help: this function's return type contains a borrowed value, but the signature does not say whether it is borrowed from `x` or `y`
help: consider introducing a named lifetime parameter
  |
1 | fn bar<'a>(x: &'a i32, y: &'a i32) -> &'a i32 {
  |       ++++     ++          ++          ++

For more information about this error, try `rustc --explain E0106`.

Enter fullscreen mode Exit fullscreen mode

There isn't much need for further explanation of the error message since it is self-explanatory. Following the compiler suggestion of adding lifetimes to bar(), any ambiguity regarding its return value is resolved. As a consumer of bar(), it is reassuring to know that there will be no dangling references after calling it and the error message would completely point towards the issue at my callsite i.e inside foo()


error[E0515]: cannot return value referencing local variable `y`
   | fn foo(x:&i32) -> &i32 {
   |
   |     let y = 2;
   |
   |     bar(&x, &y)
   |     ^^^^^^^^--^
   |     |       |
   |     |       `y` is borrowed here
   |     returns a value referencing data owned by the current function

For more information about this error, try `rustc --explain E0515`.

Enter fullscreen mode Exit fullscreen mode

It is worth noting that if bar() had to return the same input, specifying the same lifetimes for both arguments would serve little purpose and would be unnessarily constraining for the caller. Let's look at a full example:


fn bar<'a>(x: &'a i32, y: &'a i32) -> &'a i32 {
    if *x < 10 {
        //some_func(x, y);
        println!("{:?}", y);
    }

    x
}

fn foo(x: &i32) -> &i32 {
    let y = 2;
    bar(&x, &y)
}

fn main() {
    let x = 1;
    println!("{}", foo(&x));
}

Enter fullscreen mode Exit fullscreen mode

Since there is a return lifetime from bar() and y is tied to the same input lifetime as x, the compiler expects y to live beyond the foo() which is not possible.


error[E0515]: cannot return value referencing local variable `y`
  --> src/main.rs:12:5
   |
12 |     bar(&x, &y)
   |     ^^^^^^^^--^
   |     |       |
   |     |       `y` is borrowed here
   |     returns a value referencing data owned by the current function

For more information about this error, try `rustc --explain E0515`.

Enter fullscreen mode Exit fullscreen mode

The fix would be to have bar() receive disjoint input lifetimes.


fn bar<'a, 'b>(x: &'a i32, y: &'b i32) -> &'a i32 {
    if *x < 10 {
        //some_func(x, y);
        println!("{:?}", y);
    }

    x
}

Enter fullscreen mode Exit fullscreen mode

3. Out-of-Scope Variable Access - Multithreading

Another situation where accessing out-of-scope data can occur is in the context of multithreading. Here's an example in C++:


// clang++ -std=c++20  -Wall -Wextra -Wpedantic -Weverything -Wno-c++98-compat 

#include <iostream>
#include <thread>
#include <unistd.h>

using namespace std;

static
std::thread fire_a_job(int id) {
    std::thread t([&]() {
        // here there's no guarantee that `id` will be valid 
        std::cout << "job running " << id << std::endl;
    });

    return t;
}

int main() {
    std::thread t = fire_a_job(10);
    t.join();
}

Enter fullscreen mode Exit fullscreen mode

In this scenario, fire_a_job() initiates a thread t and fire_a_job() promptly returns. However, it's important to note that thread t may continue running for an extended period or possibly even start executing at a later time after the completion of fire_a_job(). This uncertainty arises because there is no guarantee or control over when the thread t will be scheduled to run by the underlying scheduler.

As a consequence, there is a risk of accessing the variable id after it has been destroyed. The variable id lives on the stack and is borrowed by thread t indicated by the [&] syntax. The compiler again, doesn't catch this nor does cppcheck (as of the time of writing).Hence your code becomes concurrency-unsafe, as it may lead to accessing invalidated data.

Rust's borrow-checker tackles such data access invalidation issues between the threads too. Let's look at the Rust equivalent of fire_a_job():


use std::thread::*;

fn fire_a_job(id: i32) -> JoinHandle<()> {
    spawn(|| {
        println!("job running {}", id);
    })
}

Enter fullscreen mode Exit fullscreen mode

Compiler fails with the following message:


error[E0373]: closure may outlive the current function, but it borrows `id`, which is owned by the current function
 --> src/lib.rs:4:11
  |
4 |     spawn(|| {
  |           ^^ may outlive borrowed value `id`
5 |         println!("job running {}", id);
  |                                    -- `id` is borrowed here
  |
note: function requires argument type to outlive `'static`
 --> src/lib.rs:4:5
  |
4 | /     spawn(|| {
5 | |         println!("job running {}", id);
6 | |     })
  | |______^
help: to force the closure to take ownership of `id` (and any other referenced variables), use the `move` keyword
  |
4 |     spawn(move || {

For more information about this error, try `rustc --explain E0373`.

Enter fullscreen mode Exit fullscreen mode

The message indicates that thread t may exist beyond the lifetime of id. To address this, it is recommended to either ensure that id remains valid throughout the entire program execution (referred to as the static lifetime and denoted by 'static), or "move" id (copy or transfer ownership) into thread t, similar to the [=] syntax in C++. While the first recommendation is not possible since id is a stack variable, the second one provides a solution to the problem. By moving id (copying or transferring ownership) into thread t, we can ensure that the data remains accessible and valid throughout the thread's execution. This resolves the issue of accessing potentially destroyed or out-of-scope data.


use std::thread::*;

fn fire_a_job(id: i32) -> JoinHandle<()> {
    spawn(move || {
        println!("job running {}", id);
    })
}

Enter fullscreen mode Exit fullscreen mode

4. Shared Mutability

Unlike Rust, C++ does not have native support for preventing shared mutability at compile-time. Here's an example in C++


// clang++ -std=c++20  -Wall -Wextra -Wpedantic -Weverything -Wno-c++98-compat 

#include <iostream>
#include <thread>
#include <vector>

using namespace std;

int main() {
  vector<int> v;
  v.push_back(0);

  std::thread t1([&](){
    v.push_back(1);
    std::cout << "t1 running " << std::endl;
  });

  std::thread t2([&](){
    v.push_back(2);
    std::cout << "t2 running " << std::endl;
  });

  t1.join();
  t2.join();

  for (auto element : v) {
    std::cout << element << " ";
  }
}

Enter fullscreen mode Exit fullscreen mode

In the above example, v is captured by reference and is simultaneously mutated from two different threads. The code compiles fine but as expected, yields inconsistent results during runtime.. Here's what happens at runtime:

4 different runs

4 different runs - 2 passed, 2 failed!

Certainly, the appropriate solution in this scenario is to utilize synchronization primitives or thread-safe data structures, which applies not only to C++ but also to Rust. However, Rust's borrow checker detects such issues at compile-time, preventing your application from crashing at runtime.

Here's the Rust equivalent example:

use std::thread::*;

fn main() {
    let mut v = vec![];
    v.push(0);

    scope(|s| {
        s.spawn(|| {
            v.push(1);
            println!("t1 running");
        });

        s.spawn(|| {
            v.push(2);
            println!("t2 running");
        });
    });
}
Enter fullscreen mode Exit fullscreen mode

and here's the compiler's error message:

error[E0499]: cannot borrow `v` as mutable more than once at a time
  --> src/main.rs:13:17
   |
7  |       scope(|s| {
   |              - has type `&'1 Scope<'1, '_>`
8  |           s.spawn(|| {
   |           -       -- first mutable borrow occurs here
   |  _________|
   | |
9  | |             v.push(1);
   | |             - first borrow occurs due to use of `v` in closure
10 | |             println!("t1 running");
11 | |         });
   | |__________- argument requires that `v` is borrowed for `'1`
12 |           
13 |           s.spawn(|| {
   |                   ^^ second mutable borrow occurs here
14 |               v.push(2);
   |               - second borrow occurs due to use of `v` in closure

For more information about this error, try `rustc --explain E0499`.

Enter fullscreen mode Exit fullscreen mode

Closing Points

In contrast to C++, where it falls upon the programmer to implement adequate synchronization and prevent dangling references, Rust takes a different approach. Rust's borrow checker thoroughly examines the code, verifying that shared references and mutable access are handled safely. This not only prevents accessing invalidated data but also eliminates issues like data races and other concurrency-related problems. By detecting these concerns during compilation, Rust enhances the overall safety of the codebase, effectively mitigating certain categories of bugs associated with memory access.

While it may require some additional annotations like lifetimes or code restructuring at times, the borrow-checker's thorough analysis provides a valuable layer of protection and contributes to the overall reliability and efficiency of programs.

💖 💪 🙅 🚩
vikram2784
Vikram Fugro

Posted on July 3, 2023

Join Our Newsletter. No Spam, Only the good stuff.

Sign up to receive the latest update from our blog.

Related