Rust: smart pointers
Nicky Meuleman
Posted on February 25, 2021
Pointers
A pointer is a variable that contains an address in memory.
It points to, or refers to some other data.
You can think of it like an arrow to that value.
Rust has two regular types of pointers called references.
They're recognized by the ampersand in front of a variable name.
-
&
for an immutable reference (eg.&my_variable
). -
&mut
for a mutable reference. (eg.&mut my_variable
)
References to a value don't own that value, they borrow that value.
In other words, the reference can disappear and the value it pointed at will still exist.
The Rust programming language book has a great chapter on ownership in Rust.
Rust's data race prevention rules dictate that for a given piece of data in any particular scope:
You can have only one mutable reference to that data OR you can have multiple immutable references to that data.
Never both at the same time.
You might recognize the term "shared reference" from compiler errors.
A while back, I got a compiler error that said something like blablabla, because it is behind a shared reference
.
This confused me, I got that error while the message indicated a reference that was the only one in the entire program.
How could it be "shared" then?
Turns out that in many cases, "shared reference" is an other way to say "immutable reference".
- An other name for an immutable reference is a shared reference.
- An other name for a mutable reference is an exclusive reference.
Smart pointers
Those references are regular pointers that only point to some data, they don't have any other capabilities.
Smart pointers can have extra capabilities.
They are data structures that not only act like a pointer, but have additional metadata.
They use that extra data to enable behavior regular pointers could not have.
You could say that those pointers are ... smart
Smart pointers are usually implemented using a struct.
An other difference with regular references, smart pointers usually own the data they point to.
In other words: when the smart pointer gets dropped, the data they point to gets dropped.
Most smart pointers implement the Deref
, and Drop
traits.
String
If you've programmed in Rust before, chances are great you already used smart pointer, even if you didn't know you were.
The String
is a smart pointer.
let s1 = String::from("hello");
On the left of the following image is the data that is stored on the stack.
On the right, the data that is stored on the heap.
Psst, I wrote about the stack and the heap in Rust
On the stack is our String
, named s1
.
It's a struct that not only has a pointer to a specific location on the heap (in ptr
).
It has additional metadata, like the length of the string (in len
), and the amount of bytes that string occupies (in capacity
).
The distinction between those last 2 fields is not important right now, what is important is that the string has additional metadata associated with it.
The heap stores the contents of that string in consecutive memory addresses.
In this case the letters h
, e
, l
, l
, o
.
Deref
The Deref
trait allows a struct that implements it to behave like a pointer, instead of as a regular struct that holds a pointer in a field.
That way you can write code that works for references, and smart pointers will work with it.
The dereference operator, *
, follows a pointer to the value it is pointing to.
Calling it on a regular struct wouldn't work, but a struct that implements Deref
knows what to do when that happens.
To implement the Deref
trait, you have to implement a method named deref
.
It takes an immutable reference to self
, and returns an immutable reference to an other type.
In my opinion, the
deref
method is incredibly confusing naming.
deref
doesn't dereference at all, it returns a reference.
The compiler knows how to dereference that reference.
The Box<T>
type is a smart pointer that implements Deref
.
When you use the dereference operator a Box<T>
, under the hood, a call to the deref
method happens first.
deref
returns another reference.
For Box<T>
, that's a reference to the inner type, the T
in Box<T>
.
The compiler then dereferences it by following that reference.
let num = 5;
let boxed_num = Box::new(num);
assert_eq!(5, num);
assert_eq!(5, *boxed_num);
In the first assert_eq
, we directly compare 5
with num
.
In the second assert_eq
, we compare 5
with the result of using the dereference operator on a boxed value.
*boxed_num
is equivalent to writing *(boxed_num.deref())
.
Drop
The Drop
trait allows you to customize the code that runs when an instance of that struct goes out of scope.
It is used to release resources like network connections, files, and used memory.
An example usage: when the owner of a Box<T>
goes out of scope,
not only is the Box
popped off the stack, the T
that uses memory on the heap is deallocated.
To implement the Drop
trait, you have to implement a method named drop
.
It takes a mutable reference to self
, and doesn't return anything. (well, it returns the unit type, the empty tuple, ()
)
That drop
is called automatically when the owner of a value goes out of scope.
In other words: if a variable leaves the curly bois {}
that denote a scope, drop
is called on that variable.
It's not allowed to call the drop
method in the Drop
trait manually.
At least not directly.
Doing so during a scope would cause the method to be called again, automatically, at the end of that scope.
That would cause unwanted situations, or the infamous double free error where you try to deallocate a piece of memory twice.
If you want to call drop
before the end of the scope, call it via std::mem::drop
.
That will make sure drop
is only called once.
Posted on February 25, 2021
Join Our Newsletter. No Spam, Only the good stuff.
Sign up to receive the latest update from our blog.