Damien Cosset
Posted on June 13, 2024
Introduction
All programs have to manage the way they use the computer's memory while they run. Rust is obviously no different. In this article, I'll try to understand how it works.
Rust uses something that they call Ownership. Ownership is a set of rules that define how a Rust program manages memory. If you violate one of these rules, you dieyour code won't compile. The rules are as follow:
- Each value in Rust has an owner.
- There can only be one owner at a time.
- When the owner goes out of scope, the value will be dropped.
Let's dive into it.
Stack and Heap
First, we need to touch on the stack and the heap. The stack and the heap are two parts of memory available to you at runtime, but they are not structured in the same way.
The stack is organized like a stack of newspapers. Last in, first out. If you put a newspaper at the top of the pile, that newspaper is the first one that gets picked. It's fast and it's efficient. All the variables we store in the stack have a known, fixed size.
The heap is less organized. This is where we store variables where the size is not fixed. The memory allocator finds an empty spot and returns a pointer. Because the pointer has a fixed size, that is stored in the stack. As long as you need only the reference of a variable stored in the heap, we can use the pointer stored in the stack. But if you need to access the value, we have to go in the heap and retrieve the value associated to the pointer. As you can imagine, storing data in the heap takes more time (having to look for an empty space vs always storing at the top) and retrieving data also takes more time because you have to follow a pointer to get there.
Variables scope
Let's consider this code:
{ // The scope begins, greeting is not valid yet because not declared
let mut greeting = String::from("Good "); // greeting becomes valid here
greeting.push_str("morning"); // we do stuff with greeting
println!("{greeting}");
} // Our scope ends, greeting is no longer valid
So, a variable becomes valid when it comes into the scope ( not when the scope begins). When the scope ends, the variable is no longer valid. In other words, it's no longer valid when the variable goes out of scope. But what happens behind the scenes?
Allocating memory
Two things need to happen:
- We request the memory from the memory allocator at runtime
- We return the memory to the allocator when we are done with our variable
The first part is done when we do let mut greeting = String::from("Good ");
and it's quite universal across programming languages.
But how do we return the memory to the allocator when we are done with the variable?
There are 3 ways:
- Languages with a garbage collector ( like Java ) do that for you. Garbage collectors keep track of what isn't used anymore and cleans up.
- If there is no garbage collector? In most cases, it's the developer's responsability to identify when memory is no longer being used and explicitly free it ( like we explicitly requested memory earlier ). This is a difficult thing to do correctly, because you need to do it at the right time, and only once per memory allocation...
- Rust does this in a third way: it automatically returns the memory once the variable that owns it is out of scope.
If we take a look at our code again:
{ // The scope begins, greeting is not valid yet because not declared
let mut greeting = String::from("Good "); // greeting becomes valid here
greeting.push_str("morning"); // we do stuff with greeting
println!("{greeting}");
} // We are done with greeting => Rust frees the memory associated with greeting
Rust will automatically return the memory when the scope ends, because greeting
is no longer valid. The variable greeting
owns a chunk of memory, therefore it is returned to the memory allocator.
To return the memory to the allocator, Rust calls a special function called drop(). Rust does it automatically for us at the closing curly bracket.
Special Case:
Consider the following code:
let hello = String::from("Hello");
let hello_again = hello.clone();
println!("{hello}, world!");
Now, if we keep the same logic as before, you could say: both hello
and hello_again
are valid variables at the same time after we declare hello_again
. But it's not the case, this won't compile. Why?
Remember earlier when we said that the pointer is stored in the stack and the variable's value is stored in the heap? Well, when we do hello_again = hello
, we copy the pointer ( and other things stored in the stack ). But that's all we copy, Rust do not copy the values stored in the heap. Rust does this because it would be too expensive memory wise to copy the data stored in the heap.
So, what you would expect to happen is this:
But what actually happens is this:
So, we have 2 pointers that refers to the same value on the heap? What happens if both variables go out of scope? Rust will call drop() twice, trying to free the same memory twice? Doing this would obviously lead to problems, this is called a double free error.
To prevent this, Rust considers the hello
variable as no longer valid after let hello_again = hello;
. So, when hello
goes out of scope, Rust doesn't need to free any memory.
This is not a shallow copy because Rust invalidates the first variable, we refer to this as a move. We moved hello
into hello_again
.
This means that Rust will never automatically create "deep" copies of your data.
So what if I truly want to deeply copy?
We have a clone()
method that also copies the heap data, making it a real deep copy.
let hello = String::from("Hello");
let hello_again = hello.clone();
println!("{hello}, world!");
And this is valid Rust code. Because we do have the following this time:
Remember that clone()
is a more expensive operation though.
Note that:
let x = 5;
let y = x;
works fine because we have integers variables here. Integers are fixed sized variables, meaning that we do not store anything in the heap, everything is in the stack. So, in this case, there is no difference between shallow or deep copying, calling clone() wouldn't do anything different here. Rust has therefore no reason to consider the x
variable invalid when we do let y = x
.
Rust has a special Copy
trait that is implemented on types that are exclusively stored on the stack. When a type is annotated with Copy
, it means that variables don't move, they are just copied ( like integers ). Rust has another annotation called Drop
that is implemented by types that are stored on the heap. A type cannot implement both Copy
and Drop
traits. If you try to add the Copy
trait to a type that can be moved ( like strings ), you'll get an error.
Here are some of the types that implement Copy:
- All the integer types, such as u32.
- The Boolean type, bool, with values true and false.
- All the floating-point types, such as f64.
- The character type, char.
- Tuples, if they only contain types that also implement Copy. For example, (i32, i32) implements Copy, but (i32, String) does not.
Functions and returning values
When it comes to passing values to functions, the mechanics are the same than when we assign values to variables. For example:
fn main(){
let greeting = String::from("Good morning"); // greeting comes into scope
has_new_owner(greeting); // greeting moves into the function
// greeting is no longer valid here
let i32_int = 32; // i32_int comes into scope
just_copy(i32_int); // i32_int moves into the function
// But it implements Copy, so it's still okay to use it afterwards
} // i32_int goes out of scope, then greeting. Nothing happens because greeting's value has been move earlier
fn has_new_owner(a_string: String){ // a_string comes into scope
println!("{a_string}")
} // a_string goes out of scope, 'drop' is called and the memory is returned
fn just_copy(an_int: i32) { // an_int comes into scope
println!("{an_int}")
}// an_int goes out of scope, nothing special happens
Returning values also transfer ownership. Take the following code:
fn main(){
let give_me = i_give_you_ownership(); // i_give_you_ownership moves its return value to give_me
let greeting = String::from("Hello World!"); // greeting comes into scope
let i_received = i_take_and_i_give_back(greeting); // greeting is moved to
// i_take_and_i_give_back and moves its return value to
// i_received
} // i_received goes out of scope and is dropped. greeting was moved so nothing happens give_me goes out of
// scope and is dropped.
fn i_give_you_ownership() -> String {
let a_string = String::from("What's up?"); // a_string comes into scope
a_string // a_string is returned and is moved to the calling function
}
fn i_take_and_i_give_back(some_string: String) -> String { // some_string comes into scope
some_string // some_string is returned and is moved to the calling function
}
Once you understand the principle, the same pattern is repeated all the time. But, it's a bit tedious to pass ownership around like this. If I give ownership, I need something in return to be able to use it again... It means that if I pass a variable to a function, I would need to make that function return the variable everytime if I want to use it again later?
Fear not, there is a way for us to use values without transferring ownership. We will see that in another article about References and Borrowing. Hope it was useful!
Have fun β€οΈ
Posted on June 13, 2024
Join Our Newsletter. No Spam, Only the good stuff.
Sign up to receive the latest update from our blog.
Related
October 16, 2024