Using Types To Prevent Unit Conversion Errors

stevepryde

Steve Pryde

Posted on March 9, 2020

Using Types To Prevent Unit Conversion Errors

What if we could use static types to prevent passing a value in Centimetres to a function that expected Inches? Or what if we could prevent passing a Person id to a function that expected a Book id?

In this article I will show you a technique I use all the time that can accomplish just that. I did not come up with this technique. I just find it incredibly useful.

I will give examples in Rust (and a few in TypeScript) but you can adapt this technique to other statically typed languages as well.

Wrapping Ids

Suppose you have a function like this:

fn give_book_to_person(person_id: i32, book_id: i32) {
    let person = get_person(person_id);
    person.books.push(book_id); 
}
Enter fullscreen mode Exit fullscreen mode

What would happen if you accidentally called it with the ids in the wrong order? Like this:

give_book_to_person(book_id, person_id);
Enter fullscreen mode Exit fullscreen mode

This could result in a runtime error (could not find book? could not find person?), or worse it could give the wrong book to the wrong person! We used static types, but both ids are the same type so the compiler won't complain. This kind of problem can be difficult to debug and cause headaches for you later.

So how do we avoid it?

We wrap the ids in their own types. This allows the type system to work for us, and can practically eliminate this class of bugs entirely.

Let's see how it works.

struct PersonId(i32);
struct BookId(i32);

fn give_book_to_person(person_id: PersonId, book_id: BookId) {
    let person = get_person(person_id);
    person.books.push(book_id);
}
Enter fullscreen mode Exit fullscreen mode

Now if we call this with the ids the wrong way around, we will get a compile-time error. Congratulations, you have just saved yourself a headache and a lot of time!

But wait, how do we access the wrapped value?

In Rust, you can do it like this:

let id: i32 = person_id.0;
Enter fullscreen mode Exit fullscreen mode

But it is much better to keep the internal value private and provide an accessor method instead.

struct PersonId(i32);

impl PersonId {
    pub fn id(&self) -> i32 {
        self.0
    }
}

fn main() {
    let person_id = PersonId(1);
    let id = person_id.id();
    assert_eq!(id, 1);
}
Enter fullscreen mode Exit fullscreen mode

Another pattern is to use the dereference operator to access the internal value.

use std::ops::Deref;

struct PersonId(i32);

impl Deref for PersonId {
    type Target = i32;

    fn deref(&self) -> &Self::Target {
        &self.0
    }
}

fn main() {
    let person_id = PersonId(1);
    let id: i32 = *person_id;
    assert_eq!(id, 1);
}
Enter fullscreen mode Exit fullscreen mode

In other languages you might use an object and access the member variable by name. For example, in TypeScript you might do this:

class PersonId {
    constructor(private readonly id: number) {}
    get getId(): number { return this.id }
}

let personId = new PersonId(1);

// We can access the actual id like this:
let myId = personId.getId();
Enter fullscreen mode Exit fullscreen mode

Typically the internal value should only be accessed when outputting the value, or when calling an external function from another library. By keeping all access to the internal value within the same file, we reduce the potential for mistakes.

Equality comparison

Sometimes you will want to compare an id against another to see if they match. Remember we want to limit any access to the internal value to as few places as possible. The best way to do this is to do the equality check via operator overloading, or via an instance method on the type itself.

In Rust we can simply derive the Eq (i.e. equality) trait.

#[derive(Eq, PartialEq)]
struct PersonId(i32);

fn main() {
    let person_a = PersonId(1);
    let person_b = PersonId(1);
    if person_a == person_b {
        println!("It's a match!");
    }
}
Enter fullscreen mode Exit fullscreen mode

If your language supports operator overloading, you can use a similar approach. However, a language like TypeScript does not support operator overloading, so we would instead do something like this:

class PersonId {
    constructor(public readonly id: number) {}

    equals(other: PersonId): Boolean {
        return this.id === other.id;
    }
}

let personA = new PersonId(1);
let personB = new PersonId(1);
if (personA.equals(personB)) {
    console.log("It's a match!");
}
Enter fullscreen mode Exit fullscreen mode

Feel free to add any other comparison methods you find useful. By only adding these methods to the class / object itself, we limit the places that access the internal value, and thus significantly reduce the possibility of making a mistake.

Everywhere else, we use the PersonId object, and the compiler will be able to check that we always supplied an instance of the correct type.

Unit Conversion

By now you should be able to see other uses for this technique. My other favourite is in unit conversion.

Suppose we want to convert between Centimetres and Inches. We might provide a function like this:

fn convert_inches_to_cm(inches: f32) -> f32 {
    inches * 2.54
}

fn get_rect(width: f32, height: f32) -> (f32, f32) {
    // How big is the rect going to be?

    let width_cm = convert_inches_to_cm(width);
    let height_cm = convert_inches_to_cm(height);
    (width_cm, height_cm)
}
Enter fullscreen mode Exit fullscreen mode

Do you see the problem?

We (or someone else working on the same code) might call get_rect() with values that are already in cm, and get unexpected results.

How do we avoid this? We wrap each unit in its own type!

struct Centimetres(f32);

impl From<Inches> for Centimetres {
    fn from(value: Inches) -> Self {
        Centimetres(value.0 * 2.54)
    }
}

struct Inches(f32);

impl From<Centimetres> for Inches {
    fn from(value: Centimetres) -> Self {
        Inches(value.0 / 2.54)
    }
}

// Might as well make the Rect a type too.
struct RectInCM {
    width: Centimetres,
    height: Centimetres
}

fn get_rect_cm(width: Inches, height: Inches) -> RectInCM {
    RectInCM {
        width: width.into(),
        height: height.into(),
    }
}
Enter fullscreen mode Exit fullscreen mode

Great! Now it is very unlikely you will get the units wrong again. More headaches and debugging time avoided!

But we can do even better. Rust supports generics, which means we can write the get_rect_cm() function in a way that supports either units.

fn get_rect_cm<T>(width: T, height: T) -> RectInCM 
where
    T: Into<Centimetres>
{
    RectInCM {
        width: width.into(),
        height: height.into(),
    }
}
Enter fullscreen mode Exit fullscreen mode

Generics are beyond the scope of this article, but basically what this is saying is that the width and height parameters must both be the same type, and that type can be anything that implements the trait Into<Centimetres>.

Now you can call get_rect_cm() and pass in either Centimetres or Inches and it will always produce the correct RectInCM struct.

Pretty neat, hey?

I hope you find this technique useful.

💖 💪 🙅 🚩
stevepryde
Steve Pryde

Posted on March 9, 2020

Join Our Newsletter. No Spam, Only the good stuff.

Sign up to receive the latest update from our blog.

Related