Abstraction & Type-Safety using Singleton Variants

rawtoast

RawToast

Posted on February 15, 2020

Abstraction & Type-Safety using Singleton Variants

Variants are sold as one of Reason's more powerful features, often demonstrated to show polymorphic pattern matching; however, they have another interesting use case by boxing datatypes as singleton variants to create something similar to a Value Class or newtype.

From previous with Scala I am used to creating Value Classes, which can be compared to Haskell's newtype. These constructs allow the developer to express greater levels of type information and abstraction in their code, with little or no runtime performance penalty. It's possible to achieve the same effect in ReasonML with singleton variants.

What is a Value Class?

A value class is a simple wrapper around a primitive type that gives you more control over the input and output of functions. This has a number of benefits, such as restricting construction to validated values or simply aiding with passing around many parameters into a function.

These are very easy to construct in Scala by extending AnyVal

case class Name(value: String) extends AnyVal
Enter fullscreen mode Exit fullscreen mode

Whilst it looks like there is an additional overhead here; after all, the String has been boxed inside a class which you would expect would need to be instantiated each time – in the JVM the wrapping class is removed after compilation. So there should not be a performance cost when wrapping types in this manner. There is just one minor issue, if you wish to access the underlying String then you have to manually access it:

val name = Name("Cat")

println("The name is: " + name.value)
Enter fullscreen mode Exit fullscreen mode

You can achieve something similar in ReasonML by boxing the type within a single-argument variant, which I'll demonstrate later.

Why would you want to do this?

Essentially to make your code more descriptive and to prevent mistakes. This is probably best illustrated with examples. So let us imagine you have a function type signature for a simple function to create a person:

let makePerson: (string, string, string, int) => unit;
Enter fullscreen mode Exit fullscreen mode

As simple as the definition is, it may leave you wondering about a number of things: how do you distinguish between the meaning of these fields? Which holds the first name and which the surname? Just what exactly is that integer? Why are there three string parameters?

Sure, you could probably work those questions out by looking at the output type, and yes, I deliberately left it as unit to make life hard. Still, this function could be storing its output in a database or mutable dictionary somewhere and unit could be an acceptable output type.

So in order to answer that question, you may wish to use named parameters instead. And that's a reasonable solution:

let makePerson: (
  ~firstName: string,
  ~surname: string, 
  ~hometown: string, 
  ~age: int
) => unit 
Enter fullscreen mode Exit fullscreen mode

Now at least you can identify what goes where and it would be acceptable to finish here. Still, this has some minor issues that can be addressed. For example, you could accidentally pass in a name into the hometown field.

Another alternative would be to use type aliases for the fields, which would make the method more descriptive without the overhead of typing the labels each time:

type firstName = string;
type surname = string;
type hometown = string;
type age = int;

let makePerson: (
  firstName,
  surname, 
  hometown, 
  age) => unit
Enter fullscreen mode Exit fullscreen mode

Whilst very readable, this code is no safer than the original implementation. Aliases don't provide any protection and you can pass any string as any of the function's parameters.

In both solutions, the string type is still being used for three different things; however, in Scala is possible to abstract the string away by using Value Classes. Let's quickly demonstrate that:

case class FirstName(value: String) extends AnyVal
case class Surname(value: String) extends AnyVal
case class Hometown(value: String) extends AnyVal
case class Age(value: String) extends AnyVal

abstract def makePerson(
  firstName: FirstName,
  surname: Surname, 
  hometown: Hometown,
  age: Age): Person

// Or if you simply wanted to use a constructor
case class Person(
  firstName: FirstName,
  surname: Surname, 
  hometown: Hometown,
  age: Age)

Enter fullscreen mode Exit fullscreen mode

In the above example, unlike simple type aliases, you cannot pass a FirstName into say the Hometown field. Each of those types is independent of the primitive type it wraps.

So how do we do this in Reason?

So how do we do this in Reason? Well, we can box the primitive types within single-argument variants.

type firstName = FirstName(string);
type surname = Surname(string);
type hometown = Hometown(string);
type age = Age(int);

let makePerson: (
  firstName,
  surname, 
  hometown, 
  age) => unit = (a, b, c, d) => ();
Enter fullscreen mode Exit fullscreen mode

Now it isn't possible to accidentally pass in a hometown as a surname, any such mistake would cause the program to not compile. Whilst this is only a simple example, this becomes more useful the bigger your solution gets. Anywhere else in the codebase it would no longer be possible to mistake a surname for a string or an age for an int.

A common situation for this in a larger application is for id fields. You may end up with int being used for a user id, post id, account id, payment id, group id, and so on. If these types are abstracted within singleton variants, then we can differentiate between the types.

Now, at some point, you will need to unbox the values from these singleton variants. You could use a switch, but that's a little long-winded. Instead, try using fun instead:

let name = FirstName("Dave");

let nameString = name |> fun | FirstName(str) => str;
Enter fullscreen mode Exit fullscreen mode

Isn't there a performance cost?

Unlike Scala, the above example can come with a penalty. In older versions of Reason, it will construct the variant as a single argument array. Accessing the value in the code above is like accessing an array using myArray[0]. For example, if you use one of the online Reason editors the above name construction may compile to:

var name = /* FirstName */["Dave"];
Enter fullscreen mode Exit fullscreen mode

However, since Bucklescript release 7.1.0 we are able to use unboxed to get around this! What is this? Let's look at the OCaml manual:

unboxed can be used on a type definition if the type is a single-field record or a concrete type with a single constructor that has a single argument. It tells the compiler to optimize the representation of the type by removing the block that represents the record or the constructor (i.e. a value of this type is physically equal to its argument).

This now means a singleton variant is not compiled as an array but is instead unboxed to the underlying type. Essentially, like with Scala, the OCaml compiler will erase the singleton variant in a later stage of compilation as it's not required at runtime. To use this mark the type as [@unboxed] like so:

[@unboxed]
type hometown = Hometown(string);
let tokyo = Hometown("tokyo");
Enter fullscreen mode Exit fullscreen mode

This will then be unboxed from the array during compilation:

var tokyo = "tokyo";
Enter fullscreen mode Exit fullscreen mode

So no more performance penalties! According to the release notes, this can also be used to unbox singleton records. Note, that whilst the release notes are for development version, this feature was released with bs-platform@7.1.0.

Whether you prefer to use singleton variants or records for this is a personal choice. I've included a small demonstration of using singleton records for this at Itazura.io.

💖 💪 🙅 🚩
rawtoast
RawToast

Posted on February 15, 2020

Join Our Newsletter. No Spam, Only the good stuff.

Sign up to receive the latest update from our blog.

Related