Harness the Power of the Union Type (part 1)

marshallformula

Nate Gibbons

Posted on November 14, 2017

Harness the Power of the Union Type (part 1)

Originally posted at marshallformula.codes


I've been using elm at work for about a year and a half at this point. Elm was my first experience in a purely functional language and I'm still deeply smitten. I'm still only beginning to scratch the surface of what functional programming has to offer. I keep discovering these new gems of insight and making new cognitive connections. One such recent discovery was the inherent power of the Union Type.

A point of Clarification

Technicaly speaking - the construct that Elm uses is a Tagged Union Type:

a data structure used to hold a value that could take on several different, but fixed, types. Only one of the types can be in use at any one time, and a tag field explicitly indicates which one is in use

Additionally - other languages call it an Algebraic Data Type (or ADT).

How to use it

Before we get to some concrete examples - I'd like to clear up some misconceptions that I started with when first learning about Union Types in elm. Here's what a Union Type looks like:

type Shape
    = Cirle
    | Rectangle
    | Triangle
Enter fullscreen mode Exit fullscreen mode

Based on the definition above and this example - you might be assuming (as I was) that this is basically just an enum. This is not a bad place to start. You can definitely use a Union Type in the same manner as an enum, but you would be missing the real power that it provides.

It may help first to discuss some of the terminology when talking about Union Types. In our example above, the type is a Shape and each of the specified values of Shape are the type constructors.

type Shape     -- Type definition
    = Circle   -- Circle Shape constructor
    | Rectangle   -- Rectangle Shape constructor
    | Triangle -- Triangle Shape constructor
Enter fullscreen mode Exit fullscreen mode

Coming from an Object-Oriented background - this delineation of terms is what really made things start to click for me. The constructors are used to create a Shape value - just like a class (which can have multiple constructors) would use constructors to instantiate a class object. Now that's just about where the similarities end so don't take that concept too far, but perhaps that association might trigger some further understanding as it did for me.

So lets start using our new Shape type:

makeCircle : Shape
makeCircle =
    Circle

makeRectangle : Shape
makeRectangle =
    Rectangle

-- you get it
Enter fullscreen mode Exit fullscreen mode

Ok. Admittedly that's pretty underwhelming. But it does demonstrate how to use the Shape's constructors to create new Shape types. Let's make it a little more interesting.

Tag Payload

So far - our Shape is not much more than an enum. If you actually continued reading the wikipedia entry referenced above you might have seen this snippet:

An enumerated type can be seen as a degenerate case: a tagged union of unit types. It corresponds to a set of nullary constructors and may be implemented as a simple tag variable, since it holds no additional data besides the value of the tag.

Did that last sentence imply that unlike an enum type - the tag of a union type can hold data? Why yes it did. Nice catch.

This is where the real power of the union type comes in to play. A union type's constructors can specify additonal data (think arguments to a constructor function in OO) that are required to construct that version of the union type. Now we can make our Shape a little more useful.

type Shape
    = Circle Int
    | Rectangle Int Int
    | Triangle Int
Enter fullscreen mode Exit fullscreen mode

Neat! Now the Circle can store it's radius, the Rectangle can store it's height and width, and the Triangle can store the length of it's sides (we'll keep things simple for now and only allow equilateral triangles).

Now - you might be arguing that it's not completely obvious that the Rectangle's two constructor arguments are meant to be the width and height - since all we see is that it takes two Int values. Here's a tip... alias is your friend:

type alias Radius = Int
type alias Height = Int
type alias Width = Int
type alias Side = Int

type Shape
    = Circle Radius
    | Rectangle Height Width
    | Triangle Side
Enter fullscreen mode Exit fullscreen mode

That might be a little overkill for our simple example - but it illustrates how constructor definitions can be very explicit. I use the this technique anytime that my constructor arguments might be ambigious or confusing. I've found it incredibly valuable.

We'd better fix our creator functions to reflect our new type constructors:

makeCircle : Radius -> Shape
makeCircle radius =
    Circle radius

makeRectangle : Height -> Width -> Shape
makeRectangle height width =
    Rectangle height width

makeTriangle : Side -> Shape
makeTriangle side =
    Triangle side
Enter fullscreen mode Exit fullscreen mode

Ok - on to the really good stuff. Now that we have different type constructors - we need a way to retrieve data we're storing in the type. In order to do that we need to use pattern matching.

So lets create function that will find the perimiter of a given shape:

pi = 3 -- rounding for simplicity

findPerimeter : Shape -> Int
findPerimeter shape =
    case shape of 
        Circle radius -> 
            2 * pi * radius

        Rectangle height width ->
            (height * 2) + (width * 2)

        Triangle side ->
            (side * 3) -- again equilateral triangle for simplicity


-- let's try it out

findPerimeter (makeCircle 5)        -- 30

findPerimiter (makeRectangle 2 10)  -- 24

findPerimeter (makeTriangle 6)      -- 18
Enter fullscreen mode Exit fullscreen mode

Pretty sweet right? But what happens when we get a new requirement to support a new shape... say a Hexagon. Well, let's just add a new constructor then:

type Shape
    -- other types already defined here
    | Hexagon Side -- equal sides for simplicity
Enter fullscreen mode Exit fullscreen mode

Easy Peasy. Now, perhaps you thought we had already exposed all of the awesomeness that union types had to offer? Good news...

ron

Yep! More awesome headed your way. You see we've just added a new type constructor, however we forgot to account for it in the findPerimeter function. You see - unlike in other languages, or by using an enum - our friendly elm compiler simply won't let this stand.

-- MISSING PATTERNS ---------------------------------------------

This `case` does not have branches for all possibilities.

|>    case shape of
|>        Circle radius ->
|>            2 * pi * radius
|>
|>        Rectangle height width ->
|>            (height * 2) + (width * 2)
|>
|>        Triangle side ->
|>            (side * 3)

You need to account for the following values:

    Hexagon _

Add a branch to cover this pattern!
Enter fullscreen mode Exit fullscreen mode

What an incredibly helpful and informative compiler! It just saved us from our own folly. Whenever you use a union type, the Elm compiler checks to make sure you've covered all of the possibilities leading to fewer footguns, and therefore - fewer bugs.


In part 2, I'll be extracting even more magic from the union type. I'll be exposing some of the unique benefits of using union types when paired with concepts like recursion, encapsulation, and polymorphism.

All of the code discussed here is available in this ellie.

đź’– đź’Ş đź™… đźš©
marshallformula
Nate Gibbons

Posted on November 14, 2017

Join Our Newsletter. No Spam, Only the good stuff.

Sign up to receive the latest update from our blog.

Related