Discriminated unions in 2024

asik

André Slupik

Posted on September 18, 2024

Discriminated unions in 2024

Here are some slightly edited notes I took while researching a solution for discriminated unions in C#. I assume the reader is familiar with discriminated unions, wants to use them, has summarily searched Google and StackOverflow already, and is looking for more information beyond the basics.

What about OneOf?

OneOf is implemented as a non-overlapping struct so

  • ❌ It can grow large in size.
  • ❌ It introduces value semantics complications: default instances, mutations, copying and passing by reference.
  • ❌ Plus, it's not serializable; an unrelated library does that, but forces me to take a dependency on Newtonsoft.Json.

Can we do better?

I experimented with a modernized version of what I was doing circa 2014: a generic base class with subclasses for each case, where Match is a virtual method, e.g.

abstract record Union<T0, T1>
{
    public abstract T Match<T>(Func<T0, T> func0, Func<T1, T> func1);

    public record Case0(T0 Value) : Union<T0, T1>
    {
        public override T Match<T>(Func<T0, T> func0, Func<T1, T> func1) => 
            func0(Value);
    }

    public record Case1(T1 Value) : Union<T0, T1>
    {
        public override T Match<T>(Func<T0, T> func0, Func<T1, T> func1) => 
            func1(Value);
    }
}
Enter fullscreen mode Exit fullscreen mode

While this approach eliminates value type concerns, some issues remain:

  • ❌ It's annoying to have to alias these (it doesn't cross assembly boundaries)
  global using EntityState = Union<MyGame.StartingState, MyGame.EndingState>;
Enter fullscreen mode Exit fullscreen mode
  • ❌ Serializing this requires use of a JsonConverterFactory, which means the user must remember to use the proper JsonSerializerOption object referencing said factory.
  • ❌ More importantly, the Match function takes N delegates, meaning every call to Match causes N (typically 2-5) Func allocations.

Can we do better?

Records offer a relatively clean syntax for what approximates a real union type:

abstract record EntityState
{
    public record StaticState(Vector2 Position) : EntityState;
    public record MovingState(Vector2 Position, Vector2 Velocity) : EntityState;
}
Enter fullscreen mode Exit fullscreen mode

This means we rely on switch expressions rather than a Match function with lambdas for each case. Unfortunately, unlike lambdas, switch expressions don't allow blocks of code. However, we can emulate them like this:

static T Invoke<T>(Func<T> func) => func();

var result = entityState switch
{
    StaticState staticState => Invoke(() =>
    {
        // Do whatever here, you got a block of code
        return 0;
    }),

    MovingState movingState => etc.,

    _ => throw new Exception()
}
Enter fullscreen mode Exit fullscreen mode

If you squint, the Invoke(() => noise practically disappears. This still incurs allocations, but it's only 1 per match rather than N.
We also leverage the full power of pattern-matching here(deconstruction etc.) which isn't accessible in lambda parameters.
The thing that bugs me the most is the non-exhaustiveness and the unnecessary yet mandatory default clause e.g.

_ => throw new Exception()
Enter fullscreen mode Exit fullscreen mode

I tried ExhaustiveMatching, and found that:

  • I still have to add a default clause to every switch
  • Every time I add a default clause, I have to remember to use their special exception to trigger the analyzer
  • When I add a case to the type I have to remember to add it to their special Closed attribute type list
    • And no, this can't be automated with a source generator, as it will not run before the analyzer and there is no way to control that.

That's still a lot of potential for error. As a result, I do not feel compelled to use this analyzer.

We can make these record hierarchies serializable without forcing specific JsonSerializerOptions everywhere, by adding a JsonDerivedTypeAttribute to the base type for each case.
This can be automated with a source generator, which requires marking the records partial so the source generator can add a identically named partial record with the appropriate attributes.

To ensure the developer would not forget to add the partial keyword, I had the generator generate warnings for non-partial abstract records.

In summary, this final solution:

  • ✔️ leverages modern language features: records, switch expressions, pattern matching
  • ✔️ does not require usage of a specific library except for an optional source generator
  • ✔️ does not incur excessive allocations (use of the Invoke trick above is optional)
  • ✔️ does not suffer from value semantics complications
  • ✔️ is serializable with minimal noise and no added maintenance burden (just a partial keyword)
  • ✔️ approximates the proposed syntax for "Union Classes" in C#
  • ❌ fails to achieve exhaustiveness
  • ❌ is somewhat slow (switch on types results in several casts and branches)

This may be the best compromise until C# adds support for some kind of closed type hierarchies (e.g. Type Union Proposal).

💖 💪 🙅 🚩
asik
André Slupik

Posted on September 18, 2024

Join Our Newsletter. No Spam, Only the good stuff.

Sign up to receive the latest update from our blog.

Related