Adding Union to Swift with Metaprogramming
Ivan Goremykin
Posted on February 2, 2023
TL;DR
We generate a set of enums — Union2
, Union3
, etc. — that act like a disjoint set. For every UnionX
, we also provide a selection of helper methods: conformance to the standard Swift protocols, higher-order functions, etc. The code is generated using a Sourcery template written in Swift. All the source code, including a Swift playground, Sourcery templates, configuration files, and scripts, is available on GitHub. You can also run the code on Repl.it.
Motivation
The essence of many mobile applications is manipulating lists of elements. Sometimes, we need to process lists where elements are not of the same type. Unfortunately, Swift doesn’t provide an out-of-the-box type-safe solution that doesn’t require developers to write boilerplate code.
For example, if we are developing a music streaming app, we might want to display a list of all items that the User has previously viewed: [AlbumA, AlbumB, ArtistA, PlaylistA, ArtistB]
. In that case, we will need to store objects of Album
, Artist
, and Playlist
types in the same array.
An obvious type-safe solution would be introducing an enum with associated values:
enum RecentlyViewedItem {
case album(Album)
case artist(Artist)
case playlist(Playlist)
}
Next time we want to display a list of all downloaded items — which are only albums and playlists. In that case, we will have to introduce another enum, consisting of only 2 elements:
enum DownloadedItem {
case album(Album)
case playlist(Playlist)
}
It seems that introducing a new type that only carries one of the other possible types is a bit excessive. It would be great to avoid introducing new types and to have a generic approach that preserves type safety, i.e., we don’t want something like [AnyObject]
to carry objects of different types.
What Swift already has
When we need to glue together a bunch of unrelated types without introducing a new type, we use a tuple. It works like the AND
operator for multiple types: (TypeA, TypeB)
represents TypeA AND TypeB
. A tuple is one of the common algebraic data type classes.
However, if we want to unite multiple types together but have only one of them being instantiated — TypeA XOR TypeB
— Swift doesn’t provide us with any means for doing that. This scenario represents another typical algebraic data types class — a disjoint union.
One can say that a tuple works like an anonymous data structure. Our goal is to do with enums what tuple does to structs.
What other languages have
There are a couple of languages that feature first-class support of disjoint unions. In these languages, an expression representing TypeA XOR TypeB
can be called a union, a tagged union, etc.
For instance, TypeScript’s type system allows building new types out of existing ones using a large variety of operators, including unions:
function printId(id: number | string) {
console.log("Your ID is: " + id);
}
printId(101);
printId("202");
Scala provides first-class support of union types:
case class Username(name: String)
case class Password(hash: Hash)
def help(id: Username | Password) =
val user = id match
case Username(name) => lookupName(name)
case Password(hash) => lookupPassword(hash)
…
Bosque’s type system also supports unions. Like in previous examples, the TypeA | TypeB
notation specifies a type that may be either TypeA
or TypeB
. Side note: Bosque is quite an interesting language to study on its own, be sure to check it out.
Previous work
The Swift community came up with a couple of implementations of Either
— a generic enum that carries one of 2 types associated with it (here, here, and here). There has been a discussion about adding Either to the Standard Library. Still, it seems we’re not going to have it in the foreseeable future because adding disjunctions (logical XOR
s) in type constraints is a commonly rejected evolution proposal. Funny enough, Apple has an internal implementation of Either in their standard library, but they are not sharing it with us ¯\_(ツ)_/¯
Anyway, Either
is a handy data structure, and many teams have adopted it in their codebase. However, it limits the number of types to 2 and often lacks some useful helper functions.
What we’re going to do
We’re going to introduce a set of enums Union2
, Union3
, etc. and add a bunch of helper methods that cover Swift standard protocols, data transformations, and some other popular use cases. To avoid writing these UnionX
types by hand, we will generate them using metaprogramming.
Metaprogramming in Swift
The most popular tool for generating code in Swift is Sourcery — a tool developed by Krzysztof Zabłocki. It works like this: you provide a template file, Sourcery parses your source code and generates code based on your template and the parsed source code. Sourcery can be used as a standalone executable or embedded right into the Xcode building process as a Run Script phase. It automatically regenerates code on any changes in your template file or in the project source files.
Generating Unions
We’re going to implement a Sourcery template for generating enum UnionX
, where X
is the size of the enum. X
is going to be an argument of the Sourcery template. We will go through various aspects of UnionX
using Union2
as an example.
Type Definition
We’re going to start with defining a type. This is how our Unions are going to look like:
enum Union2<Item0, Item1> {
case item0(Item0)
case item1(Item1)
}
The meta code for generating a definition of UnionX
can be found here.
Initializers
We want to be able to initialize our Unions simply by providing a value for one of the wrapped values:
let unionFromArtist: Union2<Artist, Album> = .init(Artist.sample())
let unionFromAlbum: Union2<Artist, Album> = .init(Album.sample())
The initializers themselves are going to look like this:
extension Union2 {
init(_ item0: Item0) {
self = .item0(item0)
}
init(_ item1: Item1) {
self = .item1(item1)
}
}
Here is the meta code for generating an initializer.
Getters
Now that we can define and instantiate our union we want to access its wrapped values:
let unionFromArtist: Union2<Artist, Album> = .init(Artist.sample())
let artist = unionFromArtist.item0 // Artist(…)
let album = unionFromArtist.item1 // nil
The getters themselves look unsurprisingly boring:
extension Union2 {
var item0: Item0? {
if case let .item0(item0) = self {
return item0
}
return nil
}
var item1: Item1? {
if case let .item1(item1) = self {
return item1
}
return nil
}
}
It’s great that we didn’t have to write them manually, thanks to the respective meta code. Note that we omit type names in the if-case-let
expressions to improve readability.
Setters
If you have a stateful variable, you will need to mutate it. Since we can’t overload the assignment operator in Swift, we will introduce a bunch of setter methods:
var selectedItem: Union2<Artist, Album> = …
selectedItem.set(artist)
selectedItem.set(album)
The setters themselves look like this:
extension Union2 {
mutating func set(_ item0: Item0) {
self = .init(item0)
}
mutating func set(_ item1: Item1) {
self = .init(item1)
}
}
Note that in the setters’ code we’re using initializers that we have generated in the previous step. The corresponding meta code is here.
Transforming a Union
The nature of many data operations is a transformation of its wrapped values. Like many of Swift’s native types, our Unions will have a map
and a flatMap
higher-order functions.
map
returns a new UnionX
, mapping its wrapped value using the given transformation:
extension Union2 {
func map0<Transformed0>(_ transform: (Item0) throws -> Transformed0) rethrows -> Union2<Transformed0, Item1> {
switch self {
case .item0(let item0):
return .init(try transform(item0))
case .item1(let item1):
return .init(item1)
}
}
func map1<Transformed1>(_ transform: (Item1) throws -> Transformed1) rethrows -> Union2<Item0, Transformed1> {
switch self {
case .item0(let item0):
return .init(item0)
case .item1(let item1):
return .init(try transform(item1))
}
}
}
flatMap
returns a new UnionX
, mapping its wrapped value using the given transformation and unwrapping the produced result:
func flatMap0<Transformed0>(_ transform: (Item0) throws -> Union2<Transformed0, Item1>) rethrows -> Union2<Transformed0, Item1> {
switch self {
case .item0(let item0):
return try transform(item0)
case .item1(let item1):
return .init(item1)
}
}
func flatMap1<Transformed1>(_ transform: (Item1) throws -> Union2<Item0, Transformed1>) rethrows -> Union2<Item0, Transformed1> {
switch self {
case .item0(let item0):
return .init(item0)
case .item1(let item1):
return try transform(item1)
}
}
Note that we’re trying to stay close to Apple’s naming and signatures when defining our higher-order functions. The respective meta code can be found here and here.
Transforming a sequence of Unions
In case we have a sequence of Unions, we might want to leave only those of a particular type:
let unions: [Union2<Artist, Album>] = […]
let artists: [Artist] = unions.compactMap0()
let albums: [Album] = unions.compactMap1()
To achieve that, we will define an extension for Sequence where we add a set of functions similar to compactMap(_:) — compactMap0
, compactMap1
, etc.
Due to the limitations of Swift’s generics, we can’t simply define an extension for Sequence where Element == UnionX
because UnionX
carries wrapped types. We can overcome this limitation by taking three steps:
1. Introduce a protocol:
protocol Union2Protocol {
associatedtype Item0
associatedtype Item1
var item0: Item0? { get }
var item1: Item1? { get }
}
2. Make an extension for Sequence:
extension Sequence where Element: Union2Protocol {
func compactMap0() -> [Element.Item0] {
return compactMap { $0.item0 }
}
func compactMap1() -> [Element.Item1] {
return compactMap { $0.item1 }
}
}
3. Auto-conform UnionX
to UnionXProtocol
:
extension Union2: Union2Protocol {}
Note that we are again using getters that we have previously generated to make the code of compactMapX
more compact. You can find the respective meta code here.
Conforming to standard Swift protocols
There is a set of the standard Swift protocols that we use daily, including Basic Behaviors (Equatable, Hashable, etc.), Error, Codable, and others. It would be great if our UnionX
conformed to each of these protocols in case all of its wrapped types also conform to it.
Equatable
Swift compiler can automatically generate conformance to Equatable, if all associated types also conform to Equatable. So we will just generate a line like this:
extension Union2: Equatable where Item0: Equatable, Item1: Equatable {}
It would be convenient if we could compare a union directly with a value of one of its wrapped types. That is particularly useful when writing asserts in a Unit-test:
var actuallySelected: Union2<Artist, Album> = …
let expectedToBeSelected: Album = …
XCTAssertEqual(actuallySelected, expectedToBeSelected)
To achieve this, we will need to generate a set of equality functions:
extension Union2: Equatable where Item0: Equatable, Item1: Equatable {
static func ==(_ union: Self, _ item0: Item0) -> Bool {
return union.item0 == item0
}
static func ==(_ union: Self, _ item1: Item1) -> Bool {
return union.item1 == item1
}
static func ==(_ item0: Item0, _ union: Self) -> Bool {
return union.item0 == item0
}
static func ==(_ item1: Item1, _ union: Self) -> Bool {
return union.item1 == item1
}
}
Note that we’re using Self
instead of Union2<Item0, Item1>
. That will improve readability, especially for Union8<Item0, Item1, Item2, Item3, Item4, Item5, Item6, Item7>
and alike. Also, we are again using the code that we had generated in the previous step — getters union.item0
and union.item1
. This helps us reduce the number of LOC and improve readability. Here is the meta code for this extension.
Hashable
This one is simple because Swift compiler can automatically generate conformance to Hashable if all of enum’s associated types also conform to Hashable:
extension Union2: Hashable where Item0: Hashable, Item1: Hashable {
}
Here’s how this line has been generated using metaprogramming.
Sendable
For those using concurrency features available in Swift 5.7, it will be convenient if UnionX
conforms to Sendable if all of its wrapped types also conform to Sendable. The code looks largely unremarkable:
extension Union2: Sendable where Item0: Sendable, Item1: Sendable {
}
You can find the relevant meta code here.
String representation
Adding conformance to CustomStringConvertable and DebugCustomStringConvertable is shockingly trivial:
extension Union2: CustomStringConvertible {
var description: String {
switch self {
case .item0(let item0):
return "Union2.item0(\(item0))"
case .item1(let item1):
return "Union2.item1(\(item1))"
}
}
}
extension Union2: CustomDebugStringConvertible {
var debugDescription: String {
switch self {
case .item0(let item0):
return "Union2.item0(\(item0))"
case .item1(let item1):
return "Union2.item1(\(item1))"
}
}
}
We use the same textual representation in both cases, which can be easily altered in the respective meta code.
Error
We can make the UnionX
conform to Error if all wrapped types of UnionX
also conform to Error. Error protocol defines a computed property localizedDescription. Since every type associated with Union conforms to Error, each of them provides localizedDescription. To make our code neater, we will first define a computed property that returns the inner error and then use it to implement localizedDescription:
extension Union2: Error where Item0: Error, Item1: Error {
var innerError: Error {
switch self {
case .item0(let item0):
return item0
case .item1(let item1):
return item1
}
}
var localizedDescription: String {
return innerError.localizedDescription
}
}
We will also leave innerError
accessible to UnionX
's clients since this property is useful on its own. The respective meta code can be found here.
CaseIterable
The CaseIterable protocol defines a type that provides a collection of all possible values. In case all of UnionX
’s wrapped types conform to CaseIterable we can make UnionX
conform to CaseIterable as well. This is useful when writing Unit-tests.
Consider an example where we have a DataProvider
that can return either NetworkError
or StorageError
:
protocol DataProviderProtocol{
func downloadFile(
fileDescriptor: FileDescriptor,
completion: @escaping (Result<FileMetadata, DataProvideError>) -> Void
)
}
typealias DataProvideError = Union2<NetworkError, StorageError>
enum NetworkError: Error, Equatable, CaseIterable {
case noInternetConnection
case authenticationFailed
case decodingFailed
}
enum StorageError: Error, Equatable, CaseIterable {
case diskFull
case fileAlreadyExists
}
… and there is another system that we want to test, that depends on DataProvider
:
final class AudioProvider: AudioProviderProtocol {
init(
dataProvider: DataProviderProtocol,
userPlanProvider: UserPlanProviderProtocol
) { … }
func downloadAudioFile(
audioFileDescriptor: AudioFileDescriptor,
completion: @escaping (Result<AudioFileMetadata, AudioProviderError>) -> Void
) { … }
}
enum AudioProviderError: Error, Equatable {
case userNotEligibleForAudioDownloads
case userNotEligibleForAudioDownloadsOfQuality(AudioQuality)
case dataProvider(DataProvideError)
}
We can write a Unit-test that asserts that if there is an error returned by a sub-system (DataProvider
), the system under test (AudioProvider
) will return a correct error:
func test_systemReturnsExpectedError_ifErrorInDataProvider() {
let dataProviderMock = DataProviderMock()
let userPlanProviderMock = UserPlanProviderMock()
let sut = AudioProvider(
dataProvider: dataProviderMock,
userPlanProvider: userPlanProviderMock
)
for dataProvideError in DataProvideError.allCases {
dataProviderMock.downloadFileMockFunc.returns(.failure(dataProvideError))
var actualResult: Result<AudioFileMetadata, AudioProviderError>?
sut.downloadAudioFile(
audioFileDescriptor: .sample(),
completion: { actualResult = $0 }
)
XCTAssertEqual(
actualResult,
.failure(.dataProvider(dataProvideError))
)
}
}
Since we can’t have stored generic static properties in generic types, UnionX
’s allCases
is going to be a computed property:
extension Union2: CaseIterable where Item0: CaseIterable, Item1: CaseIterable {
static var allCases: [Self] {
return Item0.allCases.map { .init($0) } + Item1.allCases.map { .init($0) }
}
}
Note that we’re using initializers that we have previously generated for UnionX
. Also, DataProvideError
, which is Union2
, conforms to Error because both of its wrapped types do it, — thanks to the code that we have generated before. The meta code for providing conformance to CaseIterable can be found here.
Codable
This one is not straightforward. Coding a wrapped type that conforms to Encodable is trivial. However, decoding encoded data into UnionX
is not since we don’t know which type to decode in advance. Making UnionX
conform to Decodable and Encodable is a massive topic on its own. For that reason, it has been covered in a separate article.
Final touches
Our set of unions is ready to be used, but let’s add some final polishing.
We can make our client code more compact if we use a shortened name for every UnionX
: Union2
→ U2, Union3
→ U3
, etc. For every UnionX
, we will generate a type alias UX
:
typealias U2 = Union2
To simplify navigation in the generated code, we add // MARK:
comments for every section that we described above, for every UnionX
.
We also provide control over the access level of the generated code. It’s up to your team to decide whether you want your UnionX
and its extensions to be public
or internal
. You can see how the access level parameter is defined and used in the meta code.
The Sourcery template that generates UnionX
can be found here. It accepts the following Sourcery arguments:
- max Union size to be generated,
- access level of generated code.
The Sourcery configuration file is available here.
Discussion
We have written meta code for generating a set of UnionX
types.
You can check both meta-code and generated code
- on GitHub, where it is available as a Swift playground and a set of scripts and configuration files for running Sourcery
- or on Repl.it, where you can run the whole thing in a browser
I chose 9 as the maximum size of a UnionX
to be generated. Why 9? Because that was the number of members of the Fellowship of The Ring. So, in the end, we did it for Frodo.
More things to consider
We have generated some code, but we didn’t take any measures to prevent misusing it.
For instance, it is possible to write and compile code where all or some wrapped types are the same: U2<TypeA, TypeA>
. It’s a valid type from the compiler’s perspective but not from a logic perspective. Ideally, the compiler should generate a compile-time warning for such cases. Unfortunately, we don’t have static assertions in Swift, though there is a proposal that, as of the moment this article is being written, is “awaiting implementation”.
There is an option to add run-time asserts for every UnionX
's initializer, function, and computed property:
init(_ item0: Item0) {
assert(Item0.self != Item1.self)
self = .item0(item0)
}
init(_ item1: Item1) {
assert(Item0.self != Item1.self)
self = .item1(item1)
}
… which will probably bring more clutter than value. We will leave it as an exercise for the reader.
Another thing that we need to keep in mind is that U2<TypeA, TypeB>
and U2<TypeB, TypeA>
are different types from the compiler’s perspective but are identical from the logic perspective since they both represent the same intention. Ideally, the client should be able to seamlessly work with any permutation of wrapped types carried by a UnionX
. So that it will be possible to compare types U2<TypeA, TypeB>
and U2<TypeB, TypeA>
directly, store them in the same strictly typed array, etc. To keep our reader occupied, we’ll leave generating permutations as another exercise.
How about writing U2<TypeA, Void>
? Sure, why not. As we know, Swift’s Void
is just an empty tuple, so it’s another expression valid from a compiler point of view, but not from the underlying logic of a union. U2<TypeA, Void>
is just TypeA?
, so the compiler should probably generate warnings for such cases.
Type U2<TypeA, TypeA?>
can be represented as an optional type TypeA?
. The type U2<TypeA?, TypeB?>
can be simplified as U2<TypeA, TypeB>?
. Another set of compiler warnings for these cases would be nice.
And what about nesting unions, e.g. U2<TypeA, U2<TypeB, TypeC>>
? There are plenty of cases where nested unions can be flattened:
-
U2<TypeA, U2<TypeB, TypeC>>
→U3<TypeA, TypeB, TypeC>
-
U2<TypeA, U2<TypeA, TypeB>>
→U2<TypeA, TypeB>
-
U2<U2<TypeA, TypeB>, U2<TypeB, TypeC>>
→U3<TypeA, TypeB, TypeC>
- etc.
Taking care of permutations, optionals, and nesting simultaneously is quite challenging, which probably explains why we don’t have first-class disjoints in Swift.
Writing metaprogramming code in Swift
Guidelines
-
Make generated code readable by reducing LOC
- use
Self
instead ofUnion5<T0, T1, T2, T3, T4>
- completely skip type where possible, e.g. in
if case let .item0(item0) = self
expressions
- use
-
Reuse generated code inside other parts of your generated code
- e.g. we use generated accessors in Equatable operators and
compactMapX
- e.g. we use generated accessors in Equatable operators and
-
Apply the same rules to generated code that are considered good practice for hand-written code
- stay close to Apple’s naming and signature
- follow Apple’s API Design Guidelines, etc.
Metaprogramming pipeline
- we have written our meta-programming code using a set of internal utilities for generating enums, functions, etc.
- we used Sourcery only for rendering the final result into a generated
.swift
-file - the generated code features minimum formatting: we hardcoded the indentation to be 4 spaces and use Egyptian brackets everywhere
- the idea is to separate code generation from code formatting
- if you want your code to be formatted with a specific set of rules in mind, you can forward the generated code to a formatting utility, e.g. swift-format
Acknowledgments
I would like to thank Vyacheslav Shakaev for his constructive criticism and valuable comments on the draft version of this text and Nadezhda Zhubreva for making an editorial illustration for the article U3<🦊, 🦉, 🦌>
.
Posted on February 2, 2023
Join Our Newsletter. No Spam, Only the good stuff.
Sign up to receive the latest update from our blog.