Anatomy of an algebra
Menestret Martin
Posted on February 21, 2019
[Check my articles on my blog here]
I will try to group here, in an anatomy atlas, basic notions of functional programming that I find myself explaining often lately into a series of articles.
The idea here is to have a place to point people needing explanations and to increase my own understanding of these subjects by trying to explain them the best I can.
I'll try to focus more on making the reader feel an intuition, a feeling about the concepts rather than on the perfect, strict correctness of my explanations.
- Part 1: Anatomy of functional programming
- Part 2: Anatomy of an algebra
- Part 3: Anatomy of a type class
- Part 4: Anatomy of semi groups and monoids
- Part 5: Anatomy of functors and category theory
- Part 6: Anatomy of the tagless final encoding - Yet to come !
What is an algebra ?
Functional programmers tend to talk a lot about algebras, so a good starting point would be to understand what an algebra is.
A simple definition would be (from Wikipedia):
In its most general form, an algebra is the study of mathematical symbols and the rules for manipulating these symbols
Then to rephrase it in simpler terms, it is a system formed by:
- Symbols, items or "things"
- Operations on thoses "things"
- Properties and rules about those operations
A simple algebra example would be:
- Symbols: the integer numbers
- Operations: addition and multiplication
-
Properties and rules:
- Closure: the result of the two operations is itself in integer number
- Associativity:
a + (b + c) = (a + b) + c
anda × (b × c) = (a × b) × c
- Commutativity:
a + b = b + a
anda × b = b × a
- Identity elements:
a + 0 = a
anda * 1 = a
- And so on (check the link if you want to see the rest)
How does it relates to FP ?
Domain modeling
When modeling a business domain, in functional programming, we separate strongly data from behaviors (see the Data/behavior relationship section).
To do that:
- We create types on one side
- Pure functions manipulating those types
- And these functions have to respect some strict business logic
Doesn't that ring any bell ?
We manipulate algebras !
- Symbols: our business types
- Operations: our business functions acting on those types
-
Properties and rules: our business logic implemented in these functions, and our property based tests
- We won't cover those here, but to give you an idea, these are not unit tests, but tests based on function properties that they must always verify (such as: "A combination of several sales cannot result in a negative price") while tested against a wide range of more or less random values generated by a test framework
A concrete example
To make it clearer, let's take a dummy example.
Say we have to produce sales reports, we would probably model our domain like:
type Amount = Int
type Price = Int
final class ID(val value: Int) extends AnyVal
final case class Item(id: ID, price: Price)
final case class Sales(items: List[(ID, Amount)], totalPrice: Price)
trait SalesOperations {
def combineAll(sales: List[Sales]): Sales
def generateReport(sales: Sales): String
}
In that case:
-
Symbols:
Amount
,Price
,ID
,Item
,Sales
, etc. (our business types) -
Operations:
combineAll
,generateReport
(operations on those types) - Properties and rules: the business logic and the tests we haven't implemented here
That's pretty much it !
It happens frequently to have shared behaviors that are common to several types.
We want to be able to design functions that can be run on those different types and have the expected behavior depending on their input types. Those functions are said then said to be polymorphic.
That's a problem we solve with type classes.
What is an ADT ?
ADT stands for algebraic data type. Well, we talked a bit about algebras but we didn't talk about types yet.
What is a type ?
A type or a data type, is just a classification of data, a set of values or inhabitants, that we use to tell the compiler how we intend to use data so it can check that we respect the rules (that's the type checking).
-
Boolean
is a type that classifies these two values:true
,false
-
String
is a type that reprents an infinity of inhabitants, all the possible characters combinations -
Option[A]
is a type that represents all the values of typeA
(wrapped in aSome
) + the valueNone
-
Option
has a number of inhabitants equals to the number of inhabitants of the typeA
+ 1 (theNone
value) -
Option[Boolean]
has 3 inhabitants:Some(true)
,Some(false)
,None
-
Unit
is a type that has only 1 inhabitant, the value:()
Noting
is a type that has no member (there is no way to create a value whose type isNothing
)
What is an algebraic data type then ?
Here is Wikipedia's definition of an algebraic data type:
In computer programming, especially functional programming and type theory, an algebraic data type is a kind of composite type, i.e., a type formed by combining other types.
We call these new types ADTs because we create them by combining existing types... well, guess what ? We use an algebra again !
This is algebra is the following:
-
Symbols: existing types: primitive types, other existing ADTs, ... (
Sales
,Boolean
,String
,Int
, ...) - Operations: sum and product (we'll see what they mean)
- Properties and rules: some properties and laws about these sum and product operations (we'll also see what they mean)
Sum
sum is the action of combining by "summing" their respective values together.
You can see it as a way to define that: values of a sum type can only be either a value of this OR that type (the OR is the key intuition here).
It is done, in Scala, usually by using:
- sealed traits and their instances as case classes or case objects
Or less often:
- The
Either[A, B]
type
That way, when you declare:
sealed trait CoinSide
case object Heads extends CoinSide
case object Tails extends CoinSide
You are creating an ADT CoinSide
which is a sum type (created with a sum operation) which has two and only two possible values (or inhabitants), Heads
or Tails
.
The same would be achieved with:
type CoinSide = Either[Heads, Tails]
Whose possible values would then be, Left(Heads)
or Right(Tails)
.
These are just two different encodings for the same thing.
Sum properties
The sum operation also have properties.
I won't go into full details here but (these examples would equally be true with sealed traits + their implementations instead of Either
type):
-
The number of values of a sum type is the sum of the values of its composing types members (just as you would assume for addition on integers)
-
Boolean
has two inhabitants:true
andfalse
-
Either[Boolean, Boolean]
which is a sum type of two typesBoolean
has four inhabitants:Left(true)
Left(false)
Right(true)
Right(false)
-
Either[Boolean, Either[Boolean, Boolean]]]
is a sum type between aBoolean
and another sum type of aBoolean
and aBoolean
- Well guess what ? It has 2 + (2 + 2) values !
-
-
It has an identity element which is the
Nothing
type which has no values at all-
Either[Boolean, Nothing]
is a sum type of aBoolean
with the sum identity. Because there is no way to create a value of typeNothing
, it does not exist, so there is no way to construct aRight
, it has only two values:Left(true)
Left(false)
-
It is associative,
Either[Boolean, Either[Boolean, Boolean]]]
is the same asEither[Either[Boolean, Boolean]],Boolean]
, you can enumerate the values, you'll see (well isomorphic, we have the same "expressive power" with both representations, but let's say they are the same) !And so on (I'll give you more material at the end if you want to keep diving)
Product
product is the action of combining two or more types together by "multiplying" their respective values together.
You can see it as a way to define that values of a product type are the combination of values of this AND that type (the AND is the key intuition here).
It is done, in Scala usually by using:
- case classes
Or less often
- tuples
That way, when you declare:
case class TwoCoinsAndABoolean(fst: CoinSide, snd: CoinSide, b: Boolean)
// or
type TwoCoinsAndABoolean = (CoinSide, CoinSide, Boolean)
These are just two different encodings for the same thing.
You are creating an ADT TwoCoinsAndABoolean
which is a product types (created with a product operation) which has the number of values of its members multiplied.
In our case 8 values (2 * 2 * 2):
-
TwoCoinsAndABoolean(Heads, Heads, true)
or(Heads, Heads, true)
-
TwoCoinsAndABoolean(Heads, Tails, true)
or(Heads, Tails, true)
-
TwoCoinsAndABoolean(Tails, Heads, true)
or(Tails, Heads, true)
-
TwoCoinsAndABoolean(Tails, Tails, true)
or(Tails, Tails, true)
-
TwoCoinsAndABoolean(Heads, Heads, false)
or(Heads, Heads, false)
-
TwoCoinsAndABoolean(Heads, Tails, false)
or(Heads, Tails, false)
-
TwoCoinsAndABoolean(Tails, Heads, false)
or(Tails, Heads, false)
-
TwoCoinsAndABoolean(Tails, Tails, false)
or(Tails, Tails, false)
You can observe that product types values are the cartesian product of their composing types values !
Product properties
The product operation also have properties.
I won't go into full details here but (these example would equally be true with case classes instead of tuples):
-
The number of values of a product types is the product of the number of the values of its combining members (as you would assume for multiplication on integers)
-
Boolean
has two inhabitants:true
andfalse
-
(Boolean, Boolean)
which is a product type of two typesBoolean
has four values:(true, true)
(true, false)
(false, true)
(false, false)
-
(Boolean, (Boolean, Boolean)
is a product type between aBoolean
and another product type of aBoolean
and aBoolean
- Well guess what ? It has 2 * (2 * 2) values !
-
-
It has an identity which is the famous
Unit
type which has only one inhabitant, the value()
-
(Boolean, Unit)
is a product type of aBoolean
with the product identity. It has two values:(true, ())
(false, ())
-
It is associative,
(Boolean, (Boolean, Boolean)
is the same as((Boolean, Boolean),Boolean)
, you can enumerate the values, you'll see (well isomorphic, we have the same "expressive power" with both representations, but let's say they are the same) !And so on (I'll give you more material at the end if you want to keep diving)
Mixed types
Of course, as we saw in some of the examples, ADTs can be combinations of other ADTs, such as sum of products, products of sums, product of products and so on.
More material
If you want to keep diving deeper, some interesting stuff can be found on my FP resources list and in particular:
- Functional and Reactive Domain Modeling (An awesome book about functional domain modeling using algebras)
- Kinds of types in Scala
- Why do Functional Programmers always talk about Algebra(s)?
Conclusion
All that long digression was only meant to show you that, creating new composite data types the way we do in pure FP is done by using an algebra.
That's why these composite types are called Algebraic Data Types :).
To sum up, we learnt:
- What was an algebra
- How algebras were used to model domains in a neat way and how it was adequate to the pure functional programming approach
- What was a type
- How creating new data types was done by using an algebra composed by types and operations on them and thus why the types created that were called algebraic data types
I'll try to keep that blog post updated.
If there are any additions, imprecision or mistakes that I should correct or if you need more explanations, feel free to contact me on Twitter or by mail !
Posted on February 21, 2019
Join Our Newsletter. No Spam, Only the good stuff.
Sign up to receive the latest update from our blog.
Related
November 25, 2024
November 13, 2024