Purely Functional Python With Static Types
Sune Debel
Posted on July 3, 2020
Programming in functional style has never been popular among Python programmers. Even though libraries such as functools
or toolz
exist, imperative and object-oriented style is still the norm. Being a functional programming convert myself, I'm not sure why that is. I suspect that the lack of a static type system has been a big reason that Python programmers generally haven't recognized the advantages of functional programming, since the marriage between the two is what brings many of the benefits.
With PEP 484 and friends this has changed. The ecosystem around the typing module has matured to the point where a full fledged functional library with static types is practical. As a true functional programming believer this motivated me to author a library called pfun
which brings functional programming with a strong emphasis on static types to Python. pfun
distinguishes itself from other libraries such as toolz
in that
- It's designed to provide the most accurate typing possible within the limitations of the type system
- It provides a unified functional system for managing side-effects in a module called
pfun.effect
This article will explain the pfun
effect system in enough detail that you should be able to use it, and maybe even contribute to it. we'll start with an introduction to functional programming and its benefits. If you're already a true believer and functional programming aficionado, you can probably skip ahead to A Functional Effect System for Python.
What is Functional Programming And Should You Care?
In functional programming, we build our programs by combining functions.
But we only use special functions: We only allow functions that compute new results based on their arguments. Functions that do other things besides computing results, such as mutating program state or writing to a file, are not allowed.
We call these "other things" side effects. The reason we don't like functions that perform side effects is that they are harder to re-use since the programmer needs to keep track of which parts of the program (or external state that the program depends on) are affected by the side effect when invoking the function. We prefer functions without side effects to the point that we have named them pure functions, and dislike functions that perform side effects to the extent that we have named them impure functions.
Look at these two functions that both add an element to a list for example:
from typing import List
def add_with_side_effect(l: List[int], e: int) -> List[int]:
l.append(e)
return l
def add_without_side_effect(l: List[int], e: int) -> List[int]:
return l + [e]
Calling add_with_side_effect
mutates the list passed as an argument. You need to consider which other parts of the program has a reference to l
when calling it. In other words you need to think about your program globally in order to figure out if calling add_with_side_effect
is safe.
Calling add_without_side_effect
returns a new list. The list the caller has and passes as the argument l
is left intact. It doesn't matter if any other part of your program has a reference to l
since it's left unchanged. In other words you can reason about your program locally, which is much, much easier most of the time.
Functions that are side-effect-free have a number of significant advantages:
- They are truly modular
- They are easy to reason about
- They are easy to test
- They are more robust
- And more
Limiting ourselves to using pure functions also presents a challenge, since most programs in the real world need to read and write files, raise exceptions and use the network. Given that all of these things are inherently impure, it would seem that functional programming is close to useless.
In the next section we'll see how to do all of those things in a purely functional context.
A Functional Effect System for Python
How do we write functions that can handle side-effects and are still pure? To achieve this seemingly impossible feat we're going to use a clever trick. The trick is to change our impure functions such that instead of performing a side effect, they return a description of a side effect that can be executed at a later time. The most straight-forward way of doing that is to write functions that return other functions that performs the side effects. To that end we'll use the following Effect
type alias:
from typing import TypeVar, Callable
A = TypeVar('A')
Effect = Callable[[], A]
Lets define a function read_file_pure
that returns a description of the side effect of reading a file:
def read_file_pure(path: str) -> Effect[str]:
def effect() -> str:
with open(path) as f:
return f.read()
return effect
Notice that read_file_pure
really is pure: for the same argument path it always returns the same function regardless of the actual contents of the file. The effect function returned by read_file_pure
is not pure of course.
So far so good, but how do we actually do anything with the contents of a file? We could of course run the function returned by read_file_pure
immediately after calling it to get the content and then perform whatever transformation is necessary - but then our program is no longer pure!
What we'll do instead is to write a new function that takes an Effect
and applies a function to its result in a new effect. Lets call this function map
(because it maps a function over the result of executing an impure function):
from typing import TypeVar
A = TypeVar('A')
B = TypeVar('B')
def map(effect: Effect[A],
f: Callable[[A], B]) -> Effect[B]:
def new_effect() -> B:
return f(effect())
return new_effect
Again, notice that map
is pure because it is lazy: No side-effects are carried out when calling map
. It's only when the function returned by map
is called that the side-effect described by the effect
parameter is carried out.
This means that we can manipulate values returned by read_file_pure
in a pure fashion, for example in:
def lower_case_file(path: str) -> Effect[str]:
return map(impure_read_file(path),
lambda contents: contents.lower())
Most functional programs take the shape:
1) Create an Effect
to get input for the computation
2) Transform the input using map
(or a similar function)
3) Run the Effect
returned by map
Or in code:
def transform(input: str) -> str: ...
if __name__ == '__main__'
program = map(impure_read_file(path), transform)
_ = program()
In other words, program
is an impure function, built by cleverly combining lazy, pure functions. Calling impure_read_file
and map
doesn't actually do any real computation, except whats required to put together the program
function. The real computation, including side-effects, are only executed when program
is called at the edge of our python script (in the __main__
section).
That's great, but what if we want to combine several effects? For example, what if we had a function for writing strings to files without side-effects:
def write_file_pure(path: str, contents: str) -> Effect[None]:
def effect() -> None:
with open(path, 'w') as f:
f.write(contents)
return effect
If we try to use write_file_pure
with map
, we end up with an Effect[Effect[str]]
because map
only takes care of "unwrapping" the effect it takes as a parameter, not the effect returned by the function that is mapped over the result of the effect
parameter. This means that our program now looks like this:
if __name__ == '__main__':
program = map(
read_file_pure('foo.txt'),
lambda content: write_file_pure('bar.txt', content)
)
_ = program()() # program() returns an Effect
Notice that we have to call the Effect
returned by program()
to actually run the effect returned by write_file_pure
. Also realize that it's possible to produce an Effect
that wraps an arbitrary number of nested effects, for example if map
is called in a loop. We might end up in a situation where it's not easy to figure out precisely how many functions we need to call at the edge of our script!
To fix this, we introduce another function that knows how to combine an Effect
with a function that returns an Effect
. Lets call it and_then
since it's chaining together two effects sequentially:
def and_then(effect: Effect[A],
f: Callable[[A], Effect[B]]) -> Effect[B]:
def new_effect() -> B:
return f(effect())()
return new_effect
Now we can rewrite our program from before as:
if __name__ == '__main__':
program = and_then(
read_file_pure('foo.txt'),
lambda content: write_file_pure('bar.txt', content)
)
_ = program()
Notice that we only have to call program
to run all the effects involved in the computation since and_then
knows how to combine two effects. Also realize that if we use and_then
appropriately, we can combine an arbitrary number of effects, even in a loop. In other words, we have a completely general framework for working with side-effects in a pure fashion:
- Functions that need to perform side-effects return
Effect
instances - We transform the result of effects using
map
- We combine effects with functions that return effects using
and_then
Since functional programs are generally composed of chained calls to map
and and_then
, lets refactor the functions as instance methods so that we can use a nicer dot notation:
from dataclasses import dataclass
from typing import Callable, TypeVar, Generic
A = TypeVar('A', covariant=True)
B = TypeVar('B')
@dataclass(frozen=True)
class Effect(Generic[A]):
effect: Callable[[], A]
def __call__(self):
return self.effect()
def map(self, f: Callable[[A], B]) -> Effect[B]:
return Effect(lambda: f(self()))
def and_then(self, f: Callable[[A], Effect[B]]) -> IO[B]:
return Effect(lambda: f(self())())
(We've decorated the class with dataclass
to remove some boilerplate and avoid mutation of the effect
field). Now our program from before becomes:
def read_file_pure(path: str) -> Effect[str]:
def effect() -> str:
with open(path) as f:
return f.read()
return Effect(effect)
def write_file_pure(path: str, content: str) -> Effect[None]
def effect() -> None:
with open(path, 'w') as f:
f.write(content)
return Effect(effect)
if __name__ == '__main__':
program = read_file_pure(
'foo.txt'
).and_then(
lambda content: write_file_pure('bar.txt', content)
)
program()
Which arguably reads a little nicer.
Adding Error Handling To Our Effect System
So far we have only been concerned with one particular side-effect: IO. Lets now turn our attention to error handling that is also normally modeled as a side-effect in imperative programming.
Our requirement for pure functions is that they can only compute values based on their arguments. To comply with this dogma, lets define wrapper types that can be used to 'tag' values returned from functions.
from dataclasses import dataclass
from typing import Union, TypeVar, Generic
R = TypeVar('R', covariant=True)
L = TypeVar('L', covariant=True)
@dataclass(frozen=True)
class Right(Generic[R]):
get: R
@dataclass(frozen=True)
class Left(Generic[L]):
get: L
Either = Union[Left[L], Right[R]]
For convenience, we've also defined a type-alias Either
that we can use to more easily write type signatures for functions that can fail. Since it's a union type, it gives us the added benefit that our type checker can help us figure out if we are handling errors correctly.
With this in hand we can do error handling in a purely functional way:
def divide(nume: float, denom: float) -> Either[str, float]:
try:
return Right(nume / denom)
except ZeroDivisionError:
return Left('Division by zero')
That's great, but what if we need to write a function that needs to model both the IO side-effect and the error handling side-effect? We can of course wrap one in the other, e.g:
def read_file_pure(path: str) -> Effect[Either[IOError, str]]:
def effect() -> Either[IOError, str]:
try:
with open(path) as f:
return Right(f.read())
except IOError as e:
return Left(e)
return Effect(action)
Except now we have to deal with error handling every time we need to use map
or and_then
, because neither of them know how to unpack the value wrapped by an Either
:
def lower_case_file(path: str) -> Effect[Either[IOError, str]]:
read_file_pure(path).map(
lambda either: Right(either.get.lower())
if isinstance(either, Right)
else either
)
Yuck. Thankfully, there are a number of ways to avoid having this error handling code everywhere in our functional program. The solution we use in pfun.effect
is to "bake error handling into our Effect
type", in order that map
and and_then
can take care of the wrapping/unwrapping for us:
from dataclasses import dataclass
from typing import Union, TypeVar, Generic
A = TypeVar('A', covariant=True)
E = TypeVar('E', covariant=True)
B = TypeVar('B')
E2 = TypeVar('E2')
@dataclass(frozen=True)
class Effect(Generic[E, A]):
effect: Callable[[], Either[E, A]]
def __call__(self) -> Either[E, A]:
return self.effect()
def map(self, f: Callable[[A], B]) -> Effect[E, B]:
def effect():
either = self()
if isinstance(either, Left):
return either
else:
return Right(f(either.get))
return Effect(effect)
def and_then(self,
f: Callable[[A], Effect[E2, B]]) -> Effect[Union[E, E2], B]:
def effect():
either = self()
if isinstance(either, Left):
return either
else:
return Right(f(either.get)())
return Effect(effect)
We have added a type parameter E
to represent the error type. The implementations of map
and and_then
are more less identical to our previous versions, except that they expect an Either
returned from the wrapped effect
function and handle them appropriately. Notice also that the return type of and_then
is Effect[Union[E, E2], B]
, because the resulting Effect
can fail if either self()
returns a Left
value or if f(...)()
returns a Left
value.
Now the mess from before simply becomes
def read_file_pure(path: str) -> Effect[IOError, str]:
def effect() -> Either[IOError, str]:
try:
with open(path) as f:
return Right(f.read())
except IOError as e:
return Left(e)
return Effect(effect)
def lower_case_file(path: str) -> Effect[IOError, str]]:
read_file_pure(path).map(lambda content: content.lower())
Neato!
Adding The Reader Effect
The Effect
type we've cooked up so far is pretty close to what you'll find in pfun.effect
. The main difference is that the wrapped effect
function in pfun.effect
takes a generic argument R
:
from dataclasses import dataclass
from typing import TypeVar, Generic, Union
R = TypeVar('R', contravariant=True)
E = TypeVar('E', covariant=True)
E1 = TypeVar('E1')
A = TypeVar('A', covariant=True)
B = TypeVar('B')
@dataclass(frozen=True)
class Effect(Generic[R, E, A]):
effect: Callable[[R], Either[E, A]]
def __call__(self, r: R) -> Either[E, A]:
return self.effect(r)
def map(self, f: Callable[[A], B]) -> Effect[R, E, B]:
def effect(r: R) -> Either[E, B]
either = self(r)
if isinstance(either, Left):
return either
return Right(f(either.get))
return Effect(effect)
def and_then(self,
f: Callable[A, Effect[R, E1, B]]
) -> Effect[R, Union[E, E1], B]:
def effect(r: R) -> Either[E, B]:
either = self(r)
if isinstance(either, Left):
return either
return Right(f(either.get)(r))
return Effect(effect)
Allowing the wrapped effect
function to take a parameter enables dependency injection for increased code re-use and improved test-ability
For historical reasons, this is known as the reader effect. We will go into what pfun.effect
can do with this argument in a lot of detail in future posts, but for now lets just exemplify the dependency injection feature.
Say for example that you want to implement an effect that reads from a database. Using the R
parameter you can pass the connection string to the action, making it possible to re-use the same Effect
instance for reading from different databases:
def read_from_database(query: str) -> Effect[str, IOError, dict]:
def effect(connection_string: str) -> Either[IOError, dict]:
try:
database = connect(connection_string)
return Right(database.execute(query))
except IOError as e:
return Left(e)
return Effect(effect)
This allows us to combine read_from_database
with other functions through map
and and_then
, and then re-use the entire function call chain against different databases in a completely type-safe manner.
if __name__ == '__main__':
program = read_from_database('select * from user').map(lambda users: users['user1'])
test_results = program('test_user@test_database')
production_results = program('prod_user@prod_database')
Take a look at the documentation for more information on how to use the R
parameter to its full potential.
Whats Next?
There is still a significant problem with the Effect
type we have built in this post: long chains of effects combined with map
and and_then
causes RecursionError
because one effect needs to be executed before the next one can be executed.
pfun.effect
solves this problem and more:
-
pfun.effect
can build sequences of effects of arbitrary length without causing aRecursionError
when executed, because effects are run using a mechanism called trampolines -
pfun.effect
is compatible withasyncio
for high performance asynchronous IO -
pfun.effect
provides a number of useful helper functions to help you write code at a very high level of abstraction -
pfun.effect
provides a plugin for mypy that enables fine grained type inference of helper functions
In conclusion, pfun.effect
provides a full fledged purely functional, statically typed effect system for Python. Head over to the documentation to go more in depth, or to the github repo for issues, questions and contributions.
.
Posted on July 3, 2020
Join Our Newsletter. No Spam, Only the good stuff.
Sign up to receive the latest update from our blog.