A Portrait of the Language Designer as a Young Man: Interview with Louis Pilfold

serokell

Serokell

Posted on November 26, 2021

A Portrait of the Language Designer as a Young Man: Interview with Louis Pilfold

In this edition of our video interviews, my guest is Louis Pilfold, the creator of the Gleam programming language. Gleam is a fast, friendly, and functional language for building type-safe, scalable systems. It runs on BEAM, the same VM that Erlang and Elixir uses.

We talk about how humanities can help you become a better software developer, compiler development, Rust, and, of course, Gleam.

Below, you can find a few highlighted answers from the interview that are edited for clarity.

Interview with Louis Pilfold: highlights

It’s actually really interesting to learn that you have been doing acting. For how long were you acting and did it go anywhere?

How long? I think, at one point my mother decided I was too much of a socially awkward nerd that spent all his time on the computer.

So she was like: “you’ve got to go to theater classes,” or something like that. I was maybe 11 or so. And it just turned out to be a really wonderful experience, actually. I did a lot of that all the way through my teenage years. And then I probably stopped around the time I went to university because I wasn’t able to find the correct feeling theater troupe for me. So I wasn’t able to continue, but it was fantastically useful. I’d probably recommend anyone, if they have a slightly nerdy kid, shove them through the door of the theater and get them to learn some other skills.

Right. For many of us, the closest we get to the theater is improvisation at the Dungeons and Dragons table. Which, actually, in my experience helps quite a bit in business when it comes to coming up with something on the spot during some negotiation or something like that.

Yeah. I think you’re right. Weirdly – and I don’t think I ever would have expected this when I was younger – but the skill that I learned from school or from any kind of education that’s been most applicable to working in business has been theater. I haven’t used so much of all that maths and science and English and stuff, but just the ability to get up in front of a collection of people and try and convey something or trying to put yourself in some kind of role in, in some particular pair of shoes that’s really useful, particularly when you’re new to a, a role or a team or a company or anything. And you might have imposter syndrome, like, well, I obviously can’t do this, but I can pretend to be someone who can do this.

And it turns out like pretending to do something in these situations, it’s kind of the same thing as actually doing it. So, yeah, the subtle art of faking it until you’re making it super useful.

Do you think that actually there is something about humanities that is important for developers during their day-to-day job?

Yeah, I think so. We’ve sort of already touched on how theater can give you a different set of skills. That’s super applicable in working in a team, particularly when trying to communicate in different ways.

I think that does extend to other humanities subjects. Writing code is just a small part of the act of building software as part of a team. You know, you could be a 10x programmer, but you won’t be able to compete with an entire team of people working at their best.

So if you can do something that improves the team as a whole, that’s going to be super impactful. That’s almost certainly going to be more important than just working on your own individual output.

But then I think there’s also – and I think this gets talked about more and more, particularly on Twitter and social media and stuff – but, you know, there’s also the impact that things have on the world. Software developers say: “Oh, well, I just make the app or something,” but the things we make alter people’s lives, and a really extreme example would be working in gambling or another industry that could be seen as quite predatory, that impacts people’s lives. And in working on those things, I think we are complicit. So I think it’s good to take an outlook that is encouraged by lots of these humanity subjects to look at the bigger picture, you know, what, what is the thing you’re actually doing?

Yes, we’re shipping code, but like, what does that do? There’s a lot of wonderful technology, like facial recognition, fantastically interesting technology, really powerful, and all these amazing things we can do with machine learning. But if you apply it quite haphazardly, such in some of the rollouts we’ve seen in law enforcement, you know, and you ship what is effectively a buggy model because you haven’t thought about the fact that your training data doesn’t represent the whole of society – it only works on white people, for example, or something like that.

You’ve accidentally done something that causes harm. So it’s really important to tap this high-level outlook on the world because, fundamentally, people are the thing that matters really, you know, that’s the thing we’re changing – the people that we interact with via the things we make.

So, yeah, I think it’s really worthwhile and it’s more fun, you know, being a multifaceted person.

Can you say a couple of words about the importance of automated tools to help teams build software, be it like a compiler or a test runner? How do computers actually help us build more stuff on computers?

It’s important to think a lot about how the tools that we work with impact us on a daily basis and how the things we make impact people. And I’m largely of the opinion that programming, particularly, in a professional context, particularly, if you have to operate the software running in production, is actually really stressful.

Like, it’s really challenging. It’s a lot of work and, you know, sometimes you get to the end of the day, like: “gosh, I’m exhausted, I’ve actually had a really bad time because I’ve been trying to find that bug or production’s gone down,” or “oh, I really wanted to get that feature done today, but I just couldn’t quite get all the pieces to line up.”

That really sucks, we’ve built an ecosystem where people get to the end of the day and they feel drained. That’s awful. And so I really want to take away as much of that pain as possible. Lots of people have different ideas about how you do this, but my thesis, the idea that I’m subscribing to is that we really want the computer to do as much work as possible. If there’s something that we can get the computer to do for you, that’s one less thing you have to worry about. And maybe you’ve got to play the game of the tool or the compiler, or the typechecker, whatever it is, but in exchange that should give you back more time.

In Gleam, that means we’re making very heavy use of static analysis. We want the compiler to tell you: “hey, you’ve made a change,” or “you need to go to this file, this file, this file, and this file, and then everything will be consistent again.” That’s great because, you know, if you’ve ever worked in a very large code base in Java or, worse, Python or Ruby, or Javascript, just the act of finding things, particularly if you’re new to a team and then maybe you’ve also got that imposter syndrome thing as well – that can be really very stressful. So, you know, the more we can build into these automated tools, the better, I think.

Could you talk about Gleam’s philosophy, and perhaps add a few words about why you’re targeting BEAM in the first place?

Yes. It’s tricky cause I’m so used to talking to Erlangers.

There’s this base level. “We are Erlang people, we both are talking about Erlang, let’s just continue and talk about types and stuff”, but no, it’s really important to talk about Erlang. Because it’s such an unusual ecosystem. They’ve got a lot of different attitudes towards things than a lot of other ones do.

And so you can’t just say, oh, it’s Erlang. It’s good. I would say Erlang to some extent has the same sort of philosophy in that lots of things are difficult and challenging when operating software and we want to reduce that risk, reduce that stress as much as possible.

I think that the motivation for making things more reliable in Erlang was probably not “they want to make the programmers less stressed,” but more “they wanted their business to be more reliable,” but it has the same impact, right? Fewer engineers being paged at three in the morning to fix something.

In Erlang, there was this idea that a system should run forever and never die, and this comes from the fact that Erlang was originally made for firmware for telephony devices. You know, so a box that sits at the top of the telegraph pole and boots phone calls between different handsets and stuff.

And there’s only one box on top of one telegraph pole with all those wires going into it. So if it goes down, you’ve got a huge problem. You really need to make sure that this thing never dies. And so they’ve designed systems not so that errors can’t happen in the same way we see in Rust or Haskell or, you know, languages that lean into static verification, but instead the other way, they said: ”No matter what happens, you’re going to have some errors. There could be a bug in the compiler. Even if the compiler says everything’s perfect, there could still be a bug there and something is wrong. Or, what if your computer gets struck by a bolt of lightning, and suddenly a bunch of ones become zeros and zeros become ones in the memory of a hard drive or something, something could go catastrophically wrong.”

How do you deal with that with static analysis? Well you can’t really, so we need to have a different system. The idea is how can we embrace the fact that things are going to go wrong because the universe is against us and deal with that. And they’ve got numerous set of techniques, but the core idea, I think, is the idea of process isolation. And it’s kind of like a cruise ship, really. So if you think a modern cruise ship is divided intersections, and if they strike a rock or an iceberg or something, and it punctures the side of the ship, water rushes through the hole and it floods a single compartment, but because they have this isolation between the different parts of the ship, the ship stays afloat.

Rather than having to do something right now to stop the ship from sinking, they can sail to the harbor or the dock or wherever it is that the ship goes, I don’t know. And then they can repair it in a way that is, you know, obviously still challenging. And it’s obviously still a big problem, but it’s much easier than doing it in “we’ve got 45 minutes before the world has ended”. So that’s the idea, building systems so that things can go wrong and that’s okay.

And they’ve achieved some really amazing things. You hear rumors of Ericsson systems that have like nine nines of uptime. Is that true? Who knows. That’s like a couple of seconds a year. Do they achieve that? I don’t know, but it’s quite, it’s quite bold that they can claim it and we don’t go “obviously they don’t”, the fact that we go “well, maybe they have,” that says a lot about the system. I’ve spoken to earlier friends, who – for example, there was a video streaming service or they had a bug in which they would crash every, I think, 30 frames. So like once a second, they would crash, and they didn’t notice for six months because their system was so good at healing from errors that it just dropped a frame.

It went: “oh, something’s going wrong, rebuild the state and try it again.” And it succeeded. That’s really impressive. Now I would say that they probably should have better monitoring if they were crashing once a second, didn’t notice for six months. But the fact that they could do that says a lot, and they had a much better time than, well, if the system went down like a more naively written thing would do,

And Gleam tries to unify that with the sort of Haskell – you know – the more common way of thinking about errors. So we want to try and eradicate as much as possible as we can when you’re writing them, but we still want to have this thing at runtime where we say we can survive problems.

And then you can dial in the amounts of each one of those two philosophies depending on your business needs. If you’re making a prototype, you might want to write in a kind of hacky, fast and loose code where you don’t check the error cases and then you rely on the Erlang fault tolerance.

Like that’s a really good use case, or you might want to spend more time on the static analysis. But the important thing is that it’s a really clear and explicit line about, you know, is this area intentional or is it accidental. If it’s accidental – cool. We’re going to fix it. If it’s intentional – excellent. Then we use Erlang’s fault tolerance for that.

Could you tell us which battles did you pick and which battles did you save for later when building Gleam?

People have said for a long time that Erlang isn’t typable because of x, y, and z, and I, for a long time, I sort of went: “Oh yeah. Okay. Yeah, that makes sense. I believe that.” And the more I thought about it, the more I didn’t agree. You can actually tackle all of these problems with types, and that’s really exciting, but in Gleam we’ve chosen a subset of those problems to tackle.

Um, and the reason for that is that I really want Gleam to be an Erlang-based language with a type system. That is the kind I want to use. And there’s been lots of other attempts to make statically-typed Erlan languages. And it’s very common for them to get a bit of momentum, get a bit of excitement, and they encounter some actually really difficult problem about applying types to OTP, the actor framework or distributed computing, or something like that.

And they sort of go: “oh, what are we going to do here?” And then they retreat to the library to do research and then they never come back. And they probably learnt loads of amazing things. But from the point of view of me, someone who wants to write probably quite boring programs in this language, I’m very frustrated because I don’t care that much about distributed computing, not now, maybe not now at least, but I would like to start writing more mundane programs with this language.

And so in Gleam, we’ve sort of taken some of the really difficult ones, for example, upgrading the code inside an already running system. This is something you can do inside Erlang, which is a really powerful tool, but most businesses don’t need it. Most businesses can just use like a load balancer to swap out some new versions of their application, or they can have some downtime, or they’ve got some other way of rolling out new versions because it’s quite rare that a language can do hot upgrades so we can just use the tools that everyone else uses.

Fantastic. So we won’t deal with hot upgrades. We won’t deal with distributed computing in a typed way because that’s a really complicated, active area of research. We don’t want to tackle that right now and you can just use the existing untyped Erlanged mechanisms and those will work just fine in Gleam.

You won’t get any extra help, but that’s okay. But we have for 30, 40 years been able to write really good programs that all run inside the single operating system process and can be single threaded or multi-threaded, concurrent or sequential and do loads of wonderful, powerful things that cover, you know, 98% of programs in a really practical, typed fashion.

So let’s take all those new wins and apply them to the Erlang ecosystem. So that’s the goal with Gleam and that’s great because, I think, maybe there’s others I’m not aware of, but I think that Gleam is now the most mature statically typed language on the Erlang VM.

And because we’ve taken this approach to focus on what I see as the kind of pragmatic, practical stuff, we’re in a place where all the stuff that I write for my projects, I’m writing Gleam. That’s really cool. There’s lots of people writing little bits and bobs in Gleam, there’s even been a few companies that have put Gleam into production and one start up that used Gleam as their main language.That’s really exciting.

And we haven’t seen that in some of the other languages before that, I think, since some others have as well, but, yeah, it’s really exciting that we’ve managed to get a lot further. And then in future, once we’ve got all of the common niceties, maybe then we can fight the dragon.

Maybe then we can move on to distributed computing, but let’s get all the 95% down first and then we can move on to the rest later.

Let’s take a very brief tour into the Gleam compiler itself, given that our regular audience is interested in this subject.

Without visual aid, how would you go about explaining Gleam compiler’s pipeline and what would you emphasize? What are the special bits?

Okay, I’ll give it a go. We’ll see how well this works.

This is the second Gleam – well, maybe the third – and it’s written in Rust. The first version was written in Erlang, but it got too difficult. So I ended up switching to Rust. One of the deliberate things about Gleam is that from a sort of implementation and design point of view, none of the bits of technology used are particularly state-of-the-art revolutionary. It’s all about picking things that are tried and tested, and have been established as being good ways of doing things for a long time.

So, we’ve got a fairly – I’ll talk about just the very heart of the compiler that processes a single module, because there’s lots of other build tools and rigmarole around that – but we were interested in the bit that works on the code. So it starts with a fairly old fashioned handwritten lexer and parser that takes your source code file and turns it into a syntax tree. Nothing particularly exciting going on there.

We used to use a parser generator, but we found moving to a handwritten one resulted in like a three times speed increase and dramatically better error messages. So I’m all about the handwritten ones these days. And then, after that, we move into static analysis.

So we have this untyped syntax tree and then we perform type interference on it. So Gleam, while completely type checked, doesn’t actually require any type annotations. The whole thing can be done through interference much in the same way as OCaml or Elm. And we use the Hindley-Milner type system. Specifically, we use algorithm W, which is a fairly old fashioned, but actually really good type inference algorithm.

So we use that to infer the types of every single node in our syntax tree. And then we transform the syntax tree into a typed representation, which is annotated with loads of information. Like what are the types of each node, any constructors, where did they get imported from, which of the multiple different Erlang scopes do the different things live in? Cause you need to know wherever it’s module scope or local scope. Cause Gleam only has one scope. So we need to map our scope [inaudible] to multiple different ones. In the future, it will also contain exhaustiveness checking information. But we sadly lack that at the moment.

So once we’ve got this very rich typed syntax tree, we can then move on to code generation. At this point, you can go one of two ways – you can go the JavaScript route, or you can go into the Erlang route, depending on which target you’re trying to compile to. But in both cases, we output source code.

The first versions of the Gleam compiler outputted an intermediate representation that is used by the Erlang compiler. But now we just use Erlang source code, which is actually really nice because it’s a really stable API. It’s really portable. It works absolutely everywhere. It doesn’t matter if you change the version of the Erlang compiler, it still works. And it’s really to test, you know, we can now say: “hey, look, this Gleam code should result in this Erlang code.” You know, you’re just looking at a string input and a string output. That’s really easy to work with.

It’s marginally more complicated because we pretty print the code. One of the things we try to do in building BEAM [inaudible] we want to play as nicely with Erlang people as possible. So we want them to be able to understand the actual code. We want the output to look like it was written by a human.

And so we have quite an intelligent pretty printer. The Elixir formatter works in exactly the same way. It uses the same algorithm. So instead of outputting an Erlang source file directly, we output a sort of intelligence printing algebra that has all the information of that code, but also says, if this is getting to a line, you could indent here, you could wrap here, that sort of thing.

And then we just throw it through the pretty printer, and it will output a string, which we then feed into the Erlang compiler or the JavaScript runtime that you happen to be using. So that sort of thing, and there’s a few points there at which we would also serialize metadata, you know, we’d take all the type information from the tree and write it to a file, and the next time we don’t have to recompile all the module if there’s a change, we could just read that again, that sort of thing.

And lastly, there’s a third secret backend for the Gleam compiler. So there’s Erlang and there’s JavaScript. There’s also Gleam. We also have a Gleam-to-Gleam compiler and that’s for the formatter. In that case, we do no type checking. We just read in the source file and then we print it again. Because we use a pretty printer, you’ve got a formatted Gleam file. So we can use that in your editor and stuff to make sure everything looks nice.

As far as I imagine writing a compiler of that size, for it to perform somewhat reasonably, it would require a considerable amount of mutable state to actually be efficient with what we’re doing.

In my experience, which is very limited to competitive programming, Rust makes it a huge pain to write non-trivial mutable data structures in an idiomatic fashion. For example, I’ll publicly admit that I don’t know how to write a breadth-first search in Rust without copying the whole tree on every iteration.

What was your experience with Rust?

I think I probably have an advantage in that. I don’t know what idiomatic Rust code is, really. I’d been a fan of Rust since a little bit before version one dropped, but in the way where I’d admire it from a far, and then, every 3, 4, 5, 6 months, I’d write a program in it. And I’d go, this is really cool. And then I get stuck on something like, oh, nevermind, throw it away. I’d go back to writing Erlang or something like that, or writing Haskell. But I always admired it, but then a point came when I needed it, I needed to, I felt like I needed to rewrite the Gleam compiler because I was writing in Erlang and I wasn’t very happy with how that was going. Both because I thought the tooling and the language wasn’t super suited to writing a compiler. And also just cause I’d recruited a lot of tech debt because this was the first prototype of the component. So I’d need to start again, and it was an opportunity to reevaluate my tooling choices.

And so that was when I really bought into Rust very hard. The first version of it I just copied almost line for line from Erlang into Rust. And I was like: “once I have that working, I can then refactor it until it is good Rust.” Which worked, but I’m not sure it was a good approach to take, to be honest, cause it did lots of odd things. If you try to move everything to recursion in Rust language where you’re not guaranteed tail call optimization, you just blow the stack all over the place, which causes all sorts of issues. I think it’s a real shame, Rust is a wonderful language, but there’s a lot more upfront investment in order to become really productive in Rust.

But I think once you’re productive, you’re very productive, but like you have to work a lot harder to get there. And I don’t think this because Rust has bad documentation or tutorials, or any of these things. I actually think it’s some of the best I’ve ever seen, possibly the best I’ve ever seen, but they’re just teaching something that is very complicated.

If how the memory management works is a part of your language, you have to deal with so many things inside the computer that you just didn’t have to deal with other languages. And that’s a real challenge. And I think the solution to this is, as you’re starting, don’t try and do the good solution.

As you said, you can’t do a depth-first traversal of a tree or breadth-first traversal of a tree without cloning the entire tree on every iteration. Well, clone the tree in every iteration then. If your program still works, that’s okay. There’s loads of things inside the Gleam compiler where I’m definitely doing the slow thing, I’m definitely cloning it. But even my not very good version of the Gleam compiler in which I cloned loads of things and I made loads of mistakes and it was written in really bad Rust – it was much faster than the Erlang version. And I’m a good Erlang programmer, or at least I think I am. Maybe I’m not, because I was beaten by the rubbish Rust, but it was still faster just cause it is that much more of an optimized language. And since then, I’ve been able to, as I learned things here, you know, I come to [inaudible] code base, and then I go: “Well, why was I doing that, I know how that should be done now, I’m going to fix it.” And it gets faster. So there’s been a few Gleam releases in which I’ve also published the benchmarks, and it’s been really nice to see like, oh wow, it’s taking half the time to typecheck a module as it used to. And things just keep getting better.

And I’m sure there’s loads of loads and loads and loads of problems in it still, but it’s much faster than it was before. So a program that works inefficiently is still better than one that doesn’t work at all. If you are cloning a lot, there’s kind of a few things that feel bad in rust that you can do, like swap to a reference counted pointer or a garbage collector pointer instead of the normal ones, which when you do that in Rust, you feel like you’ve lost.

You’re like, “oh man, I haven’t done this the proper way,” but then if you think for a moment about what your Erlang or Haskell or whatever program does, every single pointer is a garbage collected or reference counted pointer. So you’re already doing far better. Well, maybe not. It depends exactly on what capacity, but you’re already doing pretty good.

It’s okay to dislike slapping some of these little easy mode buttons. So just go wild. And in future, you may remove them or you may discover that it doesn’t actually make any difference, that reference counting. Like, it’s almost exactly the same performance. You’re not writing something where performance matters that much.

In general, some of the algorithms and data structures are pretty difficult. How do you learn all this kind of pretty impressive amount of skills to be able to implement this sort of stuff?

I’m very much driven by – and I’m sure other people have different ways of learning – but I find a way that works tremendously well, which people don’t seem to talk about enough, is having learning driven by projects. So I didn’t sit down – well, I have sat down with books of like typing algorithms and read them but that’s not because I wanted to learn it in isolation, it’s because I was writing a compiler and I was trying to write a type system. If you actually have a need for this algorithm, you can actually start to learn it. And I think there’s a real difference between learning things in a book or a video or whatever medium you like. You can learn a particular kind of familiarity with these algorithms, but you can’t really learn some of these more intuitive things that you learned from actually engineering these things on a daily basis. Like, how do I debug this? So when something strange happens, you look at it and go, that feels to me like something in that area of the code. Learning how to look at the performance of it and all these other things that, actually, on a day-to-day basis are really practical. And if you’re just learning it from a book, you don’t get the same exhilaration. If you read the chapter on type systems a few times, and you: “go, yeah, I understand this, that feels good.”

But if you’ve written a compiler and you write your program, and then you run the compiler on it, and it says: “this returns int” then you go: “Oh, I have a God. It works.” That feels so good. And that rush is so much more of a motivator than anything else.

And you go: “Well, I’ve got a basic type checker. I’d really love to have generics because then I could run this program.” And that’s a really great motivator to keep you working, as opposed to just “Well, now I’ll read the chapter on generics. It’s a bit dry, but like, that’s the next thing.” Right? So, like, build things. The whole Gleam project started as a learning experience that just sort of became accidentally useful.

Another motivation is “is it being useful?” But the learning around Gleam is motivated by wanting to build a thing. I wanted to build a compiler. So I learned how compilers work. And now I want Gleam to have this thing because it would help people. So I will learn how to add, I will learn how to do version resolution for package trees, you know, that sort of thing.

So have a goal and then learn around the goal.


Huge, huge thanks to Louis for the interview. If you want to keep updated on what’s happening with Gleam, you can follow him or the official @gleamlang account on Twitter. They have an open Discord that you can join as well.

For more interviews with creators of languages, companies that use functional programming, and people working on other interesting projects, head to our interview section or subscribe to us on YouTube.

💖 💪 🙅 🚩
serokell
Serokell

Posted on November 26, 2021

Join Our Newsletter. No Spam, Only the good stuff.

Sign up to receive the latest update from our blog.

Related