Magic refactoring with Unison: much more than a new programming language

gonzaloruizdevilla

Gonzalo Ruiz de Villa

Posted on December 18, 2019

Magic refactoring with Unison: much more than a new programming language

I recognise it, I am obsessed with refactoring. When I program, I keep changing the names of the variables, methods or classes again and again. Or, for example, there are no functions to which the arguments do not change without stopping, until finally I am slightly satisfied. I always have the feeling that something more can and should be done.

That is why I was surprised by the refactoring capacity of this new programming language called Unison. As I said, I rename things obsessively, but this has a cost. For example, when renaming a function, you have to modify all references, or, what is the same, you have to modify all the files where there are functions or methods that invoke the renamed function. Let’s say that in the project there are a thousand files with calls to that function: this implies that after the renamed, in the code commit, of the 1001 files it will contain, only one will have a relevant change, while the changes in the rest of the files only generate noise. But… could it be otherwise?

And, of course, the answer is yes (otherwise this article would not make sense, right? 🤷🏼‍♀️).

In Unison, to change the name of a function, we use the move command:

 .> move.term base.List.foldl base.List.foldLeft 
Enter fullscreen mode Exit fullscreen mode

If we now review the implementation of a function that uses the old foldl, now renamed as foldLeft, we see that Unison’s magic has already done its work:

.> view base.List.reversebase.List.reverse : [a] -> [a] 
base.List.reverse as = 
  use base.List +: 
  base.List.foldLeft (acc a -> a +: acc) [] as 
Enter fullscreen mode Exit fullscreen mode

When you rename a function with Unison, all references are instantly corrected. But in order to do so, what Unison is not going to do is to mutate texts on your behalf, updating thousands of files, generating a gigantic diff and breaking the libraries of users who expect the function to continue using the old name. What a madness, right?

Unison is a typed language largely influenced by Haskell, Erlang and a research language called Frank. There is a very simple and basic idea on which Unison relies: to identify definitions not by name but by content. That is, if we define a factorial function as the product of the natural numbers between 1 and a given number n, it will not matter if the name that we give the function is “factorial”, “lairotcaf” or something else. Similarly, it is completely unimportant if the parameter is called “n”, “z” or any other name: the function does not change in its essence.

What Unison does is the following: first, Unison calculates the hash of the implementation. Second, instead of storing text files, what you save in the code base is the abstract syntax tree (AST) of the function, where references to other functions are made using the corresponding hashes. In this way, a management of the base code is achieved that allows, among other things, the following:

  • not having to recompile anything
  • trivial renaming
  • cache test results
  • eliminate dependency conflicts
  • persistent typing and simple storage

The Unison base code manager is the piece that makes all these things possible by storing the AST and becoming the only source of truth and not relying on textual representations found in text files. The base code manager has some very interesting properties:

  • append-only: definitions are never modified or deleted, only new ones are added.
  • as a result of the above, it can be versioned and synchronised with Git or similar tools without generating conflicts.
  • as it can only be added, many types of information can be cached without worrying about the expiration of the cache.
  • the names are stored separately from the definitions, so renaming is very fast and 100% accurate. In addition, aliases can be added easily.

I believe this comment by the creators of the language is very clarifying:

Unison provides some files, the scratch files, where you can explore the codebase definitions and make and test the changes before saving them in the codebase manager.

Another of the motivations behind Unison’s design that I have found especially interesting is to be able to accurately describe programs that can be deployed on their own and to describe distributed elastic systems.

For example, look at the following merge sort implementation:

dsort: (a -> a -> Boolean) -> [a] -> {Remote} [a]
dsort lte as =
  if size as < 2 then as
  else case halve as of (left, right) ->
    resL = at spawn ‘(dsort lte left)
    resR = at spawn ‘(dsort lte right)
    merge lte (force resL) (force resR)
Enter fullscreen mode Exit fullscreen mode

It seems a typical implementation of in memory sort, where the list is divided into halves that are ordered and then the results are mixed. The particular thing about this implementation is that recursive calls are being made in parallel in two different newly provisioned computing resources.

This is part of the magic that Unison promises: distributed programs without configurations, without JSONs, without network connection management, etc. Just add annotations to the code where you want concurrent or distributed executions and then Unison will do its magic.

Conclusion

Unison is a very interesting bet that is testing new concepts. I don’t know if it will have much of a future or not, but what it, undoubtedly, is already doing is helping to rethink how we approach certain aspects of software development with programming languages. It is not only a new syntax, but rather a more ambitious effort.

References

💖 💪 🙅 🚩
gonzaloruizdevilla
Gonzalo Ruiz de Villa

Posted on December 18, 2019

Join Our Newsletter. No Spam, Only the good stuff.

Sign up to receive the latest update from our blog.

Related