Starting with Scala 3 macros: a short tutorial
Adam Warski
Posted on April 9, 2021
Scala 3, also known by its development name Dotty, is expected to ship by the end of April 2021. One of its flagship features is principled metaprogramming. This includes macros — compile-time code generation.
Macros have been pioneered through an experimental API since Scala 2.10. Even though the API was experimental, macros have become a useful tool, leveraged by a number of libraries. Scala is a flexible and scalable language; still, macros allow to further reduce boilerplate code in a number of use-cases.
Given their popularity, it doesn't come as a surprise that the next major revision of the Scala language keeps metaprogramming capabilities, however in an improved form. The new approach is no longer experimental, and draws from the experiences from previous macros implementation.
At the code level, Scala 3 macros are unfortunately quite different from the previous version; however, this mostly affects library code, as that's where macros were predominantly used.
The scope of metaprogramming in Dotty / Scala 3 is also different from what we've seen in Scala 2. In some areas it is broader, in some — more constrained. For example, Scala 3 brings extensive inlining support. On the other hand, macro annotations are no longer there.
A simple macro
Let's see how we can start developing a simple macro for Scala 3! I did a similar tutorial for the Scala 2.10 macros, 8 years ago! Back then, we were writing a macro which improves on println-debugging.
Turns out, 8 years later, println
-debugging is still one of the main debugging methods that I'm using. Quite often, we want to print some message and labeled values, for example:
println(
s"Funds transferred, from = $from, to = $to, amount = $amount")
It would be nice if we didn't have to duplicate the names of the values used. Our goal is to write a macro which will automatically print the value labels. The syntax that we'd like to achieve is as follows:
debug("Funds transferred", from, to, amount)
Why do we need a macro here? We need to access the abstract syntax tree (AST) of our code, so that we can find out what the names are. Let's see how we can implement the macro step-by-step. All of the code is available on GitHub.
Project setup
Each project starts with a build; same here, we'll be using sbt (version 1.5.0), but any other tool which supports Scala3/Dotty can be used as well.
The only property that we need to specify in build.sbt
is the Scala version:
scalaVersion := "3.0.0-RC1"
Once we have that, we can import the project into IntelliJ or Metals.
Hello, world!
We'll be working with two source files. First, Debug.scala
is going to be where the debug macro will be implemented. Second, Test.scala
will be where we'll be testing the code we've written. We need two separate files, as these need to be compiled separately by the compiler: we can't use code-generating code (the macro) before it has been compiled itself!
Let's start with an even simpler task: writing code which will generate a println("Hello, world!")
when invoked. This is quite trivial:
object Debug:
inline def hello(): Unit = println("Hello, world!")
object Test extends App:
import Debug._
hello()
Did we actually write a macro? Not really. Instead, we've taken advantage of a new Scala 3 / Dotty metaprogramming feature: inlining. Notice that the hello
method is prefixed with the inline
modifier. This instructs the compiler (and it's not only a suggestion — but a requirement), that upon compilation, the method body should be inlined at the call-site.
Hence, when we compile the above and inspect the bytecode, we won't see a hello()
invocation in our Test
application. Instead, the bytecode will contain directly println("Hello, world!")
.
In a way, inlining as described above is a way to do static metaprogramming — the code is generated, but basing on statically available information, without any computation. Let's see how we can take it one step further, and do some dynamic metaprogramming.
Single-parameter debug
Our goal will now be to write a debugSingle
method, which will expand a debugSingle(x)
call into:
println("Value of x is " + x)
Inlining is no longer enough. We need to access the name (or code fragment) that is passed to our method — not its value. We start out similarly as before, with a method that should be inlined:
inline def debugSingle(expr: Any): Unit = ???
That's the method that the user will invoke. However, the implementation will need to operate on a representation of the parameter which gives access to information available at compile-time: the textual representation of the code, to generate the label.
Note that this is quite different from the run-time representation; at compile-time, we manipulate trees corresponding to expressions. At run-time, we manipulate values to which the expressions are evaluated.
The mechanism to convert between the compile-time and run-time representations is called quoting & splicing. When we quote a value (by prepending the expression with'
), we get back the abstract syntax tree (AST; a value of type Expr[_]
), representing the expression:
'expr: Expr[Any]
In case of Scala macros, the AST is our code represented as data, which we can inspect.
Expr
is the root type; each Scala construct corresponds to a subclass of this type. As these can be nested (e.g. anIf
expression has child expressions), our code can be represented as a tree.
When we splice a value using ${ }
, we go back to the run-time land:
${anotherExpr: Expr[Any]}: Any
You can think of quoting as a function T => Expr[T]
, transforming code into an abstract syntax tree which can be manipulated at compile-time. Dually, splicing is a function Expr[T] => T
, transforming an abstract syntax tree into code that will be compiled and evaluated at run-time, into a value of the given type.
A crucial property that is enforced by the compiler is the phase consistency principle. It makes sure that you can only access the AST during compile time (at run-time, this information is no longer available!), and that you are not trying to access the value of an expression when a macro is being invoked (as the values are only available at run-time).
The implementation of the debugSingle
macro will operate on abstract syntax trees. Hence, its signature is:
def debugSingleImpl(expr: Expr[Any])(using Quotes): Expr[Unit]
Notice the **using** QuoteContext
value: it is provided implicitly by the compiler, and allows e.g. to report error messages that might occur during macro invocation.
The debugSingleImpl
method takes an AST representing the expression that's been passed in as a parameter. This can be a simple value reference (x
), or anything more complex (e.g. x+y*2
). It returns an AST — code represented as data, of type Expr[Unit]
— which when evaluated, returns a unit (a side-effect).
Here's the implementation:
def debugSingleImpl(expr: Expr[Any])(using Quotes) =
'{ println("Value of " + ${Expr(expr.show)} + " is " + $expr) }
The outer operation is quoting ('{ ... }
): converting code it contains (of any type T
) to a value of type Expr[T]
; that is, an abstract syntax tree — representing the code as data. For example, '{ println("Hello, world!") }
would return a value of type Expr[Unit]
(Unit
as that's the type returned by println
), which represents the AST corresponding to the invocation of println
.
However, inside the code for which we are generating the AST, we want to embed some expressions, represented as data: the string literal corresponding to the name of the value (the label), and the expression which computes the value.
This first is done using expr.show
. This will be evaluated at compile-time, and converts the AST of expr
into a String
: the textual representation of the code. We create an AST fragment — an expression which represents a constant string with the given value using Expr(expr.show)
, and finally we splice (embed) it into the code that we are generating.
The second is done by splicing the (unchanged) AST of expr
into the generated code. Any code that is passed as a parameter to debugSingleImpl
, will end up unchanged in the generated code. For example, calling debugSingle(x+y)
will generate, at compile-time, the following:
println("Value of " + "x.+(y)" + " is " + (x+y))
One final task remains: calling debugSingleImpl
from debugSingle
. To do that, we need to quote the expr
so that the AST is passed to the macro implementation (we can access that since the method is inlined), and splice the result, converting the AST back into code that will be compiled:
inline def debugSingle(expr: Any): Unit = ${debugSingleImpl('expr)}
Are we done? Not quite. We need to require that the compiler will inline any usages of the expr
parameter, instead of creating a temporary value with its value; that would spoil our labels! This is done by adding inline
to the parameter as well. Here's the whole implementation:
inline def debugSingle(inline expr: Any): Unit =
${debugSingleImpl('expr)}
private def debugSingleImpl(expr: Expr[Any])(
using Quotes): Expr[Unit] =
'{ println("Value of " + ${Expr(expr.show)} + " is " + $expr) }
Multi-parameter debug
We can now improve our implementation so that it works with multiple parameters. Additionally, if a parameter is a string literal, we'd like to simply include it in the output, without the label.
First, let's define the user-facing method. We'll use varargs so that it can be called with multiple parameters:
inline def debug(inline exprs: Any*): Unit = ${debugImpl('exprs)}
The idea is the same: we have an inlined method which splices the result of a computation involving code represented as ASTs.
This follows the definition from the Dotty docs: a macro is an inline function that contains a splice operation outside an enclosing quote.
Varargs are represented as a sequence, hence the signature of the macro implementation is:
def debugImpl(exprs: Expr[Seq[Any]])(using Quotes): Expr[Unit]
In the implementation itself, we first have to inspect the passed exprs
tree and verify if it corresponds to multiple parameters passed as varargs. This can be done with pattern matching, using the Varargs
extractor provided by the Scala 3 standard library. As a result, we get a sequence of trees (from Expr[Seq[Any]]
, we get Seq[Expr[Any]]
):
val stringExps: Seq[Expr[String]] = exprs match
case Varargs(es) => // macro implementation called with varargs
case e => // macro implementation called with other parameters
If the extraction is successful, we map each expression corresponding to subsequent parameters, using pattern matching again. This time, we inspect the underlying term tree, to check if it corresponds to a constant value (such as a constant string literal).
If so, we return an expression containing that constant (as a string). Otherwise, we convert the expression to a string containing the label and value:
case Varargs(es) =>
es.map { e =>
e.asTerm match {
case Literal(c: Constant) => Expr(c.value.toString)
case _ => showWithValue(e)
}
}
Why one time do we have to match on expressions, and the other on terms? Varargs
is a special construct used to handle this type of parameters. In all other cases, if we want to inspect the shape of the code that was passed in (the Abstract Syntax Tree), we'll have to match on the expression's term — as above.
And we're almost done; the last step is converting the stringExps: Seq[Expr[String]]
to an Expr[String]
by generating code which will concatenate all of the strings. Two string expressions can be concatenated by splicing both expressions, combining them as any other two strings, and quoting the result. More generally:
val concatenatedStringsExp = stringExps
.reduceOption((e1, e2) => '{$e1 + ", " + $e2})
.getOrElse('{""})
And so we arrive at our final implementation:
inline def debug(inline exprs: Any*): Unit = ${debugImpl('exprs)}
private def debugImpl(exprs: Expr[Seq[Any]])(using Quotes): Expr[Unit] =
def showWithValue(e: Expr[_]): Expr[String] =
'{${Expr(e.show)} + " = " + $e}
val stringExps: Seq[Expr[String]] = exprs match
case Varargs(es) =>
es.map { e =>
e.asTerm match {
case Literal(c: Constant) => Expr(c.value.toString)
case _ => showWithValue(e)
}
}
case e => List(showWithValue(e))
val concatenatedStringsExp = stringExps
.reduceOption((e1, e2) => '{$e1 + ", " + $e2})
.getOrElse('{""})
'{println($concatenatedStringsExp)}
What's next
We've just scratched the surface of metaprogramming capabilities in Scala 3 / Dotty. First of all, inlining (the "static" metaprogramming variant) has quite a lot of interesting features, as described in Dotty docs:
- recursive inline methods
- specialized inline methods
- using conditionals & matches in inlined methods
- selective summoning (conditional logic depending on available implicits)
Macros also have other features which we haven't covered, such as quoting & splicing types, summoning implicits in macros and more extensive pattern matching involving quoted patterns. Finally, there's multi-stage programming support, which enables constructing code at run-time.
As mentioned before, all of the code presented here is available on GitHub. Have fun exploring Scala 3 / Dotty!
Posted on April 9, 2021
Join Our Newsletter. No Spam, Only the good stuff.
Sign up to receive the latest update from our blog.