Existential Crisis: Implementing MapK in Scala 3
Nikita Gazarov
Posted on June 8, 2021
This a record of my attempts to migrate one of my favourite types, MapK
, from Scala 2 to Scala 3. You'll learn why this is not a trivial task and how to work around the new limitations. Hopefully this will help you solve similar problems with other types as well.
See also: r/scala discussion
What is MapK?
MapK[K, V]
is a higher kinded version of Map[K, V]
. Each key-value pair in MapK
is (K[A], V[A])
for some type A
. We don't care what A
actually is, just that the keys and values have consistent types. We don't want to put a String
into a Key[Int]
, and when we ask for the value of Key[Int]
, we want to get an Int
, not an Any
.
type Tuple2K[K[_], V[_], A] = (K[A], V[A])
class MapK[K[_], V[_]] {
def apply[A](key: K[A]): V[A] = ???
def foreach(f: Tuple2K[K, V, _] => Unit): Unit = ???
}
object MapK {
def apply(items: Tuple2K[K, V, _]*): MapK[K, V] = ???
}
MapK is at the heart of my biggest personal project, and of the quote engine we built at Goodcover. It is a very useful construct – being able to express such complex types in Scala allows me to write dynamic-feeling yet type-safe code without boilerplate or macros, and I've been really enjoying this power that Scala 2 afforded me.
Here's the simplest usage of MapK:
case class Key[A](name: String, default: A)
val ageKey = Key("age", default = 20)
val nameKey = Key("name", default = "Alex")
val m = MapK[Key, Option](
ageKey -> Some(30),
nameKey -> Some("Kyle"),
// nameKey -> Some(30) // but not this, compiler won't accept it
)
m(ageKey) // Some(30): Option[Int]
m(nameKey) // Some("Kyle"): Option[String]
Note: Of course MapK has more useful methods than just apply
and foreach
, and we'll get to some of those below. The rest you can find in my Typegrounds repo, but let me complain about things first.
MapK in Scala 2
In Scala 2 we can use existential types to implement MapK very straightforwardly:
type Tuple2K[K[_], V[_], A] = (K[A], V[A])
class MapK[K[_], V[_]](protected val rawMap: Map[K[_], V[_]]) {
def apply[A](key: K[A]): V[A] = {
rawMap(key).asInstanceOf[V[A]]
}
def foreach(f: Tuple2K[K, V, _] => Unit): Unit = {
rawMap.foreach(f.asInstanceOf[((K[_], V[_])) => Unit])
}
}
object MapK {
def apply[K[_], V[_]](items: Tuple2K[K, V, _]*): MapK[K, V] = {
new MapK[K, V](Map(items: _*))
}
}
This minimal implementation allows for our desired usage, and does not need any implicit conversions to accomplish that, the above is all there is to it. It's ugly on the inside, but its public interface is nice, safe, and simple. That is our goal. Let's see how close we can get to this in Scala 3.
Scala 3 Issues
The MapK above does not work in Scala 3. Trying to implement it that way, you'll become fast friends with the unreducible application of higher-kinded type [K[_$1], V[_$2], A] =>> (K[A], V[A]) to wildcard argument"
compiler error. This is because the API we designed relies on existential types that can't be expressed without using forSome
under the hood. For example, our Tuple2K[K, V, _]
is really Tuple2[K[A], V[A]] forSome { type A }
under the hood, and we can't just rewrite this using wildcard types like Tuple2[K[?], V[?]]
because we need the same A
to repeat twice in this type, in K[A]
and V[A]
.
In Scala 3, support for such existential types has been dropped. As far as I understand from this discussion, basically no one has proved that existential types can be made sound in the new type system. Well, I'm not going to be the one to do it, so we'll just go ahead and do our best to hack around the new limitation.
Wrap It Up
The first approach we can try is to simply avoid using unsupported existential types. The reason why our Tuple2K
is problematic is because it's just a type alias, not a concrete type. So we can just define:
case class Tuple2K[T1[_], T2[_], A](first: T1[A], second: T2[A])
and then we won't have issues with forSome
because now Tuple2K[K, V, A]
is just that, it's not a Tuple2[K[A], V[A]] forSome { type A }
anymore. No forSome
– no problem.
This is how we could implement MapK with such a concrete Tuple2K
class:
class MapK[K[_], V[_]] protected(protected val rawMap: Map[K[Any], V[Any]]) {
def apply[A](key: K[A]): V[A] = {
rawMap(key.asInstanceOf[K[Any]]).asInstanceOf[V[A]]
}
def foreach(f: Tuple2K[K, V, ?] => Unit): Unit = {
rawMap.foreach(pair => f(Tuple2K(pair._1, pair._2)))
}
}
object MapK {
def apply[K[_], V[_]](entries: Tuple2K[K, V, ?]*): MapK[K, V] = {
new MapK[K, V](Map(entries.map { pair =>
(pair.first.asInstanceOf[K[Any]], pair.second.asInstanceOf[V[Any]])
}: _*))
}
}
Fun fact: even though
Tuple2K
is not ascala.Tuple2
, it is a case class, so in Scala 3 you can access itsfirst
field as_1
, andsecond
field as_2
. But for clarity, I won't be relying on that in this post.
See full implementation in my Typegrounds repo, mapk/v0
directory.
MapK by CodeDx takes this general approach (there are some differences), and they even made it cross compile for both Scala 2 and Scala 3. Personally I don't need cross-compilation since I need MapK for my end-user project rather than for a library, so I will be looking for the best solution in Scala 3 without regard for Scala 2 compatibility.
Alright. The good news with this MapK implementation is that its public API looks unchanged from my original Scala 2 version. The bad news is – looks are deceiving. Because our Tuple2K
isn't an actual tuple anymore, the foreach
method now needs to unbox a scala.Tuple2
and re-box the contents as Tuple2K
, which comes at a performance cost.
Also, we need to define an extension method to support our desired MapK[Key, Option](intKey -> Some(30))
usage syntax, because again, the built-in ->
produces a scala.Tuple2
, whereas we need a version that produces Tuple2K
:
extension [K[_], A] (key: K[A])
def ->[V[_]](value: V[A]): Tuple2K[K, V, A] = Tuple2K(key, value)
And with this, aside from boxing, this Scala 3 MapK appears to fit our requirements.
The main issue with this solution is that we didn't really solve our problem. We just avoided it in one narrow use case, by switching from one abstract type to one concrete type. But the boxing penalty aside, this solution can produce unpleasant amounts of boilerplate.
For example, we can't implement a def keys: Iterable[K[?]]
method – that is also an unreducible application of higher-kinded type
, because K
is abstract. To "solve" that, we can create yet another wrapper class, case class MapKey[K[A], A](k: K[A])
, and implement def keys: Iterable[MapKey[K, ?]]
instead.
By the way, could we define def keys
as Iterable[K[Any]]
? No, because that would be a lie, that would be exposing unsafe internals to the public: K
is not necessarily covariant.
Dealing With Id
Type
Before we jump to alternative approaches, let's talk about a special type that would be too ridiculous to even consider boxing: type Id[A] = A
. It's just a type alias allowing us to have a fully transparent higher kinded type. Well, almost. More about it here. Anyway, with Id
, we can create a MapK that contains raw values, not boxed in Option or any other type:
val idMap = MapK[Key, Id](
ageKey -> 30,
nameKey -> "Sam"
)
idMap(ageKey) // 30: Int, same as Id[Int]
idMap(nameKey) // "Sam": String, same as Id[String]
It's nice that this part of our MapK API works out of the box with Id
, but perhaps since it's not a real higher kinded type, just a type alias, type inference will fail elsewhere. Consider this:
val m = MapK[Key, Option](
ageKey -> Some(30),
nameKey -> Some("Kyle")
)
def doSomething[A, K[_], V[_]](k: K[A], v: V[A]): Unit = ???
var values: List[Tuple2K[Key, Id, ?]] = Nil
m.foreach { pair =>
doSomething(pair.first, pair.second.get)
doSomething(pair.first, pair.first.default)
values = values :+ (pair.first -> pair.second)
values = values :+ (pair.first -> pair.first.default)
}
Type inference fails on doSomething
, and we need an additional implicit conversion to make this work:
given wrapId[A]: Conversion[A, Id[A]] = a => a
Also, while our ->
extension method takes care of Id
, you'd need more implicit conversions to deal with bigger tuples or other similar types (we'll get to that below).
Parameter Untupling
In the code above we used m.foreach { pair => ... }
. This might be acceptable in older Scala versions, but in Scala 3 we can do this: m.foreach { (k, v) => ... }
. Or, we could, if the callback of our foreach
method accepted a scala.Tuple
. It doesn't, it accepts our custom Tuple2K
class instead. And so because of our wrapping, a nice language feature isn't available.
Polymorphic Function Types
Minor syntax and implicits differences aside, up to this point our MapK
implementation can be made to work on Scala 2 just as well as Scala 3. However, Scala 3 offers a new feature that I have been really looking forward to: polymorphic function types, and MapK is the perfect kind of structure to make good use of it. Let's add a foreachK
method to our MapK:
def foreachK(f: [A] => (K[A], V[A]) => Unit): Unit = {
foreach { pair => f(pair.first, pair.second) }
}
Oh? Doesn't [A] => (K[A], V[A])
remind you of something? That looks suspiciously relevant to the basic idea of our type Tuple2K[K, V, A] = (K[A], V[A])
type. Could we maybe use polymorphic function types to get the desired foreachK { (k, v) => ... }
syntax?
Not really, no. You need to manually ascribe all type parameters when providing a polymorphic function, which kind of defeats the purpose:
m.foreachK { [A] => (k: Key[A], v: Option[A]) => {
doSomething(k, v)
}}
In Scala 2 we used cats FunctionK to express polymorphic functions. You can even roll your own, more specialized version:
trait ForeachK[F[_], G[_]] {
def apply[A](fa: F[A], ga: G[A]): Unit
}
but either way, it requires too much ceremony at call site, having to instantiate a trait and specify all of its type parameters:
m.foreachFnK { new ForeachK[Key, Option] {
def apply[A](k: Key[A], v: Option[A]): Unit = ???
}}
You'll notice that this is more of the same approach – wrapping into concrete types. At least the callback is only boxed once, not once for every MapK entry, but the boxing is, or at least feels, more expensive, creating an anonymous class at each callsite, if I understand correctly.
Newtype for Existentials
Alright, a fresh start. This time, let's solve the problem for real. Let's try the approach proposed by Alexander Konovalov. Here's the basic idea:
type Type[F[_]] <: (Any { type T })
implicit def wrap[F[_], A](value: F[A]): Type[F] =
value.asInstanceOf[Type[F]]
implicit def unwrap[F[_]](value: Type[F]): F[value.T] =
value.asInstanceOf[F[value.T]]
I'm gonna level with you, I don't really understand how this works. I'm not sure how the A
in F[A]
is the same as T
in type T
, and if it isn't, how this doesn't explode at runtime. So, my MapK implementation based on this idea is likely to be suboptimal. That said, here it is:
class MapK[K[_], V[_]] protected(protected val rawMap: Map[Type[K], Type[V]]) {
def keys: Iterable[Type[K]] = rawMap.keys.asInstanceOf[Iterable[Type[K]]]
def apply[A](key: K[A]): V[A] = {
rawMap(key).asInstanceOf[V[A]]
}
def updated[A](key: K[A], value: V[A]): MapK[K, V] = {
MapK.unsafeCoerce(rawMap.updated(key, value))
}
def updated[A](pair: (K[A], V[A])): MapK[K, V] = {
MapK.unsafeCoerce(rawMap.updated(pair._1, pair._2))
}
private def unsafeForeach[A](f: ((K[A], V[A])) => Unit): Unit = {
rawMap.foreach(f.asInstanceOf[(([Type[K], Type[V]])) => Any])
}
def foreach(f: Type.Tuple2[K, V] => Unit): Unit = {
unsafeForeach { (k, v) => f((k, v)) }
}
def foreachK(f: [A] => (K[A], V[A]) => Unit): Unit = {
unsafeForeach { (k, v) => f(k, v) }
}
}
object MapK {
def apply[K[_], V[_]](entries: Type.Tuple2[K, V]*): MapK[K, V] = {
new MapK[K, V](Map(entries.asInstanceOf[Seq[(Type[K], Type[V])]]: _*))
}
private def unsafeCoerce[K[_], V[_]](rawMap: Map[Type[K], Type[V]]): MapK[K, V] = {
new MapK[K, V](rawMap)
}
}
In this pattern, we use Type.Tuple2[K, V]
instead of Tuple2K[K, V, _]
that we used in Scala 2. It's still the same concept, just encoded differently in Scala 3 now:
object Type {
type Tuple2[F[_], G[_]] <: (Any { type T })
type Tuple3[F[_], G[_], H[_]] <: (Any { type T })
}
These types alone are not enough though, what really makes this code work are the implicit conversions between Type.Tuple2
and scala.Tuple2
:
implicit def wrapId[A](a: A): Id[A] = a
implicit def wrapT2[F[_], G[_], A](value: (F[A], G[A])): Type.Tuple2[F, G] =
value.asInstanceOf[Type.Tuple2[F, G]]
implicit def wrapT3[F[_], G[_], H[_], A](value: (F[A], G[A], H[A])): Type.Tuple3[F, G, H] =
value.asInstanceOf[Type.Tuple3[F, G, H]]
implicit def unwrapT2[F[_], G[_]](value: Type.Tuple2[F, G]): (F[value.T], G[value.T]) =
value.asInstanceOf[(F[value.T], G[value.T])]
implicit def unwrapT3[F[_], G[_], H[_]](value: Type.Tuple3[F, G, H]): (F[value.T], G[value.T], H[value.T]) =
value.asInstanceOf[(F[value.T], G[value.T], H[value.T])]
Finally, we can now use our MapK from Scala 3:
val m = MapK[Key, Option](
ageKey -> Some(30),
nameKey -> Some("Kyle")
)
def doSomething[A, K[_], V[_]](k: K[A], v: V[A]): Unit = ???
var values: List[Type.Tuple3[Key, Option, List]] = Nil
m.foreach { pair =>
doSomething(pair._1, pair._2)
doSomething(pair._1, pair._1.default)
values = values :+ (pair._1, pair._2, pair._1.default :: Nil)
}
m.foreachK {
[A] => (k: Key[A], v: Option[A]) =>
values = values :+ (k, v, k.default :: Nil)
}
Let's see how it stacks up against the wrapping approach:
- + No boxing, our implicit conversions are just
asInstanceOf
- + We can offer a properly typed
keys
method - + We can still write enough conversions to implicitly convert
Tuple3
-s intoTuple3K
-s (e.g. see(k, v, k.default :: Nil)
above), but now that doesn't involve wrapping - Similar amounts of boilerplate required for
Id[A]
– see full repo for the code,mapk/v1
directory - As before, unable to use parameter untupling because
Type.Tuple2[K, V]
is not the same type asscala.Tuple2[K, V]
. Even though they happen to have the same runtime representation – that's why ourasInstanceOf
implicit conversions are safe – at compile time they are entirely different types, so some language features that are only available for Scala tuples are not available forType.Tuple2
-s.
I also attempted to simulate parameter untupling by making foreach
accept type Function2[F[_], G[_], Out] <: (Any { type T })
instead of Type.Tuple2[K, V] => Unit
:
def foreach(f: Type.Function2[K, V, Unit]): Unit = {
rawMap.foreach((k, v) => f.asInstanceOf[(Type[K], Type[V]) => Unit](k, v))
}
However, that proved futile. I was not able to make it work without needing to ascribe the callback arguments' type parameters at callsite, which defeats my goal of brevity and simplicity of use.
Path Dependent Types
We have one more avenue to explore. Scala 3 docs dismiss existential types as not overly useful:
Existential types largely overlap with path-dependent types, so the gain of having them is relatively minor.
This StackOverflow answer by Andrey Tyukin explains the basic equivalence. Essentially, any value of type F[T] forSome { type T }
can be converted to a value of type { type T; val value: F[T] }
and back.
However, taken at face value, this requires boxing, and runtime reflection, since { type T; val value: F[T] }
is a structural type. Our newtype approach above does not require boxing, it uses the dirty magic of asInstanceOf
and implicit conversions to avoid runtime cost. Let's try to see if we can apply this dirt here too.
First we need to get rid of val value
, since accessing it requires runtime reflection (which is costly):
type Kind[K[_]] = {
type A
type T = K[A]
}
Kind[K]#T
will play the same role as Type[K]
– a replacement for the K[_]
existential type in Scala 2. Note the #T
. Kind[K]
is a structural type, but Kind[K]#T
is just K[A] forSome { type A }
in Scala 3, essentially. Which is exactly what we need.
Similarly, we define the equivalents to Type.Tuple2
and Type.Tuple3
:
type KTuple2[T1[_], T2[_]] = {
type A
type T = (T1[A], T2[A])
}
type KTuple3[T1[_], T2[_], T3[_]] = {
type A
type T = (T1[A], T2[A], T3[A])
}
Now it would seem that we need implicit conversions between KTuple
-s and scala.Tuple
-s. However, we will never actually have instances of Kind
, KTuple2
or KTuple3
, only instances of Kind#T
, KTuple2#T
, and KTuple3#T
. But those types are just type aliases to real underlying scala.Tuple
types, so you'd think that they wouldn't need any conversions, not even asInstanceOf
.
Well, in practice, not only do we need implicit conversions, but some basic use cases require explicit conversions, which is extra annoying. I guess the compiler just can't infer such complex dependent types.
So these are the conversions that we need (these given
-s are equivalent to implicit def
style conversions in Scala 2):
given wrapId[A]: Conversion[A, Id[A]] = a => a
given KTuple2[T1[_], T2[_], A]: Conversion[(T1[A], T2[A]), KTuple2[T1, T2]#T] = _.asInstanceOf[KTuple2[T1, T2]#T]
given KTuple3[T1[_], T2[_], T3[_], A]: Conversion[(T1[A], T2[A], T3[A]), KTuple3[T1, T2, T3]#T] = _.asInstanceOf[KTuple3[T1, T2, T3]#T]
// Special conversion needed for tuples with Id type (of course...)
given KTuple2_P1[K[_], A]: Conversion[(K[A], A), KTuple2[K, Id]#T] = _.asInstanceOf[KTuple2[K, Id]#T]
Finally, let's see a MapK implementation with all this new path dependent machinery:
class MapK[K[_], V[_]](rawMap: Map[K[Any], V[Any]]) {
def keys: Iterable[Kind[K]#T] = rawMap.keys.asInstanceOf[Iterable[Kind[K]#T]]
def apply[A](key: K[A]): V[A] = {
rawMap(key.asInstanceOf[K[Any]]).asInstanceOf[V[A]]
}
def updated[A](key: K[A], value: V[A]): MapK[K, V] = {
MapK.unsafeCoerce(rawMap.updated(key.asInstanceOf[K[Any]], value.asInstanceOf[V[Any]]))
}
def updated[A](pair: (K[A], V[A])): MapK[K, V] = {
MapK.unsafeCoerce(rawMap.updated(pair._1.asInstanceOf[K[Any]], pair._2.asInstanceOf[V[Any]]))
}
private def unsafeForeach[A](f: ((K[A], V[A])) => Unit): Unit = {
rawMap.foreach(pair => f(pair.asInstanceOf[(K[A], V[A])]))
}
def foreach(f: KTuple2[K, V]#T => Unit): Unit = {
unsafeForeach( (k, v) => f((k, v).asInstanceOf[KTuple2[K, V]#T]))
}
def foreachK(f: [A] => (K[A], V[A]) => Unit): Unit = {
unsafeForeach((k, v) => f(k, v))
}
}
object MapK {
def apply[K[_], V[_]](pairs: KTuple2[K, V]#T*): MapK[K, V] = {
val x: List[KTuple2[K, V]#T] = pairs.toList
val y: List[(K[Any], V[Any])] = x.map(t => t.asInstanceOf[(K[Any], V[Any])])
new MapK(Map(y: _*))
}
private def unsafeCoerce[K[_], V[_]](rawMap: Map[K[Any], V[Any]]): MapK[K, V] = {
new MapK[K, V](rawMap)
}
}
It looks a lot like our newtype solution, just with differently encoded existential types – the basic idea is still the same. Note the #T
-s, as mentioned above.
And here's how we can use this MapK:
val optMap = MapK[Key, Option](
ageKey -> Some(30),
nameKey -> None
)
def doSomething[A, K[_], V[_]](k: K[A], v: V[A]): Unit = ???
var values: List[KTuple3[Key, Option, Id]#T] = Nil
optMap.foreach { (k, v) =>
doSomething(k, v)
doSomething(k, k.default)
values = values :+ KTuple3(k, v, k.default)
}
optMap.foreachK { [A] => (k: Key[A], v: Option[A]) => {
doSomething(k, v)
doSomething(k, k.default)
values = values :+ KTuple3(k, v, k.default)
}}
Do you see it? Parameter untupling in the foreach
callback finally works!
We can say foreach { (k, v) => ... }
instead of foreach { pair => ... }
. This is because our KTuple2[Key, Option]#T
type is a (very complicated) type alias to a real scala.Tuple2[...]
type.
And, since we added an implicit to support Id
, we can use this type in MapK
already (well, in the value position, we'd need another implicit to use Id
in the key position):
val idMap = MapK[Key, Id](
ageKey -> 30,
nameKey -> "Kyle"
)
Full MapK
implementation using dependent types is available in the mapk/v2
directory of the repo.
So we finally made our much coveted foreach { (k, v) => ... }
syntax work, but at what cost?
Type errors produced by the compiler are even more complicated than with the newtype approach. Sometimes they point to the wrong location, e.g. if you have a type error inside a
foreach
callback you might see the compiler complain about(k, v)
arguments in addition to the actual type error. In general, it's hard to distinguish between actually incorrect types, and the compiler being unable to perform type inference.The compiler does not pick up the implicit conversion from
Tuple3[T1[A], T2[A], T3[A]]
toKTuple3[T1, T2, T3]#T
. In fact you can see that we need to manually callKTuple3(k, v, k.default)
when we want to pass a well typed tuple to a method that expects a KTuple3.You'd think – well, at least the
KTuple2
conversion works – but bizarrely it only works because->
is an extension method inArrowAssoc
. For whatever reason,MapK[Key, Option]((ageKey, Some(30))
does not compile whileMapK[Key, Option](ageKey -> Some(30)
does. I would really love to know what's going on there.
I'm a bit surprised that the recommended approach seems to run into compiler limitations the most. Maybe it's just my implementation, though, but that's the best I could figure out with the documentation and materials available today.
Summary
So, which of these approaches do you like better?
None of them are perfect, and honestly, each of them feels like a downgrade from what was possible in Scala 2. We have acquired many nice features with Scala 3, but I will be feeling this loss of expressiveness rather sorely.
That said, I have very little experience with dependent types, so it's entirely possible that a better solution is out there. And even though we may not get good old existential types back in Scala ever again, I hope at least the type inference and compiler errors will improve over time to compensate.
Further Work
I would like to be able to use MapK
with bounded types, e.g. RecordType[Rec <: Record]
. In Scala 2 I needed to build MapK[K[_ <: BaseVal], V[_ <: BaseVal], BaseVal]
(the idea being that A <: BaseVal
for every record) to support that.
However, it appears that my last version of Scala 3 MapK supports this out of the box with the right encodings and incantations, and to be honest I'm a bit surprised that it does. My repo has an example of this – see usage of RecordType
in V2MapKSpec
– but it's not fully developed yet.
Cover image courtesty of EFF Graphics CC BY
Posted on June 8, 2021
Join Our Newsletter. No Spam, Only the good stuff.
Sign up to receive the latest update from our blog.