Friday 26 February 2021
Vincenzo Bazzucchi, Scala Center
Tuples allow developers to create new types by associating existing types. In
doing so, they are very similar to case classes but unlike them they retain
only the structure of the types (e.g., which type is in which order) rather
than giving each element a name. A tuple can also be seen as a sequence and
therefore a collection of objects, however, whereas homogeneous collections
Set[A] accumulate elements retaining only one type
A), tuples are capable of storing data of different types while preserving
the type of each entry.
In Scala 3, tuples gain power thanks to new operations, additional type safety and fewer restrictions, pointing in the direction of a construct called Heterogeneous Lists (HLists), one of the core data structures in generic programming.
In this post I will take you on a tour of the new Tuple API before looking at how a new language feature, dependent match types, allows to implement such API. I hope that through the two proposed examples, you will develop an intuition about the usage and power of a few new exciting features of Scala 3.
Why generic programming?
HLists and case classes can both be used to define products of types. However
HLists do not require the developer to declare class or field names. This
makes them more convenient in some scenarios, for example in return types. If
List, you can see that
def splitAt(n: Int) produces a
(List[A], List[A]) and not a
case class SplitResult(left: List[A], right:
List[A]) because of the cognitive cost of introducing new names
Moreover, there are infinitely many case classes which share a common structure, which means that they have the same number and type of fields. We might want to apply the same transformations to them, so that such transformations can be defined only once. The Type Astronaut’s Guide to Shapeless proposes the following simple example:
case class Employee(name: String, number: Int, manager: Boolean) case class IceCream(name: String, numCherries: Int, inCone: Boolean)
If you are implementing an operation such as serializing instances of these
types to CSV or JSON, you will realize that the logic is exactly the same and
you will want to implement it only once. This is equivalent to defining the
serialization algorithm for the
(String, Int, Boolean) HList, assuming that
you can map both case classes to it.
A simple CSV encoder
Let’s consider a simple CSV encoder for our
IceCream case classes.
Each record, or line, of a CSV file is a sequence of values separated by a
delimiter, usually a comma or a semicolon. In Scala we can represent each value
as text, using the
String type, and thus each record can be a list of values,
List[String]. Therefore, in order to encode case classes to CSV, we
need to extract each field of the case class and to turn it into a
and then collect all the fields in a list. In this setting,
IceCream could be treated in the same way, because they can be simply be seen
(String, Int, Boolean) which needs to be transformed into a
List[String]. We will first see how to handle this simple scenario before
briefly looking at how to obtain a tuple from a case class.
Assuming that we know how to transform each element of a tuple into a
List[String], can we transform any tuple into a
The answer is yes, and this is possible because Scala 3 introduces types
NonEmptyTuple but also methods
tail which allow
us to define recursive operations on tuples.
Let’s define the
RowEncoder[A] type-class, which describes the capability of
values of type
A to be converted into
Row. To encode a type to
first need to convert each field of the type into a
String: this capability
is defined by the
trait FieldEncoder[A]: def encodeField(a: A): String type Row = List[String] trait RowEncoder[A]: def encodeRow(a: A): Row
We can then add some instances for our base types:
object BaseEncoders: given FieldEncoder[Int] with def encodeField(x: Int) = x.toString given FieldEncoder[Boolean] with def encodeField(x: Boolean) = if x then "true" else "false" given FieldEncoder[String] with def encodeField(x: String) = x // Ideally, we should also escape commas and double quotes end BaseEncoders
Now that all these tools are in place, let’s focus on the hard part:
implementing the transformation of a tuple with an arbitrary number of elements
Row. Similarly to how you may be used to recurse on lists, on
tuples we need to manage two scenarios: the base case (
EmptyTuple) and the
inductive case (
In the following snippet, I prefer to use the context bound syntax even if I need a handle for the instances because it concentrates all the constraints in the type parameter list (and I do not need to come up with any name). After this personal preference disclaimer, let’s see the two cases:
object TupleEncoders: // Base case given RowEncoder[EmptyTuple] with def encodeRow(empty: EmptyTuple) = List.empty // Inductive case given [H: FieldEncoder, T <: Tuple: RowEncoder]: RowEncoder[H *: T] with def encodeRow(tuple: H *: T) = summon[FieldEncoder[H]].encodeField(tuple.head) :: summon[RowEncoder[T]].encodeRow(tuple.tail) end TupleEncoders
If the tuple is empty, we produce an empty list. To encode a non-empty tuple we
invoke the encoder for the first element and we prepend the result to the
created by the encoder of the tail of the tuple.
We can create an entrypoint function and test this implementation:
def tupleToCsv[X <: Tuple : RowEncoder](tuple: X): List[String] = summon[RowEncoder[X]].encodeRow(tuple) tupleToCsv(("Bob", 42, false)) // List("Bob", "42", "false")
How to obtain a tuple from a case class?
Scala 3 introduces the
type-class which provides type-level information about the components and
labels of types. A paragraph from that
is particularly interesting for our use case:
The compiler automatically generates instances of
enums and their cases, case classes and case objects, sealed classes or traits having only case classes and case objects as children.
That’s why we can obtain a tuple from a case class using:
val bob: Employee = Employee("Bob", 42, false) val bobTuple: (String, Int, Boolean) = Tuple.fromProductTyped(bob)
But that is also why we can revert the operation:
val bobAgain: Employee = summon[Mirror.Of[Employee]].fromProduct(bobTuple)
New tuples operations
In the previous example, we saw that we can use
.tail on tuples,
but Scala 3 introduces many other operations, here is a quick overview:
Under the hood: Scala 3 introduces match types
All the operations in the above table use very precise types. For example, the
compiler ensures that
3 *: (4, 5, 6) is a
(Int, Int, Int, Int) or that the
index provided to
apply is strictly inferior to the size of the tuple.
How is this possible?
The core new feature that allows such a flexible implementation of tuples are match types. I invite you to read more about them here.
Let’s see how we can implement the
++ operator using this powerful construct.
We will call our naive version
First let’s define our own tuple:
enum Tup: case EmpT case TCons[H, T <: Tup](head: H, tail: T)
That is a tuple is either empty, or an element
head which precedes another
tuple. Using this recursive definition we can create a tuple in the following
import Tup.* val myTup = TCons(1, TCons(2, EmpT))
It is not very pretty, but it can be easily adapted to provide the same ease of use as the previous examples. To do so we can use another Scala 3 feature: extension methods
import Tup.* extension [A, T <: Tup] (a: A) def *: (t: T): TCons[A, T] = TCons(a, t)
So that we can write:
1 *: "2" *: EmpT
Now let’s focus on
concat, which could look like this:
import Tup.* def concat[L <: Tup, R <: Tup](left: L, right: R): Tup = left match case EmpT => right case TCons(head, tail) => TCons(head, concat(tail, right))
Let’s analyze the algorithm line by line:
R are the type of the left
and right tuple. We require them to be a subtype of
Tup because we want to
concatenate tuples. Then we proceed recursively by case: if the left tuple is
empty, the result of the concatenation is just the right tuple. Otherwise the
result is the current head followed by the result of concatenating the tail
with the other tuple.
If we test the function, it seems to work:
val left = 1 *: 2 *: EmpT val right = 3 *: 4 *: EmpT concat(left, right) // TCons(1,TCons(2,TCons(3, TCons(4,EmpT))))
So everything seems good. However we can ask the compiler to verify that the function behaves as expected. For instance the following code type-checks:
def concat[L <: Tup, R <: Tup](left: L, right: R): Tup = left
More problematic is the fact that this signature prevents us from using a more specific type for our variables or methods:
// This does not compile val res: TCons[Int, TCons[Int, TCons[Int, TCons[Int, EmpT.type]]]] = concat(left, right)
Because the returned type is just a tuple, we do not check anything else. This
means that the function can return an arbitrary tuple, the compiler cannot
check that returned value consists of the concatenation of the two tuples. In
other words, we need a type to indicate that the return of this function is all
the types of
left followed by all the types of the elements of
Can we make it so that the compiler verifies that we are indeed returning a tuple consisting of the correct elements?
In Scala 3 it is now possible, without requiring external libraries!
A new type for the result of
We know that we need to focus on the return type. We can define it exactly as
we have just described it. Let’s call this type
Concat to mirror the name of
type Concat[L <: Tup, R <: Tup] <: Tup = L match case EmpT.type => R case TCons[headType, tailType] => TCons[headType, Concat[tailType, R]]
You can see that the implementation closely follows the one above for the
method. The syntax can be read in the following way: the
Concat type is a
Tup and is obtained by combining types
R which are both
Tup. To use it we need to massage a bit the method
implementation and to change its return type:
def concat[L <: Tup, R <: Tup](left: L, right: R): Concat[L, R] = left match case _: EmpT.type => right case cons: TCons[_, _] => TCons(cons.head, concat(cons.tail, right))
We use here a combination of match types and a form of dependent types called dependent match types (docs here and here). There are some quirks to it as you might have noticed: using lower case types means using type variables and we cannot use pattern matching on the object. I think however that this implementation is extremely concise and readable.
Now the compiler will prevent us from making the above mistake:
def wrong[L <: Tup, R <: Tup](left: L, right: R): Concat[L, R] = left // This does not compile!
We can use an extension method to allow users to write
(1, 2) ++ (3, 4)
concat((1, 2), (3, 4)), similarly to how we implemented
We can use the same approach for other functions on tuples, I invite you to have a look at the source code of the standard library to see how the other operators are implemented.