Refactor with scalafix v0.3

Monday 27 February 2017

Ólafur Páll Geirsson

BLOG

I am happy to announce the release of scalafix v0.3, a library and tool to rewrite Scala source code. Scalafix is developed at the Scala Center with the long-term mission to help automate the migration of between Scala versions. However, as I hope to demonstrate in this post, scalafix can be used for more than just migrating between Scala versions. Scalafix can also be used for ad-hoc library and application migrations.

Scalafix v0.3 uses the new scala.meta semantic API to provide a re-designed Rewrite and Patch API to implement custom refactorings. Let me explain what that means word by word.

Note. This is the second post on scalafix and scala.meta. You might be interested in reading the first post here.

Scala.meta semantic API

Scala.meta recently announced the first release of its semantic API. This release is the product of close collaboration for the past several months between @xeno_by at Twitter and myself at the Scala Center. The objective of the semantic API is to provide operations to query information from the compiler.

The first version of the scala.meta semantic API makes it possible to query for the resolved “symbol” of a name that appears in a Scala source file. A name is a reference to some definition, for example println or scala.Predef.println. A symbol is a unique identifier of a single definition. For example, println from the standard library has the symbol _root_.scala.Predef.println(Ljava/lang/Object;)V.. The compiler is responsible for resolving names to symbols.

Scalahost is a compiler plugin in the scala.meta project that extracts symbols from the compiler and maps them to scala.meta syntax trees. Scalahost emits the extracted symbols into a “semantic database”. The semantic database can be persisted to files on disk and loaded for later analysis. Semantic databases from different compilation units, potentially produced by different versions of the Scala compiler, can be merged. This opens possibilities for large-scale code analysis.

The introduction of the scala.meta semantic API is a game changer for scalafix. The ability to resolve names to symbols opens possibilities for many scalafix rewrites. Before we cover a few example rewrites, let’s look closer at what exactly “rewrite” means.

Rewrite: meta.Tree => Seq[Patch]

In a nutshell, a scalafix Rewrite is a scala.meta.Tree => Seq[Patch] function. The tree is backed by the scala.meta semantic API, so the rewrite is able to query for compiler information such as symbols. A scalafix Patch is a small operation that can produce a diff on a Scala source file. A patch can either be a “token patch” or a “tree patch”.

Token patches are low-level but give full control over how every detail in a source file is handled, for example formatting and comments. Example token patches are Remove(token) and AddLeft(token, toAdd: String), which removes or prepends a string to token, respectively.

Tree patches are high-level and allow the rewrite author to declaratively explain what operation to perform. An example tree patch is AddGlobalImport(importer), which adds a new import to the top of a file if it does not exist. Observe that AddGlobalImport does not worry about token-level details such as whether the user groups imports by prefix (import a.{b, c}) or not (import a.b; import a.c).

Tree and token patches build a small algebra of operations that can be composed to build complex refactorings. A challenge with composing patches is to figure out what to do on conflicts. For example, what happens when one patch renames a token while the other patch removes the same token? The current strategy in scalafix is to try and resolve as many conflicts as possible on the tree patch level. It can be harder to resolve conflicts on the token level since the original intent of the patch is lost. Unsolvable conflicts abort the refactoring. In the future, we hope to support more advanced conflict resolution strategies.

To demonstrate how rewrites are implemented with scalafix v0.3, let’s step through an example use-case.

Example: Xor to Either

The functional programming library cats migrated recently from its Xor data type to Either from the standard library. It requires a few mechanical steps to migrate code to use Either instead of Xor. For example, below is a diff that’s taken from circe’s migration to Either.

-final def either: Xor[HCursor, HCursor] = if (succeeded) Xor.right(any) else Xor.left(any)
+final def either: Either[HCursor, HCursor] = if (succeeded) Right(any) else Left(any)

For this particular rewrite, we are able to get away with only using tree patches. Replace is one tree patch that can be used to replace usage of a reference such as a type, term or a static method.

Replace(Symbol("_root_.cats.data.Xor."), q"Either")
Replace(Symbol("_root_.cats.data.Xor.Left."), q"Left")
Replace(Symbol("_root_.cats.data.Xor.Right."), q"Right")

The Symbol(_root_...data.Xor) part is the scala.meta symbol referencing the class definition of cats.data.Xor. As we saw in the either: Xor[HCursor diff above, references to Xor should become Either after the rewrite.

Symbols are normalized by default, so the cats.data.Xor.Right replace patch will handle the Xor.Right type, Xor.Right companion object as well as the Right.apply constructor method.

To introduce new imports, it’s possible to pass in additionalImports

Replace(Symbol("_root_.cats.data.XorT."), q"EitherT",
        additionalImports = List(importer"cats.data.EitherT")),

Imports can be removed with the RemoveGlobalImport patch

RemoveGlobalImport(importer"cats.data.Xor")

Nothing happens if the import does not appear in the source file. Likewise, it’s OK to add the same import twice, scalafix will de-duplicate it.

The full Xor to Either scalafix rewrite can be found here and its accompanying test suite here.

Try it out!

You can use scalafix both as a library and a tool.

The recommended way to use scalafix as a library is with the sbt-scalahost plugin. By using scalafix as a library, you have full control of how, where and when to run rewrites.

The recommended way to use scalafix as tool is the sbt-scalafix plugin. It’s possible define custom tree patches in the .scalafix.conf configuration file.

I am excited to see what applications the community can build with scala.meta and scalafix. Some promising ideas that have floated around include

  • parse and run rewrites from @deprecated warning messages. This would enable library authors to provide executable migration guides.
  • shim/cross-build libraries with similar APIs. For example, a library could be developed with the scalaz API, and use scalafix to code-generate a version of the library that uses cats instead.
  • build a code search web-interface with “jump to definition” functionality powered by the scala.meta’s semantic database (similar to sxr).
  • replace usage of any2stringadd with string interpolators or explicit .toString.
  • replace usage of scala.Seq, which can be mutable, in favor of scala.collection.immutable.Seq.

If this sounds exciting to you, join us! I am happy to answer any question in the scalafix gitter channel.

PS. I want to thank @ShaneDelmore who has provided invaluable feedback from the early days of scalafix development and come up with several brilliant ideas for scalafix use-cases.