Google Summer of Code 2017 Scala Projects

This page is work in progress for the upcoming GSOC 2017

Google Summer of Code

This year the Scala team applied again for the Google Summer of Code program to work with enthusiastic students on challenging Scala projects

This page provides a list of project ideas. The suggestions are only a starting point for students. We expect students to explore the ideas in much more detail, preferably with their own suggestions and detailed plans on how they want to proceed. Don’t feel constrained by the provided list! We welcome any of your own challenging ideas, but make sure that the proposed project satisfies the main requirements mentioned below.

How to get involved

The best place to propose and discuss your proposals is our “contributors” discussion forum. This way you will get quickly responses from the whole Scala community.

Previous Summer of Code

We encourage you to have a look at our 2016, 2015, 2014, 2013, 2012, 2011, Summer of Code 2010 pages to get an idea on what we and you can expect while working on Scala.

Project Ideas

Here are some project ideas. The list is non-binding and any reasonable project related to Scala that is proposed by a student will be thoroughly reviewed.

String interpolator for XML literals

We have wanted for a long time to deprecate XML literals in Scala and replace them by a string interpolator. That is, instead of

<body>
  ...
</body>

future Scala programs would use an interpolated string

xml"""
  <body>
    ...
  </body>
"""

This project should turn the wish into reality by implementing the xml interpolator. The implementation should re-use most of the existing infrastructure for xml literals. It would probably be a good idea to re-package the existing XML parser in the Scala compiler, and to map interpolated strings to constructors from the existing scala.xml library. Some other aspects need to be re-thought. In particular name-space management should be handled by implicits instead of being hard-coded in the Scala typer.

The project is a success if it can support essentially all legal XML expressions in Scala as interpolated strings. By contrast, support for XML patterns is optional: it would be nice if we could do it, but it’s not required.

This project is ideal for someone who knows XML basics, and is interested to learn enough about aspects of the compiler (either Scala 2.12 or Dotty) to identify what can be re-used.

Supervised by @densh

Shared Async Debugging Library

There are three IDEs for Scala: IntelliJ, Scala IDE and ENSIME. Unfortunately, all three have their own debugger solutions, resulting in duplicated effort across the overburdened tooling community.

This project aims to extract and build on the Scala IDE Async Debugger in a standalone debugging library. The Scala IDE has excellent support for typical async constructs such as Futures and Akka Actors.

Once in the shared library, ENSIME and command line tools can make use of the debugging support, and JetBrains could be approached to make use of the library also, which is part of a general theme of moving towards common tooling components (c.f. sbt server).

There is also scope within this project to improve the visual debugging support in the ENSIME text editors (Emacs, Vim, Atom, VSCode, Sublime) as well as to add features such as expression evaluation and time travel into the debugging library.

https://ensime.github.io/
https://scala-debugger.org/
https://scalacamp.pl/data/async-debugger-slides/index.html#/

Supervised by Sam Halliday (ENSIME) and Chip Senkbeil (scala-debugger) with advice from Iulian Dragos and Wiesław Popielarski (Scala IDE)

Future-proof scala-refactoring to use scala.meta

There are three IDEs for Scala: IntelliJ, Scala IDE and ENSIME. Scala IDE and ENSIME share a common library for refactorings, e.g. “rename method” or “organise imports”, but IntelliJ have their own implementation.

Within the past year the scala.meta / scalafix projects have become mature enough for use in CI tooling and inclusion in IntelliJ. scala.meta is planned to replace Scala macros and its newly released Semantic API allows for many refactoring tasks to be written with this common API.

In a nutshell, scala.meta gives us an opportunity to consolidate our tooling libraries into a single refactoring and automated suggestions library, used by all IDEs and static analysis tools.

This project aims to rewrite the most important parts of scala-refactoring to use scala.meta, in such a way that it can be shared between all three IDEs (integration into any specific IDE is not part of the scope).

This project also aims to form the foundation of a new intentions library, providing agreement between all the IDEs and to build a community-powered static analysis database.

https://github.com/scala-ide/scala-refactoring
https://scalameta.org/

Supervised by Matthias Langer (Scala Refactoring) with advice from Sam Halliday (ENSIME), Wiesław Popielarski / Simon Schäfer (Scala IDE) and Mikhail Mutcianko (IntelliJ).

Case classes a la carte with scala.meta

case classes are a very useful feature of the scala language, but can be limiting. For example, there is no way to modify the internal representation of a case class, leading to heap usage problems for larger applications that generate hundreds of millions of instances (a common problem in monolithic financial services applications).

In this project we would introduce a @data annotation, as a user-land alternative to case class. The default implementation would have feature parity with final case class but opt-in features include: interning (backed by a fast, non-blocking, concurrent, weak hashset, allowing for super-fast instance equality), hashCode caching, fast value equality (backed by instance ids and a cache of previous equality tests), unboxing of Options and primitives (each unbox saving 64bits per instance), Booleans as bitsets on a shared Long, all the way to FastInfoSet-style custom compression dictionaries (e.g. for Strings).

Another usecase would be to allow for automatic typeclass derivation of well-known typeclasses, placed on the companion. e.g. LabelledGeneric, Show, Lens. This would speed up compilations of downstream code and simplify implicit resolution.

https://github.com/fommil/scala-data-classes

Supervised by Sam Halliday with advice from Eugene Burmako.

Dotty Documentation Compiler

Dotty is the future Scala compiler developed at EPFL. Getting into compiler development is usually a very difficult especially when it comes to compilers for advanced languages like Scala.

Luckily, there are two compilers in the Dotty project. One for the language and one which acts as a documentation compiler. The docs compiler, alias “Dottydoc”, uses parts of the main compiler to collect semantic information about the user’s source files, it then creates a full-blown static site similar to Jekyll - featuring an API reference, markdown parsing for documentation, cross-referencing between docs and API and a blog. Gone is the divide between what is API and what is documentation. If you visited the Dotty website, then you’ve seen Dottydoc - that entire site was generated by it.

If you feel that you want to help shape the future of Scala, either by contributing to the Dotty compiler or by dramatically improving the tooling surrounding the Scala language, this is the place to start.

Supervised by @felixmulder

Auto-completion and type information in Scastie

Scastie is an online Scala programming environment. This project would add support for auto-completion and type information to the online code editor. That would require to perform calls to the presentation compiler running on the server-side to retrieve the required informations.

https://github.com/scalacenter/scastie

Supervised by @MasseGuillaume and/or @julienrf

Writing interactive programs with Scastie

Scastie is an online Scala programming environment. Currently, programs written in scastie have limited means of showing an output to users. This project would add support for writing programs showing images and interactive animations running in the browser.

https://github.com/scalacenter/scastie

Supervised by @MasseGuillaume and/or @julienrf

Reimplement the JDK in Scala-Native

Scala-Native is a new target for Scala. Instead of compiling to bytecode and running on the JVM, we use llvm to generate native assembly code. We currently do not compile Java to llvm. Therefore we need to reimplement some parts of JDK such as java.net.*, java.io.*, java.nio.* and java.util.*. We are already reimplementing the JDK and we need your help to increase our coverage. It’s a unique chance to learn how to implement the java api and write low level Scala code.

Supervised by @MasseGuillaume

Slick bug and feature hunt

There are lots of open tickets in the Slick issue tracker. Some which have long time fallen behind. This project would be about tackling as many of them as time permits.

https://github.com/slick/slick/issues

Supervised by @cvogt and/or @szeiger

CBT plugin initiative

CBT, the new build tool for Scala is making fast progress. In order to be a viable alternative to the existing choices, many people will need plugins, which often do not exist yet for CBT. The focus of this project will be converting builds of popular Scala libraries to CBT and writing whatever plugins necessary to pull it off. This project gives you the chance to play with one of the new exciting tools in the Scala world, work on a fairly elegant Scala code base and sharpen your proficiency in parts of the JDK often forgotten by Scala developers.

https://github.com/cvogt/cbt/issues?q=is%3Aopen+is%3Aissue+label%3Aplugin+label%3A%22help+wanted%22

Supervised by @cvogt

CBT IDE integration

CBT, the new build tool for Scala is making fast progress. In order to be a viable alternative to the existing choices, we need good integration with IDEs. Your goal will be to build solid integration with IntelliJ and potentially also tackle Ensime and Scala IDE. This project gives you the chance to play with one of the new exciting tools in the Scala world, work on a fairly elegant Scala code base and sharpen your proficiency in parts of the JDK often forgotten by Scala developers.

https://github.com/cvogt/cbt/issues/123

Supervised by @cvogt

Implementation of the Savina Benchmark Suite using the Reactors Framework

The Savina suite is a benchmark suite for actor-oriented programs. Its goal is to provide standard benchmarks that enable researchers and application developers to compare different actor implementations and identify those that deliver the best performance for a given use-case. The Reactors framework is a novel actor framework based on the reactor programming model, and offers better composition and modularity compared to standard actors. So far, only a part of the Savina suite was ported to the Reactors framework. To make Reactors fully compliant with Savina, the goal of this project is to implement the remaining 21 Savina benchmarks using the Reactors framework.

Savina Benchmark Suite

Reactors Framework

Supervised by @axel22

Benchmarking the new collections

The standard collections are in the process of being redesigned. We want to be sure that the new implementation is as efficient as the current one, if not more efficient. A simple benchmark suite has been implemented but it lacks nice features like performance regression reports via interactive charts. This project would require the student to be familiar with benchmarking (especially within the JVM) and web-based visualization libraries such as d3.

https://github.com/scala/collection-strawman

Supervised by @julienrf

Connecting potential contributors with Scala projects via Scaladex

Scaladex is a cool new tool that maps out the known Scala ecosystem by connecting GitHub users, published binaries, contributor info, documentation info, and release info of Scala projects. The one thing though that Scaladex doesn’t do yet is connecting potential contributors to Scala projects that are on the lookout for contributors. Your task if you take on this project would be to find a way to better connect project maintainers with interested open source contributors, by building on Scaladex, a web application built with Scala and Scala.js.

Supervised by @heathermiller

New line-wrapping algorithm for scalafmt

Scalafmt is a code formatter for Scala. A key feature of scalafmt is that users can define a maximum width (for example 80 characters) so that long code lines get automatically wrapped into shorter lines. Currently, scalafmt uses a token-based best-first search algorithm to wrap long lines. Both dartfmt and clang-format use a similar approach. This token-based best-first algorithm has known issues in scalafmt.

The goal of this project would be to explore and implement an alternative line wrapping algorithm for scalafmt. One promising alternative is the dynamic-programming algorithm employed by rfmt (https://www.researchgate.net/profile/Phillip-Yelland/publication/284724851_A_New_Approach_to_Optimal_Code_Formatting/links/5657f1e608ae4988a7b58558/A-New-Approach-to-Optimal-Code-Formatting.pdf) (see Python implementation). Another promising algorithm is Philip Wadler’s “prettier printer”. The prettier-printer algorithm has been successfully employed by the prettier JavaScript formatter.

Supervised by @olafurpg

Implementing a Benchmark Suite for Big Data and Machine Learning Applications

In this project, the aim is to design and implement several larger Big Data and Machine Learning applications, which will be used to regularly test the performance of the Scala compiler releases, and to test the performance of JVMs running these applications. By the time that the project is completed, you are expected to implement one larger data-intensive application at least for these frameworks: Spark, Flink, Storm, Kafka and DeepLearning4j. Each of the applications will have to be accompanied with a dataset used to run the application.

This project is an excellent opportunity to familiarize yourself with these modern cutting-edge frameworks for distributed computing and machine learning!

… Mentors: please insert your projects here. You can use the following template and submit a PR. Note that student applications will end on April 3, 2017 …

Project name

Project description.

Link to the corresponding code repository, if relevant.

Supervised by @username.

Requirements and Guidelines

General Student Application Requirements

This is the seventh time the Scala project has applied to the Summer of Code, and from previous years’ experience, increased popularity of the language and stories of other mentor organizations we expect a high number of applications. First, be aware of the following:

Make sure that you understand, fulfill and agree to the general Google Summer of Code rules
The work done during GSoC requires some discipline as you have to plan your day-to-day activity by yourself. Nevertheless, you can expect regular contact with your mentors both via the usual means of communication for you project as well as personal guidance via email, chat or phone. The mentor is there for you in case you get stuck or need some guidance during your 3 month coding project.
The official SoC timetable mentions May 30 as the official start of coding. If you have time, you are encouraged to research your proposals even before that (and definitely learn the basics of Scala, if you haven’t done that already).

Student Application Guidelines

Student proposals should be very specific. We want to see evidence that you can succeed in the project. Applications with one-liners and general descriptions definitely won’t make the cut.
Because of the nature of our projects students must have some knowledge of the Scala language. Applicants with Scala programming experience will be preferred. Alternatively, experience with functional programming could suffice, but in your application we want to see evidence that you can quickly be productive in Scala.
You can think of Google Summer of Code as a kind of independent internship. Therefore, we expect you to work full-time during the duration. Applicants with other time commitments are unlikely to be selected. From our previous experience we know that students’ finishing their studies (either Bachelor, Master of PhD) are likely to be overwhelmed by their final work, so please don’t be too optimistic and carefully plan your time for the project.
If you are unsure whether your proposal is suitable, feel free to discuss it on our “contributors” discussion forum. We have many community members on our mailing list who will quickly answer any of your questions regarding the project. Mentors are also constantly monitoring the mailing list. Don’t be afraid to ask questions. We’d love to help you out!

General Proposal Requirements

The proposal will be submitted via the standard web-interface at https://summerofcode.withgoogle.com/, therefore plain text is the best way to go. We expect your application to be in the range of 700-1500 words. Anything less than that will probably not contain enough information for us to determine whether or not you are the right person for the job.

Your proposal should contain at least the following information, but feel free to include anything that you think is relevant:

Please include your name (weird as it may be, people do forget it)
Title of your proposal
Abstract of your proposal
Detailed description of your idea including explanation on why it is innovative (maybe you already have a prototype?), what contribution do you expect to make to the Scala community and why do you think your project is needed. A rough plan of your development and possible architecture sketches.
Description of previous work, existing solutions (links to prototypes or references are more than welcome!)
Write us about yourself and convince that you are the right person for the job (linking to your resume/CV is good but not sufficient)
- Mention the details of your academic studies, any previous work, internships
- Any relevant skills that will help you to achieve the goal (programming languages, frameworks)?
- Any previous open-source projects (or even previous GSoC) you have contributed to?
- Do you plan to have any other commitments during SoC that may affect you work? Any vacations/holidays planned? Please be specific as much as you can.
If you apply to more than one GSoC project, especially if you also apply for a project in another organization, specify which project you prefer. In case two organizations choose to accept your applications, we can then give you the project that is most important to you. Preferring the project of another organization will not influence our decision whether to accept your application.
Contact details (very important!)

Google Summer of Code 2017 Scala Projects