This page is no longer maintained — Please continue to the home page at www.scala-lang.org

Scala lexer for SciTE and other Scintilla-based editors

1 reply
Philippe Lhoste
Joined: 2010-09-02,
User offline. Last seen 42 years 45 weeks ago.

Hello Scala users!

I recently started to learn Scala and so far I am very pleased by what I
see...

I was delighted to see there was support for my favorite editor, SciTE.
I had only some adjustments to do to use in on Windows (add .bat to the
command names) and to have the keyword working: not all keyword lists
are usable with the chosen lexer...
For the record, I attach my modified version of the scala.properties
file found in the misc/scala-tool-support/scite directory.

For those not knowing it, SciTE is a text editor based on the Scintilla
source code editor component. It was originally a demo of the component
but grown to be a complete, flexible and lightweight editor.

Scintilla is an excellent component, used in other editors like the well
known Notepad++ and some others. Unlike many other components, it has
hard-coded lexers written in C++ instead of some simple (or not so
simple) pattern rules.
This allows for very fast lexing, and powerful one, based on context: I
know no other component doing so well for Perl or HTML highlighting
(including JavaScript or PHP in HTML, etc.).
The downside is that it is difficult to add a new lexer: one has to have
base knowledge of C++, a compiler, some time and will to recompile the
component.
That's why many people just wanting support for their favorite language
pick an existing lexer for a language close of their, stick a list of
keywords in the properties files, and hope special syntax isn't too ugly...

That's what was done for Scala: using the cpp lexer, which is also used
for Java, JavaScript and lot other C-like languages.
It isn't so bad for most of the Scala code. But it doesn't work so well
for nested block comments, multi-line strings or symbols, for example.
See the first screenshot to see the issue (I hope small attachments like
these are tolerated on the ML).
But at least it allowed me to advance in my learning (I haven't tried
yet the IDE plugins...), and I appreciate being able to compile from
SciTE and double-click on an error to jump to the faulty line.

I have some experience with Scintilla lexers, improving the Lua one and
some others, writing the POV-Ray one (and an unofficial AutoHotkey one),
etc. So, once I got a general picture of the Scala syntax and most of
its features, I started to fork the CPP lexer to make a Scala one:
removing all the stuff not needed (pre-processor, UUID, regex, etc.).
Then starting to support the above features, the difference between
delimiters and operator symbols, etc.

This is a work in progress, I haven't tested the folding at all, I want
to better enforce some rules, like moo% not being a unique identifier
but one followed by an operator. But it will never be able to tell for
gah_%+*dah where is the end of the first identifier: it is an editor,
not an IDE, it doesn't know what are the valid identifiers in the
current scope. Hey, just add some spaces!
I might also do some work to improve XML support, but not going to
duplicate the full XML lexer code or list of styles. After all, I doubt
many coders really use XML hard-coded in the source, except for small
fragments perhaps.

It already looks good after a couple of hours of hacking: see the second
screenshot and appreciate the progress... ^_^

For those feeling adventurous, I uploaded the
http://www.autohotkey.net/~PhiLho/SciTE4Scala.zip
file (650KB): it contains SciTE.exe and SciLexer.dll (for Windows,
obviously - only SciLexer.dll has changes for Scala, might work with
Notepad++ - made with source code at v.2.21), the current version of the
source code of the lexer and a scala.properties adapted to the new lexer.
If you don't have SciTE already, you need some complementary files from
the SciTE binary distribution at http://www.scintilla.org

Suggestions and criticism are welcome!

Renghen Renghen
Joined: 2010-09-06,
User offline. Last seen 42 years 45 weeks ago.
Re: Scala lexer for SciTE and other Scintilla-based editors
great work, I use scite all the time, it is very light weight, and fun to use.

I will do the testing and let u know, if any issues arises.

On Tue, Sep 7, 2010 at 1:20 PM, Philippe Lhoste <PhiLho [at] gmx [dot] net> wrote:
Hello Scala users!

I recently started to learn Scala and so far I am very pleased by what I see...

I was delighted to see there was support for my favorite editor, SciTE.
I had only some adjustments to do to use in on Windows (add .bat to the command names) and to have the keyword working: not all keyword lists are usable with the chosen lexer...
For the record, I attach my modified version of the scala.properties file found in the misc/scala-tool-support/scite directory.

For those not knowing it, SciTE is a text editor based on the Scintilla source code editor component. It was originally a demo of the component but grown to be a complete, flexible and lightweight editor.

Scintilla is an excellent component, used in other editors like the well known Notepad++ and some others. Unlike many other components, it has hard-coded lexers written in C++ instead of some simple (or not so simple) pattern rules.
This allows for very fast lexing, and powerful one, based on context: I know no other component doing so well for Perl or HTML highlighting (including JavaScript or PHP in HTML, etc.).
The downside is that it is difficult to add a new lexer: one has to have base knowledge of C++, a compiler, some time and will to recompile the component.
That's why many people just wanting support for their favorite language pick an existing lexer for a language close of their, stick a list of keywords in the properties files, and hope special syntax isn't too ugly...

That's what was done for Scala: using the cpp lexer, which is also used for Java, JavaScript and lot other C-like languages.
It isn't so bad for most of the Scala code. But it doesn't work so well for nested block comments, multi-line strings or symbols, for example.
See the first screenshot to see the issue (I hope small attachments like these are tolerated on the ML).
But at least it allowed me to advance in my learning (I haven't tried yet the IDE plugins...), and I appreciate being able to compile from SciTE and double-click on an error to jump to the faulty line.

I have some experience with Scintilla lexers, improving the Lua one and some others, writing the POV-Ray one (and an unofficial AutoHotkey one), etc. So, once I got a general picture of the Scala syntax and most of its features, I started to fork the CPP lexer to make a Scala one: removing all the stuff not needed (pre-processor, UUID, regex, etc.). Then starting to support the above features, the difference between delimiters and operator symbols, etc.

This is a work in progress, I haven't tested the folding at all, I want to better enforce some rules, like moo% not being a unique identifier but one followed by an operator. But it will never be able to tell for gah_%+*dah where is the end of the first identifier: it is an editor, not an IDE, it doesn't know what are the valid identifiers in the current scope. Hey, just add some spaces!
I might also do some work to improve XML support, but not going to duplicate the full XML lexer code or list of styles. After all, I doubt many coders really use XML hard-coded in the source, except for small fragments perhaps.

It already looks good after a couple of hours of hacking: see the second screenshot and appreciate the progress... ^_^

For those feeling adventurous, I uploaded the
http://www.autohotkey.net/~PhiLho/SciTE4Scala.zip
file (650KB): it contains SciTE.exe and SciLexer.dll (for Windows, obviously - only SciLexer.dll has changes for Scala, might work with Notepad++ - made with source code at v.2.21), the current version of the source code of the lexer and a scala.properties adapted to the new lexer.
If you don't have SciTE already, you need some complementary files from the SciTE binary distribution at http://www.scintilla.org

Suggestions and criticism are welcome!

Copyright © 2012 École Polytechnique Fédérale de Lausanne (EPFL), Lausanne, Switzerland