package matching

Content Hierarchy Learn more about scaladoc diagrams
  1. Public
  2. All

Type Members

  1. class Regex extends Serializable

    This class provides methods for creating and using regular expressions.

    This class provides methods for creating and using regular expressions. It is based on the regular expressions of the JDK since 1.4.

    Its main goal is to extract strings that match a pattern, or the subgroups that make it up. For that reason, it is usually used with for comprehensions and matching (see methods for examples).

    A Regex is created from a java.lang.String representation of the regular expression pattern1. That pattern is compiled during construction, so frequently used patterns should be declared outside loops if performance is of concern. Possibly, they might be declared on a companion object, so that they need only to be initialized once.

    The canonical way of creating regex patterns is by using the method r, provided on java.lang.String through an implicit conversion into scala.collection.immutable.WrappedString. Using triple quotes to write these strings avoids having to quote the backslash character (\).

    Using the constructor directly, on the other hand, makes it possible to declare names for subgroups in the pattern.

    For example, both declarations below generate the same regex, but the second one associate names with the subgroups.

    val dateP1 = """(\d\d\d\d)-(\d\d)-(\d\d)""".r
    val dateP2 = new scala.util.matching.Regex("""(\d\d\d\d)-(\d\d)-(\d\d)""", "year", "month", "day")

    There are two ways of using a Regex to find a pattern: calling methods on Regex, such as findFirstIn or findAllIn, or using it as an extractor in a pattern match.

    Note that, when calling findAllIn, the resulting scala.util.matching.Regex.MatchIterator needs to be initialized (by calling hasNext or next(), or causing these to be called) before information about a match can be retrieved:

    val msg = "I love Scala"
    // val start = " ".r.findAllIn(msg).start // throws an IllegalStateException
    val matches = " ".r.findAllIn(msg)
    matches.hasNext // initializes the matcher
    val start = matches.start

    When Regex is used as an extractor in a pattern match, note that it only succeeds if the whole text can be matched. For this reason, one usually calls a method to find the matching substrings, and then use it as an extractor to break match into subgroups.

    As an example, the above patterns can be used like this:

    val dateP1(year, month, day) = "2011-07-15"
    // val dateP1(year, month, day) = "Date 2011-07-15" // throws an exception at runtime
    val copyright: String = dateP1 findFirstIn "Date of this document: 2011-07-15" match {
      case Some(dateP1(year, month, day)) => "Copyright "+year
      case None                           => "No copyright"
    val copyright: Option[String] = for {
      dateP1(year, month, day) <- dateP1 findFirstIn "Last modified 2011-07-15"
    } yield year
    def getYears(text: String): Iterator[String] = for (dateP1(year, _, _) <- dateP1 findAllIn text) yield year
    def getFirstDay(text: String): Option[String] = for (m <- dateP2 findFirstMatchIn text) yield m group "day"

    Regex does not provide a method that returns a scala.Boolean. One can use java.lang.String matches method, or, if Regex is preferred, either ignore the return value or test the Option for emptyness. For example:

    def hasDate(text: String): Boolean = (dateP1 findFirstIn text).nonEmpty
    def printLinesWithDates(lines: Traversable[String]) {
      lines foreach { line =>
        dateP1 findFirstIn line foreach { _ => println(line) }

    There are also methods that can be used to replace the patterns on a text. The substitutions can be simple replacements, or more complex functions. For example:

    val months = Map( 1 -> "Jan", 2 -> "Feb", 3 -> "Mar",
                      4 -> "Apr", 5 -> "May", 6 -> "Jun",
                      7 -> "Jul", 8 -> "Aug", 9 -> "Sep",
                      10 -> "Oct", 11 -> "Nov", 12 -> "Dec")
    import scala.util.matching.Regex.Match
    def reformatDate(text: String) = dateP2 replaceAllIn ( text, (m: Match) =>
      "%s %s, %s" format (months(m group "month" toInt), m group "day", m group "year")

    You can use special pattern syntax constructs like (?idmsux-idmsux)¹ to switch various regex compilation options like CASE_INSENSITIVE or UNICODE_CASE.


    1.1, 29/01/2008


    ¹ A detailed description is available in java.util.regex.Pattern.

    See also


  2. trait UnanchoredRegex extends Regex

Value Members

  1. object Regex extends Serializable

    This object defines inner classes that describe regex matches and helper objects.