This page is no longer maintained — Please continue to the home page at www.scala-lang.org

another day, another attempt at optimizing Range.foreach

4 replies
Miguel Garcia 2
Joined: 2011-01-30,
User offline. Last seen 42 years 45 weeks ago.


Good and bad news.

The good news:

1 to 1010987314:,   foreachSum wins, 1.000:  foreach 316.456   while 316.535
1 to 1010987314:,     whileSum wins, 0.998:  foreach 316.782   while 316.005
1 to 1010987314:,     whileSum wins, 0.997:  foreach 316.429   while 315.365
1 to 1155414073:,     whileSum wins, 0.998:  foreach 361.877   while 361.194
1 to 1155414073:,   foreachSum wins, 0.999:  foreach 362.039   while 362.549
1 to 1155414073:,     whileSum wins, 1.000:  foreach 362.113   while 362.058
1 to 1320473226:,     whileSum wins, 0.996:  foreach 413.646   while 411.898
1 to 1320473226:,   foreachSum wins, 0.997:  foreach 412.784   while 414.203
1 to 1320473226:,   foreachSum wins, 0.996:  foreach 412.433   while 413.955
1 to 1509112258:,     whileSum wins, 0.999:  foreach 471.742   while 471.493
1 to 1509112258:,     whileSum wins, 1.000:  foreach 472.143   while 472.107
1 to 1509112258:,     whileSum wins, 0.998:  foreach 472.320   while 471.466
1 to 1724699723:,   foreachSum wins, 0.999:  foreach 538.530   while 538.824
1 to 1724699723:,   foreachSum wins, 0.997:  foreach 538.495   while 539.990
1 to 1724699723:,     whileSum wins, 1.000:  foreach 539.370   while 539.145
1 to 1971085397:,     whileSum wins, 0.998:  foreach 617.124   while 616.190
1 to 1971085397:,     whileSum wins, 0.999:  foreach 616.969   while 616.423
1 to 1971085397:,     whileSum wins, 0.999:  foreach 617.502   while 617.011

The bad news: even after  `ant all.clean build`, the example `Int.MaxValue to Int.MaxValue by 2 foreach println`
fails at runtime with NoClassDefFound for LongIsIntegral.

Status so far:


  @inline final override def foreach[U](f: Int => U) {
    val isCommonCase = (start != Int.MinValue) && (end != Int.MaxValue)
    val isInclu  = isInclusive
    var numSteps = {
      if (isCommonCase) 0 // doesn't matter
      else NumericRange.count[Long](start, end, step, isInclu) // Accessing `length` here is a performance killer.
    }
    val goesDown = (step < 0)
    var i = start
    while (
      if(isCommonCase) {
        if(goesDown)
          if (isInclu) i >= end
          else         i >  end
        else
          if (isInclu) i <= end
          else         i <  end
      } else {
        numSteps -= 1
        numSteps >= 0
      }
    ) {
      f(i) // The cast `(f.asInstanceOf[Int => Unit]).apply(i)` results in slowdown.
      i += step
    }
  }





Miguel
http://lamp.epfl.ch/~magarcia/ScalaCompilerCornerReloaded/


Viktor Klang
Joined: 2008-12-17,
User offline. Last seen 1 year 27 weeks ago.
Re: another day, another attempt at optimizing Range.foreach
I haven't tried this, but just throwing some ideas out there, what about this:

@inline final override def foreach[U](f: Int => U) {
    val callMe = f.asInstanceOf[Int => Unit]
    val uncommon = start == Int.MinValue || end == Int.MaxValue
    val isInclu  = isInclusive
    var numSteps =
      if (uncommon) NumericRange.count[Long](start, end, step, isInclu) // Accessing `length` here is a performance killer.
      else 0
    val goesDown = (step < 0)
    val jmp = if (uncommon) 0 else {
      if (goesDown) {
        if (isInclu) 1 else 2
      } else {
        if (isInclu) 3 else 4
      }
    }
    var i = start
    while (
      (jmp: Int @switch) match {
        case 0 => numSteps -= 1; numSteps >= 0
        case 1 => i >= end
        case 2 => i > end
        case 3 => i <= end
        case 4 => i < end
      }
    ) {
      callMe.apply(i)
      i += step
    }
  }

On Wed, Dec 14, 2011 at 12:19 PM, Miguel Garcia <mgarcia512 [at] yahoo [dot] com> wrote:


Good and bad news.

The good news:

1 to 1010987314:,   foreachSum wins, 1.000:  foreach 316.456   while 316.535
1 to 1010987314:,     whileSum wins, 0.998:  foreach 316.782   while 316.005
1 to 1010987314:,     whileSum wins, 0.997:  foreach 316.429   while 315.365
1 to 1155414073:,     whileSum wins, 0.998:  foreach 361.877   while 361.194
1 to 1155414073:,   foreachSum wins, 0.999:  foreach 362.039   while 362.549
1 to 1155414073:,     whileSum wins, 1.000:  foreach 362.113   while 362.058
1 to 1320473226:,     whileSum wins, 0.996:  foreach 413.646   while 411.898
1 to 1320473226:,   foreachSum wins, 0.997:  foreach 412.784   while 414.203
1 to 1320473226:,   foreachSum wins, 0.996:  foreach 412.433   while 413.955
1 to 1509112258:,     whileSum wins, 0.999:  foreach 471.742   while 471.493
1 to 1509112258:,     whileSum wins, 1.000:  foreach 472.143   while 472.107
1 to 1509112258:,     whileSum wins, 0.998:  foreach 472.320   while 471.466
1 to 1724699723:,   foreachSum wins, 0.999:  foreach 538.530   while 538.824
1 to 1724699723:,   foreachSum wins, 0.997:  foreach 538.495   while 539.990
1 to 1724699723:,     whileSum wins, 1.000:  foreach 539.370   while 539.145
1 to 1971085397:,     whileSum wins, 0.998:  foreach 617.124   while 616.190
1 to 1971085397:,     whileSum wins, 0.999:  foreach 616.969   while 616.423
1 to 1971085397:,     whileSum wins, 0.999:  foreach 617.502   while 617.011

The bad news: even after  `ant all.clean build`, the example `Int.MaxValue to Int.MaxValue by 2 foreach println`
fails at runtime with NoClassDefFound for LongIsIntegral.

Status so far:


  @inline final override def foreach[U](f: Int => U) {
    val isCommonCase = (start != Int.MinValue) && (end != Int.MaxValue)
    val isInclu  = isInclusive
    var numSteps = {
      if (isCommonCase) 0 // doesn't matter
      else NumericRange.count[Long](start, end, step, isInclu) // Accessing `length` here is a performance killer.
    }
    val goesDown = (step < 0)
    var i = start
    while (
      if(isCommonCase) {
        if(goesDown)
          if (isInclu) i >= end
          else         i >  end
        else
          if (isInclu) i <= end
          else         i <  end
      } else {
        numSteps -= 1
        numSteps >= 0
      }
    ) {
      f(i) // The cast `(f.asInstanceOf[Int => Unit]).apply(i)` results in slowdown.
      i += step
    }
  }





Miguel
http://lamp.epfl.ch/~magarcia/ScalaCompilerCornerReloaded/





--
Viktor Klang

Akka Tech LeadTypesafe - Enterprise-Grade Scala from the Experts

Twitter: @viktorklang
Miguel Garcia 2
Joined: 2011-01-30,
User offline. Last seen 42 years 45 weeks ago.
Re: another day, another attempt at optimizing Range.foreach

I got rid of the bad-news-part by replacing the callsite

  NumericRange.count[Long](start, end, step, isInclu)

to invoke instead

   Range.LCount(start, end, step, isInclu)

which is a "by hand" specialization for Long of `NumericRange.count`, added to the Range object:


  @noinline def LCount(start: Long, end: Long, step: Long, isInclusive: Boolean): Int = {
    val upward  = start < end
    val posStep = step  > 0

    if (step == 0) throw new IllegalArgumentException("step cannot be 0.")
    else if (start == end) if (isInclusive) 1 else 0
    else if (upward != posStep) 0
    else {
      val diff      = end - start
      val jumps     = diff / step
      val remainder = diff % step
      val longCount = jumps + (
        if (!isInclusive && 0 == remainder) 0 else 1
      )

      /** The edge cases keep coming.  Since e.g.
       *    Long.MaxValue + 1 == Long.MinValue
       *  we do some more improbable seeming checks lest
       *  overflow turn up as an empty range.
       */
      // The second condition contradicts an empty result.
      val isOverflow = (longCount == 0) && (((start + step) < end) == upward)

      if (longCount > scala.Int.MaxValue || longCount < 0L || isOverflow) {
        val word  = if (isInclusive) "to" else "until"
        val descr = List(start, word, end, "by", step) mkString " "

        throw new IllegalArgumentException(descr + ": seqs cannot contain more than Int.MaxValue elements.")
      }
      longCount.toInt
    }
  }


Today could be the day! :-) We just need another round of improvements, and more benchmarks :-)


Miguel
http://lamp.epfl.ch/~magarcia/ScalaCompilerCornerReloaded/

dcsobral
Joined: 2009-04-23,
User offline. Last seen 38 weeks 5 days ago.
Re: Re: another day, another attempt at optimizing Range.foreac

Well, I notice the benchmark you are using seems to have a lot of
iterations, and I'm rather doubtful of whether that's a common use
case at all.

My own use cases for Range are usually in one of two forms:

for (i <- array.indices) // not necessarily the method, but as array index
for (i <- 0 to string.length)

Arrays can be long, but not 1010987314-long. Strings -- the kind I
iterate over, anyway -- are usually small.

Anyway, my benchmark project is finally ready. You can find it at
https://github.com/dcsobral/scala-benchmarking-template.

What it does do:

1. You specify a base directory, then add subdirectories of it as
project names, and it will use them as scala home for compiling the
benchmark. I use dist, but you could easily copy or link builds you
want tested to some directory.

2. It tests stuff compiled with and without -optimise. You can easily
make other variations based on flags.

3. It tests range sizes of 10, 100, 1000 and 10000.

4. It uses Caliper to do the testing.

5. It tests a fast, unit-returning operation on single-step closed range.

What it doesn't do:

1. It doesn't test non-unit-returning operations.

2. It doesn't test slow operations.

3. It doesn't test open ranges.

4. It doesn't test stepped ranges.

5. It doesn't test decreasing ranges.

I'm now adding such tests to it.

On Wed, Dec 14, 2011 at 09:46, Miguel Garcia wrote:
>
> I got rid of the bad-news-part by replacing the callsite
>
>   NumericRange.count[Long](start, end, step, isInclu)
>
> to invoke instead
>
>    Range.LCount(start, end, step, isInclu)
>
> which is a "by hand" specialization for Long of `NumericRange.count`, added
> to the Range object:
>
>
>   @noinline def LCount(start: Long, end: Long, step: Long, isInclusive:
> Boolean): Int = {
>     val upward  = start < end
>     val posStep = step  > 0
>
>     if (step == 0) throw new IllegalArgumentException("step cannot be 0.")
>     else if (start == end) if (isInclusive) 1 else 0
>     else if (upward != posStep) 0
>     else {
>       val diff      = end - start
>       val jumps     = diff / step
>       val remainder = diff % step
>       val longCount = jumps + (
>         if (!isInclusive && 0 == remainder) 0 else 1
>       )
>
>       /** The edge cases keep coming.  Since e.g.
>        *    Long.MaxValue + 1 == Long.MinValue
>        *  we do some more improbable seeming checks lest
>        *  overflow turn up as an empty range.
>        */
>       // The second condition contradicts an empty result.
>       val isOverflow = (longCount == 0) && (((start + step) < end) ==
> upward)
>
>       if (longCount > scala.Int.MaxValue || longCount < 0L || isOverflow) {
>         val word  = if (isInclusive) "to" else "until"
>         val descr = List(start, word, end, "by", step) mkString " "
>
>         throw new IllegalArgumentException(descr + ": seqs cannot contain
> more than Int.MaxValue elements.")
>       }
>       longCount.toInt
>     }
>   }
>
>
> Today could be the day! :-) We just need another round of improvements, and
> more benchmarks :-)
>
>
> Miguel
> http://lamp.epfl.ch/~magarcia/ScalaCompilerCornerReloaded/
>

Miguel Garcia 2
Joined: 2011-01-30,
User offline. Last seen 42 years 45 weeks ago.
Re: Re: another day, another attempt at optimizing Range.foreac

Daniel,

The benchmark I'm using can be found in

  scala/test/benchmarks/src/scala/collection/immutable/range-bench.scala

It also covers few iterations (excerpt below). For them, "while wins". What the "new" Range.foreach does is narrowing that gap.

Absolutely, the more benchmarks the better.

1 to 220:,     whileSum wins, 0.195:  foreach 0.025   while 0.005
1 to 220:,     whileSum wins, 0.232:  foreach 0.025   while 0.006
1 to 220:,     whileSum wins, 0.232:  foreach 0.025   while 0.006
1 to 251:,     whileSum wins, 0.180:  foreach 0.050   while 0.009
1 to 251:,     whileSum wins, 0.212:  foreach 0.031   while 0.007
1 to 251:,     whileSum wins, 0.210:  foreach 0.030   while 0.006
1 to 286:,     whileSum wins, 0.225:  foreach 0.034   while 0.008
1 to 286:,     whileSum wins, 0.207:  foreach 0.034   while 0.007
1 to 286:,     whileSum wins, 0.200:  foreach 0.023   while 0.005
1 to 326:,     whileSum wins, 0.218:  foreach 0.037   while 0.008
1 to 326:,     whileSum wins, 0.197:  foreach 0.038   while 0.008
1 to 326:,     whileSum wins, 0.226:  foreach 0.037   while 0.008
1 to 372:,     whileSum wins, 0.230:  foreach 0.042   while 0.010
1 to 372:,     whileSum wins, 0.246:  foreach 0.043   while 0.011
1 to 372:,     whileSum wins, 0.211:  foreach 0.043   while 0.009
1 to 425:,     whileSum wins, 0.222:  foreach 0.046   while 0.010
1 to 425:,     whileSum wins, 0.200:  foreach 0.048   while 0.010
1 to 425:,     whileSum wins, 0.218:  foreach 0.047   while 0.010
1 to 485:,     whileSum wins, 0.207:  foreach 0.057   while 0.012
1 to 485:,     whileSum wins, 0.223:  foreach 0.053   while 0.012
1 to 485:,     whileSum wins, 0.383:  foreach 0.057   while 0.022
1 to 554:,     whileSum wins, 0.197:  foreach 0.071   while 0.014
1 to 554:,     whileSum wins, 0.441:  foreach 0.069   while 0.030
1 to 554:,   foreachSum wins, 0.667:  foreach 0.001   while 0.001
1 to 633:,   foreachSum wins, 0.998:  foreach 0.001   while 0.001
1 to 633:,     whileSum wins, 0.998:  foreach 0.001   while 0.001
1 to 633:,     whileSum wins, 0.500:  foreach 0.001   while 0.000
1 to 723:,   foreachSum wins, 0.333:  foreach 0.000   while 0.001
1 to 723:,     whileSum wins, 0.501:  foreach 0.001   while 0.000
1 to 723:,     whileSum wins, 0.332:  foreach 0.001   while 0.000
1 to 826:,   foreachSum wins, 0.667:  foreach 0.001   while 0.001
1 to 826:,   foreachSum wins, 0.666:  foreach 0.001   while 0.001
1 to 826:,     whileSum wins, 1.000:  foreach 0.001   while 0.001
1 to 944:,   foreachSum wins, 0.666:  foreach 0.001   while 0.001
1 to 944:,     whileSum wins, 0.667:  foreach 0.001   while 0.001
1 to 944:,     whileSum wins, 0.666:  foreach 0.001   while 0.001
1 to 1078:,     whileSum wins, 0.667:  foreach 0.001   while 0.001
1 to 1078:,   foreachSum wins, 0.667:  foreach 0.001   while 0.001
1 to 1078:,   foreachSum wins, 0.400:  foreach 0.001   while 0.002
1 to 1232:,     whileSum wins, 0.666:  foreach 0.001   while 0.001
1 to 1232:,   foreachSum wins, 0.499:  foreach 0.001   while 0.001
1 to 1232:,   foreachSum wins, 0.998:  foreach 0.001   while 0.001
1 to 1408:,   foreachSum wins, 0.500:  foreach 0.001   while 0.001
1 to 1408:,   foreachSum wins, 0.667:  foreach 0.001   while 0.001
1 to 1408:,   foreachSum wins, 0.999:  foreach 0.001   while 0.001
1 to 1609:,   foreachSum wins, 0.749:  foreach 0.001   while 0.001
1 to 1609:,   foreachSum wins, 0.667:  foreach 0.001   while 0.001


Copyright © 2012 École Polytechnique Fédérale de Lausanne (EPFL), Lausanne, Switzerland