This page is no longer maintained — Please continue to the home page at www.scala-lang.org

Re: Re: another day, another attempt at optimizing Range.foreach

34 replies
Viktor Klang
Joined: 2008-12-17,
User offline. Last seen 1 year 27 weeks ago.
Did you see my badass tableswitch? ;-)

On Wed, Dec 14, 2011 at 12:46 PM, Miguel Garcia <mgarcia512 [at] yahoo [dot] com> wrote:

I got rid of the bad-news-part by replacing the callsite

  NumericRange.count[Long](start, end, step, isInclu)

to invoke instead

   Range.LCount(start, end, step, isInclu)

which is a "by hand" specialization for Long of `NumericRange.count`, added to the Range object:


  @noinline def LCount(start: Long, end: Long, step: Long, isInclusive: Boolean): Int = {
    val upward  = start < end
    val posStep = step  > 0

    if (step == 0) throw new IllegalArgumentException("step cannot be 0.")
    else if (start == end) if (isInclusive) 1 else 0
    else if (upward != posStep) 0
    else {
      val diff      = end - start
      val jumps     = diff / step
      val remainder = diff % step
      val longCount = jumps + (
        if (!isInclusive && 0 == remainder) 0 else 1
      )

      /** The edge cases keep coming.  Since e.g.
       *    Long.MaxValue + 1 == Long.MinValue
       *  we do some more improbable seeming checks lest
       *  overflow turn up as an empty range.
       */
      // The second condition contradicts an empty result.
      val isOverflow = (longCount == 0) && (((start + step) < end) == upward)

      if (longCount > scala.Int.MaxValue || longCount < 0L || isOverflow) {
        val word  = if (isInclusive) "to" else "until"
        val descr = List(start, word, end, "by", step) mkString " "

        throw new IllegalArgumentException(descr + ": seqs cannot contain more than Int.MaxValue elements.")
      }
      longCount.toInt
    }
  }


Today could be the day! :-) We just need another round of improvements, and more benchmarks :-)


Miguel
http://lamp.epfl.ch/~magarcia/ScalaCompilerCornerReloaded/




--
Viktor Klang

Akka Tech LeadTypesafe - Enterprise-Grade Scala from the Experts

Twitter: @viktorklang
Miguel Garcia 2
Joined: 2011-01-30,
User offline. Last seen 42 years 45 weeks ago.
Re: Re: another day, another attempt at optimizing Range.foreac

Viktor,

Your tableswitch is a winner.

I almost start to feel sorry for the few remaining "whileSum wins":

1 to 1010987314:,   foreachSum wins, 0.991:  foreach 311.183   while 314.050
1 to 1010987314:,   foreachSum wins, 0.992:  foreach 310.832   while 313.461
1 to 1010987314:,   foreachSum wins, 1.000:  foreach 313.268   while 313.313
1 to 1155414073:,   foreachSum wins, 0.999:  foreach 357.983   while 358.241
1 to 1155414073:,   foreachSum wins, 0.996:  foreach 356.814   while 358.163
1 to 1155414073:,     whileSum wins, 0.997:  foreach 358.470   while 357.390
1 to 1320473226:,   foreachSum wins, 0.995:  foreach 408.537   while 410.701
1 to 1320473226:,     whileSum wins, 0.998:  foreach 409.137   while 408.431
1 to 1320473226:,     whileSum wins, 0.999:  foreach 408.877   while 408.546
1 to 1509112258:,   foreachSum wins, 0.997:  foreach 466.760   while 467.954
1 to 1509112258:,   foreachSum wins, 0.997:  foreach 467.564   while 468.738
1 to 1509112258:,   foreachSum wins, 0.995:  foreach 465.689   while 467.953
1 to 1724699723:,   foreachSum wins, 0.997:  foreach 533.638   while 535.091
1 to 1724699723:,     whileSum wins, 0.997:  foreach 535.347   while 533.644
1 to 1724699723:,   foreachSum wins, 1.000:  foreach 533.480   while 533.707
1 to 1971085397:,     whileSum wins, 0.997:  foreach 611.528   while 609.835
1 to 1971085397:,     whileSum wins, 1.000:  foreach 610.961   while 610.960
1 to 1971085397:,     whileSum wins, 0.999:  foreach 611.479   while 610.575

Can we call it a day already?


Miguel
http://lamp.epfl.ch/~magarcia/ScalaCompilerCornerReloaded/

adriaanm
Joined: 2010-02-08,
User offline. Last seen 31 weeks 4 days ago.
Re: Re: another day, another attempt at optimizing Range.foreac


2011/12/14 √iktor Ҡlang <viktor [dot] klang [at] gmail [dot] com>
badass tableswitch
from a link Ismael forwarded in the optimizing virtpatmat thread (http://wikis.sun.com/display/HotSpotInternals/PerformanceTechniques):
Switches are profiled but the profile information is poorly used. For now, consider building an initial decision tree if you know one or two cases are really common.
thus, I'm surprised the switch is faster since there are few cases and we know which one is more likely, right? 
also, as pointed out by Martin, what happens when you have multiple functions that are passed to foreach? is your benchmark doing this already? 
Miguel Garcia 2
Joined: 2011-01-30,
User offline. Last seen 42 years 45 weeks ago.
Re: Re: another day, another attempt at optimizing Range.foreac

For those who want to help with benchmarking, here's the aggregated status so far:

// in class scala.collection.immutable.Range

  @inline final override def foreach[U](f: Int => U) {
    val uncommon = start == Int.MinValue || end == Int.MaxValue
    val isInclu  = isInclusive
    var numSteps = {
      if (uncommon) Range.LCount(start, end, step, isInclu) // Accessing `length` here is a performance killer.
      else 0 // doesn't matter
    }
    val jmp = if (uncommon) 0 else {
      if (step < 0) {
        if (isInclu) 1 else 2
      } else {
        if (isInclu) 3 else 4
      }
    }
    var i = start
    while (
      (jmp: Int @scala.annotation.switch) match {
        case 0 => numSteps -= 1; numSteps >= 0
        case 1 => i >= end
        case 2 => i > end
        case 3 => i <= end
        case 4 => i < end
      }
    ) {
      f(i) // The cast `(f.asInstanceOf[Int => Unit]).apply(i)` results in slowdown.
      i += step
    }
  }


// in object scala.collection.immutable.Range

  @noinline def LCount(start: Long, end: Long, step: Long, isInclusive: Boolean): Int = {
    val upward  = start < end
    val posStep = step  > 0

    if (step == 0) throw new IllegalArgumentException("step cannot be 0.")
    else if (start == end) if (isInclusive) 1 else 0
    else if (upward != posStep) 0
    else {
      val diff      = end - start
      val jumps     = diff / step
      val remainder = diff % step
      val longCount = jumps + (
        if (!isInclusive && 0 == remainder) 0 else 1
      )

      /** The edge cases keep coming.  Since e.g.
       *    Long.MaxValue + 1 == Long.MinValue
       *  we do some more improbable seeming checks lest
       *  overflow turn up as an empty range.
       */
      // The second condition contradicts an empty result.
      val isOverflow = (longCount == 0) && (((start + step) < end) == upward)

      if (longCount > scala.Int.MaxValue || longCount < 0L || isOverflow) {
        val word  = if (isInclusive) "to" else "until"
        val descr = List(start, word, end, "by", step) mkString " "

        throw new IllegalArgumentException(descr + ": seqs cannot contain more than Int.MaxValue elements.")
      }
      longCount.toInt
    }
  }



Miguel
http://lamp.epfl.ch/~magarcia/ScalaCompilerCornerReloaded/

dcsobral
Joined: 2009-04-23,
User offline. Last seen 38 weeks 5 days ago.
Re: Re: another day, another attempt at optimizing Range.foreac

Is this a game of find the errors? Testing for MinValue/MaxValue is
correct for step 1 or -1, but not for any other step. Consider these
other extremes:

1 to 1 by Int.MaxValue
0 to Int.MinValue by Int.MinValue

Which sums up pretty well the problem with Range when compared to
while loops: it's too generic. In a while loop, or a traditional
C-style for loop, it is up to the programmer to write the guard
condition correctly. Since most loops are single-increment or
single-decrement far from int boundaries, it is usually correct. But,
because Range is too generic, its foreach has to handle a number of
boundary conditions that are useless for the simple case.

So we guard ourselves with lazy vals on "last" and "length", paying
lazy's associated costs, and then go right ahead and compute these
values because foreach uses them, and almost everything uses foreach.

At which point I think the only practical solution for range is
separate classes. A statement like (x to y) does not need Range's
heavy mechanics, and both "to" and "until" can easily return a
subclass of Range. That causes problem for decreasing ranges, unless
we introduce dependent types to handle "by - 1" :-), though we could
override "reverse" to produce a simple decreasing range (and vice
versa). In the common use case of "for (i <- x to y)" or "for (i <- x
to y reverse)", that should be enough. The method "indices" could
likewise be changed to return a simpler range.

The thing is... we _have_ iterated through this solution, though I'm
pretty sure "to", "until" and "indices" were not adapted as well to
make the call site monomorphic. I'll look into this if someone doesn't
beat me to it.

As for your proposed solution, I suggest you check whether step is 1
or -1, handle that, and delegate anything else.

On Wed, Dec 14, 2011 at 10:34, Miguel Garcia wrote:
>
> For those who want to help with benchmarking, here's the aggregated status
> so far:
>
> // in class scala.collection.immutable.Range
>
>   @inline final override def foreach[U](f: Int => U) {
>     val uncommon = start == Int.MinValue || end == Int.MaxValue
>     val isInclu  = isInclusive
>     var numSteps = {
>       if (uncommon) Range.LCount(start, end, step, isInclu) // Accessing
> `length` here is a performance killer.
>       else 0 // doesn't matter
>
>     }
>     val jmp = if (uncommon) 0 else {
>       if (step < 0) {
>
>         if (isInclu) 1 else 2
>       } else {
>         if (isInclu) 3 else 4
>       }
>     }
>     var i = start
>     while (
>       (jmp: Int @scala.annotation.switch) match {
>
>         case 0 => numSteps -= 1; numSteps >= 0
>         case 1 => i >= end
>         case 2 => i > end
>         case 3 => i <= end
>         case 4 => i < end
>       }
>     ) {
>       f(i) // The cast `(f.asInstanceOf[Int => Unit]).apply(i)` results in
> slowdown.
>       i += step
>     }
>   }
>
>
> // in object scala.collection.immutable.Range
>
>
>   @noinline def LCount(start: Long, end: Long, step: Long, isInclusive:
> Boolean): Int = {
>     val upward  = start < end
>     val posStep = step  > 0
>
>     if (step == 0) throw new IllegalArgumentException("step cannot be 0.")
>     else if (start == end) if (isInclusive) 1 else 0
>     else if (upward != posStep) 0
>     else {
>       val diff      = end - start
>       val jumps     = diff / step
>       val remainder = diff % step
>       val longCount = jumps + (
>         if (!isInclusive && 0 == remainder) 0 else 1
>       )
>
>       /** The edge cases keep coming.  Since e.g.
>        *    Long.MaxValue + 1 == Long.MinValue
>        *  we do some more improbable seeming checks lest
>        *  overflow turn up as an empty range.
>        */
>       // The second condition contradicts an empty result.
>       val isOverflow = (longCount == 0) && (((start + step) < end) ==
> upward)
>
>       if (longCount > scala.Int.MaxValue || longCount < 0L || isOverflow) {
>         val word  = if (isInclusive) "to" else "until"
>         val descr = List(start, word, end, "by", step) mkString " "
>
>         throw new IllegalArgumentException(descr + ": seqs cannot contain
> more than Int.MaxValue elements.")
>       }
>       longCount.toInt
>     }
>   }
>
>
>
> Miguel
> http://lamp.epfl.ch/~magarcia/ScalaCompilerCornerReloaded/
>

Miguel Garcia 2
Joined: 2011-01-30,
User offline. Last seen 42 years 45 weeks ago.
Re: Re: another day, another attempt at optimizing Range.foreac

Daniel,

As you correctly point out the version I posted doesn't identify properly those "uncommon" cases. Still, I believe the current code structure can cope with them (as long as the rhs of the assignment to "val uncommon" is formulated in such a way to actually cover the uncommon cases).


Miguel


Matthew Pocock 3
Joined: 2010-07-30,
User offline. Last seen 42 years 45 weeks ago.
Re: Re: another day, another attempt at optimizing Range.foreac
Sorry if you've already thought about this already but... my recollection is that hotspot will optimistically inline things assuming there is only one implementation of a non-final method available (or that the code is only ever invoked such that it chains into a single implementation?) and then back out of this decision if it discovers a 2nd implementation. However, I'm rather hazy about how much context it takes into account. Are we sure that the performance being measured in these benchmarks will stand in a typical application without hotspot backing out its inlining? Range.foreach is likely to be invoked with many different bodies in a typical application. This is the sort of subtlety that leaves me always leaning towards inlining everything we reasonably can through the compiler, particularly when this lets us also remove boxing and do further inlining, rather than relying heavily upon hotspot.
Matthew

On 14 December 2011 13:20, Daniel Sobral <dcsobral [at] gmail [dot] com> wrote:
Is this a game of find the errors? Testing for MinValue/MaxValue is
correct for step 1 or -1, but not for any other step. Consider these
other extremes:

1 to 1 by Int.MaxValue
0 to Int.MinValue by Int.MinValue

Which sums up pretty well the problem with Range when compared to
while loops: it's too generic. In a while loop, or a traditional
C-style for loop, it is up to the programmer to write the guard
condition correctly. Since most loops are single-increment or
single-decrement far from int boundaries, it is usually correct. But,
because Range is too generic, its foreach has to handle a number of
boundary conditions that are useless for the simple case.

So we guard ourselves with lazy vals on "last" and "length", paying
lazy's associated costs, and then go right ahead and compute these
values because foreach uses them, and almost everything uses foreach.

At which point I think the only practical solution for range is
separate classes. A statement like (x to y) does not need Range's
heavy mechanics, and both "to" and "until" can easily return a
subclass of Range. That causes problem for decreasing ranges, unless
we introduce dependent types to handle "by - 1" :-), though we could
override "reverse" to produce a simple decreasing range (and vice
versa). In the common use case of "for (i <- x to y)" or "for (i <- x
to y reverse)", that should be enough. The method "indices" could
likewise be changed to return a simpler range.

The thing is... we _have_ iterated through this solution, though I'm
pretty sure "to", "until" and "indices" were not adapted as well to
make the call site monomorphic. I'll look into this if someone doesn't
beat me to it.

As for your proposed solution, I suggest you check whether step is 1
or -1, handle that, and delegate anything else.

On Wed, Dec 14, 2011 at 10:34, Miguel Garcia <mgarcia512 [at] yahoo [dot] com> wrote:
>
> For those who want to help with benchmarking, here's the aggregated status
> so far:
>
> // in class scala.collection.immutable.Range
>
>   @inline final override def foreach[U](f: Int => U) {
>     val uncommon = start == Int.MinValue || end == Int.MaxValue
>     val isInclu  = isInclusive
>     var numSteps = {
>       if (uncommon) Range.LCount(start, end, step, isInclu) // Accessing
> `length` here is a performance killer.
>       else 0 // doesn't matter
>
>     }
>     val jmp = if (uncommon) 0 else {
>       if (step < 0) {
>
>         if (isInclu) 1 else 2
>       } else {
>         if (isInclu) 3 else 4
>       }
>     }
>     var i = start
>     while (
>       (jmp: Int @scala.annotation.switch) match {
>
>         case 0 => numSteps -= 1; numSteps >= 0
>         case 1 => i >= end
>         case 2 => i > end
>         case 3 => i <= end
>         case 4 => i < end
>       }
>     ) {
>       f(i) // The cast `(f.asInstanceOf[Int => Unit]).apply(i)` results in
> slowdown.
>       i += step
>     }
>   }
>
>
> // in object scala.collection.immutable.Range
>
>
>   @noinline def LCount(start: Long, end: Long, step: Long, isInclusive:
> Boolean): Int = {
>     val upward  = start < end
>     val posStep = step  > 0
>
>     if (step == 0) throw new IllegalArgumentException("step cannot be 0.")
>     else if (start == end) if (isInclusive) 1 else 0
>     else if (upward != posStep) 0
>     else {
>       val diff      = end - start
>       val jumps     = diff / step
>       val remainder = diff % step
>       val longCount = jumps + (
>         if (!isInclusive && 0 == remainder) 0 else 1
>       )
>
>       /** The edge cases keep coming.  Since e.g.
>        *    Long.MaxValue + 1 == Long.MinValue
>        *  we do some more improbable seeming checks lest
>        *  overflow turn up as an empty range.
>        */
>       // The second condition contradicts an empty result.
>       val isOverflow = (longCount == 0) && (((start + step) < end) ==
> upward)
>
>       if (longCount > scala.Int.MaxValue || longCount < 0L || isOverflow) {
>         val word  = if (isInclusive) "to" else "until"
>         val descr = List(start, word, end, "by", step) mkString " "
>
>         throw new IllegalArgumentException(descr + ": seqs cannot contain
> more than Int.MaxValue elements.")
>       }
>       longCount.toInt
>     }
>   }
>
>
>
> Miguel
> http://lamp.epfl.ch/~magarcia/ScalaCompilerCornerReloaded/
>



--
Daniel C. Sobral

I travel to the future all the time.



--
Dr Matthew PocockIntegrative Bioinformatics Group, School of Computing Science, Newcastle Universitymailto: turingatemyhamster [at] gmail [dot] com gchat: turingatemyhamster [at] gmail [dot] commsn: matthew_pocock [at] yahoo [dot] co [dot] uk irc.freenode.net: drdozerskype: matthew.pococktel: (0191) 2566550mob: +447535664143
extempore
Joined: 2008-12-17,
User offline. Last seen 35 weeks 3 days ago.
Re: Re: another day, another attempt at optimizing Range.foreac

I have what I believe is a big improvement. I went back to first
principles and changed a bunch of other stuff to support the foreach
implementation, which emerges looking like this:

/** A nice simple foreach for hotspot to smother in kisses.
*/
@inline final override def foreach[@specialized(Unit) U](f: Int => U) {
var i: Int = start
// The value which should terminate the loop, one step beyond the
// last member of the range. The value may overflow Int and wrap
// around, but it doesn't matter because a call to foreach will
// overflow in the same way and terminate on schedule.
val terminate = lastElement + step
while (i != terminate) {
f(i)
i += step
}
}

lastElement is an eager val calculated at construction time (there are
no lazy vals.) I find this version profiles faster than anything yet
tried, and has the bonus of looking a lot less "designed for the
benchmark."

Josh Marcus 2
Joined: 2011-05-23,
User offline. Last seen 42 years 45 weeks ago.
Re: Re: another day, another attempt at optimizing Range.foreac

+1

Our experience is that de-inlining or "dynamic deoptimization" is a
very real phenomenon, and can lead to misleading microbenchmark
results.

Here's one straightforward description:
http://www.ibm.com/developerworks/library/j-jtp12214/#5.0

An example is that there are certain optimizations that HotSpot can
perform when it observes that only a single class is
implementing an interface -- but when HotSpot realizes that its
assumptions are wrong, it has to "dyanmically deoptimize".

If you are running -XX:+PrintCompilation, a line with "made not
entrant" means that a previous optimization has been discarded.

--j

On Wed, Dec 14, 2011 at 9:13 AM, Matthew Pocock
wrote:
> Sorry if you've already thought about this already but... my recollection is
> that hotspot will optimistically inline things assuming there is only one
> implementation of a non-final method available (or that the code is only
> ever invoked such that it chains into a single implementation?) and then
> back out of this decision if it discovers a 2nd implementation. However, I'm
> rather hazy about how much context it takes into account. Are we sure that
> the performance being measured in these benchmarks will stand in a typical
> application without hotspot backing out its inlining? Range.foreach is
> likely to be invoked with many different bodies in a typical application.
> This is the sort of subtlety that leaves me always leaning towards inlining
> everything we reasonably can through the compiler, particularly when this
> lets us also remove boxing and do further inlining, rather than relying
> heavily upon hotspot.
>
> Matthew
>
>
> On 14 December 2011 13:20, Daniel Sobral wrote:
>>
>> Is this a game of find the errors? Testing for MinValue/MaxValue is
>> correct for step 1 or -1, but not for any other step. Consider these
>> other extremes:
>>
>> 1 to 1 by Int.MaxValue
>> 0 to Int.MinValue by Int.MinValue
>>
>> Which sums up pretty well the problem with Range when compared to
>> while loops: it's too generic. In a while loop, or a traditional
>> C-style for loop, it is up to the programmer to write the guard
>> condition correctly. Since most loops are single-increment or
>> single-decrement far from int boundaries, it is usually correct. But,
>> because Range is too generic, its foreach has to handle a number of
>> boundary conditions that are useless for the simple case.
>>
>> So we guard ourselves with lazy vals on "last" and "length", paying
>> lazy's associated costs, and then go right ahead and compute these
>> values because foreach uses them, and almost everything uses foreach.
>>
>> At which point I think the only practical solution for range is
>> separate classes. A statement like (x to y) does not need Range's
>> heavy mechanics, and both "to" and "until" can easily return a
>> subclass of Range. That causes problem for decreasing ranges, unless
>> we introduce dependent types to handle "by - 1" :-), though we could
>> override "reverse" to produce a simple decreasing range (and vice
>> versa). In the common use case of "for (i <- x to y)" or "for (i <- x
>> to y reverse)", that should be enough. The method "indices" could
>> likewise be changed to return a simpler range.
>>
>> The thing is... we _have_ iterated through this solution, though I'm
>> pretty sure "to", "until" and "indices" were not adapted as well to
>> make the call site monomorphic. I'll look into this if someone doesn't
>> beat me to it.
>>
>> As for your proposed solution, I suggest you check whether step is 1
>> or -1, handle that, and delegate anything else.
>>
>> On Wed, Dec 14, 2011 at 10:34, Miguel Garcia wrote:
>> >
>> > For those who want to help with benchmarking, here's the aggregated
>> > status
>> > so far:
>> >
>> > // in class scala.collection.immutable.Range
>> >
>> >   @inline final override def foreach[U](f: Int => U) {
>> >     val uncommon = start == Int.MinValue || end == Int.MaxValue
>> >     val isInclu  = isInclusive
>> >     var numSteps = {
>> >       if (uncommon) Range.LCount(start, end, step, isInclu) // Accessing
>> > `length` here is a performance killer.
>> >       else 0 // doesn't matter
>> >
>> >     }
>> >     val jmp = if (uncommon) 0 else {
>> >       if (step < 0) {
>> >
>> >         if (isInclu) 1 else 2
>> >       } else {
>> >         if (isInclu) 3 else 4
>> >       }
>> >     }
>> >     var i = start
>> >     while (
>> >       (jmp: Int @scala.annotation.switch) match {
>> >
>> >         case 0 => numSteps -= 1; numSteps >= 0
>> >         case 1 => i >= end
>> >         case 2 => i > end
>> >         case 3 => i <= end
>> >         case 4 => i < end
>> >       }
>> >     ) {
>> >       f(i) // The cast `(f.asInstanceOf[Int => Unit]).apply(i)` results
>> > in
>> > slowdown.
>> >       i += step
>> >     }
>> >   }
>> >
>> >
>> > // in object scala.collection.immutable.Range
>> >
>> >
>> >   @noinline def LCount(start: Long, end: Long, step: Long, isInclusive:
>> > Boolean): Int = {
>> >     val upward  = start < end
>> >     val posStep = step  > 0
>> >
>> >     if (step == 0) throw new IllegalArgumentException("step cannot be
>> > 0.")
>> >     else if (start == end) if (isInclusive) 1 else 0
>> >     else if (upward != posStep) 0
>> >     else {
>> >       val diff      = end - start
>> >       val jumps     = diff / step
>> >       val remainder = diff % step
>> >       val longCount = jumps + (
>> >         if (!isInclusive && 0 == remainder) 0 else 1
>> >       )
>> >
>> >       /** The edge cases keep coming.  Since e.g.
>> >        *    Long.MaxValue + 1 == Long.MinValue
>> >        *  we do some more improbable seeming checks lest
>> >        *  overflow turn up as an empty range.
>> >        */
>> >       // The second condition contradicts an empty result.
>> >       val isOverflow = (longCount == 0) && (((start + step) < end) ==
>> > upward)
>> >
>> >       if (longCount > scala.Int.MaxValue || longCount < 0L ||
>> > isOverflow) {
>> >         val word  = if (isInclusive) "to" else "until"
>> >         val descr = List(start, word, end, "by", step) mkString " "
>> >
>> >         throw new IllegalArgumentException(descr + ": seqs cannot
>> > contain
>> > more than Int.MaxValue elements.")
>> >       }
>> >       longCount.toInt
>> >     }
>> >   }
>> >
>> >
>> >
>> > Miguel
>> > http://lamp.epfl.ch/~magarcia/ScalaCompilerCornerReloaded/
>> >
>>
>>
>>
>> --
>> Daniel C. Sobral
>>
>> I travel to the future all the time.
>
>
>
>
> --
> Dr Matthew Pocock
> Integrative Bioinformatics Group, School of Computing Science, Newcastle
> University
> mailto: turingatemyhamster [at] gmail [dot] com
> gchat: turingatemyhamster [at] gmail [dot] com
> msn: matthew_pocock [at] yahoo [dot] co [dot] uk
> irc.freenode.net: drdozer
> skype: matthew.pocock
> tel: (0191) 2566550
> mob: +447535664143
>

Ismael Juma 2
Joined: 2011-01-22,
User offline. Last seen 42 years 45 weeks ago.
Re: Re: another day, another attempt at optimizing Range.foreac
On Wed, Dec 14, 2011 at 2:13 PM, Matthew Pocock <turingatemyhamster [at] gmail [dot] com> wrote:
Sorry if you've already thought about this already but

That is what Martin mentioned yesterday.
Ismael 
Viktor Klang
Joined: 2008-12-17,
User offline. Last seen 1 year 27 weeks ago.
Re: Re: another day, another attempt at optimizing Range.foreac


On Wed, Dec 14, 2011 at 3:37 PM, Paul Phillips <paulp [at] improving [dot] org> wrote:
On Wed, Dec 14, 2011 at 6:34 AM, Ismael Juma <mlists [at] juma [dot] me [dot] uk> wrote:
> This does look great if it handles all the corner-cases that have been
> discussed. I am suspicious of complex code that relies on HotSpot to do
> sophisticated optimisations in order to perform well.

I've learned to temper my bold pronouncements about the presence or
absence of bugs in Range, but I'm hip to all the corners.  Everyone
will have a chance to unmask any failures on that front in the near
future.

unit tests are nice :-)

--
Viktor Klang

Akka Tech LeadTypesafe - Enterprise-Grade Scala from the Experts

Twitter: @viktorklang
ijuma
Joined: 2008-08-20,
User offline. Last seen 22 weeks 2 days ago.
Re: Re: another day, another attempt at optimizing Range.foreac
On Wed, Dec 14, 2011 at 2:30 PM, Paul Phillips <paulp [at] improving [dot] org> wrote:
I have what I believe is a big improvement.  I went back to first
principles and changed a bunch of other stuff to support the foreach
implementation, which emerges looking like this:

 /** A nice simple foreach for hotspot to smother in kisses.
  */
 @inline final override def foreach[@specialized(Unit) U](f: Int => U) {
   var i: Int = start
   // The value which should terminate the loop, one step beyond the
   // last member of the range. The value may overflow Int and wrap
   // around, but it doesn't matter because a call to foreach will
   // overflow in the same way and terminate on schedule.
   val terminate = lastElement + step
   while (i != terminate) {
     f(i)
     i += step
   }
 }

lastElement is an eager val calculated at construction time (there are
no lazy vals.) I find this version profiles faster than anything yet
tried, and has the bonus of looking a lot less "designed for the
benchmark."

This does look great if it handles all the corner-cases that have been discussed. I am suspicious of complex code that relies on HotSpot to do sophisticated optimisations in order to perform well.
Best,Ismael 
extempore
Joined: 2008-12-17,
User offline. Last seen 35 weeks 3 days ago.
Re: Re: another day, another attempt at optimizing Range.foreac

2011/12/14 √iktor Ҡlang :
> unit tests are nice :-)

Unitests? These are related to unicorns in some way? I enjoy your
active imagination, viktor.

extempore
Joined: 2008-12-17,
User offline. Last seen 35 weeks 3 days ago.
Re: Re: another day, another attempt at optimizing Range.foreac

On Wed, Dec 14, 2011 at 6:34 AM, Ismael Juma wrote:
> This does look great if it handles all the corner-cases that have been
> discussed. I am suspicious of complex code that relies on HotSpot to do
> sophisticated optimisations in order to perform well.

I've learned to temper my bold pronouncements about the presence or
absence of bugs in Range, but I'm hip to all the corners. Everyone
will have a chance to unmask any failures on that front in the near
future.

Johannes Rudolph 2
Joined: 2010-02-12,
User offline. Last seen 42 years 45 weeks ago.
Re: Re: another day, another attempt at optimizing Range.foreac

On Wed, Dec 14, 2011 at 3:13 PM, Matthew Pocock
wrote:
> Sorry if you've already thought about this already but... my recollection is
> that hotspot will optimistically inline things assuming there is only one
> implementation of a non-final method available (or that the code is only
> ever invoked such that it chains into a single implementation?) and then
> back out of this decision if it discovers a 2nd implementation. However, I'm
> rather hazy about how much context it takes into account. Are we sure that
> the performance being measured in these benchmarks will stand in a typical
> application without hotspot backing out its inlining? Range.foreach is
> likely to be invoked with many different bodies in a typical application.

What I've observed is that Hotspot in some cases then just inlines one
level up in the call chain and going on from there inlines several
levels deep so it effectively specializes each instance separately.

> This is the sort of subtlety that leaves me always leaning towards inlining
> everything we reasonably can through the compiler, particularly when this
> lets us also remove boxing and do further inlining, rather than relying
> heavily upon hotspot.

IMO, quite the opposite.

Overall I've lately came to distrusting the optimizer and 'blind'
inlining on the Scala side. It obviously does useful things in the
best case, but to improve matters consistently it would need a
somewhat sound model of the JIT specifics of the target JVM and try to
improve things by trying out optimizations but also rate the overall
*results* of the optimization steps regarding the model of the JVM to
decide if an optimization is worthwhile in the end. For example,
inlining in the Scala compiler is only then useful if closures or
boxing can be eliminated afterwards. Everything else will more likely
interfere with the way Hotspot's JIT will inline stuff.
What the inliner currently does is applying a simple heuristic when it
first decides if a method should be inlined, but it doesn't check
afterwards if this really produces any useful results. You would
probably need a completely different design for the optimizer to be
able to backtrack in those cases.

Viktor Klang
Joined: 2008-12-17,
User offline. Last seen 1 year 27 weeks ago.
Re: Re: another day, another attempt at optimizing Range.foreac


On Wed, Dec 14, 2011 at 3:44 PM, Paul Phillips <paulp [at] improving [dot] org> wrote:
2011/12/14 √iktor Ҡlang <viktor [dot] klang [at] gmail [dot] com>:
> unit tests are nice :-)

Unitests? These are related to unicorns in some way? I enjoy your
active imagination, viktor.

Yes, we should reminisce about this over a beer at the next tech meeting


--
Viktor Klang

Akka Tech LeadTypesafe - Enterprise-Grade Scala from the Experts

Twitter: @viktorklang
Ismael Juma 2
Joined: 2011-01-22,
User offline. Last seen 42 years 45 weeks ago.
Re: Re: another day, another attempt at optimizing Range.foreac
On Wed, Dec 14, 2011 at 2:50 PM, Johannes Rudolph <johannes [dot] rudolph [at] googlemail [dot] com> wrote:
What I've observed is that Hotspot in some cases then just inlines one
level up in the call chain and going on from there inlines several
levels deep so it effectively specializes each instance separately.

The key phrase here is "in some cases". Very brittle because results can change dramatically if small changes cause you to go over the inlining budget. And this issue has been an issue in the past:
http://blog.juma.me.uk/2009/10/26/new-jvm-options-and-scala-iteration-performance/ 
Overall I've lately came to distrusting the optimizer and 'blind'
inlining on the Scala side.

Blind inlining is obviously bad. 
but to improve matters consistently it would need a
somewhat sound model of the JIT specifics of the target JVM and try to
improve things by trying out optimizations but also rate the overall
*results* of the optimization steps regarding the model of the JVM to
decide if an optimization is worthwhile in the end. For example,
inlining in the Scala compiler is only then useful if closures or
boxing can be eliminated afterwards.

This would be nice, indeed.
Best,Ismael
dcsobral
Joined: 2009-04-23,
User offline. Last seen 38 weeks 5 days ago.
Re: Re: another day, another attempt at optimizing Range.foreac

2011/12/14 √iktor Ҡlang :
>
>
> On Wed, Dec 14, 2011 at 3:37 PM, Paul Phillips wrote:
>>
>> On Wed, Dec 14, 2011 at 6:34 AM, Ismael Juma wrote:
>> > This does look great if it handles all the corner-cases that have been
>> > discussed. I am suspicious of complex code that relies on HotSpot to do
>> > sophisticated optimisations in order to perform well.
>>
>> I've learned to temper my bold pronouncements about the presence or
>> absence of bugs in Range, but I'm hip to all the corners.  Everyone
>> will have a chance to unmask any failures on that front in the near
>> future.
>
>
> unit tests are nice :-)

I was going to mention this on another message, but ended up
forgetting it. Long ago, when people started realizing Range was full
of bugs, I had a scalacheck test for a number of methods. However,
taking three arbitrary integers caused a lot of trouble, since some
ranges are invalid (more than Int.MaxValue elements), and many
resulted in exceedingly slow tests.

Right now there's a scalacheck range test, but I'm not sure the
generated ranges cover enough cases to reliably catch all errors.

Matthew Pocock 3
Joined: 2010-07-30,
User offline. Last seen 42 years 45 weeks ago.
Re: Re: another day, another attempt at optimizing Range.foreac


On 14 December 2011 14:50, Johannes Rudolph <johannes [dot] rudolph [at] googlemail [dot] com> wrote:

For example,
inlining in the Scala compiler is only then useful if closures or
boxing can be eliminated afterwards. Everything else will more likely
interfere with the way Hotspot's JIT will inline stuff.

I can see this. The whole point is to be able to get rid of object allocation, dead code and unnecessary indirection.  
What the inliner currently does is applying a simple heuristic when it
first decides if a method should be inlined, but it doesn't check
afterwards if this really produces any useful results. You would
probably need a completely different design for the optimizer to be
able to backtrack in those cases.

Oh, this is a pity. Things like this make me want to take a few months out of my day job and get a proper understanding of the compiler. I kind of assumed that inlining and dead-code elimination and boxing-elimination all act recursively but obviously this is rather naive.
Matthew 


--
Johannes

-----------------------------------------------
Johannes Rudolph
http://virtual-void.net



--
Dr Matthew PocockIntegrative Bioinformatics Group, School of Computing Science, Newcastle Universitymailto: turingatemyhamster [at] gmail [dot] com gchat: turingatemyhamster [at] gmail [dot] commsn: matthew_pocock [at] yahoo [dot] co [dot] uk irc.freenode.net: drdozerskype: matthew.pococktel: (0191) 2566550mob: +447535664143
Simon Ochsenreither
Joined: 2011-07-17,
User offline. Last seen 42 years 45 weeks ago.
Re: Re: another day, another attempt at optimizing Range.foreac
There are also a few additional test cases in https://issues.scala-lang.org/browse/SI-4370 (although a bit more focused on Long Ranges). Maybe they help, too.
dcsobral
Joined: 2009-04-23,
User offline. Last seen 38 weeks 5 days ago.
Re: Re: another day, another attempt at optimizing Range.foreac

On Wed, Dec 14, 2011 at 11:20, Daniel Sobral wrote:
>
> At which point I think the only practical solution for range is
> separate classes. A statement like (x to y) does not need Range's
> heavy mechanics, and both "to" and "until" can easily return a
> subclass of Range. That causes problem for decreasing ranges, unless
> we introduce dependent types to handle "by - 1" :-), though we could
> override "reverse" to produce a simple decreasing range (and vice
> versa). In the common use case of "for (i <- x to y)" or "for (i <- x
> to y reverse)", that should be enough. The method "indices" could
> likewise be changed to return a simpler range.
>
> The thing is... we _have_ iterated through this solution, though I'm
> pretty sure "to", "until" and "indices" were not adapted as well to
> make the call site monomorphic. I'll look into this if someone doesn't
> beat me to it.

Ok, I have implemented and tested subclassing Range for single-step
cases, two classes for closed ranges (incrementing and decrementing),
and one for open ranges. I have changed RichInt's "to" and "until" as
described, and provided "reverse" methods on the subclassed ranges
that return a subclass as well, so that one can do something like (0
until s.length reverse) and still benefit from the improvements.

The results for unit-returning functions are fantastic, consistently
matching (actually, beating by a small margin) while loops, even at
very small lengths. All range tests pass too, so I don't think it is
broken. :-) Note that any use of "by" negates all benefits of this
change.

So, if whatever other improvement to foreach falls short of this in
some cases, I propose combining it with this.

The code can be found on my subrange branch at
http://github.com/dcsobral/scala, and the full benchmark can be found
at http://github.com/dcsobral/scala-benchmarking-template. This
benchmark includes many different uses of Range now, which gives me a
lot of confidence in the results. I'll see if I can put up a web page
with the results tomorrow.

extempore
Joined: 2008-12-17,
User offline. Last seen 35 weeks 3 days ago.
Re: Re: another day, another attempt at optimizing Range.foreac

On Wed, Dec 14, 2011 at 12:31 PM, Daniel Sobral wrote:
> http://github.com/dcsobral/scala

It's the glorious return of Range.ByOne! That I can see, that differs
from my first attempt only in the constant 1, and that attempt was
apparently 10x slower than the status quo. (It's believable that a
static known constant could make that big a difference, especially
when that constant is 1.) I forget what happened with Range.ByOne, and
as I look into that, here's something funny from two years and two
months ago.

https://github.com/scala/scala/commit/136c1cce62

class ByOne(start: Int, end: Int) extends Range(start, end, 1) {
override final def foreach[U](f: Int => U) {
var i = start
while (i < end) {
f(i)
i += 1
}
}
}

class ByPosStep(start: Int, end: Int, step: Int) extends
Range(start, end, step) {
override final def foreach[U](f: Int => U) {
var i = start
while (i < end) {
f(i)
i += step
}
}
}

class ByNegStep(start: Int, end: Int, step: Int) extends
Range(start, end, step) {
override final def foreach[U](f: Int => U) {
var i = start
while (i > end) {
f(i)
i += step
}
}
}

The more things change... those lasted less than a day, replaced by
"trait ByOne"

https://github.com/scala/scala/commit/221f2a8d72

dcsobral
Joined: 2009-04-23,
User offline. Last seen 38 weeks 5 days ago.
Re: Re: another day, another attempt at optimizing Range.foreac

On Wed, Dec 14, 2011 at 19:08, Paul Phillips wrote:
> On Wed, Dec 14, 2011 at 12:31 PM, Daniel Sobral wrote:
>> http://github.com/dcsobral/scala
>
> It's the glorious return of Range.ByOne! That I can see, that differs

Yes, I know. :-) Actually, I thought in your attempt you didn't get
RichInt to return ByOne, but I see from your commit that, yes, you
did, so I don't know why you didn't see results from that.

The original didn't have @inline, though, for whatever that's worth.
And I'm very dubious about changing it into a trait as well, but I see
it was Odersky who did that.

> from my first attempt only in the constant 1, and that attempt was
> apparently 10x slower than the status quo.  (It's believable that a
> static known constant could make that big a difference, especially
> when that constant is 1.) I forget what happened with Range.ByOne, and
> as I look into that, here's something funny from two years and two
> months ago.
>
> https://github.com/scala/scala/commit/136c1cce62
>
>    class ByOne(start: Int, end: Int) extends Range(start, end, 1) {
>      override final def foreach[U](f: Int => U) {
>        var i = start
>        while (i < end) {
>          f(i)
>          i += 1
>        }
>      }
>    }
>
>    class ByPosStep(start: Int, end: Int, step: Int) extends
> Range(start, end, step) {
>      override final def foreach[U](f: Int => U) {
>        var i = start
>        while (i < end) {
>          f(i)
>          i += step
>        }
>      }
>    }
>
>    class ByNegStep(start: Int, end: Int, step: Int) extends
> Range(start, end, step) {
>      override final def foreach[U](f: Int => U) {
>        var i = start
>        while (i > end) {
>          f(i)
>          i += step
>        }
>      }
>    }
>
> The more things change... those lasted less than a day, replaced by
> "trait ByOne"
>
> https://github.com/scala/scala/commit/221f2a8d72

dcsobral
Joined: 2009-04-23,
User offline. Last seen 38 weeks 5 days ago.
Re: Re: another day, another attempt at optimizing Range.foreac

I'm presently uploading the results to
http://microbenchmarks.appspot.com/run/dcsobral [at] gmail [dot] com/org.example.Benchmark.
I'm not sure if others can see these results or not. Each run, right
now, is anonymous (A, B, C, etc), but I'll rename them tomorrow when I
get back to work and see exactly in what order they did execute. You
can, meanwhile, pick your favorites. ;-)

On Wed, Dec 14, 2011 at 18:31, Daniel Sobral wrote:
> On Wed, Dec 14, 2011 at 11:20, Daniel Sobral wrote:
>>
>> At which point I think the only practical solution for range is
>> separate classes. A statement like (x to y) does not need Range's
>> heavy mechanics, and both "to" and "until" can easily return a
>> subclass of Range. That causes problem for decreasing ranges, unless
>> we introduce dependent types to handle "by - 1" :-), though we could
>> override "reverse" to produce a simple decreasing range (and vice
>> versa). In the common use case of "for (i <- x to y)" or "for (i <- x
>> to y reverse)", that should be enough. The method "indices" could
>> likewise be changed to return a simpler range.
>>
>> The thing is... we _have_ iterated through this solution, though I'm
>> pretty sure "to", "until" and "indices" were not adapted as well to
>> make the call site monomorphic. I'll look into this if someone doesn't
>> beat me to it.
>
> Ok, I have implemented and tested subclassing Range for single-step
> cases, two classes for closed ranges (incrementing and decrementing),
> and one for open ranges. I have changed RichInt's "to" and "until" as
> described, and provided "reverse" methods on the subclassed ranges
> that return a subclass as well, so that one can do something like (0
> until s.length reverse) and still benefit from the improvements.
>
> The results for unit-returning functions are fantastic, consistently
> matching (actually, beating by a small margin) while loops, even at
> very small lengths. All range tests pass too, so I don't think it is
> broken. :-) Note that any use of "by" negates all benefits of this
> change.
>
> So, if whatever other improvement to foreach falls short of this in
> some cases, I propose combining it with this.
>
> The code can be found on my subrange branch at
> http://github.com/dcsobral/scala, and the full benchmark can be found
> at http://github.com/dcsobral/scala-benchmarking-template. This
> benchmark includes many different uses of Range now, which gives me a
> lot of confidence in the results. I'll see if I can put up a web page
> with the results tomorrow.
>
> --
> Daniel C. Sobral
>
> I travel to the future all the time.

odersky
Joined: 2008-07-29,
User offline. Last seen 45 weeks 6 days ago.
Re: Re: another day, another attempt at optimizing Range.foreac

On Wed, Dec 14, 2011 at 10:33 PM, Daniel Sobral wrote:
> On Wed, Dec 14, 2011 at 19:08, Paul Phillips wrote:
>> On Wed, Dec 14, 2011 at 12:31 PM, Daniel Sobral wrote:
>>> http://github.com/dcsobral/scala
>>
>> It's the glorious return of Range.ByOne! That I can see, that differs
>
> Yes, I know. :-) Actually, I thought in your attempt you didn't get
> RichInt to return ByOne, but I see from your commit that, yes, you
> did, so I don't know why you didn't see results from that.
>
> The original didn't have @inline, though, for whatever that's worth.
> And I'm very dubious about changing it into a trait as well, but I see
> it was Odersky who did that.
>
I am not sure why anymore. I normally do that only if I am forced to.
Maybe the reason that forced me went away. In any case a subclassing
solution should work fine, even though I still think macros will give
us the simplest route to optimal performance. But please, no changes
in the signatures of any method in collections or RichInt.

Cheers

Ismael Juma 2
Joined: 2011-01-22,
User offline. Last seen 42 years 45 weeks ago.
Re: Re: another day, another attempt at optimizing Range.foreac
On Thu, Dec 15, 2011 at 8:15 AM, martin odersky <martin [dot] odersky [at] epfl [dot] ch> wrote:
even though I still think macros will give us the simplest route to optimal performance

Are macros being considered for 2.10?
Best, Ismael 
Ismael Juma 2
Joined: 2011-01-22,
User offline. Last seen 42 years 45 weeks ago.
Re: Re: another day, another attempt at optimizing Range.foreac
On Thu, Dec 15, 2011 at 10:13 AM, martin odersky <martin [dot] odersky [at] epfl [dot] ch> wrote:
No decision yet (need a concrete SIP first),  but it's a possibility, yes.

Thanks. I agree that this would give us the simplest route to optimal performance, which would be a very nice thing.
Best,Ismael
odersky
Joined: 2008-07-29,
User offline. Last seen 45 weeks 6 days ago.
Re: Re: another day, another attempt at optimizing Range.foreac

On Thu, Dec 15, 2011 at 11:04 AM, Ismael Juma wrote:
> On Thu, Dec 15, 2011 at 8:15 AM, martin odersky
> wrote:
>>
>> even though I still think macros will give us the simplest route to
>> optimal performance
>
>
> Are macros being considered for 2.10?

No decision yet (need a concrete SIP first), but it's a possibility, yes.

dcsobral
Joined: 2009-04-23,
User offline. Last seen 38 weeks 5 days ago.
Re: Re: another day, another attempt at optimizing Range.foreac

On Thu, Dec 15, 2011 at 06:15, martin odersky wrote:
> I am not sure why anymore. I normally do that only if I am forced to.
> Maybe the reason that forced me went away. In any case a subclassing
> solution should work fine, even though I still think macros will give
> us the simplest route to optimal performance. But please, no changes
> in the signatures of any method in collections or RichInt.

This question doesn't become moot even with macros. The main problem
is not inlining, but handling border cases: wrap arounds, empty ranges
(end < start with positive step, etc), and length greater than allowed
in a collection. The latter can be ignored by a macro if it can assert
no reference to the range is escaping, but the others can only be
handled if the range is being composed with literals. So, whatever
else happens, the macro will face the same problems we are facing
here.

Given the results I obtained with single-step ranges, I suspect macros
won't be able to do any better.

As for signatures in collections, well, I don't know if we can reap
any gains without that. I suspect making foreach monomorphic for
single-step ranges (where step is a known value, which makes the
implementation simpler) is a big deal, and I don't know if simple
subclassing (but passing Range around) will be enough. I'll get some
benchmarks in that too, to see if that's true or not.

extempore
Joined: 2008-12-17,
User offline. Last seen 35 weeks 3 days ago.
Re: Re: another day, another attempt at optimizing Range.foreac

On Thu, Dec 15, 2011 at 3:40 PM, Daniel Sobral wrote:
> This question doesn't become moot even with macros. The main problem
> is not inlining, but handling border cases: wrap arounds, empty ranges
> (end < start with positive step, etc), and length greater than allowed
> in a collection.

I must violently agree with this, although not entirely in the
details. Everything ends up coming back to the corners. The logic
has to go somewhere, and you can't even do it all at construction
because of this very unfortunate situation:

Int.MinValue to Int.MaxValue foreach println // this needs to throw
an exception
10 to 10 foreach println // this needs to NOT
throw an exception
Int.MinValue to Int.MaxValue by 10 // this needs to NOT
throw an exception

You can evade several of the corners daniel lists above by modeling it this way:

val start: Int
val step: Int
val terminal: Int

"terminal" is the value one "step" past the last actual member of the
collection. Now the loop is always the same:

var i = start
while (i != terminal) { f(i) ; i += step }

This works for all wraparounds (because if i wraps, so will terminal)
and all empty ranges (because you precalculate terminal, and if the
range is empty, set terminal == start). But there's no good way to
handle "x to y" vs "x to y to z", and that corner has a grotesque
impact on the attempt to see this thing optimized.

extempore
Joined: 2008-12-17,
User offline. Last seen 35 weeks 3 days ago.
Re: Re: another day, another attempt at optimizing Range.foreac

On Thu, Dec 15, 2011 at 3:40 PM, Daniel Sobral wrote:
> As for signatures in collections, well, I don't know if we can reap
> any gains without that. I suspect making foreach monomorphic for
> single-step ranges (where step is a known value, which makes the
> implementation simpler) is a big deal, and I don't know if simple
> subclassing (but passing Range around) will be enough. I'll get some
> benchmarks in that too, to see if that's true or not.

I too have observed the statically known constant step having a huge
impact. I'm tempted to generate some classes to cover likely step
values... at least the ones which have custom bytecode instructions
(iconst_1 through iconst_5.)

dcsobral
Joined: 2009-04-23,
User offline. Last seen 38 weeks 5 days ago.
Re: Re: another day, another attempt at optimizing Range.foreac

On Thu, Dec 15, 2011 at 22:41, Paul Phillips wrote:
> You can evade several of the corners daniel lists above by modeling it this way:
>
>   val start: Int
>    val step: Int
> val terminal: Int
>
> "terminal" is the value one "step" past the last actual member of the
> collection.  Now the loop is always the same:
>
>  var i = start
>  while (i != terminal) { f(i) ; i += step }
>
> This works for all wraparounds (because if i wraps, so will terminal)
> and all empty ranges (because you precalculate terminal, and if the
> range is empty, set terminal == start).  But there's no good way to
> handle "x to y" vs "x to y to z", and that corner has a grotesque
> impact on the attempt to see this thing optimized.

Not _all_ wrap arounds as stated. Try this one:

Int.MinValue to Int.MaxValue by (Int.MinValue / 2).abs

extempore
Joined: 2008-12-17,
User offline. Last seen 35 weeks 3 days ago.
Re: Re: another day, another attempt at optimizing Range.foreac

On Fri, Dec 16, 2011 at 5:08 AM, Daniel Sobral wrote:
> Int.MinValue to Int.MaxValue by (Int.MinValue / 2).abs

Nice. Int.MinValue is the worst, it's tempting to exclude it by fiat.
I was also reminded of this one:

scala> math.abs(Int.MinValue) == Int.MinValue
res0: Boolean = true

There's a similar angle available which I resisted because it
introduces another counter, but since it's a ++ counter maybe it's not
so bad.

var i = start
var count = 0
while (count < length) {
f(i)
i += step
count += 1
}

This should still come in under 35 bytes anywya.

I have to abandon this range business ASAP and get back to the billion
other things. Yesterday I tried out implementing public static fields
for final vals: scala never generates them but hotspot clearly
privileges them and I'm getting the feeling this contributes to our
difficulties.

Viktor Klang
Joined: 2008-12-17,
User offline. Last seen 1 year 27 weeks ago.
Re: Re: another day, another attempt at optimizing Range.foreac


On Fri, Dec 16, 2011 at 3:12 PM, Paul Phillips <paulp [at] improving [dot] org> wrote:
On Fri, Dec 16, 2011 at 5:08 AM, Daniel Sobral <dcsobral [at] gmail [dot] com> wrote:
> Int.MinValue to Int.MaxValue by (Int.MinValue / 2).abs

Nice.  Int.MinValue is the worst, it's tempting to exclude it by fiat.
 I was also reminded of this one:

scala> math.abs(Int.MinValue)

I would expect this to throw an exception
 
== Int.MinValue
res0: Boolean = true

There's a similar angle available which I resisted because it
introduces another counter, but since it's a ++ counter maybe it's not
so bad.

var i = start
var count = 0
while (count < length) {
 f(i)
 i += step
 count += 1
}

This should still come in under 35 bytes anywya.

I have to abandon this range business ASAP and get back to the billion
other things.  Yesterday I tried out implementing public static fields
for final vals: scala never generates them but hotspot clearly
privileges them and I'm getting the feeling this contributes to our
difficulties.



--
Viktor Klang

Akka Tech LeadTypesafe - Enterprise-Grade Scala from the Experts

Twitter: @viktorklang
extempore
Joined: 2008-12-17,
User offline. Last seen 35 weeks 3 days ago.
Re: Re: another day, another attempt at optimizing Range.foreac

2011/12/16 √iktor Ҡlang :
>> scala> math.abs(Int.MinValue)
>
> I would expect this to throw an exception

The java people assessed that a note in the documentation was sufficient.

"Note that if the argument is equal to the value of Integer.MIN_VALUE,
the most negative representable int value, the result is that same
value, which is negative."

I like the out-of-our-hands vibe, the rote listing of facts as if
addressing a small child ("...which is negative.") and the absence of
any reasoning.

/** Returns the argument, times two.
* @note timesTwo(562) = -4.
* @note -4 is not the same as 562*2.
* @note that -4 will almost certainly screw you.
* @author java library team
*/
def timesTwo(x: Int): Int

Copyright © 2012 École Polytechnique Fédérale de Lausanne (EPFL), Lausanne, Switzerland