This page is no longer maintained — Please continue to the home page at www.scala-lang.org

inefficient PartialFunctions, functionWithDefault

15 replies
ido
Joined: 2011-05-02,
User offline. Last seen 42 years 45 weeks ago.

Hi,

This came up once:
http://stackoverflow.com/questions/4064859/is-the-partialfunction-design...
http://article.gmane.org/gmane.comp.lang.scala.internals/4030/match=func...

Have there been any changes until now?

I just noticed my extractors being called twice.

thank you very much,
ido

dcsobral
Joined: 2009-04-23,
User offline. Last seen 38 weeks 5 days ago.
Re: inefficient PartialFunctions, functionWithDefault
Doug Tangren
Joined: 2009-12-10,
User offline. Last seen 42 years 45 weeks ago.
Re: inefficient PartialFunctions, functionWithDefault
On Sun, Jan 15, 2012 at 11:34 AM, Daniel Sobral <dcsobral [at] gmail [dot] com> wrote:
On Sun, Jan 15, 2012 at 13:48, Ido Tamir <idomtamir [at] gmail [dot] com> wrote:
> Hi,
>
> This came up once:
> http://stackoverflow.com/questions/4064859/is-the-partialfunction-design-inefficient
> http://article.gmane.org/gmane.comp.lang.scala.internals/4030/match=functionwithdefault
>
> Have there been any changes until now?
>
> I just noticed my extractors being called twice.

See:

https://github.com/scala/scala/commit/b34615a
https://github.com/scala/scala/commit/579e999
https://github.com/scala/scala/commit/5fb26c6

--

Good to hear. We've experimented with optimizations in unfiltered with this as well http://code.technically.us/post/13107005975/or-else-its-rilly-complex  
Daniel C. Sobral

I travel to the future all the time.

Pavel Pavlov
Joined: 2011-12-01,
User offline. Last seen 42 years 45 weeks ago.
Re: inefficient PartialFunctions, functionWithDefault
Am I the only one who see that it is broken?
See my comments here:

https://github.com/scala/scala/commit/5fb26c6a889cf1609823338df8783bf880...

воскресенье, 15 января 2012 г. 23:34:20 UTC+7 пользователь Daniel Sobral написал:
On Sun, Jan 15, 2012 at 13:48, Ido Tamir <idom [dot] [dot] [dot] [at] gmail [dot] com> wrote:
> Hi,
>
> This came up once:
> http://stackoverflow.com/questions/4064859/is-the-partialfunction-design-inefficient
> http://article.gmane.org/gmane.comp.lang.scala.internals/4030/match=functionwithdefault
>
> Have there been any changes until now?
>
> I just noticed my extractors being called twice.

See:

https://github.com/scala/scala/commit/b34615a
https://github.com/scala/scala/commit/579e999
https://github.com/scala/scala/commit/5fb26c6

--
Daniel C. Sobral

I travel to the future all the time.

Pavel Pavlov
Joined: 2011-12-01,
User offline. Last seen 42 years 45 weeks ago.
Fwd: inefficient PartialFunctions, functionWithDefault
Hi there.

I've just tried to use Martin's optimized PF.orElse to solve old problem with double-evaluated pattern matchers.

I've reimplemented PF.lift:
https://github.com/pavelpavlov/scala/commit/bb6291606b4e8d0987208e919695e71313e5244a

and played a bit with example from
http://stackoverflow.com/questions/4064859/is-the-partialfunction-design-inefficient

First results are good:
============================
.....
defined module Ex1
defined module Ex2
pf: PartialFunction[Int,String] = <function1>

scala> if(pf.isDefinedAt(2)) pf(2)
Ex1
Ex2
Ex1
Ex2
res0: Any = 2

scala> pf.lift(2)
Ex1
Ex2
res1: Option[String] = Some(2)
============================

Then they going worse:
============================
scala> pf.lift(3)
Ex1
Ex2
java.lang.ClassCastException: scala.PartialFunction$FallBackToken$ cannot be cast to java.lang.String
        at $anonfun$1.apply(<console>:21)
        at $anonfun$1.apply(<console>:21)
        at scala.PartialFunction$$anonfun$lift$1.apply(PartialFunction.scala:102)
        at scala.PartialFunction$$anonfun$lift$1.apply(PartialFunction.scala:102)
============================

With a bit more playing it becomes clear that the bug is not mine, it is already exist in new orElse implementation:
============================
scala> :paste
// Entering paste mode (ctrl-D to finish)

val pf1 : PartialFunction[Int,String] = {
  case 11 => "hello"
}
val pf2 : PartialFunction[Int,AnyRef] = {
  case 22 => List(1,2)
}

// Exiting paste mode, now interpreting.

pf1: PartialFunction[Int,String] = <function1>
pf2: PartialFunction[Int,AnyRef] = <function1>

scala> pf1(11)
res4: String = hello

scala> pf2(22)
res5: AnyRef = List(1, 2)

scala> val pf3 = pf1 orElse pf2
pf3: PartialFunction[Int,AnyRef] = <function1>

scala> pf3(11)
res6: AnyRef = hello

scala> pf3(22)
java.lang.ClassCastException: scala.collection.immutable.$colon$colon cannot be cast to java.lang.String
        at $anonfun$1.apply(<console>:7)
        at $anonfun$1.apply(<console>:7)
============================

So, the problem is scalac generates bytecode for PF.apply which casts result of missingCase() call to apply's return type while orElse attaches there another PF which can have arbitrary return type.



---------- Пересланное сообщение ----------
От кого: Pavel Pavlov <pavel [dot] e [dot] pavlov [at] gmail [dot] com>
Дата: 16 января 2012 г. 3:16
Тема: Re: [scala-user] inefficient PartialFunctions, functionWithDefault
Кому: scala-user [at] googlegroups [dot] com
Копия: Ido Tamir <idomtamir [at] gmail [dot] com>


Am I the only one who see that it is broken?
See my comments here:

https://github.com/scala/scala/commit/5fb26c6a889cf1609823338df8783bf880769b3f#commitcomment-869069

воскресенье, 15 января 2012 г. 23:34:20 UTC+7 пользователь Daniel Sobral написал:
On Sun, Jan 15, 2012 at 13:48, Ido Tamir <idom [dot] [dot] [dot] [at] gmail [dot] com> wrote:
> Hi,
>
> This came up once:
> http://stackoverflow.com/questions/4064859/is-the-partialfunction-design-inefficient
> http://article.gmane.org/gmane.comp.lang.scala.internals/4030/match=functionwithdefault
>
> Have there been any changes until now?
>
> I just noticed my extractors being called twice.

See:

https://github.com/scala/scala/commit/b34615a
https://github.com/scala/scala/commit/579e999
https://github.com/scala/scala/commit/5fb26c6

--
Daniel C. Sobral

I travel to the future all the time.


Pavel Pavlov
Joined: 2011-12-01,
User offline. Last seen 42 years 45 weeks ago.
Re: inefficient PartialFunctions, functionWithDefault


понедельник, 16 января 2012 г. 8:30:16 UTC+7 пользователь Pavel Pavlov написал:

java.lang.ClassCastException: scala.collection.immutable.$colon$colon cannot be cast to java.lang.String
        at $anonfun$1.apply(<console>:7)
        at $anonfun$1.apply(<console>:7)

So, the problem is scalac generates bytecode for PF.apply which casts result of missingCase() call to apply's return type while orElse attaches there another PF which can have arbitrary return type.


I believe I found a way to fix this problem as well as other orElse-related problems (synchronized blocks, AbstractPF unfit for manual extending etc.) here:
https://github.com/pavelpavlov/scala/commit/bdf2c627aba4d3e8b5105df9ac1e...

PF.lift should work as expected (pattern matchers will be invoked once).
Some futher optimizations for PF.lift are here:
https://github.com/pavelpavlov/scala/commit/27a967138dfbc022cb71b36e9e62...

Please note that the code does not work for now - I did not touch PF literals generation in the compiler.
It's just an idea how to fix the orElse issue - I even didn't try to compile it yet.


If this solution haven't any holes, it can serve as a base for faster dealing with PFs in many places:
Code like "if (pf.isDefinedAt(x)) use(pf(x))" can be rewritten to
  pf.lift(x) match {
    case Some(y) => use(y)
    case None =>
  }
thus halfing the count of pattern matcher evaluations (and in most cases avoiding the issue with double side-effects in matchers).

For example, TraversableLike.collect can be rewritten as:
  def collect[B, That](pf: PartialFunction[A, B])(implicit bf: CanBuildFrom[Repr, B, That]): That = {
    val b = bf(repr)
    val f = pf.lift
    for (x <- this) f(x) match {
      case Some(y) => b += y
      case None =>
    }
    b.result
  }

Any thoughts, comments?

extempore
Joined: 2008-12-17,
User offline. Last seen 35 weeks 3 days ago.
Re: Re: inefficient PartialFunctions, functionWithDefault


On Sun, Jan 15, 2012 at 11:07 PM, Pavel Pavlov <pavel [dot] e [dot] pavlov [at] gmail [dot] com> wrote:
Any thoughts, comments?

I can't look at it right now but I had started a new approach last night when you reported the CCE and it looked almost just like this, so based on the cursory inspection I think this is the right way.  I can't comment with much authority though because I don't know all martin's motivations for the way it was done.  Regardless, mad props for finding a serious problem and shipping a possible fix before I can even get out of the gate.
My spidey sense is telling me there's some variance issue standing between your code and working, but not sure.
Pavel Pavlov
Joined: 2011-12-01,
User offline. Last seen 42 years 45 weeks ago.
Re: inefficient PartialFunctions, functionWithDefault
Hi all,

I've finished with the first draft of the PF reimplementation.

The code is here:
https://github.com/pavelpavlov/scala/tree/pf-reimpl

What is done:
1. All the PF operations implemented through applyOrElse method instead of isDefinedAt/apply pair where possible (except isDefinedAt, of course).
2. PF literals generated by the compiler now implement applyOrElse method.
3. PF literals with exhaustive matchers are generated in special way. This will lead in slightly smaller classfiles and may be somewhat faster due to specialized implementation of PF operations in trait PartialFunction.Total.
4. orElse optimization reimplemented. This fix the issue I've reported in this thread as well.
5. Lift/unlift methods are slightly optimized (pf.lift.unlift now returns pf, not double wrapper over pf).
6. All the PF methods now avoid using temporary objects (options, closures, etc.) where possible.
7. I've browsed all those countless old threads in maillists and checked that all the issues discussed there are fixed.
8. I've added two new imperative-style combinators: run & runWith. They can help to speedup collect-like methods.
For example:

  def collect[B, That](pf: PartialFunction[A, B])(implicit bf: CanBuildFrom[Repr, B, That]): That = {
    val b = bf(repr)
    val action = pf runWith { y => b += y }
    this foreach action
    b.result
  }

In this code no any single temporary object is created in the loop. And of course, pf's pattern matchers are executed once for each collection's element.

9. All tests passed except one evil code snippet known as t4269, which crashes the compiler. I'll try to fix it later.
10. Virtpatmat is not yet supported.

What I'm going to do next:

1. Collect as much feedback as I can.
2. Implement PF literals generation for new pattern matcher.
3. Pass the code through the whole series of cosmetic procedures (code polishing, branch massaging etc.)
4. Create pull request.

I will be grateful to all who can help me with tests, docs & benchmarks.

Regards,
Pavel.

extempore
Joined: 2008-12-17,
User offline. Last seen 35 weeks 3 days ago.
Re: Re: inefficient PartialFunctions, functionWithDefault

On Thu, Feb 2, 2012 at 7:35 PM, Pavel Pavlov wrote:
>
> The code is here:
> https://github.com/pavelpavlov/scala/tree/pf-reimpl

Awesome!

I merged master into it to save you a little trouble.

https://github.com/paulp/scala/tree/pf-reimpl

I think I can answer this question:

- def orElseFast[A1 <: A, B1 >: B](that: PartialFunction[A1, B1]) :
PartialFunction[A1, B1] =
- orElse(that)
+ new OrElse[A1, B1] (this, that)
+ //TODO: why not overload it with orElse(that: F1): F1?

"Why not" is because type inference is a fragile companion and that
particular overload will most likely hit it like thor's hammer. But I
haven't tried it, so don't take my word for it.

> 9. All tests passed except one evil code snippet known as t4269, which crashes the compiler. I'll try to fix it later.

If you look at SI-4269 you'll see I didn't have the easiest time with
it the first time around.

dcsobral
Joined: 2009-04-23,
User offline. Last seen 38 weeks 5 days ago.
Re: Re: inefficient PartialFunctions, functionWithDefault

Looks awesome. If at all possible, I'd like for a SIP to be made for

dcsobral
Joined: 2009-04-23,
User offline. Last seen 38 weeks 5 days ago.
Re: Re: inefficient PartialFunctions, functionWithDefault

I've given it a look, and I have some comments that are all about
style, not content. Please note that I'm not part of Typesafe or EPFL,
so this is strictly IMHO.

I think abbreviations are ok when carefully chosen, but acronyms are
bad. The guilty parts here are PFLiteral, XPFLiteral and APF. I can't
even understand the latter: you replaced the class on runtime with a
type on PartialFunction that points to a class on runtime, while at
the same time introducing an acronym?

I don't quite understand what's the purpose of
AbstractPartialFunction2, or why isn't it replacing
AbstractPartialFunction. However, given that Function2 stands for a
function of arity two, I think that the 2 in AbstractPartialFunction2
can be very confusing.

What's up with X? XPFLiteral, enterApplyX... please choose more
meaningful names.

On Fri, Feb 3, 2012 at 01:35, Pavel Pavlov wrote:
> Hi all,
>
> I've finished with the first draft of the PF reimplementation.
>
> The code is here:
> https://github.com/pavelpavlov/scala/tree/pf-reimpl
>
> What is done:
> 1. All the PF operations implemented through applyOrElse method instead of
> isDefinedAt/apply pair where possible (except isDefinedAt, of course).
> 2. PF literals generated by the compiler now implement applyOrElse method.
> 3. PF literals with exhaustive matchers are generated in special way. This
> will lead in slightly smaller classfiles and may be somewhat faster due to
> specialized implementation of PF operations in trait PartialFunction.Total.
> 4. orElse optimization reimplemented. This fix the issue I've reported in
> this thread as well.
> 5. Lift/unlift methods are slightly optimized (pf.lift.unlift now returns
> pf, not double wrapper over pf).
> 6. All the PF methods now avoid using temporary objects (options, closures,
> etc.) where possible.
> 7. I've browsed all those countless old threads in maillists and checked
> that all the issues discussed there are fixed.
> 8. I've added two new imperative-style combinators: run & runWith. They can
> help to speedup collect-like methods.
> For example:
>
>
>   def collect[B, That](pf: PartialFunction[A, B])(implicit bf:
> CanBuildFrom[Repr, B, That]): That = {
>     val b = bf(repr)
>     val action = pf runWith { y => b += y }
>     this foreach action
>     b.result
>   }
>
> In this code no any single temporary object is created in the loop. And of
> course, pf's pattern matchers are executed once for each collection's
> element.
>
> 9. All tests passed except one evil code snippet known as t4269, which
> crashes the compiler. I'll try to fix it later.
> 10. Virtpatmat is not yet supported.
>
> What I'm going to do next:
>
> 1. Collect as much feedback as I can.
> 2. Implement PF literals generation for new pattern matcher.
> 3. Pass the code through the whole series of cosmetic procedures (code
> polishing, branch massaging etc.)
> 4. Create pull request.
>
> I will be grateful to all who can help me with tests, docs & benchmarks.
>
> Regards,
> Pavel.
>

Pavel Pavlov
Joined: 2011-12-01,
User offline. Last seen 42 years 45 weeks ago.
Re: Re: inefficient PartialFunctions, functionWithDefault
Hi Daniel,

Thank you for reviewing this.


4 февраля 2012 г. 5:55 пользователь Daniel Sobral <dcsobral [at] gmail [dot] com> написал:
I've given it a look, and I have some comments that are all about
style, not content. Please note that I'm not part of Typesafe or EPFL,
so this is strictly IMHO.

I think abbreviations are ok when carefully chosen, but acronyms are
bad. The guilty parts here are PFLiteral, XPFLiteral

I had three reasons for this:
1) I've tried to choose shortest possible names for them as the compiler generates tons of closures extending these classes. So I thought that some economy on classfile sizes is worth here.
2) These classes are not part of the public interface. They are referenced only once in the compiler (well, twice if we count StdNames.scala) and are not refecenced from library code at all.
3) I wanted to specialize PF literals with exhaustive matchers as they can be generated more effectively and had troubles trying to invent meaningful names for two sorts of basic classes for PF literals.

So if someone have good pair of names in mind and (1) is not an issue, I will rename them without a regret.

 
and APF. I can't
even understand the latter: you replaced the class on runtime with a
type on PartialFunction that points to a class on runtime, while at
the same time introducing an acronym?

I don't quite understand what's the purpose of
AbstractPartialFunction2, or why isn't it replacing
AbstractPartialFunction. However, given that Function2 stands for a
function of arity two, I think that the 2 in AbstractPartialFunction2
can be very confusing.


Well, this is really tricky bootstrapping issue.
The problem is that compiler & library are mutually dependent as compiler generates closures based on the library's classes and library have a lot of closures itself.
What's worse, there are many manually defined partial functions which extends AbstractPartialFunction class.

When I had tried to build my code, I first need to build new library with the old compiler, and this сaused some trouble.
So all this mess is just temporary bootstrapping workarounds.

I think it should be integrated in two steps:
1) When (and if) this code will be in trunk, it eventually should became a new starr. At this point, AbstractPartialFunction class will no longer be used by the compiler.
2) Then I'll perform some cleanup:
  - remove old AbstractPartialFunction
  - rename AbstractPartialFunction2 to AbstractPartialFunction
  - remove APF type alias

 
enterApplyX... please choose more meaningful names.

Ok, thanks. Will fix this. All this code in UnCurry is in under construction state for now. I definitely will polish it after I support new pattern matcher and fix the bugs.

Regards,
Pavel.

dcsobral
Joined: 2009-04-23,
User offline. Last seen 38 weeks 5 days ago.
Re: Re: inefficient PartialFunctions, functionWithDefault

2012/2/3 Pavel Pavlov :
> Hi Daniel,
>
> Thank you for reviewing this.
>
>
> 4 февраля 2012 г. 5:55 пользователь Daniel Sobral
> написал:
>
>> I've given it a look, and I have some comments that are all about
>> style, not content. Please note that I'm not part of Typesafe or EPFL,
>> so this is strictly IMHO.
>>
>> I think abbreviations are ok when carefully chosen, but acronyms are
>> bad. The guilty parts here are PFLiteral, XPFLiteral
>
>
> I had three reasons for this:
> 1) I've tried to choose shortest possible names for them as the compiler
> generates tons of closures extending these classes. So I thought that some
> economy on classfile sizes is worth here.
> 2) These classes are not part of the public interface. They are referenced
> only once in the compiler (well, twice if we count StdNames.scala) and are
> not refecenced from library code at all.
> 3) I wanted to specialize PF literals with exhaustive matchers as they can
> be generated more effectively and had troubles trying to invent meaningful
> names for two sorts of basic classes for PF literals.
>
> So if someone have good pair of names in mind and (1) is not an issue, I
> will rename them without a regret.

Fair enough. But that's the X about?

>
>
>>
>> and APF. I can't
>> even understand the latter: you replaced the class on runtime with a
>> type on PartialFunction that points to a class on runtime, while at
>> the same time introducing an acronym?
>>
>> I don't quite understand what's the purpose of
>> AbstractPartialFunction2, or why isn't it replacing
>> AbstractPartialFunction. However, given that Function2 stands for a
>> function of arity two, I think that the 2 in AbstractPartialFunction2
>> can be very confusing.
>>
>
> Well, this is really tricky bootstrapping issue.
> The problem is that compiler & library are mutually dependent as compiler
> generates closures based on the library's classes and library have a lot of
> closures itself.
> What's worse, there are many manually defined partial functions which
> extends AbstractPartialFunction class.
>
> When I had tried to build my code, I first need to build new library with
> the old compiler, and this сaused some trouble.
> So all this mess is just temporary bootstrapping workarounds.
>
> I think it should be integrated in two steps:
> 1) When (and if) this code will be in trunk, it eventually should became a
> new starr. At this point, AbstractPartialFunction class will no longer be
> used by the compiler.
> 2) Then I'll perform some cleanup:
>   - remove old AbstractPartialFunction
>   - rename AbstractPartialFunction2 to AbstractPartialFunction
>   - remove APF type alias
>
>
>>
>> enterApplyX... please choose more meaningful names.
>
>
> Ok, thanks. Will fix this. All this code in UnCurry is in under construction
> state for now. I definitely will polish it after I support new pattern
> matcher and fix the bugs.

As far as I know, that's what starr -- those jar libraries one has to
download -- is all about. I really can't help you with this, but I
think there is a solution, and there's no better list than this one to
bring it up. :-)

Pavel Pavlov
Joined: 2011-12-01,
User offline. Last seen 42 years 45 weeks ago.
Re: Re: inefficient PartialFunctions, functionWithDefault


4 февраля 2012 г. 8:36 пользователь Daniel Sobral <dcsobral [at] gmail [dot] com> написал:
Fair enough. But that's the X about?


`X` is for `exhaustive`. Don't beat me for it :)
 
extempore
Joined: 2008-12-17,
User offline. Last seen 35 weeks 3 days ago.
Re: Re: inefficient PartialFunctions, functionWithDefault

2012/2/3 Pavel Pavlov :
> 1) I've tried to choose shortest possible names for them as the compiler
> generates tons of closures extending these classes. So I thought that some
> economy on classfile sizes is worth here.

In the past I've considered renaming all the function,
partialfunction, etc. classes to something like scala.r.f1,
scala.r.f2, for exactly this reason. It's pretty shocking how many
times these names are duplicated across classfiles, and although in a
different universe compression might have saved the day, in this one
it doesn't.

Pavel Pavlov
Joined: 2011-12-01,
User offline. Last seen 42 years 45 weeks ago.
Re: Re: inefficient PartialFunctions, functionWithDefault


4 февраля 2012 г. 10:43 пользователь Paul Phillips <paulp [at] improving [dot] org> написал:
2012/2/3 Pavel Pavlov <pavel [dot] e [dot] pavlov [at] gmail [dot] com>:
> 1) I've tried to choose shortest possible names for them as the compiler
> generates tons of closures extending these classes. So I thought that some
> economy on classfile sizes is worth here.

In the past I've considered renaming all the function,
partialfunction, etc. classes to something like scala.r.f1,
scala.r.f2, for exactly this reason.  It's pretty shocking how many
times these names are duplicated across classfiles, and although in a
different universe compression might have saved the day, in this one
it doesn't.

Well, I've tried to do this:
https://github.com/pavelpavlov/scala/tree/closize

However the overall size difference is surprisingly negligible. There:
 - (1) is base version (branch pf-reimpl):
https://github.com/pavelpavlov/scala/commit/164c81f980b3414b345d6e2114dbf1c8c16923a9
 - in (2) I've removed Serializable from every "anon$ extends AbstractFunctionN with Serializable" generated by the compiler and put it into the library: "class AbstractFunctionN extends FunctionN with Serializable":
https://github.com/pavelpavlov/scala/commit/8238c940f2a4d5f83901b19ac251bf2eb81a41f1
 - in (3) I've replaced all class names by 2/3-character abbreviations:
https://github.com/pavelpavlov/scala/commit/cc17884ed989bf95b947fae87baa6f93cb9f1d49

There are the results:

                          (1)          (2)          (3)      (2)-(1)  (3)-(2)   (3)-(1)
scala-library.jar      7,249,310    7,217,142    7,191,714   -32,168  -25,428   -57,596
scala-library.jar/*   16,748,616   16,698,761   16,657,898   -49,855  -40,863   -90,718
scala-compiler.jar    14,021,784   13,914,189   13,849,748  -107,575  -64,441  -172,016
scala-compiler.jar/*  29,199,389   29,031,503   28,933,025  -167,886  -98,478  -266,364


Regards,
Pavel.

Copyright © 2012 École Polytechnique Fédérale de Lausanne (EPFL), Lausanne, Switzerland