This page is no longer maintained — Please continue to the home page at www.scala-lang.org

typerefs and allocations

3 replies
extempore
Joined: 2008-12-17,
User offline. Last seen 35 weeks 3 days ago.
Since I will go nuts if I allow myself to work on this for one more minute, I will just document something now.  I have spent the last several days ignoring things to wrestle with a bear.  The bear won, but I got in a few weak blows.
Attempted TypeRef creations: 10525999          Surviving uniques: 569797 Doomed allocations avoided: 0
These are the statistics from when I started for quick.lib.  This means that we allocate ten million typerefs, 94% of which are discarded after discovering that uniques already holds an equivalent typeref.  If that's not already obviously suboptimal, note that we must calculate the hashcode for each of those 9.4M, which means a reasonably expensive murmurhash3 which itself creates yet more garbage.
As of now I have those numbers at:

Attempted TypeRef creations: 10525999          Surviving uniques: 569797 Doomed allocations avoided: 1863767
I am morally certain that the last number can be much higher, but woe to he who gets too ambitious here.  (Or maybe that's "whoa.") The major difficulty is as usual the introduction of cycles.  If I rework the tpeCache mechanism in Symbol a bit (something I wish I'd pursued before I despair enveloped me) I think I can make that part predictable.  I also discovered that one does not want to try to cache types separately from symbols unless they are never transformed, and even trivial types like packages are transformed in erasure.  So it's a matter of populating the tpeCache of applicable symbols without cycling on the caching mechanism, but while still being able to use the cache subsequently.
What would be helpful is for anyone to share what they know about which symbols are *in principle* always mapped to the same typeref (during a single phase.) I estimate at least the following:
If args.isEmpty:
  - a class without typeParams whose prefix is ThisType(aPackageClass):.  - a type parameter without typeParams.  (This is probably incomplete.)
If args.nonEmpty (and I didn't do anything with this yet, but some of these come in by the tens of thousands.)
  - a parameterized top level class which has been parameterized on its own type parameters to stand in as dummies so it can be kind *. (This whole setup "desperately needs cleaning up" as moors put it.)
Finally, the most deflating news is that the profiler is fairly impressed by these gains but the wall clock is not.  My working theory at this point is that collecting allocation information takes the profiler time out of proportion with the impact in real life; so it heavily exaggerates the significance of short-lived allocations (okay, probably it's the naive profile output reader doing the exaggerating.) Obviously hotspot can't eliminate either the allocation or the calculation of the hashcode given that the hashcode is in the constructor and we use the hashcode to look up the pre-existing object, but the amortized allocation cost seems to be close to zero.  The wall clock gain appears to be entirely from eliminating the hashcode calculation.
Since the cached typerefs can key on symbol and symbols use reference equality, I was able to use IdentityHashMap for this, which also lowers the burden on uniques.
Martin Odersky
Joined: 2009-10-07,
User offline. Last seen 42 years 45 weeks ago.
Re: typerefs and allocations

Hi Paul,

Can we avoid the redundant allocations by prechecking in thr typeRef
factory method whether a reference exists? That would leave murmur as
the only bottleneck. Do we know that the benefits of murmur outweigh
the cost, or should wevinvestigate a cheaper has?

Cheers

Martin

Sent from my phone

On Dec 30, 2011, at 17:22, Paul Phillips wrote:

> Since I will go nuts if I allow myself to work on this for one more
> minute, I will just document something now. I have spent the last
> several days ignoring things to wrestle with a bear. The bear won,
> but I got in a few weak blows.
>
> Attempted TypeRef creations: 10525999
> Surviving uniques: 569797
> Doomed allocations avoided: 0
>
> These are the statistics from when I started for quick.lib. This
> means that we allocate ten million typerefs, 94% of which are
> discarded after discovering that uniques already holds an equivalent
> typeref. If that's not already obviously suboptimal, note that we
> must calculate the hashcode for each of those 9.4M, which means a
> reasonably expensive murmurhash3 which itself creates yet more
> garbage.
>
> As of now I have those numbers at:
>
> Attempted TypeRef creations: 10525999
> Surviving uniques: 569797
> Doomed allocations avoided: 1863767
>
> I am morally certain that the last number can be much higher, but
> woe to he who gets too ambitious here. (Or maybe that's "whoa.")
> The major difficulty is as usual the introduction of cycles. If I
> rework the tpeCache mechanism in Symbol a bit (something I wish I'd
> pursued before I despair enveloped me) I think I can make that part
> predictable. I also discovered that one does not want to try to
> cache types separately from symbols unless they are never
> transformed, and even trivial types like packages are transformed in
> erasure. So it's a matter of populating the tpeCache of applicable
> symbols without cycling on the caching mechanism, but while still
> being able to use the cache subsequently.
>
> What would be helpful is for anyone to share what they know about
> which symbols are *in principle* always mapped to the same typeref
> (during a single phase.) I estimate at least the following:
>
> If args.isEmpty:
>
> - a class without typeParams whose prefix is ThisType
> (aPackageClass):.
> - a type parameter without typeParams. (This is probably
> incomplete.)
>
> If args.nonEmpty (and I didn't do anything with this yet, but some
> of these come in by the tens of thousands.)
>
> - a parameterized top level class which has been parameterized on
> its own type parameters to stand in as dummies so it can be kind *.
> (This whole setup "desperately needs cleaning up" as moors put it.)
>
> Finally, the most deflating news is that the profiler is fairly
> impressed by these gains but the wall clock is not. My working
> theory at this point is that collecting allocation information takes
> the profiler time out of proportion with the impact in real life; so
> it heavily exaggerates the significance of short-lived allocations
> (okay, probably it's the naive profile output reader doing the
> exaggerating.) Obviously hotspot can't eliminate either the
> allocation or the calculation of the hashcode given that the
> hashcode is in the constructor and we use the hashcode to look up
> the pre-existing object, but the amortized allocation cost seems to
> be close to zero. The wall clock gain appears to be entirely from
> eliminating the hashcode calculation.
>
> Since the cached typerefs can key on symbol and symbols use
> reference equality, I was able to use IdentityHashMap for this,
> which also lowers the burden on uniques.
>

Viktor Klang
Joined: 2008-12-17,
User offline. Last seen 1 year 27 weeks ago.
Re: Re: typerefs and allocations


On Fri, Dec 30, 2011 at 5:50 PM, Martin Odersky <odersky [at] gmail [dot] com> wrote:
Hi Paul,

Can we avoid the redundant allocations by prechecking in thr typeRef factory method whether a reference exists? That would leave murmur as the only bottleneck. Do we know that the benefits of murmur outweigh the cost, or should wevinvestigate a cheaper has?

Switch to Phils hash?
 

Cheers

 Martin

Sent from my phone


On Dec 30, 2011, at 17:22, Paul Phillips <paulp [at] improving [dot] org> wrote:

Since I will go nuts if I allow myself to work on this for one more minute, I will just document something now.  I have spent the last several days ignoring things to wrestle with a bear.  The bear won, but I got in a few weak blows.

Attempted TypeRef creations: 10525999
         Surviving uniques: 569797
 Doomed allocations avoided: 0

These are the statistics from when I started for quick.lib.  This means that we allocate ten million typerefs, 94% of which are discarded after discovering that uniques already holds an equivalent typeref.  If that's not already obviously suboptimal, note that we must calculate the hashcode for each of those 9.4M, which means a reasonably expensive murmurhash3 which itself creates yet more garbage.

As of now I have those numbers at:

Attempted TypeRef creations: 10525999
         Surviving uniques: 569797
 Doomed allocations avoided: 1863767

I am morally certain that the last number can be much higher, but woe to he who gets too ambitious here.  (Or maybe that's "whoa.") The major difficulty is as usual the introduction of cycles.  If I rework the tpeCache mechanism in Symbol a bit (something I wish I'd pursued before I despair enveloped me) I think I can make that part predictable.  I also discovered that one does not want to try to cache types separately from symbols unless they are never transformed, and even trivial types like packages are transformed in erasure.  So it's a matter of populating the tpeCache of applicable symbols without cycling on the caching mechanism, but while still being able to use the cache subsequently.

What would be helpful is for anyone to share what they know about which symbols are *in principle* always mapped to the same typeref (during a single phase.) I estimate at least the following:

If args.isEmpty:

 - a class without typeParams whose prefix is ThisType(aPackageClass):.
 - a type parameter without typeParams.  (This is probably incomplete.)

If args.nonEmpty (and I didn't do anything with this yet, but some of these come in by the tens of thousands.)

 - a parameterized top level class which has been parameterized on its own type parameters to stand in as dummies so it can be kind *. (This whole setup "desperately needs cleaning up" as moors put it.)

Finally, the most deflating news is that the profiler is fairly impressed by these gains but the wall clock is not.  My working theory at this point is that collecting allocation information takes the profiler time out of proportion with the impact in real life; so it heavily exaggerates the significance of short-lived allocations (okay, probably it's the naive profile output reader doing the exaggerating.) Obviously hotspot can't eliminate either the allocation or the calculation of the hashcode given that the hashcode is in the constructor and we use the hashcode to look up the pre-existing object, but the amortized allocation cost seems to be close to zero.  The wall clock gain appears to be entirely from eliminating the hashcode calculation.

Since the cached typerefs can key on symbol and symbols use reference equality, I was able to use IdentityHashMap for this, which also lowers the burden on uniques.




--
Viktor Klang

Akka Tech LeadTypesafe - Enterprise-Grade Scala from the Experts

Twitter: @viktorklang
extempore
Joined: 2008-12-17,
User offline. Last seen 35 weeks 3 days ago.
Re: typerefs and allocations


On Fri, Dec 30, 2011 at 8:50 AM, Martin Odersky <odersky [at] gmail [dot] com> wrote:
Can we avoid the redundant allocations by prechecking in thr typeRef factory method whether a reference exists?

That's where I'm doing it.  The challenge is that it's easy to go right through the factory method again while you're trying to check whether the typeref being constructed can be cached, because calling almost anything on a symbol can lead to its tpeCache being populated, which goes right through the factory again.
That would leave murmur as the only bottleneck. Do we know that the benefits of murmur outweigh the cost, or should wevinvestigate a cheaper has?

We saw immediate speed benefits to murmur, but we could use it a lot better.  We can keep it on the back burner, but for now I think the hash is fine and we should just be smarter about avoiding hashing.

Copyright © 2012 École Polytechnique Fédérale de Lausanne (EPFL), Lausanne, Switzerland