This page is no longer maintained — Please continue to the home page at www.scala-lang.org

Does anyone know if case classes cache hashCode?

10 replies
pkolaczk
Joined: 2010-01-14,
User offline. Last seen 2 years 38 weeks ago.

Hi,

I'm writing software that does lots of hash map lookups.
These lookups take quite a lot of time.

I use case class objects as hash map keys - can it be the problem?
Do the case classes store the hash code somewhere, or is it recomputed
everytime it is called?

Regards,
Piotr Kołaczkowski

H-star Development
Joined: 2010-04-14,
User offline. Last seen 2 years 26 weeks ago.
Re: Does anyone know if case classes cache hashCode?

use a profiler and/or override hashcode yourself
Am 18.10.2010 11:46, schrieb Piotr Kołaczkowski:
> Hi,
>
> I'm writing software that does lots of hash map lookups.
> These lookups take quite a lot of time.
>
> I use case class objects as hash map keys - can it be the problem?
> Do the case classes store the hash code somewhere, or is it recomputed
> everytime it is called?
>
> Regards,
> Piotr Kołaczkowski
>
>
>

Stephen Tu
Joined: 2010-02-24,
User offline. Last seen 42 years 45 weeks ago.
Re: Does anyone know if case classes cache hashCode?
the hash code is recomputed for each invocation. you can see that as follows:

$ scalac -Xprint:clean caseclass.scala
[...]
    override def hashCode(): Int = ScalaRunTime.this._hashCode(Foo.this);

and _hashCode looks something like:

  def _hashCode(x: Product): Int = {
    val arr =  x.productArity
    var code = arr
    var i = 0
    while (i < arr) {
      val elem = x.productElement(i)
      code = code * 41 + (if (elem == null) 0 else elem.##)
      i += 1
    }
    code
  }



2010/10/18 Piotr Kołaczkowski <pkolaczk [at] elka [dot] pw [dot] edu [dot] pl>
Hi,

I'm writing software that does lots of hash map lookups.
These lookups take quite a lot of time.

I use case class objects as hash map keys - can it be the problem?
Do the case classes store the hash code somewhere, or is it recomputed everytime it is called?

Regards,
Piotr Kołaczkowski



Kevin Wright 2
Joined: 2010-05-30,
User offline. Last seen 26 weeks 4 days ago.
Re: Does anyone know if case classes cache hashCode?
If you want it cached, you should be able to implement this yourself as a lazy val:
    override lazy val hashCode(): Int = ScalaRunTime.this._hashCode(Foo.this);
if a definition of hashCode is already present, the compiler won't add it again for a case class.

2010/10/18 Stephen Tu <steve33671 [at] gmail [dot] com>
the hash code is recomputed for each invocation. you can see that as follows:

$ scalac -Xprint:clean caseclass.scala
[...]
    override def hashCode(): Int = ScalaRunTime.this._hashCode(Foo.this);

and _hashCode looks something like:

  def _hashCode(x: Product): Int = {
    val arr =  x.productArity
    var code = arr
    var i = 0
    while (i < arr) {
      val elem = x.productElement(i)
      code = code * 41 + (if (elem == null) 0 else elem.##)
      i += 1
    }
    code
  }



2010/10/18 Piotr Kołaczkowski <pkolaczk [at] elka [dot] pw [dot] edu [dot] pl>
Hi,

I'm writing software that does lots of hash map lookups.
These lookups take quite a lot of time.

I use case class objects as hash map keys - can it be the problem?
Do the case classes store the hash code somewhere, or is it recomputed everytime it is called?

Regards,
Piotr Kołaczkowski






--
Kevin Wright

mail / gtalk / msn : kev [dot] lee [dot] wright [at] gmail [dot] com
pulse / skype: kev.lee.wright
twitter: @thecoda

pkolaczk
Joined: 2010-01-14,
User offline. Last seen 2 years 38 weeks ago.
Re: Does anyone know if case classes cache hashCode?

Nice one. :)
Scala surprises me all the time :)
Thanks.

W dniu 2010-10-18 12:37, Kevin Wright pisze:
> If you want it cached, you should be able to implement this yourself as
> a lazy val:
>
> override lazy val hashCode(): Int =
> ScalaRunTime.this._hashCode(Foo.this);
>
> if a definition of hashCode is already present, the compiler won't add
> it again for a case class.
>
>
> 2010/10/18 Stephen Tu
> >
>
> the hash code is recomputed for each invocation. you can see that as
> follows:
>
> $ scalac -Xprint:clean caseclass.scala
> [...]
> override def hashCode(): Int =
> ScalaRunTime.this._hashCode(Foo.this);
>
> and _hashCode looks something like:
>
> def _hashCode(x: Product): Int = {
> val arr = x.productArity
> var code = arr
> var i = 0
> while (i < arr) {
> val elem = x.productElement(i)
> code = code * 41 + (if (elem == null) 0 else elem.##)
> i += 1
> }
> code
> }
>
>
>
> 2010/10/18 Piotr Kołaczkowski
> >
>
> Hi,
>
> I'm writing software that does lots of hash map lookups.
> These lookups take quite a lot of time.
>
> I use case class objects as hash map keys - can it be the problem?
> Do the case classes store the hash code somewhere, or is it
> recomputed everytime it is called?
>
> Regards,
> Piotr Kołaczkowski
>
>
>
>
>
>
> --
> Kevin Wright
>
> mail / gtalk / msn :
> kev [dot] lee [dot] wright [at] gmail [dot] com
>
> pulse / skype: kev.lee.wright
> twitter: @thecoda
>

nilskp
Joined: 2009-01-30,
User offline. Last seen 1 year 27 weeks ago.
Re: Does anyone know if case classes cache hashCode?
On Mon, Oct 18, 2010 at 5:37 AM, Kevin Wright <kev [dot] lee [dot] wright [at] gmail [dot] com> wrote:
If you want it cached, you should be able to implement this yourself as a lazy val:
    override lazy val hashCode(): Int = ScalaRunTime.this._hashCode(Foo.this);

Lazy vs non-lazy in this case should be considered. If every instance is always used as key, it shouldn't be lazy as that will just incur unnecessary memory overhead on each read.
pkolaczk
Joined: 2010-01-14,
User offline. Last seen 2 years 38 weeks ago.
Re: Does anyone know if case classes cache hashCode?

W dniu 2010-10-18 15:18, Nils Kilden-Pedersen pisze:
> On Mon, Oct 18, 2010 at 5:37 AM, Kevin Wright
> > wrote:
>
> If you want it cached, you should be able to implement this yourself
> as a lazy val:
>
> override lazy val hashCode(): Int =
> ScalaRunTime.this._hashCode(Foo.this);
>
>
> Lazy vs non-lazy in this case should be considered. If every instance is
> always used as key, it shouldn't be lazy as that will just incur
> unnecessary memory overhead on each read.
>

Ok, I changed it to val and now it runs much faster.

Additionally, just to check, I temporarily replaced the Scala immutable
Map with java.util.HashMap and it seems that gets are many times faster
(about 3x-4x) fot the Java HashMap. Is it ok (because of immutability)
or should I expect any performance improvements of the Scala immutable
HashMap in the future?

Regards,
Piotr Kołaczkowski

H-star Development
Joined: 2010-04-14,
User offline. Last seen 2 years 26 weeks ago.
Re: Re: Does anyone know if case classes cache hashCode?

out of curiosity, what happens if you use scala's mutable hashmap?
Am 18.10.2010 15:43, schrieb Piotr Kołaczkowski:
>
> W dniu 2010-10-18 15:18, Nils Kilden-Pedersen pisze:
>> On Mon, Oct 18, 2010 at 5:37 AM, Kevin Wright
>> > > wrote:
>>
>> If you want it cached, you should be able to implement this yourself
>> as a lazy val:
>>
>> override lazy val hashCode(): Int =
>> ScalaRunTime.this._hashCode(Foo.this);
>>
>>
>> Lazy vs non-lazy in this case should be considered. If every instance is
>> always used as key, it shouldn't be lazy as that will just incur
>> unnecessary memory overhead on each read.
>>
>
> Ok, I changed it to val and now it runs much faster.
>
> Additionally, just to check, I temporarily replaced the Scala
> immutable Map with java.util.HashMap and it seems that gets are many
> times faster (about 3x-4x) fot the Java HashMap. Is it ok (because of
> immutability) or should I expect any performance improvements of the
> Scala immutable HashMap in the future?
>
> Regards,
> Piotr Kołaczkowski
>
>
>

pkolaczk
Joined: 2010-01-14,
User offline. Last seen 2 years 38 weeks ago.
Re: Does anyone know if case classes cache hashCode?

W dniu 2010-10-18 16:12, HamsterofDeath pisze:
> out of curiosity, what happens if you use scala's mutable hashmap?

It was faster than the immutable one, but not as fast as the Java one.
Strangely, I tested it once again after caching hashCode, and now the
mutable one seems to be slower even than the immutable one... But I
haven't run "scientific" tests to check this, so this might not be
generally true. I will try to make proper performance tests soon.

BTW: I checked that on the **client** JVM, purposely, because our
application is intended to run in the browser, as an applet.

Regards,
Piotr Kołaczkowski

extempore
Joined: 2008-12-17,
User offline. Last seen 35 weeks 3 days ago.
Re: Re: Does anyone know if case classes cache hashCode?

On Mon, Oct 18, 2010 at 06:13:43PM +0200, Piotr Kołaczkowski wrote:
> It was faster than the immutable one, but not as fast as the Java one.
> Strangely, I tested it once again after caching hashCode, and now the
> mutable one seems to be slower even than the immutable one... But I
> haven't run "scientific" tests to check this, so this might not be
> generally true. I will try to make proper performance tests soon.

You might also want to consider this ticket:

https://lampsvn.epfl.ch/trac/scala/ticket/2537

Not long ago I checked in a new option:

-Ymurmur

It achieves a far superior hashing distribution but is significantly
slower to calculate.

You are right that a lazy val in case classes could make a big
difference, but there's no way for the compiler to know that it's safe.
(The fields may be immutable but there's no way to know the hashCodes of
those fields are.) When we pursue breaking down case class functionality
better, one thing I would like is a way to signal from userspace that
the class is truly immutable and the hashcode should be cached. For now
you'd have to do it manually. If you use murmur:

override lazy val hashCode =
scala.runtime.ScalaRunTime._hashCodeMurmur(this)

...then you might get good results. Maybe.

pkolaczk
Joined: 2010-01-14,
User offline. Last seen 2 years 38 weeks ago.
Re: Does anyone know if case classes cache hashCode?

Thanks!

Expensive hashing is not a problem for me, because the key set in my
hashmap rarely changes (and I keep the caches). What changes often, are
the values. I need extremely fast single element access (mostly read).
We build a simulation engine, where the state of the system is stored in
the hashmap. The system needs to backtrack sometimes - cancel the last
point of simulation and restart from the previous state with a smaller
step. Thus, we benefit from the immutability and persistence very much.

By the way, is there any method to change the fill factor of
an immutable hash map in Scala?

Regards,
Piotr

W dniu 2010-10-18 18:44, Paul Phillips pisze:
> On Mon, Oct 18, 2010 at 06:13:43PM +0200, Piotr Kołaczkowski wrote:
>> It was faster than the immutable one, but not as fast as the Java one.
>> Strangely, I tested it once again after caching hashCode, and now the
>> mutable one seems to be slower even than the immutable one... But I
>> haven't run "scientific" tests to check this, so this might not be
>> generally true. I will try to make proper performance tests soon.
>
> You might also want to consider this ticket:
>
> https://lampsvn.epfl.ch/trac/scala/ticket/2537
>
> Not long ago I checked in a new option:
>
> -Ymurmur
>
> It achieves a far superior hashing distribution but is significantly
> slower to calculate.
>
> You are right that a lazy val in case classes could make a big
> difference, but there's no way for the compiler to know that it's safe.
> (The fields may be immutable but there's no way to know the hashCodes of
> those fields are.) When we pursue breaking down case class functionality
> better, one thing I would like is a way to signal from userspace that
> the class is truly immutable and the hashcode should be cached. For now
> you'd have to do it manually. If you use murmur:
>
> override lazy val hashCode =
> scala.runtime.ScalaRunTime._hashCodeMurmur(this)
>
> ...then you might get good results. Maybe.
>

Copyright © 2012 École Polytechnique Fédérale de Lausanne (EPFL), Lausanne, Switzerland