This page is no longer maintained — Please continue to the home page at www.scala-lang.org

JVM Optimizations and val.

29 replies
edmondo1984
Joined: 2011-09-14,
User offline. Last seen 28 weeks 3 days ago.
Dear Scala users,is it necessary to turn all the vals into "final vals" to allow JVM to perform optimizations, or is that useless?
Best Regards
Edmondo
Geir Hedemark
Joined: 2011-11-01,
User offline. Last seen 42 years 45 weeks ago.
Re: JVM Optimizations and val.

On 2011, Dec 27, at 1:50 PM, Edmondo Porcu wrote:
> Dear Scala users,
> is it necessary to turn all the vals into "final vals" to allow JVM to perform optimizations, or is that useless?

I am wondering why you are asking, because I really cant see in what kind of case this kind of hand optimization would be in order.

Personally, I wouldn't bother thinking about this until I actually had a performance issue. In those cases where I do find performance issues, I almost never end up doing this kind of optimization.

To answer your question: If you are assigning a literal string to your val, the final would be useless because the string would be put into a lookup table anyway. If you are creating a val inside of a loop, try moving it outside the loop if you can (due to readability issues, not performance). If you are doing anything else I would expect the JIT compiler to sort it out after a few thousand iterations until I was proven wrong.

YMMV. I haven't actually tested any of this, and I have no idea how your code looks.

yours
Geir

H-star Development
Joined: 2010-04-14,
User offline. Last seen 2 years 26 weeks ago.
Re: JVM Optimizations and val.

do not walk this path as it leads to only darkness and despair.

rule of thumb: trust the scala compiler and the vm. usually they can apply the obvious optimizations by themselves. especially the (-server)vm was able to surprise me in positive ways.
if there is still a problem: trust the result of a profiler.

if you feel like doing experiments, read some stuff on micro benchmarking and just try which way is faster.

to answer your question:
i know about a multithreading realted optimization the vm can perform on vals, but not on vars. but i do not know of an optimization that applies only to final vals.

-------- Original-Nachricht --------
> Datum: Tue, 27 Dec 2011 13:50:47 +0100
> Von: Edmondo Porcu
> An: scala-user
> Betreff: [scala-user] JVM Optimizations and val.

> Dear Scala users,
> is it necessary to turn all the vals into "final vals" to allow JVM to
> perform optimizations, or is that useless?
>
> Best Regards
>
> Edmondo

H-star Development
Joined: 2010-04-14,
User offline. Last seen 2 years 26 weeks ago.
Re: JVM Optimizations and val.

-------- Original-Nachricht --------
> Datum: Tue, 27 Dec 2011 14:08:40 +0100
> Von: Geir Hedemark
> An: Edmondo Porcu
> CC: scala-user
> Betreff: Re: [scala-user] JVM Optimizations and val.

> On 2011, Dec 27, at 1:50 PM, Edmondo Porcu wrote:
> > Dear Scala users,
> > is it necessary to turn all the vals into "final vals" to allow JVM to
> perform optimizations, or is that useless?
>
> I am wondering why you are asking, because I really cant see in what kind
> of case this kind of hand optimization would be in order.
>
> Personally, I wouldn't bother thinking about this until I actually had a
> performance issue. In those cases where I do find performance issues, I
> almost never end up doing this kind of optimization.
>
> To answer your question: If you are assigning a literal string to your
> val, the final would be useless because the string would be put into a lookup
> table anyway. If you are creating a val inside of a loop, try moving it
> outside the loop if you can (due to readability issues, not performance).

i would suggest to put everything *into* the loop to make it more readable - this way it is clear that the val is only used inside the loop :). it also avoids errors: if you have to make it a var, it might be possible that you forget to update that vars value after a few code changes and you got yourself a bug.

the server vm moves declarations out of the loop btw (at least some years ago it did, i tested that)

If
> you are doing anything else I would expect the JIT compiler to sort it out
> after a few thousand iterations until I was proven wrong.
>
> YMMV. I haven't actually tested any of this, and I have no idea how your
> code looks.
>
> yours
> Geir
>
>
>

Tim P
Joined: 2011-07-28,
User offline. Last seen 1 year 4 weeks ago.
Re: JVM Optimizations and val.

Maybe I've over-indulged on the Christmas spirit, and my head is not
clear on this subject - but what exactly is a "final val" - what does
the final do and why is it a necessary part of the language, even?

Viktor Klang
Joined: 2008-12-17,
User offline. Last seen 1 year 27 weeks ago.
Re: JVM Optimizations and val.


On Tue, Dec 27, 2011 at 2:28 PM, Tim Pigden <tim [dot] pigden [at] optrak [dot] com> wrote:
Maybe I've over-indulged on the Christmas spirit, and my head is not
clear on this subject - but what exactly is a "final val" - what does
the final do and why is it a necessary part of the language, even?

class F {
  val x = 5
}

class Pigdog extends F {
  override val x = 10
}
 Cheers,



--
Viktor Klang

Akka Tech LeadTypesafe - Enterprise-Grade Scala from the Experts

Twitter: @viktorklang
Geir Hedemark
Joined: 2011-11-01,
User offline. Last seen 42 years 45 weeks ago.
Re: JVM Optimizations and val.

On 2011, Dec 27, at 2:25 PM, Dennis Haupt wrote:
>> To answer your question: If you are assigning a literal string to your
>> val, the final would be useless because the string would be put into a lookup
>> table anyway. If you are creating a val inside of a loop, try moving it
>> outside the loop if you can (due to readability issues, not performance).
> i would suggest to put everything *into* the loop to make it more readable - this way it is clear that the val is only used inside the loop :). it also avoids errors: if you have to make it a var, it might be possible that you forget to update that vars value after a few code changes and you got yourself a bug.
>
> the server vm moves declarations out of the loop btw (at least some years ago it did, i tested that)

We are not talking about the same thing: I am suggesting doing

val myConst = "xyzzy"
something.foreach{item => ...}

instead of

something.foreach{ item => val myConst = "xyzzy" ... }

The last val will eventually be moved outside the loop by the JIT compiler (or possibly the scala compiler?), but I think it occludes what really goes on in the loop. It has nothing to do with the loop, get it out.

yours
Geir

H-star Development
Joined: 2010-04-14,
User offline. Last seen 2 years 26 weeks ago.
Re: JVM Optimizations and val.

there is also no reason to put in into a scope where it is not used.

my suggestion:
instead of:
val x = ...
val result = list.map(... uses x ... )

write

val result = {
val x = ...
list.map(.... uses x ... )
}

:)

-------- Original-Nachricht --------
> Datum: Tue, 27 Dec 2011 14:48:32 +0100
> Von: Geir Hedemark
> An: "Dennis Haupt"
> CC: edmondo [dot] porcu [at] gmail [dot] com, scala-user [at] googlegroups [dot] com
> Betreff: Re: [scala-user] JVM Optimizations and val.

> On 2011, Dec 27, at 2:25 PM, Dennis Haupt wrote:
> >> To answer your question: If you are assigning a literal string to your
> >> val, the final would be useless because the string would be put into a
> lookup
> >> table anyway. If you are creating a val inside of a loop, try moving it
> >> outside the loop if you can (due to readability issues, not
> performance).
> > i would suggest to put everything *into* the loop to make it more
> readable - this way it is clear that the val is only used inside the loop :). it
> also avoids errors: if you have to make it a var, it might be possible that
> you forget to update that vars value after a few code changes and you got
> yourself a bug.
> >
> > the server vm moves declarations out of the loop btw (at least some
> years ago it did, i tested that)
>
> We are not talking about the same thing: I am suggesting doing
>
> val myConst = "xyzzy"
> something.foreach{item => ...}
>
> instead of
>
> something.foreach{ item => val myConst = "xyzzy" ... }
>
> The last val will eventually be moved outside the loop by the JIT compiler
> (or possibly the scala compiler?), but I think it occludes what really
> goes on in the loop. It has nothing to do with the loop, get it out.
>
> yours
> Geir
>

Pavel Pavlov
Joined: 2011-12-01,
User offline. Last seen 42 years 45 weeks ago.
Re: JVM Optimizations and val.


вторник, 27 декабря 2011 г. 20:20:05 UTC+7 пользователь Dennis Haupt написал:
i know about a multithreading realted optimization the vm can perform on vals, but not on vars. but i do not know of an optimization that applies only to final vals.

VM doesn't know nothing about vals&vars. At VM level there are final & non-final fields only.
Final fields can be optimized by VM much more aggressively, as they don't need to be re-read from shared memory on every potential monitorenter/volatile_load bytecode instruction (i.e. after every virtual/non-inlined call).

However, scalac already translates vals into final fields + accessors. For example, such code:
  class F { val x = 5 }
translated into:
  public class F implements ScalaObject {
    public int x() {
      return x;
    }
    public F() { }
    private final int x = 5;
  }

So, if VM will be able to inline an accessor, all final field optimizations will take place.

edmondo1984
Joined: 2011-09-14,
User offline. Last seen 28 weeks 3 days ago.
Re: JVM Optimizations and val.
Dear Pavel,thank you for your answer. You are getting to the point.
Having a final val makes also the accessor final, right?
That would be make way more probable that all the call to the getter will be inlined...right?
Best Regards

2011/12/27 Pavel Pavlov <pavel [dot] e [dot] pavlov [at] gmail [dot] com>


вторник, 27 декабря 2011 г. 20:20:05 UTC+7 пользователь Dennis Haupt написал:
i know about a multithreading realted optimization the vm can perform on vals, but not on vars. but i do not know of an optimization that applies only to final vals.

VM doesn't know nothing about vals&vars. At VM level there are final & non-final fields only.
Final fields can be optimized by VM much more aggressively, as they don't need to be re-read from shared memory on every potential monitorenter/volatile_load bytecode instruction (i.e. after every virtual/non-inlined call).

However, scalac already translates vals into final fields + accessors. For example, such code:
  class F { val x = 5 }
translated into:
  public class F implements ScalaObject {
    public int x() {
      return x;
    }
    public F() { }
    private final int x = 5;
  }

So, if VM will be able to inline an accessor, all final field optimizations will take place.


Pavel Pavlov
Joined: 2011-12-01,
User offline. Last seen 42 years 45 weeks ago.
Re: JVM Optimizations and val.


вторник, 27 декабря 2011 г. 22:42:04 UTC+7 пользователь edmondo1984 написал:
Dear Pavel,thank you for your answer. You are getting to the point.
Having a final val makes also the accessor final, right?

No. It would be impossible to override val in that case - see Viktor's example.


That would be make way more probable that all the call to the getter will be inlined...right?
 Of course. But scalac can make accessors final for anonymous/private classes, closures and other "good enough" cases.
You can check if it does this or not by playing with scalac+jad (or some other class file decompiler).

edmondo1984
Joined: 2011-09-14,
User offline. Last seen 28 weeks 3 days ago.
Re: JVM Optimizations and val.
I am confused, doesn't Viktor example shows how to override vals?
Best


2011/12/27 Pavel Pavlov <pavel [dot] e [dot] pavlov [at] gmail [dot] com>


вторник, 27 декабря 2011 г. 22:42:04 UTC+7 пользователь edmondo1984 написал:
Dear Pavel,thank you for your answer. You are getting to the point.
Having a final val makes also the accessor final, right?

No. It would be impossible to override val in that case - see Viktor's example.


That would be make way more probable that all the call to the getter will be inlined...right?
 Of course. But scalac can make accessors final for anonymous/private classes, closures and other "good enough" cases.
You can check if it does this or not by playing with scalac+jad (or some other class file decompiler).


d_m
Joined: 2010-11-11,
User offline. Last seen 35 weeks 2 days ago.
Re: JVM Optimizations and val.

On Tue, Dec 27, 2011 at 02:20:05PM +0100, Dennis Haupt wrote:
> rule of thumb: trust the scala compiler and the vm. usually they can
> apply the obvious optimizations by themselves. especially the
> (-server)vm was able to surprise me in positive ways.
>
> if there is still a problem: trust the result of a profiler.

I would say that the rule should be "hope that scalac and hotspot will
optimize your code, but don't trust them to." Anyone working with large
arrays of data has independently discovered that for/foreach are much
slower than while-loops (although this will probably change in 2.10,
hurrah!) and there are other similar situations.

Looking at the generated bytecode can often be useful (especially for
finding instances of boxing) but profiling and testing are the only
ways to be sure.

So: write expressive code. Hope that Scalac/Hotspot will optimize it.
Confirm (or deny) that they do. Profile/refactor if necessary.

There are definitely situations where making a method final is
important (e.g. when you want enable method inlining) but "final"
should only be used when determined to be necessary, not globally.

ichoran
Joined: 2009-08-14,
User offline. Last seen 2 years 3 weeks ago.
Re: JVM Optimizations and val.


On Tue, Dec 27, 2011 at 7:50 AM, Edmondo Porcu <edmondo [dot] porcu [at] gmail [dot] com> wrote:
Dear Scala users,is it necessary to turn all the vals into "final vals" to allow JVM to perform optimizations, or is that useless?

The JVM is usually pretty good about figuring out when something can be promoted from a method access (which a non-final val must be, in case it is overridden) to a field access.

However, you need it if you want a numeric constant to be used in a fully optimized manner.

My canonical example is the low-quality linear congruential random number generator from the Computer Languages Benchmark Game :

object RandomNumber {
val IM = 139968
val IA = 3877
val IC = 29573
private var seed = 42

def scaledTo(max: Double) = {
seed = (seed * IA + IC) % IM
max * seed / IM
}
}
is the obvious way to write it, but

object RandomNumber {
final val IM = 139968
final val IA = 3877
final val IC = 29573
private var seed = 42

def scaledTo(max: Double) = {
seed = (seed * IA + IC) % IM
max * seed / IM
}
}
is about 30% faster (even though there is apparently no difference whatsoever in the meaning of the code--how can an object's vals not be final?).

So--final val for optimization, yes, good idea, at least for numeric constants.

  --Rex

Pavel Pavlov
Joined: 2011-12-01,
User offline. Last seen 42 years 45 weeks ago.
Re: JVM Optimizations and val.


вторник, 27 декабря 2011 г. 22:54:47 UTC+7 пользователь Pavel Pavlov написал:
вторник, 27 декабря 2011 г. 22:42:04 UTC+7 пользователь edmondo1984 написал:
Dear Pavel,thank you for your answer. You are getting to the point.
Having a final val makes also the accessor final, right?

No. It would be impossible to override val in that case - see Viktor's example.

Sorry, I've mislooked your question. Yes, in that case accessor will be final.
 
Marcus Downing
Joined: 2011-02-08,
User offline. Last seen 42 years 45 weeks ago.
Re: JVM Optimizations and val.

Surely an object is by definition final - since you can't extend it - so all its vals should be considered final as well? So Rex's example shows only that the compiler could be doing even more to optimise this case?

Pavel Pavlov
Joined: 2011-12-01,
User offline. Last seen 42 years 45 weeks ago.
Re: JVM Optimizations and val.
There is another optimization with final fields in Java:
When you declare static final field javac do its best to evaluate field's initializer at compile time.
For example this Java code

class A {
  static final String s1 = "foo";
  static final int i1 = 123;
}
class B {
  static final String s2 = "s1 = " + A.s1 + "; i1 = " + A.i1;
  static final int i2 = (A.i1 * A.i1) % 25 - 7;
}

will be optimized by javac into:

class B {
  static final String s2 = "s1 = foo; i1 = 123";
  static final int i2 = -3;
}

The same code in Scala will be evaluated at run time.



вторник, 27 декабря 2011 г. 22:42:04 UTC+7 пользователь edmondo1984 написал:
Dear Pavel,thank you for your answer. You are getting to the point.
Having a final val makes also the accessor final, right?
That would be make way more probable that all the call to the getter will be inlined...right?
Best Regards

2011/12/27 Pavel Pavlov <pavel [dot] e [dot] [dot] [dot] [at] gmail [dot] com>


вторник, 27 декабря 2011 г. 20:20:05 UTC+7 пользователь Dennis Haupt написал:
i know about a multithreading realted optimization the vm can perform on vals, but not on vars. but i do not know of an optimization that applies only to final vals.

VM doesn't know nothing about vals&vars. At VM level there are final & non-final fields only.
Final fields can be optimized by VM much more aggressively, as they don't need to be re-read from shared memory on every potential monitorenter/volatile_load bytecode instruction (i.e. after every virtual/non-inlined call).

However, scalac already translates vals into final fields + accessors. For example, such code:
  class F { val x = 5 }
translated into:
  public class F implements ScalaObject {
    public int x() {
      return x;
    }
    public F() { }
    private final int x = 5;
  }

So, if VM will be able to inline an accessor, all final field optimizations will take place.


Pavel Pavlov
Joined: 2011-12-01,
User offline. Last seen 42 years 45 weeks ago.
Re: JVM Optimizations and val.
Very interesting. And the bytecode difference is:

public final class RandomNumber1$ {
  private final int IM = 0x222c0;
  public int IM() { return IM; }
  public double scaledTo(double max) {
    seed_$eq((seed() * IA() + IC()) % IM());
    return (max * (double)seed()) / (double)IM();
  }
}

public final class RandomNumber2$ {
  private final int IM;
  public final int IM() { return 0x222c0; }
  public double scaledTo(double max) {
    seed_$eq((seed() * 3877 + 29573) % 0x222c0);
    return (max * (double)seed()) / (double)0x222c0;
  }
}

So it looks for me like a sort of phase order problem in scalac. I see no reason why these two classfiles differ at all.


вторник, 27 декабря 2011 г. 23:06:59 UTC+7 пользователь Rex Kerr написал:


On Tue, Dec 27, 2011 at 7:50 AM, Edmondo Porcu <edmond [dot] [dot] [dot] [at] gmail [dot] com> wrote:
Dear Scala users,is it necessary to turn all the vals into "final vals" to allow JVM to perform optimizations, or is that useless?

The JVM is usually pretty good about figuring out when something can be promoted from a method access (which a non-final val must be, in case it is overridden) to a field access.

However, you need it if you want a numeric constant to be used in a fully optimized manner.

My canonical example is the low-quality linear congruential random number generator from the Computer Languages Benchmark Game :

object RandomNumber {
val IM = 139968
val IA = 3877
val IC = 29573
private var seed = 42

def scaledTo(max: Double) = {
seed = (seed * IA + IC) % IM
max * seed / IM
}
}
is the obvious way to write it, but

object RandomNumber {
final val IM = 139968
final val IA = 3877
final val IC = 29573
private var seed = 42

def scaledTo(max: Double) = {
seed = (seed * IA + IC) % IM
max * seed / IM
}
}
is about 30% faster (even though there is apparently no difference whatsoever in the meaning of the code--how can an object's vals not be final?).

So--final val for optimization, yes, good idea, at least for numeric constants.

  --Rex

Geir Hedemark
Joined: 2011-11-01,
User offline. Last seen 42 years 45 weeks ago.
Re: JVM Optimizations and val.

On 2011, Dec 27, at 5:06 PM, Rex Kerr wrote:
> My canonical example is the low-quality linear congruential random number generator from the Computer Languages Benchmark Game :
(...)
> is about 30% faster (even though there is apparently no difference whatsoever in the meaning of the code--how can an object's vals not be final?).
>
> So--final val for optimization, yes, good idea, at least for numeric constants.

I disagree.

What you say is only true if you are creating truly immense numbers of random numbers, and only then if the runtime is a problem. Who cares if the call takes thirty or fifty microseconds? 1)

I have worked with all to many developers who spent ages thinking about stuff like this, only to be blindsided by the O(2^n) algorithm they implemented. Most developers are not library developers. They are application developers who assemble library calls into cool stuff.

Prematurely optimizing is a slightly better idea than writing your own random number generator. That still doesn't mean it is a good idea.

yours
Geir

1) Some people do care. In that case, you know why, and you will be able to explain why you have chosen to use Scala as your tool of choice. The advice offered in this thread is general, which is why I think it is wrong.

H-star Development
Joined: 2010-04-14,
User offline. Last seen 2 years 26 weeks ago.
Re: JVM Optimizations and val.
such optimization findings are dangerous. they spread, people believe in them - and after the problem is long fixed, they still act as if it wasn't because they don't notice.

btw. is there still a speed difference with the server vm? it should inline everything anyway

Am 27.12.2011 17:33, schrieb Pavel Pavlov:
Very interesting. And the bytecode difference is:

public final class RandomNumber1$ {
  private final int IM = 0x222c0;
  public int IM() { return IM; }
  public double scaledTo(double max) {
    seed_$eq((seed() * IA() + IC()) % IM());
    return (max * (double)seed()) / (double)IM();
  }
}

public final class RandomNumber2$ {
  private final int IM;
  public final int IM() { return 0x222c0; }
  public double scaledTo(double max) {
    seed_$eq((seed() * 3877 + 29573) % 0x222c0);
    return (max * (double)seed()) / (double)0x222c0;
  }
}

So it looks for me like a sort of phase order problem in scalac. I see no reason why these two classfiles differ at all.


вторник, 27 декабря 2011 г. 23:06:59 UTC+7 пользователь Rex Kerr написал:


On Tue, Dec 27, 2011 at 7:50 AM, Edmondo Porcu <edmond [dot] [dot] [dot] [at] gmail [dot] com> wrote:
Dear Scala users, is it necessary to turn all the vals into "final vals" to allow JVM to perform optimizations, or is that useless?

The JVM is usually pretty good about figuring out when something can be promoted from a method access (which a non-final val must be, in case it is overridden) to a field access.

However, you need it if you want a numeric constant to be used in a fully optimized manner.

My canonical example is the low-quality linear congruential random number generator from the Computer Languages Benchmark Game :

object RandomNumber {
  val IM = 139968
  val IA = 3877

  val IC = 29573
  private var seed = 42

  def scaledTo(max: Double) = {

    seed = (seed * IA + IC) % IM
    max * seed / IM

  }
}
is the obvious way to write it, but

object RandomNumber {
  final val IM = 139968

  final val IA = 3877
  final val IC = 29573
  private var seed = 42


  def scaledTo(max: Double) = {
    seed = (seed * IA + IC) % IM

    max * seed / IM
  }
}
is about 30% faster (even though there is apparently no difference whatsoever in the meaning of the code--how can an object's vals not be final?).

So--final val for optimization, yes, good idea, at least for numeric constants.

  --Rex


ichoran
Joined: 2009-08-14,
User offline. Last seen 2 years 3 weeks ago.
Re: JVM Optimizations and val.
On Tue, Dec 27, 2011 at 11:22 AM, Geir Hedemark <geir [dot] hedemark [at] gmail [dot] com> wrote:
On 2011, Dec 27, at 5:06 PM, Rex Kerr wrote:
> My canonical example is the low-quality linear congruential random number generator from the Computer Languages Benchmark Game :
(...)
> is about 30% faster (even though there is apparently no difference whatsoever in the meaning of the code--how can an object's vals not be final?).
>
> So--final val for optimization, yes, good idea, at least for numeric constants.


I disagree.

What you say is only true if you are creating truly immense numbers of random numbers, and only then if the runtime is a problem. Who cares if the call takes thirty or fifty microseconds? 1)

I have worked with all to many developers who spent ages thinking about stuff like this, only to be blindsided by the O(2^n) algorithm they implemented.

I have seen the work of all too many developers who say things like that, and then find that they're unable to produce high-performance code even when they need to, because they have resolutely avoided learning anything that might help them, and because they have put so much work into a low-performance strategy that refactoring for performance is an impractical amount of work when they belatedly realize that performance is going to be an issue.

Just because some people don't understand the performance characteristics of the algorithms they are using (and yet pay attention to small performance improvements) is not an argument against knowing how to write high-performance code.

Algorithmic complexity is one of the top things to pay attention to if you're writing performance code.  It's extremely important.  Instead of saying "write first, then benchmark!", one could advise people to consider whether the code was heavily used, and if there are likely to be many items in a collection, and then either ignore performance considerations or choose an appropriate collection.

Memory usage is also quite important for large applications where garbage collection starts becoming expensive.  Primative vs. boxed types make a huge difference.  Then multiple dispatch becomes an issue.  And finally are optimizations like this one--knowing what the JVM can do for you and what it can't and how to help it out.

So, I reiterate: final val for optimization of numeric constants is a good idea.  If you don't need your numeric constants optimized, of course, don't bother.  In case you do, now you know what to try (for the time being--hopefully eventually the compiler will get smarter and will do more of these things for you).

  --Rex

Geir Hedemark
Joined: 2011-11-01,
User offline. Last seen 42 years 45 weeks ago.
Re: JVM Optimizations and val.

On 2011, Dec 27, at 5:41 PM, Rex Kerr wrote:
> Algorithmic complexity is one of the top things to pay attention to if you're writing performance code. It's extremely important. Instead of saying "write first, then benchmark!", one could advise people to consider whether the code was heavily used, and if there are likely to be many items in a collection, and then either ignore performance considerations or choose an appropriate collection.

I agree wholeheartedly. I would be perfectly happy to accept well-reasoned explanations for why a bit of code is going to be called an immense number of times and why that can't be avoided. People who are good at managing resources - doesn't matter if it is money, time, power or memory - usually do that anyway when they create a resource budget.

yours
Geir

ichoran
Joined: 2009-08-14,
User offline. Last seen 2 years 3 weeks ago.
Re: JVM Optimizations and val.
Just measured it, and there is a speed difference with the server VM as of  Sun JVM 1.6.0_26 and 1.7.0-ea-b143.  It's actually 50%, not 30%, as I had said.  (That is, the final val version is 2x faster.)

JRockit lessens the difference a bit, but everything is slower (as is typical).

A 2x slowdown in numeric code is too much to ignore out of perceived danger for numerically intensive operations.  I wish it weren't there; it seems like it really shouldn't be.  But that's the way it is for now.

  --Rex

On Tue, Dec 27, 2011 at 11:42 AM, HamsterofDeath <h-star [at] gmx [dot] de> wrote:
such optimization findings are dangerous. they spread, people believe in them - and after the problem is long fixed, they still act as if it wasn't because they don't notice.

btw. is there still a speed difference with the server vm? it should inline everything anyway

Am 27.12.2011 17:33, schrieb Pavel Pavlov:
Very interesting. And the bytecode difference is:

public final class RandomNumber1$ {
  private final int IM = 0x222c0;
  public int IM() { return IM; }
  public double scaledTo(double max) {
    seed_$eq((seed() * IA() + IC()) % IM());
    return (max * (double)seed()) / (double)IM();
  }
}

public final class RandomNumber2$ {
  private final int IM;
  public final int IM() { return 0x222c0; }
  public double scaledTo(double max) {
    seed_$eq((seed() * 3877 + 29573) % 0x222c0);
    return (max * (double)seed()) / (double)0x222c0;
  }
}

So it looks for me like a sort of phase order problem in scalac. I see no reason why these two classfiles differ at all.


вторник, 27 декабря 2011 г. 23:06:59 UTC+7 пользователь Rex Kerr написал:


On Tue, Dec 27, 2011 at 7:50 AM, Edmondo Porcu <edmond [dot] [dot] [dot] [at] gmail [dot] com> wrote:
Dear Scala users, is it necessary to turn all the vals into "final vals" to allow JVM to perform optimizations, or is that useless?

The JVM is usually pretty good about figuring out when something can be promoted from a method access (which a non-final val must be, in case it is overridden) to a field access.

However, you need it if you want a numeric constant to be used in a fully optimized manner.

My canonical example is the low-quality linear congruential random number generator from the Computer Languages Benchmark Game :

object RandomNumber {
  val IM = 139968
  val IA = 3877

  val IC = 29573
  private var seed = 42

  def scaledTo(max: Double) = {

    seed = (seed * IA + IC) % IM
    max * seed / IM

  }
}
is the obvious way to write it, but

object RandomNumber {
  final val IM = 139968

  final val IA = 3877
  final val IC = 29573
  private var seed = 42


  def scaledTo(max: Double) = {
    seed = (seed * IA + IC) % IM

    max * seed / IM
  }
}
is about 30% faster (even though there is apparently no difference whatsoever in the meaning of the code--how can an object's vals not be final?).

So--final val for optimization, yes, good idea, at least for numeric constants.

  --Rex



H-star Development
Joined: 2010-04-14,
User offline. Last seen 2 years 26 weeks ago.
Re: JVM Optimizations and val.
i could not resist:

on a java 8 server vm, the final val object version is the fastest. the non final val object is as fast as a static java equivalent. i could not get the java version as fast as the scala object.

after inlining everything everywhere - no difference left - i still got differences :). then i changed the execution order. the first one that was benchmarked was always faster than its twin. repeating the benchmark twice then showed equal (but slower!) results than the first run. i have no idea how this is possible. it also happens on the java7 vm.

i attached my file as a proof. my output:
executing 100000000 warmup calls of not final object
warmup took 1023
executing 60000000 "real" calls of not final object
execution took 499, which is 1.2024048096192385E8 operations per second
executing 100000000 warmup calls of final object
warmup took 586
executing 60000000 "real" calls of final object
execution took 344, which is 1.7441860465116277E8 operations per second //first test of final vals
executing 100000000 warmup calls of not final object
warmup took 1279
executing 60000000 "real" calls of not final object
execution took 745, which is 8.053691275167786E7 operations per second // second test of non final vals. became slower. huh?
executing 100000000 warmup calls of final object
warmup took 854
executing 60000000 "real" calls of final object
execution took 473, which is 1.2684989429175475E8 operations per second // second test of final vals. wtf???

so basically, i am once more pretty sure that not trying to do micro optimizations is a good choice. only do it if you are really forced to. your investment might be turned upside down on the next vm/hardware upgrade.






Am 27.12.2011 18:05, schrieb Rex Kerr:
CAP_xLa3eU8kiTP2iu7T0FetBqQ7Ak-oKbDcfUv276MAjMbP0oA [at] mail [dot] gmail [dot] com" type="cite">Just measured it, and there is a speed difference with the server VM as of  Sun JVM 1.6.0_26 and 1.7.0-ea-b143.  It's actually 50%, not 30%, as I had said.  (That is, the final val version is 2x faster.)

JRockit lessens the difference a bit, but everything is slower (as is typical).

A 2x slowdown in numeric code is too much to ignore out of perceived danger for numerically intensive operations.  I wish it weren't there; it seems like it really shouldn't be.  But that's the way it is for now.

  --Rex

On Tue, Dec 27, 2011 at 11:42 AM, HamsterofDeath <h-star [at] gmx [dot] de" target="_blank" rel="nofollow">h-star [at] gmx [dot] de> wrote:
such optimization findings are dangerous. they spread, people believe in them - and after the problem is long fixed, they still act as if it wasn't because they don't notice.

btw. is there still a speed difference with the server vm? it should inline everything anyway

Am 27.12.2011 17:33, schrieb Pavel Pavlov:
Very interesting. And the bytecode difference is:

public final class RandomNumber1$ {
  private final int IM = 0x222c0;
  public int IM() { return IM; }
  public double scaledTo(double max) {
    seed_$eq((seed() * IA() + IC()) % IM());
    return (max * (double)seed()) / (double)IM();
  }
}

public final class RandomNumber2$ {
  private final int IM;
  public final int IM() { return 0x222c0; }
  public double scaledTo(double max) {
    seed_$eq((seed() * 3877 + 29573) % 0x222c0);
    return (max * (double)seed()) / (double)0x222c0;
  }
}

So it looks for me like a sort of phase order problem in scalac. I see no reason why these two classfiles differ at all.


вторник, 27 декабря 2011 г. 23:06:59 UTC+7 пользователь Rex Kerr написал:


On Tue, Dec 27, 2011 at 7:50 AM, Edmondo Porcu <edmond [dot] [dot] [dot] [at] gmail [dot] com> wrote:
Dear Scala users, is it necessary to turn all the vals into "final vals" to allow JVM to perform optimizations, or is that useless?

The JVM is usually pretty good about figuring out when something can be promoted from a method access (which a non-final val must be, in case it is overridden) to a field access.

However, you need it if you want a numeric constant to be used in a fully optimized manner.

My canonical example is the low-quality linear congruential random number generator from the Computer Languages Benchmark Game :

object RandomNumber {
  val IM = 139968
  val IA = 3877

  val IC = 29573
  private var seed = 42

  def scaledTo(max: Double) = {

    seed = (seed * IA + IC) % IM
    max * seed / IM

  }
}
is the obvious way to write it, but

object RandomNumber {
  final val IM = 139968

  final val IA = 3877
  final val IC = 29573
  private var seed = 42


  def scaledTo(max: Double) = {
    seed = (seed * IA + IC) % IM

    max * seed / IM
  }
}
is about 30% faster (even though there is apparently no difference whatsoever in the meaning of the code--how can an object's vals not be final?).

So--final val for optimization, yes, good idea, at least for numeric constants.

  --Rex




Pavel Pavlov
Joined: 2011-12-01,
User offline. Last seen 42 years 45 weeks ago.
Re: JVM Optimizations and val.
It seems that HotSpot doesn't bother optimizing access to final static fields, such as converting them into constants.
In classfiles generated by javac this optimization has no sense since javac already performs constant folding.

On the other way, constant-returning method is very important optimization case besides final fields.
And folding call to such method into constant is a way easier than call to field accessor - inlining is enough here.

So I think, it is work for scalac to fold constants into accessors.
Moreover, it can completely remove fields for constant-initialized vals (it's hopeless to lay such optimization upon VM).

It would be great if it can translate the example above (  class F { val x = 5 } ) into such bytecode:
  public class F implements ScalaObject {
    public int x() { return 5; }
    // look ma, no instance field!
    // private final int x = 5;
  }

This way memory usage can be somewhat decreased.



среда, 28 декабря 2011 г. 0:05:35 UTC+7 пользователь Rex Kerr написал:
Just measured it, and there is a speed difference with the server VM as of  Sun JVM 1.6.0_26 and 1.7.0-ea-b143.  It's actually 50%, not 30%, as I had said.  (That is, the final val version is 2x faster.)

JRockit lessens the difference a bit, but everything is slower (as is typical).

A 2x slowdown in numeric code is too much to ignore out of perceived danger for numerically intensive operations.  I wish it weren't there; it seems like it really shouldn't be.  But that's the way it is for now.
ichoran
Joined: 2009-08-14,
User offline. Last seen 2 years 3 weeks ago.
Re: JVM Optimizations and val.
Based on your other benchmarking code, I'm pretty sure you're measuring multiple inlining through through the "lots" method (or the lack thereof), or something similar, in addition to the actual execution time.  For microbenchmarking, I really don't know a good way to do it aside from writing a while loop by hand to get up to at least a few thousand processor cycles.  Any convenience function like lots that I use eventually starts running afoul of optimization rules.

Microbenchmarking is not straightforward given the complexity of the JVM (including optimizations).  That doesn't mean that micro-optimizations aren't important, just that you need to test carefully, run things multiple times and in multiple orders, and so on.

FWIW, even with the strangeness, your tests _do_ show a ~2x speedup with final val.

  --Rex

2011/12/27 HamsterofDeath <h-star [at] gmx [dot] de>
i could not resist:

on a java 8 server vm, the final val object version is the fastest. the non final val object is as fast as a static java equivalent. i could not get the java version as fast as the scala object.

after inlining everything everywhere - no difference left - i still got differences :). then i changed the execution order. the first one that was benchmarked was always faster than its twin. repeating the benchmark twice then showed equal (but slower!) results than the first run. i have no idea how this is possible. it also happens on the java7 vm.

i attached my file as a proof. my output:
executing 100000000 warmup calls of not final object
warmup took 1023
executing 60000000 "real" calls of not final object
execution took 499, which is 1.2024048096192385E8 operations per second
executing 100000000 warmup calls of final object
warmup took 586
executing 60000000 "real" calls of final object
execution took 344, which is 1.7441860465116277E8 operations per second //first test of final vals
executing 100000000 warmup calls of not final object
warmup took 1279
executing 60000000 "real" calls of not final object
execution took 745, which is 8.053691275167786E7 operations per second // second test of non final vals. became slower. huh?
executing 100000000 warmup calls of final object
warmup took 854
executing 60000000 "real" calls of final object
execution took 473, which is 1.2684989429175475E8 operations per second // second test of final vals. wtf???

so basically, i am once more pretty sure that not trying to do micro optimizations is a good choice. only do it if you are really forced to. your investment might be turned upside down on the next vm/hardware upgrade.






Am 27.12.2011 18:05, schrieb Rex Kerr:
Just measured it, and there is a speed difference with the server VM as of  Sun JVM 1.6.0_26 and 1.7.0-ea-b143.  It's actually 50%, not 30%, as I had said.  (That is, the final val version is 2x faster.)

JRockit lessens the difference a bit, but everything is slower (as is typical).

A 2x slowdown in numeric code is too much to ignore out of perceived danger for numerically intensive operations.  I wish it weren't there; it seems like it really shouldn't be.  But that's the way it is for now.

  --Rex

On Tue, Dec 27, 2011 at 11:42 AM, HamsterofDeath <h-star [at] gmx [dot] de> wrote:
such optimization findings are dangerous. they spread, people believe in them - and after the problem is long fixed, they still act as if it wasn't because they don't notice.

btw. is there still a speed difference with the server vm? it should inline everything anyway

Am 27.12.2011 17:33, schrieb Pavel Pavlov:
Very interesting. And the bytecode difference is:

public final class RandomNumber1$ {
  private final int IM = 0x222c0;
  public int IM() { return IM; }
  public double scaledTo(double max) {
    seed_$eq((seed() * IA() + IC()) % IM());
    return (max * (double)seed()) / (double)IM();
  }
}

public final class RandomNumber2$ {
  private final int IM;
  public final int IM() { return 0x222c0; }
  public double scaledTo(double max) {
    seed_$eq((seed() * 3877 + 29573) % 0x222c0);
    return (max * (double)seed()) / (double)0x222c0;
  }
}

So it looks for me like a sort of phase order problem in scalac. I see no reason why these two classfiles differ at all.


вторник, 27 декабря 2011 г. 23:06:59 UTC+7 пользователь Rex Kerr написал:


On Tue, Dec 27, 2011 at 7:50 AM, Edmondo Porcu <edmond [dot] [dot] [dot] [at] gmail [dot] com> wrote:
Dear Scala users, is it necessary to turn all the vals into "final vals" to allow JVM to perform optimizations, or is that useless?

The JVM is usually pretty good about figuring out when something can be promoted from a method access (which a non-final val must be, in case it is overridden) to a field access.

However, you need it if you want a numeric constant to be used in a fully optimized manner.

My canonical example is the low-quality linear congruential random number generator from the Computer Languages Benchmark Game :

object RandomNumber {
  val IM = 139968
  val IA = 3877

  val IC = 29573
  private var seed = 42

  def scaledTo(max: Double) = {

    seed = (seed * IA + IC) % IM
    max * seed / IM

  }
}
is the obvious way to write it, but

object RandomNumber {
  final val IM = 139968

  final val IA = 3877
  final val IC = 29573
  private var seed = 42


  def scaledTo(max: Double) = {
    seed = (seed * IA + IC) % IM

    max * seed / IM
  }
}
is about 30% faster (even though there is apparently no difference whatsoever in the meaning of the code--how can an object's vals not be final?).

So--final val for optimization, yes, good idea, at least for numeric constants.

  --Rex





E. Labun
Joined: 2010-06-20,
User offline. Last seen 42 years 45 weeks ago.
Re: JVM Optimizations and val.

Hi, Pavel

On 2011-12-27 17:17, Pavel Pavlov wrote:
> There is another optimization with final fields in Java:
> When you declare static final field javac do its best to evaluate field's initializer at compile time.
> For example this Java code
>
> class A {
> static final String s1 = "foo";
> static final int i1 = 123;
> }
> class B {
> static final String s2 = "s1 = " + A.s1 + "; i1 = " + A.i1;
> static final int i2 = (A.i1 * A.i1) % 25 - 7;
> }
>
> will be optimized by javac into:
>
> class B {
> static final String s2 = "s1 = foo; i1 = 123";
> static final int i2 = -3;
> }
>
> The same code in Scala will be evaluated at run time.

The last statement is not completely true. Actually Scala performs some optimizations.
Class B (must use "object" to mimic static members) will be compiled to (an equivalent of):

object B {
final val s2 = "s1 = foo; i1 = " + 123
final val i2 = -3
}

which is close to the Java optimization.

See the attached test + output of the compiler's typer-phase.
The above optimization will be also performed for non-final vals in Scala.
(I suppose, the same should be true for Java)

Tested with Scala 2.9.1.

Regards,
Eugen

E. Labun
Joined: 2010-06-20,
User offline. Last seen 42 years 45 weeks ago.
Re: JVM Optimizations and val.

On 2011-12-27 19:07, HamsterofDeath wrote:
> execution took 473, which is 1.2684989429175475E8 operations per second // second test of final
> vals. wtf???

Too fast? Theoretically, a method call can be optimized away, if the method does not return anything
(or the returned value is not used) and otherwise does not have any side effects...

Pavel Pavlov
Joined: 2011-12-01,
User offline. Last seen 42 years 45 weeks ago.
Re: JVM Optimizations and val.
Hi Eugen,

среда, 28 декабря 2011 г. 8:01:57 UTC+7 пользователь Eugen Labun написал:

> The same code in Scala will be evaluated at run time.

The last statement is not completely true. Actually Scala performs some optimizations.
Class B (must use "object" to mimic static members) will be compiled to (an equivalent of):

object B {
   final val s2 = "s1 = foo; i1 = " + 123
   final val i2 = -3
}

which is close to the Java optimization.

You are right, scalac folds constants in that case (I've just double checked it).

However, the whole picture is more trickier than I can imagine before.

Consider the code:

object A {
  final val s1_         = "foo"
  final val s1T: String = "foo"
  final val i1_      = 123
  final val i1T: Int = 123
}

object B {
  final val s2__         = "s1 = " + A.s1_ + "; i1 = " + A.i1_;
  final val s2_T: String = "s1 = " + A.s1_ + "; i1 = " + A.i1_;
  final val s2T_         = "s1 = " + A.s1T + "; i1 = " + A.i1T;
  final val i2__         = (A.i1_ * A.i1_) % 25 - 7;
  final val i2_T: Int    = (A.i1_ * A.i1_) % 25 - 7;
  final val i2T_         = (A.i1T * A.i1T) % 25 - 7;
}

It is translated by scalac into:

public final class A$ implements ScalaObject {
  public final String s1_() { return "foo"; }
  public final String s1T() { return s1T; }

  public final int i1_() { return 123; }
  public final int i1T() { return i1T; }

  private final String s1_;
  private final String s1T = "foo";

  private final int i1_;
  private final int i1T = 123;
}

public final class B$ implements ScalaObject {
  public final String s2__() { return s2__; }
  public final String s2_T() { return s2_T; }
  public final String s2T_() { return s2T_; }

  public final int i2__() { return -3; }
  public final int i2_T() { return i2_T; }
  public final int i2T_() { return i2T_; }

  private B$() {
    s2T_ = (new StringBuilder()).append("s1 = ").append(A$.MODULE$.s1T()).append("; i1 = ").append(BoxesRunTime.boxToInteger(A$.MODULE$.i1T())).toString();
    i2T_ = (A$.MODULE$.i1T() * A$.MODULE$.i1T()) % 25 - 7;
  }

  private final String s2__ = (new StringBuilder()).append("s1 = foo; i1 = ").append(BoxesRunTime.boxToInteger(123)).toString();
  private final String s2_T = (new StringBuilder()).append("s1 = foo; i1 = ").append(BoxesRunTime.boxToInteger(123)).toString();
  private final String s2T_;
  private final int i2__;
  private final int i2_T = -3;
  private final int i2T_;
}

Here you can see that type annotation on final val turns it from constant no non-constant one.
Surprisingly enough (for me) such scalac behavior is not an error - it strictly corresponds to the language specification:
"A constant value definition is of the form 'final val x = e' where e is a constant expression. The final modifier must be present and no type annotation may be given." (SLS 4.1)
 

The above optimization will be also performed for non-final vals in Scala.

I cannot see this. Can you provide an example?

(I suppose, the same should be true for Java)

No. For Java, non-final field is always 'var', not 'val'. So it is incorrect to use its initial value instead of re-reading it from memory.

E. Labun
Joined: 2010-06-20,
User offline. Last seen 42 years 45 weeks ago.
Re: JVM Optimizations and val.

On 2011-12-28 19:16, Pavel Pavlov wrote:
> Surprisingly enough (for me) such scalac behavior is not an error - it strictly corresponds to the
> language specification:
> "A constant value definition is of the form 'final val x = e' where e is a constant expression. The
> final modifier must be present *and no type annotation may be given*." (SLS 4.1)

That is not intuitive to me, too.

> The above optimization will be also performed for non-final vals in Scala.
>
> I cannot see this. Can you provide an example?

I just compiled my previous example without 'final' modifiers, and the result was exactly the same
(binary identical).

> (I suppose, the same should be true for Java)
>
> No. For Java, non-final field is always 'var', not 'val'. So it is incorrect to use its initial
> value instead of re-reading it from memory.

Of course. But computing the initial value (at compile-time) could still benefit from
folding/inlining of constants.

Copyright © 2012 École Polytechnique Fédérale de Lausanne (EPFL), Lausanne, Switzerland