This page is no longer maintained — Please continue to the home page at www.scala-lang.org

Re: home on the range

12 replies
extempore
Joined: 2008-12-17,
User offline. Last seen 35 weeks 3 days ago.

2011/12/13 √iktor Ҡlang :
> You usually want to stay below 35 instructions per method, to get them
> inlined when hot.

Where 35 means 35 distinct opcodes? Here's what the inner loop looks
like right now:

public final void foreachDownIn(scala.Function1);
0: aload_0
1: invokevirtual #151515; //Method start:()I
4: istore_2
5: iload_2
6: aload_0
7: invokevirtual #111111; //Method end:()I
10: if_icmplt 30
13: aload_1
14: iload_2
15: invokeinterface #5f5f5f, 2; //InterfaceMethod
scala/Function1.apply$mcVI$sp:(I)V
20: iload_2
21: aload_0
22: invokevirtual #141414; //Method step:()I
25: iadd
26: istore_2
27: goto 5
30: return

And here's the foreach.

public final void foreach(scala.Function1);
0: aload_0
1: invokevirtual #141414; //Method step:()I
4: iconst_0
5: if_icmpge 31
8: aload_0
9: invokevirtual #1c1c1c; //Method isInclusive:()Z
12: ifeq 23
15: aload_0
16: aload_1
17: invokevirtual #4f4f4f; //Method foreachDownIn:(Lscala/Function1;)V
20: goto 51
23: aload_0
24: aload_1
25: invokevirtual #545454; //Method foreachDownEx:(Lscala/Function1;)V
28: goto 51
31: aload_0
32: invokevirtual #1c1c1c; //Method isInclusive:()Z
35: ifeq 46
38: aload_0
39: aload_1
40: invokevirtual #5a5a5a; //Method foreachUpIn:(Lscala/Function1;)V
43: goto 51
46: aload_0
47: aload_1
48: invokevirtual #575757; //Method foreachUpEx:(Lscala/Function1;)V
51: return

Ismael Juma 2
Joined: 2011-01-22,
User offline. Last seen 42 years 45 weeks ago.
Re: home on the range
First of all, great work Paul. Thanks for spending your time on this.
On Tue, Dec 13, 2011 at 11:40 AM, Paul Phillips <paulp [at] improving [dot] org> wrote:
2011/12/13 √iktor Ҡlang <viktor [dot] klang [at] gmail [dot] com>:
> You usually want to stay below 35 instructions per method, to get them
> inlined when hot.

Where 35 means 35 distinct opcodes? Here's what the inner loop looks
like right now:

I would say that the only way to deal with these things sanely is to have a set of benchmarks with different characteristics (e.g. loop with a small number of iterations, loop with a large number of iterations, loop with simple code inside, loop with complex code inside, nested loops with few/many iterations in inner/outer, etc.).
It's too easy to improve some cases and make others worse otherwise.
There are some improvements that make things obviously better, of course (where boxing is avoided, for example).
Best,Ismael
Viktor Klang
Joined: 2008-12-17,
User offline. Last seen 1 year 27 weeks ago.
Re: home on the range
Have you've tried to pass "end" as a parameter, then you can use the same check for inclusive and exclusive, just add +1 to end?

Also, what happens if you make end on range a final val?

On Tue, Dec 13, 2011 at 12:58 PM, Ismael Juma <ismael [at] juma [dot] me [dot] uk> wrote:
First of all, great work Paul. Thanks for spending your time on this.
On Tue, Dec 13, 2011 at 11:40 AM, Paul Phillips <paulp [at] improving [dot] org> wrote:
2011/12/13 √iktor Ҡlang <viktor [dot] klang [at] gmail [dot] com>:
> You usually want to stay below 35 instructions per method, to get them
> inlined when hot.

Where 35 means 35 distinct opcodes? Here's what the inner loop looks
like right now:

I would say that the only way to deal with these things sanely is to have a set of benchmarks with different characteristics (e.g. loop with a small number of iterations, loop with a large number of iterations, loop with simple code inside, loop with complex code inside, nested loops with few/many iterations in inner/outer, etc.).
It's too easy to improve some cases and make others worse otherwise.
There are some improvements that make things obviously better, of course (where boxing is avoided, for example).
Best,Ismael



--
Viktor Klang

Akka Tech LeadTypesafe - Enterprise-Grade Scala from the Experts

Twitter: @viktorklang
Tiark Rompf
Joined: 2009-02-18,
User offline. Last seen 42 years 45 weeks ago.
Re: home on the range

On Dec 13, 2011, at 12:40 PM, Paul Phillips wrote:
> 2011/12/13 √iktor Ҡlang :
>> You usually want to stay below 35 instructions per method, to get them
>> inlined when hot.
>
> Where 35 means 35 distinct opcodes?

Is that opcodes or bytecode size? I was under the assumption that -XX:MaxInlineSize is measured in bytes, not opcodes.

- Tiark

Ismael Juma 2
Joined: 2011-01-22,
User offline. Last seen 42 years 45 weeks ago.
Re: home on the range
On Tue, Dec 13, 2011 at 12:04 PM, Tiark Rompf <tiark [dot] rompf [at] epfl [dot] ch> wrote:
Is that opcodes or bytecode size? I was under the assumption that -XX:MaxInlineSize is measured in bytes, not opcodes.

Yes. Some inlining parameters:
FreqInlineSizeMaxInlineSizeInlineSmallCodeMaxTrivialSizeNodeCountInliningCutoffMaxNodeLimitInlineThrowMaxSizeMaxTrivialSize

A good resource:
http://wikis.sun.com/display/HotSpotInternals/Inlining
Best,Ismael
Pavel Pavlov
Joined: 2011-12-01,
User offline. Last seen 42 years 45 weeks ago.
Re: home on the range


вторник, 13 декабря 2011 г. 19:04:15 UTC+7 пользователь tiark написал:
On Dec 13, 2011, at 12:40 PM, Paul Phillips wrote:
> 2011/12/13 √iktor Ҡlang <viktor [dot] [dot] [dot] [at] gmail [dot] com>:
>> You usually want to stay below 35 instructions per method, to get them
>> inlined when hot.
>
> Where 35 means 35 distinct opcodes?

Is that opcodes or bytecode size? I was under the assumption that -XX:MaxInlineSize is measured in bytes, not opcodes.

It is bytecode size, of course:

"Method inlining eliminates the overhead of method dispatching. The client compiler uses a static inlining strategy: Methods with a size less than 35 bytes are inlined. This size is decreased for each nested inlining because the number of inlining candidates grows at each level. "

See section 2.2 of this paper:

@article{Kotzmann:2008:DJH:1369396.1370017,
 author = {Kotzmann, Thomas and Wimmer, Christian and M\"{o}ssenb\"{o}ck, Hanspeter and Rodriguez, Thomas and Russell, Kenneth and Cox, David},
 title = {Design of the Java HotSpot\™ client compiler for Java 6},
 journal = {ACM Trans. Archit. Code Optim.},
 volume = {5},
 issue = {1},
 month = {May},
 year = {2008},
 issn = {1544-3566},
 pages = {7:1--7:32},
 articleno = {7},
 numpages = {32},
 url = {http://doi.acm.org/10.1145/1369396.1370017},
 doi = {http://doi.acm.org/10.1145/1369396.1370017},
 acmid = {1370017},
 publisher = {ACM},
 address = {New York, NY, USA},
 keywords = {Java, compiler, deoptimization, intermediate representation, just-in-time compilation, optimization, register allocation},
}

odersky
Joined: 2008-07-29,
User offline. Last seen 45 weeks 6 days ago.
Re: home on the range

One other thing that's important: Always test with call sites calling
different methods (i.e. do a Range#foreach with a set of different
closures). If you do it only for a single one, you risk getting
unrealistic inlining performance. Then the micro-benchmark results
will not be repeatable in real code.

Cheers

Viktor Klang
Joined: 2008-12-17,
User offline. Last seen 1 year 27 weeks ago.
Re: home on the range


On Tue, Dec 13, 2011 at 1:04 PM, Tiark Rompf <tiark [dot] rompf [at] epfl [dot] ch> wrote:
On Dec 13, 2011, at 12:40 PM, Paul Phillips wrote:
> 2011/12/13 √iktor Ҡlang <viktor [dot] klang [at] gmail [dot] com>:
>> You usually want to stay below 35 instructions per method, to get them
>> inlined when hot.
>
> Where 35 means 35 distinct opcodes?

Is that opcodes or bytecode size? I was under the assumption that -XX:MaxInlineSize is measured in bytes, not opcodes.

Yes of course, thanks for correcting my poor wording.

Cheers,

 

- Tiark



--
Viktor Klang

Akka Tech LeadTypesafe - Enterprise-Grade Scala from the Experts

Twitter: @viktorklang
extempore
Joined: 2008-12-17,
User offline. Last seen 35 weeks 3 days ago.
Re: home on the range

2011/12/13 √iktor Ҡlang :
> Have you've tried to pass "end" as a parameter, then you can use the same
> check for inclusive and exclusive, just add +1 to end?

Ha ha, don't remind me of all the bugs in range I've already either
caused or fixed. It's really a slippery class, one always thinks
"gee, I could simplify it by doing this..." and then walks right into
a boundary condition.

(In this case, Int.MaxValue - 2 to Int.MaxValue.)

> Also, what happens if you make end on range a final val?

Nothing. That was one of the things I had going most of the time on
the which I unrolled as irrelevant. I tried taking things away one at
a time, and what's in there was what was left when taking anything
away also took away the results.

Viktor Klang
Joined: 2008-12-17,
User offline. Last seen 1 year 27 weeks ago.
Re: home on the range


On Tue, Dec 13, 2011 at 3:35 PM, Paul Phillips <paulp [at] improving [dot] org> wrote:
2011/12/13 √iktor Ҡlang <viktor [dot] klang [at] gmail [dot] com>:
> Have you've tried to pass "end" as a parameter, then you can use the same
> check for inclusive and exclusive, just add +1 to end?

Ha ha, don't remind me of all the bugs in range I've already either
caused or fixed.  It's really a slippery class, one always thinks
"gee, I could simplify it by doing this..." and then walks right into
a boundary condition.

(In this case, Int.MaxValue - 2 to Int.MaxValue.)

> Also, what happens if you make end on range a final val?

Nothing.  That was one of the things I had going most of the time on
the which I unrolled as irrelevant.  I tried taking things away one at
a time, and what's in there was what was left when taking anything
away also took away the results.

Alright, and what if you just provide the end as a method parameter instead of calling the end-member?

--
Viktor Klang

Akka Tech LeadTypesafe - Enterprise-Grade Scala from the Experts

Twitter: @viktorklang
extempore
Joined: 2008-12-17,
User offline. Last seen 35 weeks 3 days ago.
Re: home on the range

2011/12/13 √iktor Ҡlang :
> Alright, and what if you just provide the end as a method parameter instead
> of calling the end-member?

Yep. Nothing. In fact for a while I had all four implementations
taking (f, start, end, step) as parameters, in the interests of having
no data in the loop which wasn't known local. No difference.

Viktor Klang
Joined: 2008-12-17,
User offline. Last seen 1 year 27 weeks ago.
Re: home on the range


On Tue, Dec 13, 2011 at 4:10 PM, Paul Phillips <paulp [at] improving [dot] org> wrote:
2011/12/13 √iktor Ҡlang <viktor [dot] klang [at] gmail [dot] com>:
> Alright, and what if you just provide the end as a method parameter instead
> of calling the end-member?

Yep.  Nothing.  In fact for a while I had all four implementations
taking (f, start, end, step) as parameters, in the interests of having
no data in the loop which wasn't known local.  No difference.

Sometimes optimization is more like using a divining rod... ;)


--
Viktor Klang

Akka Tech LeadTypesafe - Enterprise-Grade Scala from the Experts

Twitter: @viktorklang
d_m
Joined: 2010-11-11,
User offline. Last seen 35 weeks 2 days ago.
Re: home on the range

On Tue, Dec 13, 2011 at 01:30:32PM +0100, martin odersky wrote:
> One other thing that's important: Always test with call sites calling
> different methods (i.e. do a Range#foreach with a set of different
> closures). If you do it only for a single one, you risk getting
> unrealistic inlining performance. Then the micro-benchmark results
> will not be repeatable in real code.

This is a great point, and something I've seen before.

It's always a rude awakening to see a benchmark run x5-10 slower after,
and it's easy to get confused about what's going on in situations like:

test 1: fast
test 2: slow
(test 1: slow, if you happen to run it again)

Also, thanks to Paul, Martin, and everyone else who's been working on
making Range#foreach faster!

Copyright © 2012 École Polytechnique Fédérale de Lausanne (EPFL), Lausanne, Switzerland