This page is no longer maintained — Please continue to the home page at www.scala-lang.org

880 MB

5 replies
extempore
Joined: 2008-12-17,
User offline. Last seen 35 weeks 3 days ago.

A snapshot of the heap after running through the scala sbt build. Post-processing. A count of the number of distinct copies of strings which are found on the heap. It starts reasonably enough like this:

1 matchesPT :
1 failed of type :
1 stack after interpret:
1 stack at the beginning of block

After 93,641 unique strings we get to the 2s:

2 getting typeinfo at the beginning of block
2 %-
2 (which expands to)

Line 133,637 starts the 3s:

3 )
3 of which in failed :
3 of which in implicits :

Jumping forward, line 194,641 starts the 10s:

10 $anonfun$transform$
10 $plus
10 <refinement>
10 /System/Library/Java/JavaVirtualMachines/1.6.0.jdk/Contents/Classes/classes.jar
10 /scratch/trunk1/lib/scala-compiler.jar

Line 203,799 begins the 50s:

50 '
50 /scratch/trunk1/src/compiler/scala/reflect/runtime/ScalaToJava.scala
50 /scratch/trunk1/src/compiler/scala/tools/nsc/backend/icode/Printers.scala
50 /scratch/trunk1/src/compiler/scala/tools/nsc/plugins/Plugin.scala
50 /scratch/trunk1/src/compiler/scala/tools/nsc/settings/StandardScalaSettings.scala

At line 206,373 we enter the 100s:

100 #|
100 <~
100 ChoiceSetting
100 Forward
100 Implementation

Line 210,820 opens the 1000s:

1000 ListSet
1007 PrintStream
1009 Typed
1012 contextError
1012 ctx
1012 firePropertyChange

Line 210,944 opens the 10,000s:

10008 isTerm
10038 isType
10101 Name
10173 symbol
10495 children
10547 setType

And I'll include the six-figure club in its entirety.

100507 p
104843 String
110642 Function2
134476 that
137226 java
142073 collection
143074 java.lang
143345 !=
143345 ==
164973 Object
176175 Function1
214742 wait
214867 (classOf[java.lang.InterruptedException])
416501 scala
782459 x$1

Congratulations to x$1, today's winner! Who will be the first million copy string? Stay tuned!

Joshua.Suereth
Joined: 2008-09-02,
User offline. Last seen 32 weeks 5 days ago.
Re: 880 MB
Seems like a good intern could help reduce the clutter....


On Fri, Sep 9, 2011 at 10:59 AM, Paul Phillips <paulp [at] improving [dot] org> wrote:
A snapshot of the heap after running through the scala sbt build.  Post-processing.  A count of the number of distinct copies of strings which are found on the heap.  It starts reasonably enough like this:

  1               matchesPT  :
  1          failed of type  :
  1          stack after interpret:
  1          stack at the beginning of block

After 93,641 unique strings we get to the 2s:

  2          getting typeinfo at the beginning of block
  2     %-
  2     (which expands to)

Line 133,637 starts the 3s:

  3   )
  3   of which in failed     :
  3   of which in implicits  :

Jumping forward, line 194,641 starts the 10s:

 10 $anonfun$transform$
 10 $plus
 10 &lt;refinement&gt;
 10 /System/Library/Java/JavaVirtualMachines/1.6.0.jdk/Contents/Classes/classes.jar
 10 /scratch/trunk1/lib/scala-compiler.jar

Line 203,799 begins the 50s:

 50 '
 50 /scratch/trunk1/src/compiler/scala/reflect/runtime/ScalaToJava.scala
 50 /scratch/trunk1/src/compiler/scala/tools/nsc/backend/icode/Printers.scala
 50 /scratch/trunk1/src/compiler/scala/tools/nsc/plugins/Plugin.scala
 50 /scratch/trunk1/src/compiler/scala/tools/nsc/settings/StandardScalaSettings.scala

At line 206,373 we enter the 100s:

 100 #|
 100 &lt;~
 100 ChoiceSetting
 100 Forward
 100 Implementation

Line 210,820 opens the 1000s:

1000 ListSet
1007 PrintStream
1009 Typed
1012 contextError
1012 ctx
1012 firePropertyChange

Line 210,944 opens the 10,000s:

10008 isTerm
10038 isType
10101 Name
10173 symbol
10495 children
10547 setType

And I'll include the six-figure club in its entirety.

100507 p
104843 String
110642 Function2
134476 that
137226 java
142073 collection
143074 java.lang
143345 !=
143345 ==
164973 Object
176175 Function1
214742 wait
214867 (classOf[java.lang.InterruptedException])
416501 scala
782459 x$1

Congratulations to x$1, today's winner! Who will be the first million copy string? Stay tuned!

odersky
Joined: 2008-07-29,
User offline. Last seen 45 weeks 6 days ago.
Re: 880 MB


On Fri, Sep 9, 2011 at 5:46 PM, Josh Suereth <joshua [dot] suereth [at] gmail [dot] com> wrote:
Seems like a good intern could help reduce the clutter....


On Fri, Sep 9, 2011 at 10:59 AM, Paul Phillips <paulp [at] improving [dot] org> wrote:
A snapshot of the heap after running through the scala sbt build.  Post-processing.  A count of the number of distinct copies of strings which are found on the heap.  It starts reasonably enough like this:

  1               matchesPT  :
  1          failed of type  :
  1          stack after interpret:
  1          stack at the beginning of block

After 93,641 unique strings we get to the 2s:

  2          getting typeinfo at the beginning of block
  2     %-
  2     (which expands to)

Line 133,637 starts the 3s:

  3   )
  3   of which in failed     :
  3   of which in implicits  :

Jumping forward, line 194,641 starts the 10s:

 10 $anonfun$transform$
 10 $plus
 10 &lt;refinement&gt;
 10 /System/Library/Java/JavaVirtualMachines/1.6.0.jdk/Contents/Classes/classes.jar
 10 /scratch/trunk1/lib/scala-compiler.jar

Line 203,799 begins the 50s:

 50 '
 50 /scratch/trunk1/src/compiler/scala/reflect/runtime/ScalaToJava.scala
 50 /scratch/trunk1/src/compiler/scala/tools/nsc/backend/icode/Printers.scala
 50 /scratch/trunk1/src/compiler/scala/tools/nsc/plugins/Plugin.scala
 50 /scratch/trunk1/src/compiler/scala/tools/nsc/settings/StandardScalaSettings.scala

At line 206,373 we enter the 100s:

 100 #|
 100 &lt;~
 100 ChoiceSetting
 100 Forward
 100 Implementation

Line 210,820 opens the 1000s:

1000 ListSet
1007 PrintStream
1009 Typed
1012 contextError
1012 ctx
1012 firePropertyChange

Line 210,944 opens the 10,000s:

10008 isTerm
10038 isType
10101 Name
10173 symbol
10495 children
10547 setType

And I'll include the six-figure club in its entirety.

100507 p
104843 String
110642 Function2
134476 that
137226 java
142073 collection
143074 java.lang
143345 !=
143345 ==
164973 Object
176175 Function1
214742 wait
214867 (classOf[java.lang.InterruptedException])
416501 scala
782459 x$1

Congratulations to x$1, today's winner! Who will be the first million copy string? Stay tuned!

Interesting. I also noted that discipline using names instead of strings was gradually dissipating. This is the result. (For reference: names are optimized, interned strings. Neal Gafter tried to replace names with interned strings in javac once, and got a general 5-10% slowdown. So don't try that).

Cheers

 -- Martin

Joshua.Suereth
Joined: 2008-09-02,
User offline. Last seen 32 weeks 5 days ago.
Re: 880 MB
So, it sounds like for a single instance of the compiler, using the Naming char interned store is a great idea.   This SBT build has a few instances of the compiler open at a time, but that should only account for, at max, a replication of 4 or 5 of the same string from Namers if the store is being used correctly.   So although interning would reduce that duplication to 1, I think reducing the duplication from half a million to 5 would be a pretty big win.
How would one go about 'fixing' this issue?  I'm offering to help where I can because the reduction of this duplication will drastically help the SBT build.  
- Josh

On Fri, Sep 9, 2011 at 4:28 PM, martin odersky <martin [dot] odersky [at] epfl [dot] ch> wrote:


On Fri, Sep 9, 2011 at 5:46 PM, Josh Suereth <joshua [dot] suereth [at] gmail [dot] com> wrote:
Seems like a good intern could help reduce the clutter....


On Fri, Sep 9, 2011 at 10:59 AM, Paul Phillips <paulp [at] improving [dot] org> wrote:
A snapshot of the heap after running through the scala sbt build.  Post-processing.  A count of the number of distinct copies of strings which are found on the heap.  It starts reasonably enough like this:

  1               matchesPT  :
  1          failed of type  :
  1          stack after interpret:
  1          stack at the beginning of block

After 93,641 unique strings we get to the 2s:

  2          getting typeinfo at the beginning of block
  2     %-
  2     (which expands to)

Line 133,637 starts the 3s:

  3   )
  3   of which in failed     :
  3   of which in implicits  :

Jumping forward, line 194,641 starts the 10s:

 10 $anonfun$transform$
 10 $plus
 10 &lt;refinement&gt;
 10 /System/Library/Java/JavaVirtualMachines/1.6.0.jdk/Contents/Classes/classes.jar
 10 /scratch/trunk1/lib/scala-compiler.jar

Line 203,799 begins the 50s:

 50 '
 50 /scratch/trunk1/src/compiler/scala/reflect/runtime/ScalaToJava.scala
 50 /scratch/trunk1/src/compiler/scala/tools/nsc/backend/icode/Printers.scala
 50 /scratch/trunk1/src/compiler/scala/tools/nsc/plugins/Plugin.scala
 50 /scratch/trunk1/src/compiler/scala/tools/nsc/settings/StandardScalaSettings.scala

At line 206,373 we enter the 100s:

 100 #|
 100 &lt;~
 100 ChoiceSetting
 100 Forward
 100 Implementation

Line 210,820 opens the 1000s:

1000 ListSet
1007 PrintStream
1009 Typed
1012 contextError
1012 ctx
1012 firePropertyChange

Line 210,944 opens the 10,000s:

10008 isTerm
10038 isType
10101 Name
10173 symbol
10495 children
10547 setType

And I'll include the six-figure club in its entirety.

100507 p
104843 String
110642 Function2
134476 that
137226 java
142073 collection
143074 java.lang
143345 !=
143345 ==
164973 Object
176175 Function1
214742 wait
214867 (classOf[java.lang.InterruptedException])
416501 scala
782459 x$1

Congratulations to x$1, today's winner! Who will be the first million copy string? Stay tuned!

Interesting. I also noted that discipline using names instead of strings was gradually dissipating. This is the result. (For reference: names are optimized, interned strings. Neal Gafter tried to replace names with interned strings in javac once, and got a general 5-10% slowdown. So don't try that).

Cheers

 -- Martin


extempore
Joined: 2008-12-17,
User offline. Last seen 35 weeks 3 days ago.
Re: 880 MB

I burned the whole weekend profiling and tweaking sbt. I'll have a
report today.

Doug Tangren
Joined: 2009-12-10,
User offline. Last seen 42 years 45 weeks ago.
Re: 880 MB
This sounds awesome guys. Perhaps one of you could put up a blog post on performance profiling scala code and potential pitfalls. When you say `names` I assume you are referring to symbols [1].

[1]: http://www.scala-lang.org/api/current/scala/Symbol.html

-Doug Tangren
http://lessis.me

Copyright © 2012 École Polytechnique Fédérale de Lausanne (EPFL), Lausanne, Switzerland