This page is no longer maintained — Please continue to the home page at www.scala-lang.org

Processing a list efficiently in Scala

8 replies
wimxa
Joined: 2009-01-02,
User offline. Last seen 42 years 45 weeks ago.

I thought that the FP way of processing lists is:
- Take the first element
- Do with it whatever you need to do
- Repeat for the tail of the list

that is:

val list = List.range(1, 100000)
def method(list: List[Int], acc: Int): Int = {
if(list.length % 1000 == 0) println(list.length)
method(list.tail, acc + list.head)
}
println(method(list, 0))

Executing the above takes minutes. While it's a stupid example trying to sum
the integers, the imperative (or at least mutable) solution:

val list = List.range(1, 100000)
var acc = 0
for(e <- list) acc += e
println(acc)

comes in in 0 ms :)

So - is my assumption bad? I.e. what is the normal way to work with >=
0.1M-element lists? Any code examples on the Net for this?

Erik Engbrecht
Joined: 2008-12-19,
User offline. Last seen 3 years 18 weeks ago.
Re: Processing a list efficiently in Scala
The first one throws an exception because eventually list.tail will throw a NoSuchElement exception, also list.length (in your "if") is a linear time operation, so it is going to horribly dominate the constant time addition operation.  Take it out and your function will almost instantly throw an exception.
On Fri, Jan 2, 2009 at 1:57 PM, wimxa <wimxa [at] yahoo [dot] com> wrote:

I thought that the FP way of processing lists is:
- Take the first element
- Do with it whatever you need to do
- Repeat for the tail of the list

that is:

   val list = List.range(1, 100000)
   def method(list: List[Int], acc: Int): Int = {
     if(list.length % 1000 == 0) println(list.length)
     method(list.tail, acc + list.head)
   }
   println(method(list, 0))

Executing the above takes minutes. While it's a stupid example trying to sum
the integers, the imperative (or at least mutable) solution:

   val list = List.range(1, 100000)
   var acc = 0
   for(e <- list) acc += e
   println(acc)

comes in in 0 ms :)

So - is my assumption bad? I.e. what is the normal way to work with >=
0.1M-element lists? Any code examples on the Net for this?
--
View this message in context: http://www.nabble.com/Processing-a-list-efficiently-in-Scala-tp21255583p21255583.html
Sent from the Scala - User mailing list archive at Nabble.com.




--
http://erikengbrecht.blogspot.com/
Erik Engbrecht
Joined: 2008-12-19,
User offline. Last seen 3 years 18 weeks ago.
Re: Processing a list efficiently in Scala
Oh, you can also try:list.reduceLeft(_ + _)...which works but will fail if the list is empty, or:list.foldLeft(0)(_ + _)...which works for any list.

On Fri, Jan 2, 2009 at 2:08 PM, Erik Engbrecht <erik [dot] engbrecht [at] gmail [dot] com> wrote:
The first one throws an exception because eventually list.tail will throw a NoSuchElement exception, also list.length (in your "if") is a linear time operation, so it is going to horribly dominate the constant time addition operation.  Take it out and your function will almost instantly throw an exception.
On Fri, Jan 2, 2009 at 1:57 PM, wimxa <wimxa [at] yahoo [dot] com> wrote:

I thought that the FP way of processing lists is:
- Take the first element
- Do with it whatever you need to do
- Repeat for the tail of the list

that is:

   val list = List.range(1, 100000)
   def method(list: List[Int], acc: Int): Int = {
     if(list.length % 1000 == 0) println(list.length)
     method(list.tail, acc + list.head)
   }
   println(method(list, 0))

Executing the above takes minutes. While it's a stupid example trying to sum
the integers, the imperative (or at least mutable) solution:

   val list = List.range(1, 100000)
   var acc = 0
   for(e <- list) acc += e
   println(acc)

comes in in 0 ms :)

So - is my assumption bad? I.e. what is the normal way to work with >=
0.1M-element lists? Any code examples on the Net for this?
--
View this message in context: http://www.nabble.com/Processing-a-list-efficiently-in-Scala-tp21255583p21255583.html
Sent from the Scala - User mailing list archive at Nabble.com.




--
http://erikengbrecht.blogspot.com/



--
http://erikengbrecht.blogspot.com/
wimxa
Joined: 2009-01-02,
User offline. Last seen 42 years 45 weeks ago.
Re: Processing a list efficiently in Scala

Erik Engbrecht wrote:
>
> The first one throws an exception because eventually list.tail will throw
> a
> NoSuchElement exception, also list.length (in your "if") is a linear time
> operation, so it is going to horribly dominate the constant time addition
> operation. Take it out and your function will almost instantly throw an
> exception.
>
Erik, sure I know it needs to be tested (list.head throws the exception). I
forgot about the list.length - I'll just use list.isEmpty instead. Thanks
for finding the problem!

David Pollak
Joined: 2008-12-16,
User offline. Last seen 42 years 45 weeks ago.
Re: Processing a list efficiently in Scala


On Fri, Jan 2, 2009 at 10:57 AM, wimxa <wimxa [at] yahoo [dot] com> wrote:

I thought that the FP way of processing lists is:
- Take the first element
- Do with it whatever you need to do
- Repeat for the tail of the list

that is:

   val list = List.range(1, 100000)
   def method(list: List[Int], acc: Int): Int = {
     if(list.length % 1000 == 0) println(list.length)
     method(list.tail, acc + list.head)
   }
   println(method(list, 0))

In each loop, you're doing an O(n) operation (calculating the list size), so you're running O(n^2).
Try:def method(list: List[Int], acc: Int): Int =    if (list.isEmpty) acc else method(list.tail, list.head + acc)
or to be more Functional: def sum(in: List[Int], acc: Int): Int = in match {   case Nil => acc    case x :: xs => sum(xs, acc + x) } Thanks,
David

Executing the above takes minutes. While it's a stupid example trying to sum
the integers, the imperative (or at least mutable) solution:

   val list = List.range(1, 100000)
   var acc = 0
   for(e <- list) acc += e
   println(acc)

comes in in 0 ms :)

So - is my assumption bad? I.e. what is the normal way to work with >=
0.1M-element lists? Any code examples on the Net for this?
--
View this message in context: http://www.nabble.com/Processing-a-list-efficiently-in-Scala-tp21255583p21255583.html
Sent from the Scala - User mailing list archive at Nabble.com.




--
Lift, the simply functional web framework http://liftweb.net
Collaborative Task Management http://much4.us
Follow me: http://twitter.com/dpp
Git some: http://github.com/dpp
Gabriel Claramunt
Joined: 2009-01-03,
User offline. Last seen 42 years 45 weeks ago.
Re: Processing a list efficiently in Scala

Just out of curiosity, I did a micro-benchmark (yes, they're kind of
meaningless in the JVM but it was fun ;-) ) and I got:
Averages:
standard for-loop 260440 ns
recursive with pattern matching 253952 ns
recursive with if-else 248810 ns
reduce left 312948 ns

Interestingly enough, the recursive functions were consistently faster.

The code for each function is:

def loop(l:List[Int])={
var acc=0;
for(el<-l) acc+=el
acc
}

def recLoopPM(l:List[Int]):Int= l match {
case Nil => 0;
case x::xs => x+recLoop(xs)
}

def recLoop(l:List[Int]):Int= if (l.isEmpty) 0 else l.head +
recLoop(l.tail)

def reduceLoop(l:List[Int])= l.reduceLeft(_+_)

The range was 1... 7000 (what I could get without blowing the stack),and the
test runs the code 10000 times to "warm up" the JVM and then runs it again
10000 times to calculate the averages.

At least, none of the techniques seems to be singificantly faster or slower
than the others...

Gabriel Claramunt
http://gabrielsw.blogspot.com

DamonF
Joined: 2009-01-03,
User offline. Last seen 42 years 45 weeks ago.
Re: Processing a list efficiently in Scala

For completeness, could someone post whether or not scala optimizes
tail-recursive functions at this time? (see
http://en.wikipedia.org/wiki/Tail_recursion).

Thanks,
Damon

Ricky Clarkson
Joined: 2008-12-19,
User offline. Last seen 3 years 2 weeks ago.
Re: Processing a list efficiently in Scala
It's guaranteed to for self tail calls, but the rules are a bit more involved for tail calls to other methods.

2009/1/3 DamonF <damonfeldman [at] verizon [dot] net>


For completeness, could someone post whether or not scala optimizes
tail-recursive functions at this time?  (see
http://en.wikipedia.org/wiki/Tail_recursion).

Thanks,
Damon
--
View this message in context: http://www.nabble.com/Processing-a-list-efficiently-in-Scala-tp21255583p21267080.html
Sent from the Scala - User mailing list archive at Nabble.com.


David Hall
Joined: 2008-12-28,
User offline. Last seen 42 years 45 weeks ago.
Re: Processing a list efficiently in Scala

A little more detail: self-tail calls to methods that the compiler
knows can't be overridden, i.e. only private or final methods.

On Sat, Jan 3, 2009 at 9:27 AM, Ricky Clarkson wrote:
> It's guaranteed to for self tail calls, but the rules are a bit more
> involved for tail calls to other methods.
>
> 2009/1/3 DamonF
>>
>>
>> For completeness, could someone post whether or not scala optimizes
>> tail-recursive functions at this time? (see
>> http://en.wikipedia.org/wiki/Tail_recursion).
>>
>> Thanks,
>> Damon
>> --
>> View this message in context:
>> http://www.nabble.com/Processing-a-list-efficiently-in-Scala-tp21255583p...
>> Sent from the Scala - User mailing list archive at Nabble.com.
>>
>
>

Copyright © 2012 École Polytechnique Fédérale de Lausanne (EPFL), Lausanne, Switzerland