This page is no longer maintained — Please continue to the home page at www.scala-lang.org

about "combinations"

2 replies
H-star Development
Joined: 2010-04-14,
User offline. Last seen 2 years 26 weeks ago.

i noticed a new method "combinations" on seqlike. i needed such a method
a while ago so i coded one, and i made a performance comparison.
the code:

def forEachCombination[T, ANY](elements: IndexedSeq[T],
combinationSize: Int, functionToCall: (IndexedSeq[T]) => ANY) {
forEachCombination(elements, combinationSize, functionToCall, new
ArrayBuffer[T](combinationSize), 0,
elements.length - combinationSize + 1)
}

private def forEachCombination[T, U, ANY](elements: IndexedSeq[T],
remainingToTake: Int,
functionToCall:
(IndexedSeq[T]) => ANY, mutableCombinationStore: ArrayBuffer[T],
beginAt: Int,
lastIndexToTake: Int) {
if (remainingToTake == 0) {
functionToCall(mutableCombinationStore)
} else {
beginAt until lastIndexToTake foreach (i => {
mutableCombinationStore += elements(i)
forEachCombination(elements, remainingToTake - 1,
functionToCall, mutableCombinationStore, i + 1,
lastIndexToTake + 1)
mutableCombinationStore.remove(mutableCombinationStore.size - 1)
})
}
}

the result is exactly the same, the combinations are even generated in
the same order :). however, my method is a lot faster. even if i make a
copy of the arraybuffer at each functionToCall.apply-call, my algorithm
still is up to 5x faster then scala's combinationsiterator. if
recycling the arraybuffer is allowed, the speed difference goes up to 10x.

i have 2 suggestions:
1. combinations on indexedseq could be overridden. if you use my method
to build a listbuffer of sequences and get its iterator, it's still
faster than the seqlike.combinations (about 2x) in my testcases.
2. offer a method "foreachCombination" instead of "combinations" or
offer both.

my test cases may not be perfect and i might have done some micro
benchmarking mistakes, but the difference is rather big so i think the
little impurities in my benchmarks shouldn't matter.

Rüdiger Keller
Joined: 2010-01-24,
User offline. Last seen 42 years 45 weeks ago.
Re: about "combinations"

Ever heard of the term "combinatorial explosion"? As the number of
possible combinations can easily grow excessively large, it's in
general probably not a good idea to compute them eagerly. Also, your
ArrayBuffer won't be able to hold them all if you have more than 2^31
combinations.

It's debatable whether an additional faster method doing eager
computation is worth the added maintenance cost.

Regards,
Rüdiger

2011/5/11 HamsterofDeath :
> i noticed a new method "combinations" on seqlike. i needed such a method
> a while ago so i coded one, and i made a performance comparison.
> the code:
>
>  def forEachCombination[T, ANY](elements: IndexedSeq[T],
> combinationSize: Int, functionToCall: (IndexedSeq[T]) => ANY) {
>    forEachCombination(elements, combinationSize, functionToCall, new
> ArrayBuffer[T](combinationSize), 0,
>      elements.length - combinationSize + 1)
>  }
>
>  private def forEachCombination[T, U, ANY](elements: IndexedSeq[T],
> remainingToTake: Int,
>                                             functionToCall:
> (IndexedSeq[T]) => ANY, mutableCombinationStore: ArrayBuffer[T],
>                                             beginAt: Int,
> lastIndexToTake: Int) {
>    if (remainingToTake == 0) {
>      functionToCall(mutableCombinationStore)
>    } else {
>      beginAt until lastIndexToTake foreach (i => {
>        mutableCombinationStore += elements(i)
>        forEachCombination(elements, remainingToTake - 1,
> functionToCall, mutableCombinationStore, i + 1,
>          lastIndexToTake + 1)
>        mutableCombinationStore.remove(mutableCombinationStore.size - 1)
>      })
>    }
>  }
>
> the result is exactly the same, the combinations are even generated in
> the same order :). however, my method is a lot faster. even if i make a
> copy of the arraybuffer at each functionToCall.apply-call, my algorithm
> still is up to 5x faster then scala's combinationsiterator.  if
> recycling the arraybuffer is allowed, the speed difference goes up to 10x.
>
> i have 2 suggestions:
> 1. combinations on indexedseq could be overridden. if you use my method
> to build a listbuffer of sequences and get its iterator, it's still
> faster than the seqlike.combinations (about 2x) in my testcases.
> 2. offer a method "foreachCombination" instead of "combinations" or
> offer both.
>
> my test cases may not be perfect and i might have done some micro
> benchmarking mistakes, but the difference is rather big so i think the
> little impurities in my benchmarks shouldn't matter.
>
>

H-star Development
Joined: 2010-04-14,
User offline. Last seen 2 years 26 weeks ago.
Re: about "combinations"

the foreach-suggestion does not have to suffer from the size limit - you
could use a list instead of the arraybuffer if you really are planning
on doing calculations that take millenia or two. for the usual cases, an
arraybuffer should be fine. it's not eager either.

Am 11.05.2011 23:15, schrieb Rüdiger Keller:
> Ever heard of the term "combinatorial explosion"? As the number of
> possible combinations can easily grow excessively large, it's in
> general probably not a good idea to compute them eagerly. Also, your
> ArrayBuffer won't be able to hold them all if you have more than 2^31
> combinations.
>
> It's debatable whether an additional faster method doing eager
> computation is worth the added maintenance cost.
>
> Regards,
> Rüdiger
>
>
> 2011/5/11 HamsterofDeath :
>> i noticed a new method "combinations" on seqlike. i needed such a method
>> a while ago so i coded one, and i made a performance comparison.
>> the code:
>>
>> def forEachCombination[T, ANY](elements: IndexedSeq[T],
>> combinationSize: Int, functionToCall: (IndexedSeq[T]) => ANY) {
>> forEachCombination(elements, combinationSize, functionToCall, new
>> ArrayBuffer[T](combinationSize), 0,
>> elements.length - combinationSize + 1)
>> }
>>
>> private def forEachCombination[T, U, ANY](elements: IndexedSeq[T],
>> remainingToTake: Int,
>> functionToCall:
>> (IndexedSeq[T]) => ANY, mutableCombinationStore: ArrayBuffer[T],
>> beginAt: Int,
>> lastIndexToTake: Int) {
>> if (remainingToTake == 0) {
>> functionToCall(mutableCombinationStore)
>> } else {
>> beginAt until lastIndexToTake foreach (i => {
>> mutableCombinationStore += elements(i)
>> forEachCombination(elements, remainingToTake - 1,
>> functionToCall, mutableCombinationStore, i + 1,
>> lastIndexToTake + 1)
>> mutableCombinationStore.remove(mutableCombinationStore.size - 1)
>> })
>> }
>> }
>>
>> the result is exactly the same, the combinations are even generated in
>> the same order :). however, my method is a lot faster. even if i make a
>> copy of the arraybuffer at each functionToCall.apply-call, my algorithm
>> still is up to 5x faster then scala's combinationsiterator. if
>> recycling the arraybuffer is allowed, the speed difference goes up to 10x.
>>
>> i have 2 suggestions:
>> 1. combinations on indexedseq could be overridden. if you use my method
>> to build a listbuffer of sequences and get its iterator, it's still
>> faster than the seqlike.combinations (about 2x) in my testcases.
>> 2. offer a method "foreachCombination" instead of "combinations" or
>> offer both.
>>
>> my test cases may not be perfect and i might have done some micro
>> benchmarking mistakes, but the difference is rather big so i think the
>> little impurities in my benchmarks shouldn't matter.
>>
>>

Copyright © 2012 École Polytechnique Fédérale de Lausanne (EPFL), Lausanne, Switzerland