On Tue, 2 Nov 2010, Eliot Miranda wrote:
> On Tue, Nov 2, 2010 at 12:25 PM, Eliot Miranda <
[hidden email]>wrote:
>
>>
>>
>> On Tue, Nov 2, 2010 at 11:50 AM, Eliot Miranda <
[hidden email]>wrote:
>>
>>>
snip
>>> The difference is due to the instantiation "Object new", and instantiation
>>> is slow on the current Cog because I haven't implemented the instantiation
>>> primitives in machine code, and this limitation is hopefully temporary. So
>>> you might leave it be. However, changing the code to use {} for teh
>>> instantiation speeds things up markedly:
Good to know. :)
>>>
>>> Collection>>newFold: binaryBlock
>>> | firstValue nextValue |
>>> firstValue := nextValue := {}.
>>> self do:
>>> [:each |
>>> nextValue := firstValue == nextValue
>>> ifTrue: [each]
>>> ifFalse: [binaryBlock value: nextValue value: each]].
>>> ^nextValue == firstValue
>>> ifTrue: [self errorEmptyCollection]
>>> ifFalse: [nextValue]
>>>
>>> | c r |
>>> c := #('if' 'it' 'is' 'to' 'be' 'it' 'is' 'up' 'to' 'me').
>>> { Time millisecondsToRun: [1 to: 1000000 do: [:i| c fold: [:a :b | a, ' ',
>>> b]]].
>>> Time millisecondsToRun: [1 to: 1000000 do: [:i| c newFold: [:a :b | a, '
>>> ', b]]].
>>> Time millisecondsToRun: [1 to: 1000000 do: [:i| c reduce: [:a :b | a, ' ',
>>> b]]] }
>>> #(5076 5008 5052)
>>>
>>>
>>> | c |
>>> c := #('if').
>>> { Time millisecondsToRun: [1 to: 1000000 do: [:i| c fold: [:a :b | a, ' ',
>>> b]]].
>>> Time millisecondsToRun: [1 to: 1000000 do: [:i| c newFold: [:a :b | a,
>>> ' ', b]]].
>>> Time millisecondsToRun: [1 to: 1000000 do: [:i| c reduce: [:a :b | a, '
>>> ', b]]] }
>>> #(247 226 220)
>>>
>>>
>>> (N.B. the above are rough measurements!!).
>>>
>>> So how about changing fold: to replace Object new with {} and keep it?
>>>
>>
>> And of course one can consider using thisContext instead of Object new or
>> {}, since thisContext has to be instantiated anyway for the closure in
>> fold:, and so results in less memory pressure. This needs careful
>> measurement; you have the test bed set up for that; my workspace hacks are
>> showing too much variability to distinguish.
You're right, it's about the allocation costs. I came up with the
following benchmark:
#(1 10 100 1000 10000 100000 1000000) collect: [ :arraySize |
(1 to: 5) collect: [ :run |
| collection |
collection := Array new: arraySize.
Smalltalk garbageCollect.
[ 1 to: 10000000 // arraySize do: [ :i | collection fold: [ :a :b | ] ] ] timeToRun ] ].
But the noise is too high on my pc to get reliable results for larger
array sizes. For smaller sizes allocation/gc time gives the difference.
For larges sizes I can't decide if there's a difference at all. Probably
there's none.
Levente