Smalltalk › Squeak › Squeak - Dev

Collection's #fold: vs #reduce:

_‹ Previous Topic Next Topic _›

Classic

List

Threaded

21 messages Options

Levente Uzonyi-2

Collection's #fold: vs #reduce:

Hi,

we have these two methods which do exactly the same thing. #reduce: was
added by Andreas during the developement of Squeak 4.1. #fold was added by
Eliot for Cog VMMaker compatibility. One of them is superfluous. I can
image the following solutions:

1) Replace senders of #fold: in VMMaker to use #reduce:, and remove
#fold: from the image.

2) Replace #fold:'s implementation to self reduce: aBlock. I benchmarked
the two methods and #reduce: is a bit faster on CogVM. There's no
difference on SqueakVM.

Cheers,
Levente

Eliot Miranda-2

Re: Collection's #fold: vs #reduce:

Hi Levente,

On Tue, Nov 2, 2010 at 10:16 AM, Levente Uzonyi <[hidden email]> wrote:

Hi,

we have these two methods which do exactly the same thing. #reduce: was added by Andreas during the developement of Squeak 4.1. #fold was added by Eliot for Cog VMMaker compatibility. One of them is superfluous. I can image the following solutions:

1) Replace senders of #fold: in VMMaker to use #reduce:, and remove #fold: from the image.

2) Replace #fold:'s implementation to self reduce: aBlock. I benchmarked the two methods and #reduce: is a bit faster on CogVM. There's no difference on SqueakVM.

I noticed the same thing. I prefer the fold: implementation so I'm bummed it's slightly slower ;) Personally I like fold: as a name (it's shorter and more cuddly) and since we don't have map: (we have collect:) I don't find the need to use reduce: compelling. But that's my preference. I won't object if you replace fold: with reduce:/ I do note that Gilad used fold: in Newspeak.

What do you prefer?

What do others prefer?

I know, choices, choices :)

best

Eliot

Cheers,
Levente

Frank Shearar

Re: Collection's #fold: vs #reduce:

On 2010/11/02 19:45, Eliot Miranda wrote:

> Hi Levente,
>
> On Tue, Nov 2, 2010 at 10:16 AM, Levente Uzonyi <[hidden email]
> <mailto:[hidden email]>> wrote:
>
> Hi,
>
> we have these two methods which do exactly the same thing. #reduce:
> was added by Andreas during the developement of Squeak 4.1. #fold
> was added by Eliot for Cog VMMaker compatibility. One of them is
> superfluous. I can image the following solutions:
>
> 1) Replace senders of #fold: in VMMaker to use #reduce:, and remove
> #fold: from the image.
>
> 2) Replace #fold:'s implementation to self reduce: aBlock. I
> benchmarked the two methods and #reduce: is a bit faster on CogVM.
> There's no difference on SqueakVM.
>
>
> I noticed the same thing. I prefer the fold: implementation so I'm
> bummed it's slightly slower ;) Personally I like fold: as a name (it's
> shorter and more cuddly) and since we don't have map: (we have collect:)
> I don't find the need to use reduce: compelling. But that's my
> preference. I won't object if you replace fold: with reduce:/ I do
> note that Gilad used fold: in Newspeak.
>
> What do you prefer?
>
> What do others prefer?
>
> I know, choices, choices :)

I like both, and both are well-entrenched terms in the FP world (fold in
Haskell, and reduce in Common Lisp, for instance).

One extra datum is that there's already a #reduce (Number>>reduce and
Float>>reduce) so perhaps #fold: might be a better name just because it
clashes with nothing in Trunk.

frank

Levente Uzonyi-2

Re: Collection's #fold: vs #reduce:

In reply to this post by Eliot Miranda-2

On Tue, 2 Nov 2010, Eliot Miranda wrote:

> Hi Levente,
>
> On Tue, Nov 2, 2010 at 10:16 AM, Levente Uzonyi <[hidden email]> wrote:
>
>> Hi,
>>
>> we have these two methods which do exactly the same thing. #reduce: was
>> added by Andreas during the developement of Squeak 4.1. #fold was added by
>> Eliot for Cog VMMaker compatibility. One of them is superfluous. I can image
>> the following solutions:
>>
>> 1) Replace senders of #fold: in VMMaker to use #reduce:, and remove #fold:
>> from the image.
>>
>> 2) Replace #fold:'s implementation to self reduce: aBlock. I benchmarked
>> the two methods and #reduce: is a bit faster on CogVM. There's no difference
>> on SqueakVM.
>>
>
> I noticed the same thing. I prefer the fold: implementation so I'm bummed
> it's slightly slower ;) Personally I like fold: as a name (it's shorter and

The difference is about 1%, but it's reproducible. And yes, the
implementation of #fold: is a bit nicer. :)

> more cuddly) and since we don't have map: (we have collect:) I don't find
> the need to use reduce: compelling. But that's my preference. I won't
> object if you replace fold: with reduce:/ I do note that Gilad used fold:
> in Newspeak.
>
> What do you prefer?

I prefer the second option, because both names are widely used, but
some people are only aware of the one which their previously used
languages have. So this way we can avoid questions like "Why isn't there a
method for folding in Squeak?".

Levente

>
> What do others prefer?
>
> I know, choices, choices :)
>
> best
> Eliot
>
>
>
>>
>>
>> Cheers,
>> Levente
>>
>>
>

Eliot Miranda-2

Re: Collection's #fold: vs #reduce:

On Tue, Nov 2, 2010 at 11:31 AM, Levente Uzonyi <[hidden email]> wrote:

On Tue, 2 Nov 2010, Eliot Miranda wrote:

Hi Levente,

On Tue, Nov 2, 2010 at 10:16 AM, Levente Uzonyi <[hidden email]> wrote:

Hi,

we have these two methods which do exactly the same thing. #reduce: was
added by Andreas during the developement of Squeak 4.1. #fold was added by
Eliot for Cog VMMaker compatibility. One of them is superfluous. I can image
the following solutions:

1) Replace senders of #fold: in VMMaker to use #reduce:, and remove #fold:
from the image.

2) Replace #fold:'s implementation to self reduce: aBlock. I benchmarked
the two methods and #reduce: is a bit faster on CogVM. There's no difference
on SqueakVM.

I noticed the same thing. I prefer the fold: implementation so I'm bummed
it's slightly slower ;) Personally I like fold: as a name (it's shorter and

The difference is about 1%, but it's reproducible. And yes, the implementation of #fold: is a bit nicer. :)

The difference is due to the instantiation "Object new", and instantiation is slow on the current Cog because I haven't implemented the instantiation primitives in machine code, and this limitation is hopefully temporary. So you might leave it be. However, changing the code to use {} for teh instantiation speeds things up markedly:

Collection>>newFold: binaryBlock

| firstValue nextValue |

firstValue := nextValue := {}.

self do:

[:each |

nextValue := firstValue == nextValue

ifTrue: [each]

ifFalse: [binaryBlock value: nextValue value: each]].

^nextValue == firstValue

ifTrue: [self errorEmptyCollection]

ifFalse: [nextValue]

| c r |

c := #('if' 'it' 'is' 'to' 'be' 'it' 'is' 'up' 'to' 'me').

{ Time millisecondsToRun: [1 to: 1000000 do: [:i| c fold: [:a :b | a, ' ', b]]].

Time millisecondsToRun: [1 to: 1000000 do: [:i| c newFold: [:a :b | a, ' ', b]]].

Time millisecondsToRun: [1 to: 1000000 do: [:i| c reduce: [:a :b | a, ' ', b]]] }

#(5076 5008 5052)

| c |

c := #('if').

{ Time millisecondsToRun: [1 to: 1000000 do: [:i| c fold: [:a :b | a, ' ', b]]].

Time millisecondsToRun: [1 to: 1000000 do: [:i| c newFold: [:a :b | a, ' ', b]]].

Time millisecondsToRun: [1 to: 1000000 do: [:i| c reduce: [:a :b | a, ' ', b]]] }

#(247 226 220)

(N.B. the above are rough measurements!!).

So how about changing fold: to replace Object new with {} and keep it?

2¢

Eliot

more cuddly) and since we don't have map: (we have collect:) I don't find
the need to use reduce: compelling. But that's my preference. I won't
object if you replace fold: with reduce:/ I do note that Gilad used fold:
in Newspeak.

What do you prefer?

I prefer the second option, because both names are widely used, but some people are only aware of the one which their previously used languages have. So this way we can avoid questions like "Why isn't there a
method for folding in Squeak?".

Levente

What do others prefer?

I know, choices, choices :)

best
Eliot

Cheers,
Levente

Nicolas Cellier

Re: Collection's #fold: vs #reduce:

In reply to this post by Levente Uzonyi-2

2010/11/2 Levente Uzonyi <[hidden email]>:

>> What do you prefer?
>
> I prefer the second option, because both names are widely used, but some
> people are only aware of the one which their previously used languages have.
> So this way we can avoid questions like "Why isn't there a
> method for folding in Squeak?".
>
>
> Levente
>

I'd say which selector are implemented in Pharo and Cuis ?
We should better converge.

Nicolas

Andreas.Raab

Re: Collection's #fold: vs #reduce:

In reply to this post by Levente Uzonyi-2

On 11/2/2010 11:31 AM, Levente Uzonyi wrote:

>> more cuddly) and since we don't have map: (we have collect:) I don't find
>> the need to use reduce: compelling. But that's my preference. I won't
>> object if you replace fold: with reduce:/ I do note that Gilad used fold:
>> in Newspeak.
>>
>> What do you prefer?
>
> I prefer the second option, because both names are widely used, but some
> people are only aware of the one which their previously used languages
> have. So this way we can avoid questions like "Why isn't there a
> method for folding in Squeak?".

If you go by familiarity, reduce is by far the obvious choice. Most of
the popular dynamic languages use reduce:

PHP: http://php.net/manual/en/function.array-reduce.php
Javascript:
https://developer.mozilla.org/en/JavaScript/Reference/Global_Objects/Array/Reduce
Python: http://docs.python.org/library/functions.html#reduce

What languages use fold?

Cheers,
- Andreas

Eliot Miranda-2

Re: Collection's #fold: vs #reduce:

In reply to this post by Eliot Miranda-2

On Tue, Nov 2, 2010 at 11:50 AM, Eliot Miranda <[hidden email]> wrote:

On Tue, Nov 2, 2010 at 11:31 AM, Levente Uzonyi <[hidden email]> wrote:

On Tue, 2 Nov 2010, Eliot Miranda wrote:

Hi Levente,

On Tue, Nov 2, 2010 at 10:16 AM, Levente Uzonyi <[hidden email]> wrote:

Hi,

we have these two methods which do exactly the same thing. #reduce: was
added by Andreas during the developement of Squeak 4.1. #fold was added by
Eliot for Cog VMMaker compatibility. One of them is superfluous. I can image
the following solutions:

1) Replace senders of #fold: in VMMaker to use #reduce:, and remove #fold:
from the image.

2) Replace #fold:'s implementation to self reduce: aBlock. I benchmarked
the two methods and #reduce: is a bit faster on CogVM. There's no difference
on SqueakVM.

I noticed the same thing. I prefer the fold: implementation so I'm bummed
it's slightly slower ;) Personally I like fold: as a name (it's shorter and

The difference is about 1%, but it's reproducible. And yes, the implementation of #fold: is a bit nicer. :)

The difference is due to the instantiation "Object new", and instantiation is slow on the current Cog because I haven't implemented the instantiation primitives in machine code, and this limitation is hopefully temporary. So you might leave it be. However, changing the code to use {} for teh instantiation speeds things up markedly:

Collection>>newFold: binaryBlock
| firstValue nextValue |
firstValue := nextValue := {}.

self do:
[:each |
nextValue := firstValue == nextValue

ifTrue: [each]
ifFalse: [binaryBlock value: nextValue value: each]].

^nextValue == firstValue
ifTrue: [self errorEmptyCollection]
ifFalse: [nextValue]

| c r |
c := #('if' 'it' 'is' 'to' 'be' 'it' 'is' 'up' 'to' 'me').
{ Time millisecondsToRun: [1 to: 1000000 do: [:i| c fold: [:a :b | a, ' ', b]]].

Time millisecondsToRun: [1 to: 1000000 do: [:i| c newFold: [:a :b | a, ' ', b]]].
Time millisecondsToRun: [1 to: 1000000 do: [:i| c reduce: [:a :b | a, ' ', b]]] }

#(5076 5008 5052)

| c |
c := #('if').
{ Time millisecondsToRun: [1 to: 1000000 do: [:i| c fold: [:a :b | a, ' ', b]]].
Time millisecondsToRun: [1 to: 1000000 do: [:i| c newFold: [:a :b | a, ' ', b]]].

Time millisecondsToRun: [1 to: 1000000 do: [:i| c reduce: [:a :b | a, ' ', b]]] }
#(247 226 220)

(N.B. the above are rough measurements!!).

So how about changing fold: to replace Object new with {} and keep it?

And of course one can consider using thisContext instead of Object new or {}, since thisContext has to be instantiated anyway for the closure in fold:, and so results in less memory pressure. This needs careful measurement; you have the test bed set up for that; my workspace hacks are showing too much variability to distinguish.

This is too ugly to contemplate, but at first blush it seems faster:

fold: binaryBlock

"Evaluate the block with the first two elements of the receiver,

then with the result of the first evaluation and the next element,

and so on. Answer the result of the final evaluation. If the receiver

is empty, raise an error. If the receiver has a single element, answer

that element."

"#('if' 'it' 'is' 'to' 'be' 'it' 'is' 'up' 'to' 'me') fold: [:a :b | a, ' ', b]"

| firstValue nextValue |

firstValue := nextValue := "something that can't be in the receiver"

[:each |

nextValue := firstValue == nextValue

ifTrue: [each]

ifFalse: [binaryBlock value: nextValue value: each]].

self do: firstValue.

^nextValue == firstValue

ifTrue: [self errorEmptyCollection]

ifFalse: [nextValue]

2¢
Eliot

more cuddly) and since we don't have map: (we have collect:) I don't find
the need to use reduce: compelling. But that's my preference. I won't
object if you replace fold: with reduce:/ I do note that Gilad used fold:
in Newspeak.

What do you prefer?

I prefer the second option, because both names are widely used, but some people are only aware of the one which their previously used languages have. So this way we can avoid questions like "Why isn't there a
method for folding in Squeak?".

Levente

What do others prefer?

I know, choices, choices :)

best
Eliot

Cheers,
Levente

Eliot Miranda-2

Re: Collection's #fold: vs #reduce:

On Tue, Nov 2, 2010 at 12:25 PM, Eliot Miranda <[hidden email]> wrote:

On Tue, Nov 2, 2010 at 11:50 AM, Eliot Miranda <[hidden email]> wrote:

On Tue, Nov 2, 2010 at 11:31 AM, Levente Uzonyi <[hidden email]> wrote:

On Tue, 2 Nov 2010, Eliot Miranda wrote:

Hi Levente,

On Tue, Nov 2, 2010 at 10:16 AM, Levente Uzonyi <[hidden email]> wrote:

Hi,

we have these two methods which do exactly the same thing. #reduce: was
added by Andreas during the developement of Squeak 4.1. #fold was added by
Eliot for Cog VMMaker compatibility. One of them is superfluous. I can image
the following solutions:

1) Replace senders of #fold: in VMMaker to use #reduce:, and remove #fold:
from the image.

2) Replace #fold:'s implementation to self reduce: aBlock. I benchmarked
the two methods and #reduce: is a bit faster on CogVM. There's no difference
on SqueakVM.

I noticed the same thing. I prefer the fold: implementation so I'm bummed
it's slightly slower ;) Personally I like fold: as a name (it's shorter and

The difference is about 1%, but it's reproducible. And yes, the implementation of #fold: is a bit nicer. :)

The difference is due to the instantiation "Object new", and instantiation is slow on the current Cog because I haven't implemented the instantiation primitives in machine code, and this limitation is hopefully temporary. So you might leave it be. However, changing the code to use {} for teh instantiation speeds things up markedly:

Collection>>newFold: binaryBlock
| firstValue nextValue |
firstValue := nextValue := {}.

self do:
[:each |
nextValue := firstValue == nextValue

ifTrue: [each]
ifFalse: [binaryBlock value: nextValue value: each]].

^nextValue == firstValue
ifTrue: [self errorEmptyCollection]
ifFalse: [nextValue]

| c r |
c := #('if' 'it' 'is' 'to' 'be' 'it' 'is' 'up' 'to' 'me').
{ Time millisecondsToRun: [1 to: 1000000 do: [:i| c fold: [:a :b | a, ' ', b]]].

Time millisecondsToRun: [1 to: 1000000 do: [:i| c newFold: [:a :b | a, ' ', b]]].
Time millisecondsToRun: [1 to: 1000000 do: [:i| c reduce: [:a :b | a, ' ', b]]] }

#(5076 5008 5052)

| c |
c := #('if').
{ Time millisecondsToRun: [1 to: 1000000 do: [:i| c fold: [:a :b | a, ' ', b]]].
Time millisecondsToRun: [1 to: 1000000 do: [:i| c newFold: [:a :b | a, ' ', b]]].

Time millisecondsToRun: [1 to: 1000000 do: [:i| c reduce: [:a :b | a, ' ', b]]] }
#(247 226 220)

(N.B. the above are rough measurements!!).

So how about changing fold: to replace Object new with {} and keep it?

And of course one can consider using thisContext instead of Object new or {}, since thisContext has to be instantiated anyway for the closure in fold:, and so results in less memory pressure. This needs careful measurement; you have the test bed set up for that; my workspace hacks are showing too much variability to distinguish.

This is too ugly to contemplate, but at first blush it seems faster:

fold: binaryBlock
"Evaluate the block with the first two elements of the receiver,

then with the result of the first evaluation and the next element,
and so on. Answer the result of the final evaluation. If the receiver

is empty, raise an error. If the receiver has a single element, answer
that element."

"#('if' 'it' 'is' 'to' 'be' 'it' 'is' 'up' 'to' 'me') fold: [:a :b | a, ' ', b]"

| firstValue nextValue |
firstValue := nextValue := "something that can't be in the receiver"

[:each |
nextValue := firstValue == nextValue
ifTrue: [each]

ifFalse: [binaryBlock value: nextValue value: each]].
self do: firstValue.

^nextValue == firstValue
ifTrue: [self errorEmptyCollection]
ifFalse: [nextValue]

and if you're really whoorish about speed this is even faster (because the Cog code generator is so naive)

fold: binaryBlock

"Evaluate the block with the first two elements of the receiver,

then with the result of the first evaluation and the next element,

and so on. Answer the result of the final evaluation. If the receiver

is empty, raise an error. If the receiver has a single element, answer

that element."

"#('if' 'it' 'is' 'to' 'be' 'it' 'is' 'up' 'to' 'me') fold: [:a :b | a, ' ', b]"

| firstValue nextValue |

self do: (firstValue := nextValue := "something that can't be in the receiver"

[:each |

nextValue := firstValue == nextValue

ifTrue: [each]

ifFalse: [binaryBlock value: nextValue value: each]]).

^nextValue == firstValue

ifTrue: [self errorEmptyCollection]

ifFalse: [nextValue]

2¢
Eliot

more cuddly) and since we don't have map: (we have collect:) I don't find
the need to use reduce: compelling. But that's my preference. I won't
object if you replace fold: with reduce:/ I do note that Gilad used fold:
in Newspeak.

What do you prefer?

I prefer the second option, because both names are widely used, but some people are only aware of the one which their previously used languages have. So this way we can avoid questions like "Why isn't there a
method for folding in Squeak?".

Levente

What do others prefer?

I know, choices, choices :)

best
Eliot

Cheers,
Levente

Andreas.Raab

Re: Collection's #fold: vs #reduce:

In reply to this post by Eliot Miranda-2

On 11/2/2010 12:25 PM, Eliot Miranda wrote:
> And of course one can consider using thisContext instead of Object new
> or {}, since thisContext has to be instantiated anyway for the closure
> in fold:, and so results in less memory pressure. This needs careful
> measurement; you have the test bed set up for that; my workspace hacks
> are showing too much variability to distinguish.

I'm sorry but if your application is limited by the performance of
reduce/fold you're doing it wrong. Can we discuss *clarity* of the
solution over performance please? If we'd really need a high-performance
variant, then *nothing* beats a trivial custom implementation in Array
along the lines of:

Array>>reduce: aBlock

result := self at: 1.
2 to: self size do:[:i| result := aBlock value: result value: (self
at: 1)].

-- "Premature optimization is the root of all evil" --

Cheers,
- Andreas

Eliot Miranda-2

Re: Collection's #fold: vs #reduce:

On Tue, Nov 2, 2010 at 12:37 PM, Andreas Raab <[hidden email]> wrote:

On 11/2/2010 12:25 PM, Eliot Miranda wrote:

And of course one can consider using thisContext instead of Object new
or {}, since thisContext has to be instantiated anyway for the closure
in fold:, and so results in less memory pressure. This needs careful
measurement; you have the test bed set up for that; my workspace hacks
are showing too much variability to distinguish.

I'm sorry but if your application is limited by the performance of reduce/fold you're doing it wrong. Can we discuss *clarity* of the solution over performance please? If we'd really need a high-performance variant, then *nothing* beats a trivial custom implementation in Array along the lines of:

Array>>reduce: aBlock

result := self at: 1.
2 to: self size do:[:i| result := aBlock value: result value: (self at: 1)].

-- "Premature optimization is the root of all evil" --

I know. I'm joking ;) It's still interesting to see how code generator fallibilities make measurable differences. I did say I thought the whoorish speed differences were too ugly to contemplate.

best

Eliot

Cheers,
- Andreas

Levente Uzonyi-2

Re: Collection's #fold: vs #reduce:

In reply to this post by Andreas.Raab

On Tue, 2 Nov 2010, Andreas Raab wrote:

> On 11/2/2010 11:31 AM, Levente Uzonyi wrote:
>>> more cuddly) and since we don't have map: (we have collect:) I don't find
>>> the need to use reduce: compelling. But that's my preference. I won't
>>> object if you replace fold: with reduce:/ I do note that Gilad used fold:
>>> in Newspeak.
>>>
>>> What do you prefer?
>>
>> I prefer the second option, because both names are widely used, but some
>> people are only aware of the one which their previously used languages
>> have. So this way we can avoid questions like "Why isn't there a
>> method for folding in Squeak?".
>
> If you go by familiarity, reduce is by far the obvious choice. Most of the
> popular dynamic languages use reduce:
>
> PHP: http://php.net/manual/en/function.array-reduce.php
> Javascript:
> https://developer.mozilla.org/en/JavaScript/Reference/Global_Objects/Array/Reduce
> Python: http://docs.python.org/library/functions.html#reduce
>
> What languages use fold?

Mostly functional languages. Here is a list of popular languages, that
implement fold/reduce:
http://en.wikipedia.org/wiki/Fold_(higher-order_function)#Implementation

So users of Erlang, F#, Haskell, OCaml, Scala, SML will probably look for
#fold:.

Levente

>
> Cheers,
> - Andreas
>
>

Juan Vuletich-4

Re: Collection's #fold: vs #reduce:

In reply to this post by Nicolas Cellier

Nicolas Cellier wrote:
>
> I'd say which selector are implemented in Pharo and Cuis ?
> We should better converge.
>
> Nicolas
>

Hi folks,

Cuis doesn't include any of them yet. I can add whatever people prefer.
What I'd do different is the implementation:

fold: aBinaryBlock
"Evaluate the block with the first two elements of the receiver,
then with the result of the first evaluation and the next element,
and so on. Answer the result of the final evaluation. If the receiver
is empty, raise an error. If the receiver has a single element, answer
that element."
"
#('if' 'it' 'is' 'to' 'be' 'it' 'is' 'up' 'to' 'me') fold: [:a :b |
a, ' ', b]
"
| noPreviousValue |
noPreviousValue := Object new. "something that can't be in the
receiver"
^self inject: noPreviousValue into: [ :previousValue :each |
previousValue == noPreviousValue
ifTrue: [ each ]
ifFalse: [ aBinaryBlock value: previousValue value: each ]]

This is easier to understand, and it also makes clear the relation
between #fold: (or #reduce:) and #inject:into: .

Cheers,
Juan Vuletich

Eliot Miranda-2

Re: Collection's #fold: vs #reduce:

On Tue, Nov 2, 2010 at 1:23 PM, Juan Vuletich <[hidden email]> wrote:

Nicolas Cellier wrote:

I'd say which selector are implemented in Pharo and Cuis ?
We should better converge.

Nicolas

Hi folks,

Cuis doesn't include any of them yet. I can add whatever people prefer. What I'd do different is the implementation:

fold: aBinaryBlock

"Evaluate the block with the first two elements of the receiver,
then with the result of the first evaluation and the next element,
and so on. Answer the result of the final evaluation. If the receiver
is empty, raise an error. If the receiver has a single element, answer
that element."
"
#('if' 'it' 'is' 'to' 'be' 'it' 'is' 'up' 'to' 'me') fold: [:a :b | a, ' ', b]
"
| noPreviousValue |
noPreviousValue := Object new. "something that can't be in the receiver"
^self inject: noPreviousValue into: [ :previousValue :each |
previousValue == noPreviousValue
ifTrue: [ each ]
ifFalse: [ aBinaryBlock value: previousValue value: each ]]

This is easier to understand, and it also makes clear the relation between #fold: (or #reduce:) and #inject:into: .

I disagree. inject:into: is not particularly easy to understand, whereas both the existing fold: and reduce: are understandable in terms of do:. Also the use of inject:into: isn't really buying you anything since the logic in the block within the inject:into: usage is as complex as that within fold: or reduce:. Further, this /is/ a lot slower because of the extra level of evaluation (there's an extra block activation around each element).

best,

Eliot

Cheers,
Juan Vuletich

Juan Vuletich-4

Re: Collection's #fold: vs #reduce:

Eliot Miranda wrote:

>
>
> On Tue, Nov 2, 2010 at 1:23 PM, Juan Vuletich <[hidden email]
> <mailto:[hidden email]>> wrote:
>
> Nicolas Cellier wrote:
>
>
> I'd say which selector are implemented in Pharo and Cuis ?
> We should better converge.
>
> Nicolas
>
>
>
> Hi folks,
>
> Cuis doesn't include any of them yet. I can add whatever people
> prefer. What I'd do different is the implementation:
>
> fold: aBinaryBlock
>
> "Evaluate the block with the first two elements of the receiver,
> then with the result of the first evaluation and the next element,
> and so on. Answer the result of the final evaluation. If the
> receiver
> is empty, raise an error. If the receiver has a single element,
> answer
> that element."
> "
> #('if' 'it' 'is' 'to' 'be' 'it' 'is' 'up' 'to' 'me') fold: [:a
> :b | a, ' ', b]
> "
> | noPreviousValue |
> noPreviousValue := Object new. "something that can't be in
> the receiver"
> ^self inject: noPreviousValue into: [ :previousValue :each |
> previousValue == noPreviousValue
> ifTrue: [ each ]
> ifFalse: [ aBinaryBlock value: previousValue value: each ]]
>
> This is easier to understand, and it also makes clear the relation
> between #fold: (or #reduce:) and #inject:into: .
>
>
> I disagree. inject:into: is not particularly easy to understand,
> whereas both the existing fold: and reduce: are understandable in
> terms of do:.

I say this is easier to understand, given that #inject:into: is already
understood. #inject:into: and #fold: / #reduce: are very close in what
they do. So close, that it is best to show how they differ. That's what
my implementation does. Showing the use of #inject:into: (with proper
names for block args) is a didactic bonus.

> Also the use of inject:into: isn't really buying you anything since
> the logic in the block within the inject:into: usage is as complex as
> that within fold: or reduce:. Further, this /is/ a lot slower because
> of the extra level of evaluation (there's an extra block activation
> around each element).
>
>
> best,
> Eliot

No, it is not slower:
[100000 timesRepeat: [#('if' 'it' 'is' 'to' 'be' 'it' 'is' 'up' 'to'
'me') fold: [:a :b | a, ' ', b]]] timeToRun 2710 | 879
[100000 timesRepeat: [#('if' 'it' 'is' 'to' 'be' 'it' 'is' 'up' 'to'
'me') foldx: [:a :b | a, ' ', b]]] timeToRun 2723 | 874
[100000 timesRepeat: [#('if' 'it' 'is' 'to' 'be' 'it' 'is' 'up' 'to'
'me') reduce: [:a :b | a, ' ', b]]] timeToRun 2780 | 881

#foldx: is my implementation. In all cases, the first number is regular
interpreter Squeak 4.2.4beta1U.app, the second number is Cog Squeak
5.8b10-1.app on a 1.6GHz Mac mini.

The argument about complexity is more aesthetic than anything. My
version doesn't need assignment to outer temps, and I see that 'cleaner'
and closer to FP.

Cheers,
Juan Vuletich

Eliot Miranda-2

Re: Collection's #fold: vs #reduce:

On Tue, Nov 2, 2010 at 1:50 PM, Juan Vuletich <[hidden email]> wrote:

Eliot Miranda wrote:

On Tue, Nov 2, 2010 at 1:23 PM, Juan Vuletich <[hidden email] <mailto:[hidden email]>> wrote:

Nicolas Cellier wrote:

I'd say which selector are implemented in Pharo and Cuis ?
We should better converge.

Nicolas

Hi folks,

Cuis doesn't include any of them yet. I can add whatever people
prefer. What I'd do different is the implementation:

fold: aBinaryBlock

"Evaluate the block with the first two elements of the receiver,
then with the result of the first evaluation and the next element,
and so on. Answer the result of the final evaluation. If the
receiver
is empty, raise an error. If the receiver has a single element,
answer
that element."
"
#('if' 'it' 'is' 'to' 'be' 'it' 'is' 'up' 'to' 'me') fold: [:a
:b | a, ' ', b]
"
| noPreviousValue |
noPreviousValue := Object new. "something that can't be in
the receiver"
^self inject: noPreviousValue into: [ :previousValue :each |
previousValue == noPreviousValue
ifTrue: [ each ]
ifFalse: [ aBinaryBlock value: previousValue value: each ]]

This is easier to understand, and it also makes clear the relation
between #fold: (or #reduce:) and #inject:into: .

I disagree. inject:into: is not particularly easy to understand, whereas both the existing fold: and reduce: are understandable in terms of do:.

I say this is easier to understand, given that #inject:into: is already understood. #inject:into: and #fold: / #reduce: are very close in what they do. So close, that it is best to show how they differ. That's what my implementation does. Showing the use of #inject:into: (with proper names for block args) is a didactic bonus.

Also the use of inject:into: isn't really buying you anything since the logic in the block within the inject:into: usage is as complex as that within fold: or reduce:. Further, this /is/ a lot slower because of the extra level of evaluation (there's an extra block activation around each element).

best,
Eliot

No, it is not slower:
[100000 timesRepeat: [#('if' 'it' 'is' 'to' 'be' 'it' 'is' 'up' 'to' 'me') fold: [:a :b | a, ' ', b]]] timeToRun 2710 | 879
[100000 timesRepeat: [#('if' 'it' 'is' 'to' 'be' 'it' 'is' 'up' 'to' 'me') foldx: [:a :b | a, ' ', b]]] timeToRun 2723 | 874
[100000 timesRepeat: [#('if' 'it' 'is' 'to' 'be' 'it' 'is' 'up' 'to' 'me') reduce: [:a :b | a, ' ', b]]] timeToRun 2780 | 881

Be very careful to run these under the same conditions. You'd need to run each of these immediately following a GC to get meaningful results. Also use 1 to: n do for the outer loop since timesRepeat: uses do: and so the measurement harness is probably a significant fraction of the run-time. Also measure the single-element cost. BTW, I would use (1 to: 10) fold: [:a :b| a + b] to reduce the overhead of string concatenation & allocation and focus more on the costs of the various fold: implementations.

#foldx: is my implementation. In all cases, the first number is regular interpreter Squeak 4.2.4beta1U.app, the second number is Cog Squeak 5.8b10-1.app on a 1.6GHz Mac mini.

The argument about complexity is more aesthetic than anything.

Indeed; and so a matter of taste. My aesthetics end up preferring fold: over the inject:into: based one :/

My version doesn't need assignment to outer temps, and I see that 'cleaner' and closer to FP.

But the assignment to the outer temp is hidden in inject:into:; it's still there ;)

Cheers,
Juan Vuletich

Igor Stasenko

Re: Collection's #fold: vs #reduce:

In reply to this post by Juan Vuletich-4

.snip.

>
> fold: aBinaryBlock
> "Evaluate the block with the first two elements of the receiver,
> then with the result of the first evaluation and the next element,
> and so on. Answer the result of the final evaluation. If the receiver
> is empty, raise an error. If the receiver has a single element, answer
> that element."
> "
> #('if' 'it' 'is' 'to' 'be' 'it' 'is' 'up' 'to' 'me') fold: [:a :b | a, '
> ', b]

just a 0.5 penny..

wow.. you trying to get a fastest way to fold a string,
by using a concatenaiton.. i think its not most effective way, at
least for given example :)

I think that using streams, it will be much faster.

(String streamContents: [:s |
(#('if' 'it' 'is' 'to' 'be' 'it' 'is' 'up' 'to' 'me') do: [:each | s
nextPutAll: each; nextPut: $ ]) allButLast

--
Best regards,
Igor Stasenko AKA sig.

Levente Uzonyi-2

Re: Collection's #fold: vs #reduce:

On Tue, 2 Nov 2010, Igor Stasenko wrote:

> .snip.
>>
>> fold: aBinaryBlock
>> "Evaluate the block with the first two elements of the receiver,
>> then with the result of the first evaluation and the next element,
>> and so on. Answer the result of the final evaluation. If the receiver
>> is empty, raise an error. If the receiver has a single element, answer
>> that element."
>> "
>> #('if' 'it' 'is' 'to' 'be' 'it' 'is' 'up' 'to' 'me') fold: [:a :b | a, '
>> ', b]
>
> just a 0.5 penny..
>
> wow.. you trying to get a fastest way to fold a string,
> by using a concatenaiton.. i think its not most effective way, at
> least for given example :)

I'm sure nobody is trying that. It's just an example in a comment. And
this thread is not meant to be about performance at all. We have two
implementations of the same thing. To simplify the system, I thought that
I'll remove one of them. Both versions are easy to understand, so I though
it will be a good idea to keep the faster one.

Levente

>
> I think that using streams, it will be much faster.
>
> (String streamContents: [:s |
> (#('if' 'it' 'is' 'to' 'be' 'it' 'is' 'up' 'to' 'me') do: [:each | s
> nextPutAll: each; nextPut: $ ]) allButLast
>
>
>
> --
> Best regards,
> Igor Stasenko AKA sig.
>
>

Juan Vuletich-4

Re: Collection's #fold: vs #reduce:

In reply to this post by Eliot Miranda-2

Eliot Miranda wrote:

>
>
> On Tue, Nov 2, 2010 at 1:50 PM, Juan Vuletich <[hidden email]
> <mailto:[hidden email]>> wrote:
>
> Eliot Miranda wrote:
>
>
>
> On Tue, Nov 2, 2010 at 1:23 PM, Juan Vuletich
> <[hidden email] <mailto:[hidden email]>
> <mailto:[hidden email] <mailto:[hidden email]>>> wrote:
>
> Nicolas Cellier wrote:
>
>
> I'd say which selector are implemented in Pharo and Cuis ?
> We should better converge.
>
> Nicolas
>
>
> Hi folks,
>
> Cuis doesn't include any of them yet. I can add whatever people
> prefer. What I'd do different is the implementation:
>
> fold: aBinaryBlock
>
> "Evaluate the block with the first two elements of the
> receiver,
> then with the result of the first evaluation and the
> next element,
> and so on. Answer the result of the final evaluation.
> If the
> receiver
> is empty, raise an error. If the receiver has a single
> element,
> answer
> that element."
> "
> #('if' 'it' 'is' 'to' 'be' 'it' 'is' 'up' 'to' 'me')
> fold: [:a
> :b | a, ' ', b]
> "
> | noPreviousValue |
> noPreviousValue := Object new. "something that can't be in
> the receiver"
> ^self inject: noPreviousValue into: [ :previousValue :each |
> previousValue == noPreviousValue
> ifTrue: [ each ]
> ifFalse: [ aBinaryBlock value: previousValue
> value: each ]]
>
> This is easier to understand, and it also makes clear the
> relation
> between #fold: (or #reduce:) and #inject:into: .
>
>
> I disagree. inject:into: is not particularly easy to
> understand, whereas both the existing fold: and reduce: are
> understandable in terms of do:.
>
>
> I say this is easier to understand, given that #inject:into: is
> already understood. #inject:into: and #fold: / #reduce: are very
> close in what they do. So close, that it is best to show how they
> differ. That's what my implementation does. Showing the use of
> #inject:into: (with proper names for block args) is a didactic
> bonus.
>
>
> Also the use of inject:into: isn't really buying you anything
> since the logic in the block within the inject:into: usage is
> as complex as that within fold: or reduce:. Further, this /is/
> a lot slower because of the extra level of evaluation (there's
> an extra block activation around each element).
>
>
> best,
> Eliot
>
>
> No, it is not slower:
> [100000 timesRepeat: [#('if' 'it' 'is' 'to' 'be' 'it' 'is' 'up'
> 'to' 'me') fold: [:a :b | a, ' ', b]]] timeToRun 2710 | 879
> [100000 timesRepeat: [#('if' 'it' 'is' 'to' 'be' 'it' 'is' 'up'
> 'to' 'me') foldx: [:a :b | a, ' ', b]]] timeToRun 2723 | 874
> [100000 timesRepeat: [#('if' 'it' 'is' 'to' 'be' 'it' 'is' 'up'
> 'to' 'me') reduce: [:a :b | a, ' ', b]]] timeToRun 2780 | 881
>
>
> Be very careful to run these under the same conditions. You'd need to
> run each of these immediately following a GC to get meaningful
> results. Also use 1 to: n do for the outer loop since timesRepeat:
> uses do: and so the measurement harness is probably a significant
> fraction of the run-time. Also measure the single-element cost. BTW,
> I would use (1 to: 10) fold: [:a :b| a + b] to reduce the overhead of
> string concatenation & allocation and focus more on the costs of the
> various fold: implementations.

You're right wrt performance (only tried Cog on this example):
Smalltalk garbageCollect. [1 to: 200000 do: [ :i | (1 to: 100) fold:
[:a :b| a + b] ]] timeToRun 1367
Smalltalk garbageCollect. [1 to: 200000 do: [ :i | (1 to: 100)
foldx: [:a :b| a + b] ]] timeToRun 1656
Smalltalk garbageCollect. [1 to: 200000 do: [ :i | (1 to: 100)
reduce: [:a :b| a + b] ]] timeToRun 1306

>
>
>
> #foldx: is my implementation. In all cases, the first number is
> regular interpreter Squeak 4.2.4beta1U.app, the second number is
> Cog Squeak 5.8b10-1.app on a 1.6GHz Mac mini.
>
> The argument about complexity is more aesthetic than anything.
>
>
> Indeed; and so a matter of taste. My aesthetics end up preferring
> fold: over the inject:into: based one :/
>
>
>
> My version doesn't need assignment to outer temps, and I see that
> 'cleaner' and closer to FP.
>
>
>
> But the assignment to the outer temp is hidden in inject:into:; it's
> still there ;)

:)

Cheers,
Juan Vuletich

Eliot Miranda-2

Re: Collection's #fold: vs #reduce:

Hi Juan,

On Tue, Nov 2, 2010 at 2:35 PM, Juan Vuletich <[hidden email]> wrote:

Eliot Miranda wrote:

On Tue, Nov 2, 2010 at 1:50 PM, Juan Vuletich <[hidden email] <mailto:[hidden email]>> wrote:

Eliot Miranda wrote:

On Tue, Nov 2, 2010 at 1:23 PM, Juan Vuletich
<[hidden email] <mailto:[hidden email]>
<mailto:[hidden email] <mailto:[hidden email]>>> wrote:

Nicolas Cellier wrote:

I'd say which selector are implemented in Pharo and Cuis ?
We should better converge.

Nicolas

Hi folks,

Cuis doesn't include any of them yet. I can add whatever people
prefer. What I'd do different is the implementation:

fold: aBinaryBlock

"Evaluate the block with the first two elements of the
receiver,
then with the result of the first evaluation and the
next element,
and so on. Answer the result of the final evaluation.
If the
receiver
is empty, raise an error. If the receiver has a single
element,
answer
that element."
"
#('if' 'it' 'is' 'to' 'be' 'it' 'is' 'up' 'to' 'me')
fold: [:a
:b | a, ' ', b]
"
| noPreviousValue |
noPreviousValue := Object new. "something that can't be in
the receiver"
^self inject: noPreviousValue into: [ :previousValue :each |
previousValue == noPreviousValue
ifTrue: [ each ]
ifFalse: [ aBinaryBlock value: previousValue
value: each ]]

This is easier to understand, and it also makes clear the
relation
between #fold: (or #reduce:) and #inject:into: .

I disagree. inject:into: is not particularly easy to
understand, whereas both the existing fold: and reduce: are
understandable in terms of do:.

I say this is easier to understand, given that #inject:into: is
already understood. #inject:into: and #fold: / #reduce: are very
close in what they do. So close, that it is best to show how they
differ. That's what my implementation does. Showing the use of
#inject:into: (with proper names for block args) is a didactic
bonus.

Also the use of inject:into: isn't really buying you anything
since the logic in the block within the inject:into: usage is
as complex as that within fold: or reduce:. Further, this /is/
a lot slower because of the extra level of evaluation (there's
an extra block activation around each element).

best,
Eliot

No, it is not slower:
[100000 timesRepeat: [#('if' 'it' 'is' 'to' 'be' 'it' 'is' 'up'
'to' 'me') fold: [:a :b | a, ' ', b]]] timeToRun 2710 | 879
[100000 timesRepeat: [#('if' 'it' 'is' 'to' 'be' 'it' 'is' 'up'
'to' 'me') foldx: [:a :b | a, ' ', b]]] timeToRun 2723 | 874
[100000 timesRepeat: [#('if' 'it' 'is' 'to' 'be' 'it' 'is' 'up'
'to' 'me') reduce: [:a :b | a, ' ', b]]] timeToRun 2780 | 881

Be very careful to run these under the same conditions. You'd need to run each of these immediately following a GC to get meaningful results. Also use 1 to: n do for the outer loop since timesRepeat: uses do: and so the measurement harness is probably a significant fraction of the run-time. Also measure the single-element cost. BTW, I would use (1 to: 10) fold: [:a :b| a + b] to reduce the overhead of string concatenation & allocation and focus more on the costs of the various fold: implementations.

You're right wrt performance (only tried Cog on this example):
Smalltalk garbageCollect. [1 to: 200000 do: [ :i | (1 to: 100) fold: [:a :b| a + b] ]] timeToRun 1367
Smalltalk garbageCollect. [1 to: 200000 do: [ :i | (1 to: 100) foldx: [:a :b| a + b] ]] timeToRun 1656
Smalltalk garbageCollect. [1 to: 200000 do: [ :i | (1 to: 100) reduce: [:a :b| a + b] ]] timeToRun 1306

Yes, alas one has to be hyper-careful. For example if I run the following in my current image:

Smalltalk garbageCollect.

{ [1 to: 1000000 do: [ :i | (1 to: 100) fold: [:a :b| a + b] ]] timeToRun.

[1 to: 1000000 do: [ :i | (1 to: 100) foldAlt: [:a :b| a + b] ]] timeToRun.
[1 to: 1000000 do: [ :i | (1 to: 100) reduce: [:a :b| a + b] ]] timeToRun }

chances are that the second one will be fastest, irrespective of which implementation is used. So change the order to foldAlt:, fold: reduce: and fold: will win, etc.

Levente is very careful to measure the implementations precisely, so we can trust his numbers.

cheers,

Eliot

#foldx: is my implementation. In all cases, the first number is
regular interpreter Squeak 4.2.4beta1U.app, the second number is
Cog Squeak 5.8b10-1.app on a 1.6GHz Mac mini.

The argument about complexity is more aesthetic than anything.

Indeed; and so a matter of taste. My aesthetics end up preferring fold: over the inject:into: based one :/

My version doesn't need assignment to outer temps, and I see that
'cleaner' and closer to FP.

But the assignment to the outer temp is hidden in inject:into:; it's still there ;)

:)

Cheers,
Juan Vuletich