Hi guys,
Collection defines #sum:, #detectSum: and #sumNumbers:, all of which accomplish the same (in principal) but with subtle differences: #sum: - uses #inject:into: with #anyOne as the injected element and will thus fail for empty collections #detectSum: - uses “nextValue + sum” instead of “sum + nextValue” which makes it a lot slower when dealing with large numbers (primitive fails and number conversion is necessary) #sumNumbers: - same as #sum but doesn’t fail for empty collections. Only good for numbers though. Benchmarks: [ 100 timesRepeat: [ (1 to: 1000000) sum: #yourself ] ] timeToRun 18062 [ 100 timesRepeat: [ (1 to: 1000000) detectSum: #yourself ] ] timeToRun 42391 [ 100 timesRepeat: [ (1 to: 1000000) sumNumbers: #yourself ] ] timeToRun 18096 Can we settle for a single implementation? Such as (modified from #sumNumbers:): newSum: aBlock ^ self inject: (self ifEmpty: [ 0 ] ifNotEmpty: [ self anyOne ]) into: [ :sum :each | sum + (aBlock value: each) ] This implementation combines the best of the three implementations I think. Benchmark: [ 100 timesRepeat: [ (1 to: 1000000) newSum: #yourself ] ] timeToRun 17955 BTW, there is also the message #sum, which suffers from the same problem as #sum: (the implementation is basically a copy). #sum could be implemented as sum ^ self sum: [ :x | x ] As for the name for the new method, I suggest #sum:. Cheers, Max |
Hi Max, Interesting results... replacing the #yourself in newSum: cuts the runtime by a factor of two. using an [:x | x ] block cost 10% compared to a direct version of sum. So the fastest sum is: sum ^ self inject: (self ifEmpty: [ ^ 0 ] ifNotEmpty: [ self anyOne ]) into: [ :sum :each | sum + each ] And sum ^ self sum: [:x | x ] is 10% slower. And sum ^ self sum: #yourself is 100% slower (i.e. x2) It may be wise to avoid using #yourself for a block. Thierry 2015-12-01 9:17 GMT+01:00 Max Leske <[hidden email]>: Hi guys, |
In reply to this post by Max Leske
The differences between #sum: and #sumNumbers: are well described in
the respective method comments, but I agree that this may be a problem as most people will go for #sum: even for basic numeric collections. (And I personally got bit by this.) So maybe #sum: could adopt #sumNumbers: implementation, and the current #sum: implementation would be moved to #domainSum: (or some other appropriate name...)? As for #detectSum: I think this method has confusing name and is functionally duplicate to #sum:. The swapped arguments there may make sense when you overload #+, however the same thing can be achieved from the provided block ... On 12/01, Max Leske wrote: > Hi guys, > > Collection defines #sum:, #detectSum: and #sumNumbers:, all of which accomplish the same (in principal) but with subtle differences: > > #sum: > - uses #inject:into: with #anyOne as the injected element and will thus fail for empty collections > > #detectSum: > - uses “nextValue + sum” instead of “sum + nextValue” which makes it a lot slower when dealing with large numbers (primitive fails and number conversion is necessary) > > #sumNumbers: > - same as #sum but doesn’t fail for empty collections. Only good for numbers though. > > > Benchmarks: > > [ 100 timesRepeat: [ (1 to: 1000000) sum: #yourself ] ] timeToRun 18062 > [ 100 timesRepeat: [ (1 to: 1000000) detectSum: #yourself ] ] timeToRun 42391 > [ 100 timesRepeat: [ (1 to: 1000000) sumNumbers: #yourself ] ] timeToRun 18096 > > > > Can we settle for a single implementation? Such as (modified from #sumNumbers:): > > newSum: aBlock > ^ self > inject: (self > ifEmpty: [ 0 ] > ifNotEmpty: [ self anyOne ]) > into: [ :sum :each | sum + (aBlock value: each) ] > > This implementation combines the best of the three implementations I think. Benchmark: > > [ 100 timesRepeat: [ (1 to: 1000000) newSum: #yourself ] ] timeToRun 17955 > > > BTW, there is also the message #sum, which suffers from the same problem as #sum: (the implementation is basically a copy). #sum could be implemented as > > sum > ^ self sum: [ :x | x ] > > > As for the name for the new method, I suggest #sum:. > > Cheers, > Max -- Peter |
In reply to this post by Thierry Goubier
I am all for a cleanup, the current situation is confusing.
The basic #sum should be fast AND work for empty collections with 0 as starting element. I know why the #anyOne is used, and that use case should be preserved, but it is less common IMHO. > On 01 Dec 2015, at 09:38, Thierry Goubier <[hidden email]> wrote: > > Hi Max, > > Interesting results... > > replacing the #yourself in newSum: cuts the runtime by a factor of two. > using an [:x | x ] block cost 10% compared to a direct version of sum. > > So the fastest sum is: > > sum > ^ self > inject: > (self ifEmpty: [ ^ 0 ] ifNotEmpty: [ self anyOne ]) > into: [ :sum :each | sum + each ] > > And > > sum > ^ self sum: [:x | x ] > > is 10% slower. > > And > > sum > ^ self sum: #yourself > > is 100% slower (i.e. x2) > > It may be wise to avoid using #yourself for a block. > > Thierry > > > 2015-12-01 9:17 GMT+01:00 Max Leske <[hidden email]>: > Hi guys, > > Collection defines #sum:, #detectSum: and #sumNumbers:, all of which accomplish the same (in principal) but with subtle differences: > > #sum: > - uses #inject:into: with #anyOne as the injected element and will thus fail for empty collections > > #detectSum: > - uses “nextValue + sum” instead of “sum + nextValue” which makes it a lot slower when dealing with large numbers (primitive fails and number conversion is necessary) > > #sumNumbers: > - same as #sum but doesn’t fail for empty collections. Only good for numbers though. > > > Benchmarks: > > [ 100 timesRepeat: [ (1 to: 1000000) sum: #yourself ] ] timeToRun 18062 > [ 100 timesRepeat: [ (1 to: 1000000) detectSum: #yourself ] ] timeToRun 42391 > [ 100 timesRepeat: [ (1 to: 1000000) sumNumbers: #yourself ] ] timeToRun 18096 > > > > Can we settle for a single implementation? Such as (modified from #sumNumbers:): > > newSum: aBlock > ^ self > inject: (self > ifEmpty: [ 0 ] > ifNotEmpty: [ self anyOne ]) > into: [ :sum :each | sum + (aBlock value: each) ] > > This implementation combines the best of the three implementations I think. Benchmark: > > [ 100 timesRepeat: [ (1 to: 1000000) newSum: #yourself ] ] timeToRun 17955 > > > BTW, there is also the message #sum, which suffers from the same problem as #sum: (the implementation is basically a copy). #sum could be implemented as > > sum > ^ self sum: [ :x | x ] > > > As for the name for the new method, I suggest #sum:. > > Cheers, > Max > |
On Tue, Dec 1, 2015 at 4:45 PM, Sven Van Caekenberghe <[hidden email]> wrote:
> I am all for a cleanup, the current situation is confusing. > The basic #sum should be fast AND work for empty collections with 0 as starting element. > I know why the #anyOne is used, and that use case should be preserved, but it is less common IMHO. I'm curious to be learn why? cheers -ben > >> On 01 Dec 2015, at 09:38, Thierry Goubier <[hidden email]> wrote: >> >> Hi Max, >> >> Interesting results... >> >> replacing the #yourself in newSum: cuts the runtime by a factor of two. >> using an [:x | x ] block cost 10% compared to a direct version of sum. >> >> So the fastest sum is: >> >> sum >> ^ self >> inject: >> (self ifEmpty: [ ^ 0 ] ifNotEmpty: [ self anyOne ]) >> into: [ :sum :each | sum + each ] >> >> And >> >> sum >> ^ self sum: [:x | x ] >> >> is 10% slower. >> >> And >> >> sum >> ^ self sum: #yourself >> >> is 100% slower (i.e. x2) >> >> It may be wise to avoid using #yourself for a block. >> >> Thierry >> >> >> 2015-12-01 9:17 GMT+01:00 Max Leske <[hidden email]>: >> Hi guys, >> >> Collection defines #sum:, #detectSum: and #sumNumbers:, all of which accomplish the same (in principal) but with subtle differences: >> >> #sum: >> - uses #inject:into: with #anyOne as the injected element and will thus fail for empty collections >> >> #detectSum: >> - uses “nextValue + sum” instead of “sum + nextValue” which makes it a lot slower when dealing with large numbers (primitive fails and number conversion is necessary) >> >> #sumNumbers: >> - same as #sum but doesn’t fail for empty collections. Only good for numbers though. >> >> >> Benchmarks: >> >> [ 100 timesRepeat: [ (1 to: 1000000) sum: #yourself ] ] timeToRun 18062 >> [ 100 timesRepeat: [ (1 to: 1000000) detectSum: #yourself ] ] timeToRun 42391 >> [ 100 timesRepeat: [ (1 to: 1000000) sumNumbers: #yourself ] ] timeToRun 18096 >> >> >> >> Can we settle for a single implementation? Such as (modified from #sumNumbers:): >> >> newSum: aBlock >> ^ self >> inject: (self >> ifEmpty: [ 0 ] >> ifNotEmpty: [ self anyOne ]) >> into: [ :sum :each | sum + (aBlock value: each) ] >> >> This implementation combines the best of the three implementations I think. Benchmark: >> >> [ 100 timesRepeat: [ (1 to: 1000000) newSum: #yourself ] ] timeToRun 17955 >> >> >> BTW, there is also the message #sum, which suffers from the same problem as #sum: (the implementation is basically a copy). #sum could be implemented as >> >> sum >> ^ self sum: [ :x | x ] >> >> >> As for the name for the new method, I suggest #sum:. >> >> Cheers, >> Max >> > > |
Note that the sum with #anyOne is wrong. #(1 2 3 4 5) inject: #(1 2 3 4 5) anyOne into: [ :sum :each | sum + each ] returns 16 instead of 15. Thierry 2015-12-01 10:18 GMT+01:00 Ben Coman <[hidden email]>: On Tue, Dec 1, 2015 at 4:45 PM, Sven Van Caekenberghe <[hidden email]> wrote: |
In reply to this post by Ben Coman
> On 01 Dec 2015, at 10:18, Ben Coman <[hidden email]> wrote: > > On Tue, Dec 1, 2015 at 4:45 PM, Sven Van Caekenberghe <[hidden email]> wrote: >> I am all for a cleanup, the current situation is confusing. >> The basic #sum should be fast AND work for empty collections with 0 as starting element. > >> I know why the #anyOne is used, and that use case should be preserved, but it is less common IMHO. > > I'm curious to be learn why? How many times have you summed a collection of things, where (1) you could not start from the number 0 and (2) the operation was #+ ? Me, never, and if I would need that, I would do an #inject:into: myself. Any counter examples that occur a lot ? > cheers -ben > >> >>> On 01 Dec 2015, at 09:38, Thierry Goubier <[hidden email]> wrote: >>> >>> Hi Max, >>> >>> Interesting results... >>> >>> replacing the #yourself in newSum: cuts the runtime by a factor of two. >>> using an [:x | x ] block cost 10% compared to a direct version of sum. >>> >>> So the fastest sum is: >>> >>> sum >>> ^ self >>> inject: >>> (self ifEmpty: [ ^ 0 ] ifNotEmpty: [ self anyOne ]) >>> into: [ :sum :each | sum + each ] >>> >>> And >>> >>> sum >>> ^ self sum: [:x | x ] >>> >>> is 10% slower. >>> >>> And >>> >>> sum >>> ^ self sum: #yourself >>> >>> is 100% slower (i.e. x2) >>> >>> It may be wise to avoid using #yourself for a block. >>> >>> Thierry >>> >>> >>> 2015-12-01 9:17 GMT+01:00 Max Leske <[hidden email]>: >>> Hi guys, >>> >>> Collection defines #sum:, #detectSum: and #sumNumbers:, all of which accomplish the same (in principal) but with subtle differences: >>> >>> #sum: >>> - uses #inject:into: with #anyOne as the injected element and will thus fail for empty collections >>> >>> #detectSum: >>> - uses “nextValue + sum” instead of “sum + nextValue” which makes it a lot slower when dealing with large numbers (primitive fails and number conversion is necessary) >>> >>> #sumNumbers: >>> - same as #sum but doesn’t fail for empty collections. Only good for numbers though. >>> >>> >>> Benchmarks: >>> >>> [ 100 timesRepeat: [ (1 to: 1000000) sum: #yourself ] ] timeToRun 18062 >>> [ 100 timesRepeat: [ (1 to: 1000000) detectSum: #yourself ] ] timeToRun 42391 >>> [ 100 timesRepeat: [ (1 to: 1000000) sumNumbers: #yourself ] ] timeToRun 18096 >>> >>> >>> >>> Can we settle for a single implementation? Such as (modified from #sumNumbers:): >>> >>> newSum: aBlock >>> ^ self >>> inject: (self >>> ifEmpty: [ 0 ] >>> ifNotEmpty: [ self anyOne ]) >>> into: [ :sum :each | sum + (aBlock value: each) ] >>> >>> This implementation combines the best of the three implementations I think. Benchmark: >>> >>> [ 100 timesRepeat: [ (1 to: 1000000) newSum: #yourself ] ] timeToRun 17955 >>> >>> >>> BTW, there is also the message #sum, which suffers from the same problem as #sum: (the implementation is basically a copy). #sum could be implemented as >>> >>> sum >>> ^ self sum: [ :x | x ] >>> >>> >>> As for the name for the new method, I suggest #sum:. >>> >>> Cheers, >>> Max >>> >> >> > |
In reply to this post by Thierry Goubier
> On 01 Dec 2015, at 10:24, Thierry Goubier <[hidden email]> wrote: > > Note that the sum with #anyOne is wrong. > > #(1 2 3 4 5) inject: #(1 2 3 4 5) anyOne into: [ :sum :each | sum + each ] > > returns 16 instead of 15. You have subtract the element you picked with #anyOne ! > Thierry > > 2015-12-01 10:18 GMT+01:00 Ben Coman <[hidden email]>: > On Tue, Dec 1, 2015 at 4:45 PM, Sven Van Caekenberghe <[hidden email]> wrote: > > I am all for a cleanup, the current situation is confusing. > > The basic #sum should be fast AND work for empty collections with 0 as starting element. > > > I know why the #anyOne is used, and that use case should be preserved, but it is less common IMHO. > > I'm curious to be learn why? > cheers -ben > > > > >> On 01 Dec 2015, at 09:38, Thierry Goubier <[hidden email]> wrote: > >> > >> Hi Max, > >> > >> Interesting results... > >> > >> replacing the #yourself in newSum: cuts the runtime by a factor of two. > >> using an [:x | x ] block cost 10% compared to a direct version of sum. > >> > >> So the fastest sum is: > >> > >> sum > >> ^ self > >> inject: > >> (self ifEmpty: [ ^ 0 ] ifNotEmpty: [ self anyOne ]) > >> into: [ :sum :each | sum + each ] > >> > >> And > >> > >> sum > >> ^ self sum: [:x | x ] > >> > >> is 10% slower. > >> > >> And > >> > >> sum > >> ^ self sum: #yourself > >> > >> is 100% slower (i.e. x2) > >> > >> It may be wise to avoid using #yourself for a block. > >> > >> Thierry > >> > >> > >> 2015-12-01 9:17 GMT+01:00 Max Leske <[hidden email]>: > >> Hi guys, > >> > >> Collection defines #sum:, #detectSum: and #sumNumbers:, all of which accomplish the same (in principal) but with subtle differences: > >> > >> #sum: > >> - uses #inject:into: with #anyOne as the injected element and will thus fail for empty collections > >> > >> #detectSum: > >> - uses “nextValue + sum” instead of “sum + nextValue” which makes it a lot slower when dealing with large numbers (primitive fails and number conversion is necessary) > >> > >> #sumNumbers: > >> - same as #sum but doesn’t fail for empty collections. Only good for numbers though. > >> > >> > >> Benchmarks: > >> > >> [ 100 timesRepeat: [ (1 to: 1000000) sum: #yourself ] ] timeToRun 18062 > >> [ 100 timesRepeat: [ (1 to: 1000000) detectSum: #yourself ] ] timeToRun 42391 > >> [ 100 timesRepeat: [ (1 to: 1000000) sumNumbers: #yourself ] ] timeToRun 18096 > >> > >> > >> > >> Can we settle for a single implementation? Such as (modified from #sumNumbers:): > >> > >> newSum: aBlock > >> ^ self > >> inject: (self > >> ifEmpty: [ 0 ] > >> ifNotEmpty: [ self anyOne ]) > >> into: [ :sum :each | sum + (aBlock value: each) ] > >> > >> This implementation combines the best of the three implementations I think. Benchmark: > >> > >> [ 100 timesRepeat: [ (1 to: 1000000) newSum: #yourself ] ] timeToRun 17955 > >> > >> > >> BTW, there is also the message #sum, which suffers from the same problem as #sum: (the implementation is basically a copy). #sum could be implemented as > >> > >> sum > >> ^ self sum: [ :x | x ] > >> > >> > >> As for the name for the new method, I suggest #sum:. > >> > >> Cheers, > >> Max > >> > > > > > > |
2015-12-01 10:26 GMT+01:00 Sven Van Caekenberghe <[hidden email]>:
I think he means the suggested implementation for newSum is wrong, just because of this reason.
|
In reply to this post by Sven Van Caekenberghe-2
2015-12-01 10:26 GMT+01:00 Sven Van Caekenberghe <[hidden email]>:
wokring with things like Color (Color wheel:10) sum "works" (Color wheel:10) newSum:#yourself " would not work"
|
In reply to this post by Nicolai Hess-3-2
2015-12-01 10:29 GMT+01:00 Nicolai Hess <[hidden email]>:
Exactly ;) Thierry |
In reply to this post by Nicolai Hess-3-2
The basic question for me is, what should
#() sum return. Right now, it is an error, I would very much like that for this common case the result would be 0. There is a lot of power (easy of use) in a unary selector, we should not destroy that with semantics that force a test before using it. If we agree on that, we could change the #sum to | sum sample | self isEmpty ifTrue: [ ^ 0 ]. sample := self anyOne. sum := self inject: sample into: [ :accum :each | accum + each ]. ^ sum - sample And the same for #sum: As for #detectSum: that should be removed I think. > On 01 Dec 2015, at 10:33, Nicolai Hess <[hidden email]> wrote: > > > > 2015-12-01 10:26 GMT+01:00 Sven Van Caekenberghe <[hidden email]>: > > > On 01 Dec 2015, at 10:18, Ben Coman <[hidden email]> wrote: > > > > On Tue, Dec 1, 2015 at 4:45 PM, Sven Van Caekenberghe <[hidden email]> wrote: > >> I am all for a cleanup, the current situation is confusing. > >> The basic #sum should be fast AND work for empty collections with 0 as starting element. > > > >> I know why the #anyOne is used, and that use case should be preserved, but it is less common IMHO. > > > > I'm curious to be learn why? > > How many times have you summed a collection of things, where (1) you could not start from the number 0 and (2) the operation was #+ ? Me, never, and if I would need that, I would do an #inject:into: myself. Any counter examples that occur a lot ? > > wokring with things like Color > > (Color wheel:10) sum "works" > (Color wheel:10) newSum:#yourself " would not work" > > > > > cheers -ben > > > >> > >>> On 01 Dec 2015, at 09:38, Thierry Goubier <[hidden email]> wrote: > >>> > >>> Hi Max, > >>> > >>> Interesting results... > >>> > >>> replacing the #yourself in newSum: cuts the runtime by a factor of two. > >>> using an [:x | x ] block cost 10% compared to a direct version of sum. > >>> > >>> So the fastest sum is: > >>> > >>> sum > >>> ^ self > >>> inject: > >>> (self ifEmpty: [ ^ 0 ] ifNotEmpty: [ self anyOne ]) > >>> into: [ :sum :each | sum + each ] > >>> > >>> And > >>> > >>> sum > >>> ^ self sum: [:x | x ] > >>> > >>> is 10% slower. > >>> > >>> And > >>> > >>> sum > >>> ^ self sum: #yourself > >>> > >>> is 100% slower (i.e. x2) > >>> > >>> It may be wise to avoid using #yourself for a block. > >>> > >>> Thierry > >>> > >>> > >>> 2015-12-01 9:17 GMT+01:00 Max Leske <[hidden email]>: > >>> Hi guys, > >>> > >>> Collection defines #sum:, #detectSum: and #sumNumbers:, all of which accomplish the same (in principal) but with subtle differences: > >>> > >>> #sum: > >>> - uses #inject:into: with #anyOne as the injected element and will thus fail for empty collections > >>> > >>> #detectSum: > >>> - uses “nextValue + sum” instead of “sum + nextValue” which makes it a lot slower when dealing with large numbers (primitive fails and number conversion is necessary) > >>> > >>> #sumNumbers: > >>> - same as #sum but doesn’t fail for empty collections. Only good for numbers though. > >>> > >>> > >>> Benchmarks: > >>> > >>> [ 100 timesRepeat: [ (1 to: 1000000) sum: #yourself ] ] timeToRun 18062 > >>> [ 100 timesRepeat: [ (1 to: 1000000) detectSum: #yourself ] ] timeToRun 42391 > >>> [ 100 timesRepeat: [ (1 to: 1000000) sumNumbers: #yourself ] ] timeToRun 18096 > >>> > >>> > >>> > >>> Can we settle for a single implementation? Such as (modified from #sumNumbers:): > >>> > >>> newSum: aBlock > >>> ^ self > >>> inject: (self > >>> ifEmpty: [ 0 ] > >>> ifNotEmpty: [ self anyOne ]) > >>> into: [ :sum :each | sum + (aBlock value: each) ] > >>> > >>> This implementation combines the best of the three implementations I think. Benchmark: > >>> > >>> [ 100 timesRepeat: [ (1 to: 1000000) newSum: #yourself ] ] timeToRun 17955 > >>> > >>> > >>> BTW, there is also the message #sum, which suffers from the same problem as #sum: (the implementation is basically a copy). #sum could be implemented as > >>> > >>> sum > >>> ^ self sum: [ :x | x ] > >>> > >>> > >>> As for the name for the new method, I suggest #sum:. > >>> > >>> Cheers, > >>> Max > >>> > >> > >> > > > > > |
2015-12-01 11:46 GMT+01:00 Sven Van Caekenberghe <[hidden email]>: The basic question for me is, what should Yes, this is just that a 0 value is returned for an empty collection, and, if you're expecting a sum of Colors, then this is a bit strange. Returning an error on a sum for an empty collection is maybe an overall cleaner concept Which can be expressed as: I prefer this code: aCollectionOfColors isEmpty ifTrue: [ sumOfColors := aDefaultColor ] ifFalse: [ sumOfColors := aCollectionOfColors sum ] To this code: sumOfColors := aCollectionOfColors sum. sumOfColors = 0 ifTrue: [ sumOfColors := aDefaultColor ]
Ok, now I really like this implementation. Elegant. Thierry
|
On 12/01, Thierry Goubier wrote:
> 2015-12-01 11:46 GMT+01:00 Sven Van Caekenberghe <[hidden email]>: > > > The basic question for me is, what should > > > > #() sum > > > > return. Right now, it is an error, I would very much like that for this > > common case the result would be 0. There is a lot of power (easy of use) in > > a unary selector, we should not destroy that with semantics that force a > > test before using it. > > > > Yes, this is just that a 0 value is returned for an empty collection, and, > if you're expecting a sum of Colors, then this is a bit strange. > > Returning an error on a sum for an empty collection is maybe an overall > cleaner concept > > Which can be expressed as: > > I prefer this code: > > aCollectionOfColors isEmpty > ifTrue: [ sumOfColors := aDefaultColor ] > ifFalse: [ sumOfColors := aCollectionOfColors sum ] > > To this code: > > sumOfColors := aCollectionOfColors sum. > sumOfColors = 0 > ifTrue: [ sumOfColors := aDefaultColor ] or sumOfColors := (aCollectionOfColors ifEmpty: [ {aDefaultColor} ]) sum. > > > > > > If we agree on that, we could change the #sum to > > > > | sum sample | > > self isEmpty ifTrue: [ ^ 0 ]. > > sample := self anyOne. > > sum := self inject: sample into: [ :accum :each | accum + each ]. > > ^ sum - sample > > > > Ok, now I really like this implementation. Elegant. Why the temp variables? self ifEmpty: [ ^ 0 ]. ^ self inject: self anyOne negated into: [ :sum :each | sum + each ] But more importantly, if the empty/default is 0 then what is the reason for #anyOne? Otherwise you are mixing things. self ifEmpty: [ ^ 0 ]. ^ self inject: 0 into: [ :sum :each | sum + each ]. Peter > > Thierry > > > > > > And the same for #sum: > > > > As for #detectSum: that should be removed I think. > > > > > On 01 Dec 2015, at 10:33, Nicolai Hess <[hidden email]> wrote: > > > > > > > > > > > > 2015-12-01 10:26 GMT+01:00 Sven Van Caekenberghe <[hidden email]>: > > > > > > > On 01 Dec 2015, at 10:18, Ben Coman <[hidden email]> wrote: > > > > > > > > On Tue, Dec 1, 2015 at 4:45 PM, Sven Van Caekenberghe <[hidden email]> > > wrote: > > > >> I am all for a cleanup, the current situation is confusing. > > > >> The basic #sum should be fast AND work for empty collections with 0 > > as starting element. > > > > > > > >> I know why the #anyOne is used, and that use case should be > > preserved, but it is less common IMHO. > > > > > > > > I'm curious to be learn why? > > > > > > How many times have you summed a collection of things, where (1) you > > could not start from the number 0 and (2) the operation was #+ ? Me, never, > > and if I would need that, I would do an #inject:into: myself. Any counter > > examples that occur a lot ? > > > > > > wokring with things like Color > > > > > > (Color wheel:10) sum "works" > > > (Color wheel:10) newSum:#yourself " would not work" > > > > > > > > > > > > > cheers -ben > > > > > > > >> > > > >>> On 01 Dec 2015, at 09:38, Thierry Goubier <[hidden email]> > > wrote: > > > >>> > > > >>> Hi Max, > > > >>> > > > >>> Interesting results... > > > >>> > > > >>> replacing the #yourself in newSum: cuts the runtime by a factor of > > two. > > > >>> using an [:x | x ] block cost 10% compared to a direct version of > > sum. > > > >>> > > > >>> So the fastest sum is: > > > >>> > > > >>> sum > > > >>> ^ self > > > >>> inject: > > > >>> (self ifEmpty: [ ^ 0 ] ifNotEmpty: [ self > > anyOne ]) > > > >>> into: [ :sum :each | sum + each ] > > > >>> > > > >>> And > > > >>> > > > >>> sum > > > >>> ^ self sum: [:x | x ] > > > >>> > > > >>> is 10% slower. > > > >>> > > > >>> And > > > >>> > > > >>> sum > > > >>> ^ self sum: #yourself > > > >>> > > > >>> is 100% slower (i.e. x2) > > > >>> > > > >>> It may be wise to avoid using #yourself for a block. > > > >>> > > > >>> Thierry > > > >>> > > > >>> > > > >>> 2015-12-01 9:17 GMT+01:00 Max Leske <[hidden email]>: > > > >>> Hi guys, > > > >>> > > > >>> Collection defines #sum:, #detectSum: and #sumNumbers:, all of which > > accomplish the same (in principal) but with subtle differences: > > > >>> > > > >>> #sum: > > > >>> - uses #inject:into: with #anyOne as the injected element and will > > thus fail for empty collections > > > >>> > > > >>> #detectSum: > > > >>> - uses “nextValue + sum” instead of “sum + nextValue” which makes it > > a lot slower when dealing with large numbers (primitive fails and number > > conversion is necessary) > > > >>> > > > >>> #sumNumbers: > > > >>> - same as #sum but doesn’t fail for empty collections. Only good for > > numbers though. > > > >>> > > > >>> > > > >>> Benchmarks: > > > >>> > > > >>> [ 100 timesRepeat: [ (1 to: 1000000) sum: #yourself ] ] timeToRun > > 18062 > > > >>> [ 100 timesRepeat: [ (1 to: 1000000) detectSum: #yourself ] ] > > timeToRun 42391 > > > >>> [ 100 timesRepeat: [ (1 to: 1000000) sumNumbers: #yourself ] ] > > timeToRun 18096 > > > >>> > > > >>> > > > >>> > > > >>> Can we settle for a single implementation? Such as (modified from > > #sumNumbers:): > > > >>> > > > >>> newSum: aBlock > > > >>> ^ self > > > >>> inject: (self > > > >>> ifEmpty: [ 0 ] > > > >>> ifNotEmpty: [ self anyOne ]) > > > >>> into: [ :sum :each | sum + (aBlock value: each) ] > > > >>> > > > >>> This implementation combines the best of the three implementations I > > think. Benchmark: > > > >>> > > > >>> [ 100 timesRepeat: [ (1 to: 1000000) newSum: #yourself ] ] timeToRun > > 17955 > > > >>> > > > >>> > > > >>> BTW, there is also the message #sum, which suffers from the same > > problem as #sum: (the implementation is basically a copy). #sum could be > > implemented as > > > >>> > > > >>> sum > > > >>> ^ self sum: [ :x | x ] > > > >>> > > > >>> > > > >>> As for the name for the new method, I suggest #sum:. > > > >>> > > > >>> Cheers, > > > >>> Max > > > >>> > > > >> > > > >> > > > > > > > > > > > > > > > > > > > -- Peter |
In reply to this post by Sven Van Caekenberghe-2
On 01-12-15 11:46, Sven Van Caekenberghe wrote:
> The basic question for me is, what should > > #() sum > > return. Right now, it is an error, I would very much like that for this common case the result would be 0. There is a lot of power (easy of use) in a unary selector, we should not destroy that with semantics that force a test before using it. I like the error, it aligns with most of our collection protocol. It shows the need for #sum:ifEmpty: though Stephan |
> On 01 Dec 2015, at 12:45, Stephan Eggermont <[hidden email]> wrote: > > On 01-12-15 11:46, Sven Van Caekenberghe wrote: >> The basic question for me is, what should >> >> #() sum >> >> return. Right now, it is an error, I would very much like that for this common case the result would be 0. There is a lot of power (easy of use) in a unary selector, we should not destroy that with semantics that force a test before using it. > > I like the error, it aligns with most of our collection protocol. I hate the error, a lot ;-) > It shows the need for #sum:ifEmpty: though Yes, as long as #() sum == 0 I want the simplest case to be simple, having a non-0 default is a special case IMHO > Stephan > > > |
Well, I’m glad I've struck a nerve there :)
I agree that signalling an exception isn’t wrong, it’s just annoying because you have to do the check manually each time. I think I like #sum:ifEmpty:. When you look through the methods in Collection you’ll see #sum, #sum: and #sum:ifEmpty together in one place and it will be clear how to use them. > On 01 Dec 2015, at 12:52, Sven Van Caekenberghe <[hidden email]> wrote: > > >> On 01 Dec 2015, at 12:45, Stephan Eggermont <[hidden email]> wrote: >> >> On 01-12-15 11:46, Sven Van Caekenberghe wrote: >>> The basic question for me is, what should >>> >>> #() sum >>> >>> return. Right now, it is an error, I would very much like that for this common case the result would be 0. There is a lot of power (easy of use) in a unary selector, we should not destroy that with semantics that force a test before using it. >> >> I like the error, it aligns with most of our collection protocol. > > I hate the error, a lot ;-) +1 > >> It shows the need for #sum:ifEmpty: though > > Yes, as long as #() sum == 0 +100 That allows for nice Smalltalk like “myNumbers sum isZero …”. > > I want the simplest case to be simple, having a non-0 default is a special case IMHO > >> Stephan >> >> >> > > |
In reply to this post by Stephan Eggermont-3
#sum:ifEmpty: looks nice
> On 01 Dec 2015, at 12:45, Stephan Eggermont <[hidden email]> wrote: > > On 01-12-15 11:46, Sven Van Caekenberghe wrote: >> The basic question for me is, what should >> >> #() sum >> >> return. Right now, it is an error, I would very much like that for this common case the result would be 0. There is a lot of power (easy of use) in a unary selector, we should not destroy that with semantics that force a test before using it. > > I like the error, it aligns with most of our collection protocol. > It shows the need for #sum:ifEmpty: though > > Stephan > > > |
In reply to this post by Sven Van Caekenberghe-2
Hi,
> On Dec 1, 2015, at 12:52 PM, Sven Van Caekenberghe <[hidden email]> wrote: > > >> On 01 Dec 2015, at 12:45, Stephan Eggermont <[hidden email]> wrote: >> >> On 01-12-15 11:46, Sven Van Caekenberghe wrote: >>> The basic question for me is, what should >>> >>> #() sum >>> >>> return. Right now, it is an error, I would very much like that for this common case the result would be 0. There is a lot of power (easy of use) in a unary selector, we should not destroy that with semantics that force a test before using it. >> >> I like the error, it aligns with most of our collection protocol. > > I hate the error, a lot ;-) > >> It shows the need for #sum:ifEmpty: though > > Yes, as long as #() sum == 0 That won’t work :). > I want the simplest case to be simple, having a non-0 default is a special case IMHO That is why you have sumNumbers:. We could also add Collection>>sumNumbers. We had this discussion at length before Pharo 4, and this is when we agreed to add sumNumbers: and let sum: be generic (like the name says it should be) and not assume that it should work with Numbers. Cheers, Doru > >> Stephan >> >> >> > > -- www.tudorgirba.com "Speaking louder won't make the point worthier." |
> On 01 Dec 2015, at 15:11, Tudor Girba <[hidden email]> wrote: > > Hi, > >> On Dec 1, 2015, at 12:52 PM, Sven Van Caekenberghe <[hidden email]> wrote: >> >> >>> On 01 Dec 2015, at 12:45, Stephan Eggermont <[hidden email]> wrote: >>> >>> On 01-12-15 11:46, Sven Van Caekenberghe wrote: >>>> The basic question for me is, what should >>>> >>>> #() sum >>>> >>>> return. Right now, it is an error, I would very much like that for this common case the result would be 0. There is a lot of power (easy of use) in a unary selector, we should not destroy that with semantics that force a test before using it. >>> >>> I like the error, it aligns with most of our collection protocol. >> >> I hate the error, a lot ;-) >> >>> It shows the need for #sum:ifEmpty: though >> >> Yes, as long as #() sum == 0 > > That won’t work :). But wouldn’t it be nice to have #sum:ifEmpty:? I think that this is a nice way to describe summing. Otherwise I have to do ifEmpty: [ something ] ifNotEmpty: [ :col | col sum: … ] Cheers. Uko > >> I want the simplest case to be simple, having a non-0 default is a special case IMHO > > That is why you have sumNumbers:. We could also add Collection>>sumNumbers. > > We had this discussion at length before Pharo 4, and this is when we agreed to add sumNumbers: and let sum: be generic (like the name says it should be) and not assume that it should work with Numbers. > > Cheers, > Doru > >> >>> Stephan >>> >>> >>> >> >> > > -- > www.tudorgirba.com > > "Speaking louder won't make the point worthier." > > |
Free forum by Nabble | Edit this page |