Hello all,
In my explorations today I happened on the implementation of the #sum method: Collection>>sum "This is implemented using a variant of the normal inject:into: pattern. The reason for this is that it is not known whether we're in the normal number line, i.e. whether 0 is a good initial value for the sum. Consider a collection of measurement objects, 0 would be the unitless value and would not be appropriate to add with the unit-ed objects." | sum sample | sample _ self anyOne. sum _ self inject: sample into: [:accum :each | accum + each]. ^ sum - sample Personally, I view the code in the core image to, among other things, serve the role of showing new Smalltalk programmers good ways of using the language. This is exactly the kind of thing I would *not* want people to see. The obvious issue here is that #inject:into: doesn't fit this case very well. But rather then go to these kinds of steps a much better solution would be to simply notice that a specialized variation of #inject:into: (that is in fact quite common) is needed here. Namely a reduction method that just uses the first element as the starting value, conceptually: Collection>>reduce: aBlock ^ self allButFirst inject: self first into: aBlock As it turns out there exists a more efficient implementation of this method in Lukas' Magritte-Model package. I would propose his method be promoted to a core method so #sum could be rewritten as: sum "sum the reciever" ^ self reduce: [:accum :each| accum + each ] Thanks, Jason |
> As it turns out there exists a more efficient implementation of this > method in Lukas' Magritte-Model package. I would propose his method > be promoted to a core method so #sum could be rewritten as: > > sum > "sum the reciever" > ^ self reduce: [:accum :each| accum + each ] This method is called #fold: in VisualWorks, GNU Smalltalk and others. Paolo |
In reply to this post by Jason Johnson-5
> > Collection>>reduce: aBlock > ^ self allButFirst inject: self first into: aBlock > #allButFirst copies the whole collection, where in the current #sum implementation there is no such copy. we should avoid slowing down the method. Stef |
>> Collection>>reduce: aBlock >> ^ self allButFirst inject: self first into: aBlock >> > > #allButFirst copies the whole collection, where in the current #sum > implementation there is no such copy. we should avoid slowing down the > method. That's just an idea, the actual implementation would be marker := value := Object new. self do: [ :each | value := marker == value ifTrue: [ each ] ifFalse: [ aBlock value: value value: each ] ] with an optimized implementation for SequenceableCollection. Paolo |
In reply to this post by Paolo Bonzini-2
On Mon, Aug 11, 2008 at 3:48 PM, Paolo Bonzini <[hidden email]> wrote:
> >> As it turns out there exists a more efficient implementation of this >> method in Lukas' Magritte-Model package. I would propose his method >> be promoted to a core method so #sum could be rewritten as: >> >> sum >> "sum the reciever" >> ^ self reduce: [:accum :each| accum + each ] > > This method is called #fold: in VisualWorks, GNU Smalltalk and others. > > Paolo Ah ok. It's called reduce in python and Lisp, and that's also what Lukas named it in his package. |
In reply to this post by Stéphane Rollandin
On Mon, Aug 11, 2008 at 5:16 PM, Stéphane Rollandin
<[hidden email]> wrote: > >> >> Collection>>reduce: aBlock >> ^ self allButFirst inject: self first into: aBlock >> > > #allButFirst copies the whole collection, where in the current #sum > implementation there is no such copy. we should avoid slowing down the > method. > > Stef I know full well what allButFirst does. That's why I said *conceptually*. I wanted to describe the idea in as concise a manner as possible. Please read the method *I actually suggested to be included*. No copy done there. |
In reply to this post by Jason Johnson-5
Jason Johnson wrote:
> sum > "sum the reciever" > ^ self reduce: [:accum :each| accum + each ] Yuck. (Not to the solution, necessarily, but to the whole idea of "sum".) This is the kind of problem that can't be solved well without a type-inferencing system (for instance, Haskell can implement polymorphic sum over possibly-empty lists well, so long as the instance of the type class of the elements of the collection has a definition for an appropriate zero), or explicit guidance from the programmer. sumWithZero: aZero ^ self inject: aZero into: [:accum :each | each + accum] ...and remove sum entirely. Note that sum doesn't work on empty collections (!) but sumWithZero: does. |
Tony Garnock-Jones wrote:
> a definition for an > appropriate zero Another option is to make zero in Smalltalk inhabit multiple types! (In a similar way to the way nil in Smalltalk inhabits multiple types) Then, sum ^ self inject: 0 into: [:accum :each | accum + each] ...would work, so long as the elements treated 0 as a zero for #+. |
In reply to this post by Jason Johnson-5
>> This method is called #fold: in VisualWorks, GNU Smalltalk and others.
>> >> Paolo > > Ah ok. It's called reduce in python and Lisp, and that's also what > Lukas named it in his package. I think #fold: is the Haskell name (they have both foldl and foldr), but probably it comes from somewhere else. Eliot should know... Paolo |
In reply to this post by Jason Johnson-5
Jason Johnson a écrit :
> I know full well what allButFirst does. That's why I said > *conceptually*. I wanted to describe the idea in as concise a manner > as possible. Please read the method *I actually suggested to be > included*. No copy done there. wow, I did not mean any offense. just my 2 cents. sorry if it felt like I was giving you a lesson or something... Stef |
In reply to this post by Jason Johnson-5
BTW, does anybody have close contact (i.e., working in the same
building, etc.) with Travis Griggs, who has the author initials of these methods? The record indicates that he hasn't turn the signature in, and there are a few methods by him: 'TAG'->methods by Travis Griggs Collection >> floor Collection >> average Collection >> sqrt Collection >> truncated Collection >> log Collection >> reciprocal Collection >> median Collection >> min Collection >> ceiling Collection >> negated Collection >> max Collection >> abs Collection >> range Collection >> rounded Collection >> squared Collection >> sum SequenceableCollection >> @ Yes, the current implementation of #sum isn't cool^^; -- Yoshiki |
In reply to this post by Paolo Bonzini-2
On Mon, Aug 11, 2008 at 9:21 AM, Paolo Bonzini <[hidden email]> wrote:
I wish I knew for sure. I think it is called fold in certain lisps too. But I defer to Wikipedia. They prefer fold but also give reduce, accumulate, compress & inject (their emphasis) as other names.
|
In reply to this post by Paolo Bonzini-2
On Mon, Aug 11, 2008 at 6:21 PM, Paolo Bonzini <[hidden email]> wrote:
>>> This method is called #fold: in VisualWorks, GNU Smalltalk and others. >>> >>> Paolo >> >> Ah ok. It's called reduce in python and Lisp, and that's also what >> Lukas named it in his package. > > I think #fold: is the Haskell name (they have both foldl and foldr), but > probably it comes from somewhere else. Eliot should know... > > Paolo I don't know if Haskell has what I'm calling the "reduce operator". They have foldl and foldr, which are like #inject:into: but have a slightly different behavior. The easiest way to understand it is to imagine a list; #(1 2 3) as: 1 : 2 : 3 : [] where [] is an empty list and : is an operator the appends the left side to the right. So foldl and foldr both replace the : with the function they are passed and the [] with the initial value they are passed. The difference between the two is which way they then resolve. So if they did provide a "resolve" operation (i.e. take one less value and use the first element of the list as the "initial value") I would expect there to be two of them. #inject:into: behaves like a right fold (reduction moves from left to right) but with the [] on the front of the list instead of the end. |
In reply to this post by Tony Garnock-Jones-2
On Mon, Aug 11, 2008 at 6:07 PM, Tony Garnock-Jones <[hidden email]> wrote:
> Jason Johnson wrote: >> >> sum >> "sum the reciever" >> ^ self reduce: [:accum :each| accum + each ] > > Yuck. (Not to the solution, necessarily, but to the whole idea of "sum".) > > This is the kind of problem that can't be solved well without a > type-inferencing system (for instance, Haskell can implement polymorphic sum > over possibly-empty lists well, so long as the instance of the type class of > the elements of the collection has a definition for an appropriate zero), or > explicit guidance from the programmer. > > sumWithZero: aZero > ^ self inject: aZero into: [:accum :each | each + accum] > > ...and remove sum entirely. Note that sum doesn't work on empty collections > (!) but sumWithZero: does. I think I understand what you're saying, but I disagree. In a late bound system we try it and if the types are not what our code expected it fails. In this case the only problem is the case of: #() sum. The problem is we don't know what type of objects the collection was expected to have (and the collection could be hetrogenious anyway) so there is no suitable "nothing" answer. So I would propose that #sum on an empty collection is simply an error (and I believe later in the list it is said to actually do that). |
In reply to this post by Jason Johnson-5
Jason Johnson wrote:
> #inject:into: behaves like a right fold (reduction moves from left to > right) but with the [] on the front of the list instead of the end. It's actually behaving like a *left* fold. The leftness/rightness of a fold is named after the left/right associativity that the combining operator is given -- the difference between ((([] : 1) : 2) : 3) -- a left fold, with (:) invoked -- left-associatively and (1 : (2 : (3 : []))) -- a right fold, with (:) invoked -- right-associatively In haskell, foldr (:) [] =~= id and foldl (flip (:)) [] === reverse Regards, Tony |
In reply to this post by Jason Johnson-5
Jason Johnson wrote:
> So I would propose that #sum > on an empty collection is simply an error (and I believe later in the > list it is said to actually do that). Yes, an error is currently signalled. Which makes the general case: aCollection ifEmpty: [0] ifNotEmpty: [aCollection sum]. ... which, to me, is a bad code smell. Then again, I've been an FPer for a long while. The ifEmpty:ifNotEmpty: corresponds to a "reduce" in SRFI-1 [1] terminology, rather than a proper "fold", and folds are easier to use to write correct general code, in my experience. Maybe sumWithZero: was a silly name -- sumFrom: is perhaps better? I guess this is all bike-sheds, though, since #sum is firmly established, awkwardness of type, of interface and of implementation notwithstanding. sumFrom: x ^ self inject: x into: [:accum :each | accum + each] Regards, Tony [1]: http://srfi.schemers.org/srfi-1/srfi-1.html |
In reply to this post by Tony Garnock-Jones-2
On Tue, Aug 12, 2008 at 11:21 AM, Tony Garnock-Jones <[hidden email]> wrote:
> Jason Johnson wrote: >> >> #inject:into: behaves like a right fold (reduction moves from left to >> right) but with the [] on the front of the list instead of the end. > > It's actually behaving like a *left* fold. The leftness/rightness of a fold > is named after the left/right associativity that the combining operator is > given -- the difference between > > ((([] : 1) : 2) : 3) -- a left fold, with (:) invoked > -- left-associatively > > and > > (1 : (2 : (3 : []))) -- a right fold, with (:) invoked > -- right-associatively > > In haskell, > > foldr (:) [] =~= id > > and > > foldl (flip (:)) [] === reverse > > Regards, > Tony Ah right, left is the one that can work on infinite sets (in Haskell at least). The diagram I looked at before I sent this message appeared to have right fold mean the computation moves from left to right. |
In reply to this post by Tony Garnock-Jones-2
On 12-Aug-08, at 2:30 AM, Tony Garnock-Jones wrote: > Yes, an error is currently signalled. Which makes the general case: > > aCollection ifEmpty: [0] ifNotEmpty: [aCollection sum]. > > .. which, to me, is a bad code smell. Then again, I've been an FPer > for a long while. It's a code smell from an OO perspective too. This is a bit like the #join: debate that comes up every now and then when a Ruby/Python/Perl ex-pat asks how it is that we don't have a method for joining an array of strings into a single string. The short answer is that it's because #join: only makes sense when all the elements of a collection are strings, and therefore, it's bad design. This is similar. I'd say the best way to improve #sum would be to delete it. Ok, compatibility is an issue, breaking existing code would be bad etc. So leave it as is and point anyone who complains to #inject:into:. If that's really too verbose, then what's really needed is a more specific class that understands what this collection of numbers represents. Colin |
> This is a bit like the #join: debate that comes up every now and then
> when a Ruby/Python/Perl ex-pat asks how it is that we don't have a > method for joining an array of strings into a single string. The short > answer is that it's because #join: only makes sense when all the > elements of a collection are strings, and therefore, it's bad design. I disagree. Indeed, if you think of #join: as SequenceableCollection>>join: sep ^self fold: [ :a :b | a, sep, b ] (well, right, if the collection is empty the behavior is different), it becomes obvious to have an optimized implementation return an object of class "self first species". Paolo |
In reply to this post by Colin Putney
Colin Putney wrote:
> I'd say the best way to improve #sum would be to delete > it. Amen. |
Free forum by Nabble | Edit this page |