Frank Shearar uploaded a new version of Collections to project The Inbox:
http://source.squeak.org/inbox/Collections-fbs.489.mcz ==================== Summary ==================== Name: Collections-fbs.489 Author: fbs Time: 20 August 2012, 4:51:36.018 pm UUID: d1834809-65b5-46c0-9b26-fea7ec04a946 Ancestors: Collections-ul.488 SequenceableCollection >> flatten turns a nested SequenceableCollection - a collection of collections of ... - into a simple collection. Thus, #(1 (2 3) (4)) flatten = #(1 2 3 4). =============== Diff against Collections-ul.488 =============== Item was added: + ----- Method: SequenceableCollection>>flatten (in category 'copying') ----- + flatten + | recur stream | + stream := WriteStream on: (Array new: self size). "A lower bound on the final stream count." + recur := [:coll | coll + do: [:each | + each isCollection + ifTrue: [recur value: each] + ifFalse: [stream nextPut: each]]]. + recur value: self. + ^ stream contents.! |
On 20 August 2012 16:51, <[hidden email]> wrote:
> Frank Shearar uploaded a new version of Collections to project The Inbox: > http://source.squeak.org/inbox/Collections-fbs.489.mcz > > ==================== Summary ==================== > > Name: Collections-fbs.489 > Author: fbs > Time: 20 August 2012, 4:51:36.018 pm > UUID: d1834809-65b5-46c0-9b26-fea7ec04a946 > Ancestors: Collections-ul.488 > > SequenceableCollection >> flatten turns a nested SequenceableCollection - a collection of collections of ... - into a simple collection. Thus, #(1 (2 3) (4)) flatten = #(1 2 3 4). > > =============== Diff against Collections-ul.488 =============== > > Item was added: > + ----- Method: SequenceableCollection>>flatten (in category 'copying') ----- > + flatten > + | recur stream | > + stream := WriteStream on: (Array new: self size). "A lower bound on the final stream count." > + recur := [:coll | coll > + do: [:each | > + each isCollection > + ifTrue: [recur value: each] > + ifFalse: [stream nextPut: each]]]. > + recur value: self. > + ^ stream contents.! I've had to implement this a few times now, so I thought it time to offer it up for general consumption. My original implementation was recursive, using #collect: and #inject:into:. This of course used a lot of #copy sends. Using a WriteStream seems more efficient, both in time trials and in looking at MessageTally outputs (where GC usage was a fraction of the recursive implementation, and 40% of CPU time was actually spent in primitives). frank |
Even more efficient might be to: rather than flattening the
collection, another possible solution might be to employ Brent Pinkney's Medley collection in the first place, which treats collections embedded into collections as one flat collection without ever needing to flatten them. Attached, just in case you're interested. - Chris On Mon, Aug 20, 2012 at 10:54 AM, Frank Shearar <[hidden email]> wrote: > On 20 August 2012 16:51, <[hidden email]> wrote: >> Frank Shearar uploaded a new version of Collections to project The Inbox: >> http://source.squeak.org/inbox/Collections-fbs.489.mcz >> >> ==================== Summary ==================== >> >> Name: Collections-fbs.489 >> Author: fbs >> Time: 20 August 2012, 4:51:36.018 pm >> UUID: d1834809-65b5-46c0-9b26-fea7ec04a946 >> Ancestors: Collections-ul.488 >> >> SequenceableCollection >> flatten turns a nested SequenceableCollection - a collection of collections of ... - into a simple collection. Thus, #(1 (2 3) (4)) flatten = #(1 2 3 4). >> >> =============== Diff against Collections-ul.488 =============== >> >> Item was added: >> + ----- Method: SequenceableCollection>>flatten (in category 'copying') ----- >> + flatten >> + | recur stream | >> + stream := WriteStream on: (Array new: self size). "A lower bound on the final stream count." >> + recur := [:coll | coll >> + do: [:each | >> + each isCollection >> + ifTrue: [recur value: each] >> + ifFalse: [stream nextPut: each]]]. >> + recur value: self. >> + ^ stream contents.! > > I've had to implement this a few times now, so I thought it time to > offer it up for general consumption. My original implementation was > recursive, using #collect: and #inject:into:. This of course used a > lot of #copy sends. Using a WriteStream seems more efficient, both in > time trials and in looking at MessageTally outputs (where GC usage was > a fraction of the recursive implementation, and 40% of CPU time was > actually spent in primitives). > > frank > BrpExtensions-cmm.10.mcz (7K) Download Attachment |
On 20 August 2012 17:17, Chris Muller <[hidden email]> wrote:
> Even more efficient might be to: rather than flattening the > collection, another possible solution might be to employ Brent > Pinkney's Medley collection in the first place, which treats > collections embedded into collections as one flat collection without > ever needing to flatten them. > > Attached, just in case you're interested. Ah, right: Medley >> #do: iterates over the (sub-)collections, and because Medley is-a Collection, that means most (all?) of the other things will work out the box - #collect:, #inject:into: and so on - because normally we define these actions in terms of #do:. Neat! Another way to flatten without flattening would be to have an external iterator: PreOrderTraversal on: myNestedCollection could define all the usual collection methods, and use the underlying collections' #do:. (And such an iterator could iterate over heterogeneous structures too.) Either way, you don't need to garbage collect objects you didn't create in the first place. And for those too lazy to open up the mcz, BrpExtensions also has: * Ruby-like conditions: 1 if: (a > b). (I'm wondering where #unless: is though) * Object >> #asArray/#asCollection (which we've discussed here rather recently) * Emphasising key absence for maps: Dictionary >> #at:ifAbsent:ifPresent: I'm a bit curious as to the SimplifiedChineseEnvironment lurking there, since LanguageEnvironment >> #leadingChar already does what SCE's does. (Cruft, I guess, since nice updated LE's only recently (2011/01/05) frank > - Chris > > > On Mon, Aug 20, 2012 at 10:54 AM, Frank Shearar <[hidden email]> wrote: >> On 20 August 2012 16:51, <[hidden email]> wrote: >>> Frank Shearar uploaded a new version of Collections to project The Inbox: >>> http://source.squeak.org/inbox/Collections-fbs.489.mcz >>> >>> ==================== Summary ==================== >>> >>> Name: Collections-fbs.489 >>> Author: fbs >>> Time: 20 August 2012, 4:51:36.018 pm >>> UUID: d1834809-65b5-46c0-9b26-fea7ec04a946 >>> Ancestors: Collections-ul.488 >>> >>> SequenceableCollection >> flatten turns a nested SequenceableCollection - a collection of collections of ... - into a simple collection. Thus, #(1 (2 3) (4)) flatten = #(1 2 3 4). >>> >>> =============== Diff against Collections-ul.488 =============== >>> >>> Item was added: >>> + ----- Method: SequenceableCollection>>flatten (in category 'copying') ----- >>> + flatten >>> + | recur stream | >>> + stream := WriteStream on: (Array new: self size). "A lower bound on the final stream count." >>> + recur := [:coll | coll >>> + do: [:each | >>> + each isCollection >>> + ifTrue: [recur value: each] >>> + ifFalse: [stream nextPut: each]]]. >>> + recur value: self. >>> + ^ stream contents.! >> >> I've had to implement this a few times now, so I thought it time to >> offer it up for general consumption. My original implementation was >> recursive, using #collect: and #inject:into:. This of course used a >> lot of #copy sends. Using a WriteStream seems more efficient, both in >> time trials and in looking at MessageTally outputs (where GC usage was >> a fraction of the recursive implementation, and 40% of CPU time was >> actually spent in primitives). >> >> frank >> > > > |
On Mon, Aug 20, 2012 at 10:04 AM, Frank Shearar <[hidden email]> wrote:
> Another way to flatten without flattening would be to have an external > iterator: PreOrderTraversal on: myNestedCollection could define all > the usual collection methods, and use the underlying collections' > #do:. (And such an iterator could iterate over heterogeneous > structures too.) I quite like this approach. I used it in Filesystem, for walking directory trees. See FSPreorderGuide, FSPostorderGuide and FSBreadthFirstGuide. The term "Guide" comes from the fact that they work with Visitors, guiding them around the filesystem. Colin |
On 20 August 2012 18:29, Colin Putney <[hidden email]> wrote:
> On Mon, Aug 20, 2012 at 10:04 AM, Frank Shearar <[hidden email]> wrote: >> Another way to flatten without flattening would be to have an external >> iterator: PreOrderTraversal on: myNestedCollection could define all >> the usual collection methods, and use the underlying collections' >> #do:. (And such an iterator could iterate over heterogeneous >> structures too.) > > I quite like this approach. I used it in Filesystem, for walking > directory trees. See FSPreorderGuide, FSPostorderGuide and > FSBreadthFirstGuide. The term "Guide" comes from the fact that they > work with Visitors, guiding them around the filesystem. My Zipper library uses the same idea: take a look at https://github.com/frankshearar/zipr/blob/master/lib/zipr/zipper.rb from about line 373 onwards. (I would link to my Smalltalk zipper, but its implementation isn't near as polished as the Ruby one. Clearly I wrote the Ruby one after learning from my mistakes in implementing the Smalltalk one!) OK, it's a bit complicated by the fact that the various navigation routines use a trampoline to iterate-ify a recursive method, and it uses the Either monad for "try go left and if you can do this, otherwise do that" logic, but the idea should still be quite clear: it's handy to have things over which you can iterate providing basic navigation - down, left, right and so on - and tying the bits together with a more sophisticated traversal mechanism. Especially if said traversal can map, fold and do all sorts of interesting things. In FileSystem's case, I imagine that the various visitors say "this is how you can iterate over the little bit of the overall structure I represent". I've heard Visitor described as "Visitor = Iteration + Double Dispatch." frank > Colin > |
In reply to this post by Frank Shearar-3
> I'm a bit curious as to the SimplifiedChineseEnvironment lurking
> there, since LanguageEnvironment >> #leadingChar already does what > SCE's does. (Cruft, I guess, since nice updated LE's only recently > (2011/01/05) Yeah, that can probably be removed. |
Free forum by Nabble | Edit this page |