The Inbox: Collections-fbs.489.mcz

Previous Topic Next Topic
 
classic Classic list List threaded Threaded
7 messages Options
Reply | Threaded
Open this post in threaded view
|

The Inbox: Collections-fbs.489.mcz

commits-2
Frank Shearar uploaded a new version of Collections to project The Inbox:
http://source.squeak.org/inbox/Collections-fbs.489.mcz

==================== Summary ====================

Name: Collections-fbs.489
Author: fbs
Time: 20 August 2012, 4:51:36.018 pm
UUID: d1834809-65b5-46c0-9b26-fea7ec04a946
Ancestors: Collections-ul.488

SequenceableCollection >> flatten turns a nested SequenceableCollection - a collection of collections of ... -  into a simple collection. Thus, #(1 (2 3) (4)) flatten = #(1 2 3 4).

=============== Diff against Collections-ul.488 ===============

Item was added:
+ ----- Method: SequenceableCollection>>flatten (in category 'copying') -----
+ flatten
+ | recur stream |
+ stream := WriteStream on: (Array new: self size). "A lower bound on the final stream count."
+ recur := [:coll | coll
+ do: [:each |
+ each isCollection
+ ifTrue: [recur value: each]
+ ifFalse: [stream nextPut: each]]].
+ recur value: self.
+ ^ stream contents.!


Reply | Threaded
Open this post in threaded view
|

Re: The Inbox: Collections-fbs.489.mcz

Frank Shearar-3
On 20 August 2012 16:51,  <[hidden email]> wrote:

> Frank Shearar uploaded a new version of Collections to project The Inbox:
> http://source.squeak.org/inbox/Collections-fbs.489.mcz
>
> ==================== Summary ====================
>
> Name: Collections-fbs.489
> Author: fbs
> Time: 20 August 2012, 4:51:36.018 pm
> UUID: d1834809-65b5-46c0-9b26-fea7ec04a946
> Ancestors: Collections-ul.488
>
> SequenceableCollection >> flatten turns a nested SequenceableCollection - a collection of collections of ... -  into a simple collection. Thus, #(1 (2 3) (4)) flatten = #(1 2 3 4).
>
> =============== Diff against Collections-ul.488 ===============
>
> Item was added:
> + ----- Method: SequenceableCollection>>flatten (in category 'copying') -----
> + flatten
> +       | recur stream |
> +       stream := WriteStream on: (Array new: self size). "A lower bound on the final stream count."
> +       recur := [:coll | coll
> +               do: [:each |
> +                       each isCollection
> +                               ifTrue: [recur value: each]
> +                               ifFalse: [stream nextPut: each]]].
> +       recur value: self.
> +       ^ stream contents.!

I've had to implement this a few times now, so I thought it time to
offer it up for general consumption. My original implementation was
recursive, using #collect: and #inject:into:. This of course used a
lot of #copy sends. Using a WriteStream seems more efficient, both in
time trials and in looking at MessageTally outputs (where GC usage was
a fraction of the recursive implementation, and 40% of CPU time was
actually spent in primitives).

frank

Reply | Threaded
Open this post in threaded view
|

Re: The Inbox: Collections-fbs.489.mcz

Chris Muller-3
Even more efficient might be to:  rather than flattening the
collection, another possible solution might be to employ Brent
Pinkney's Medley collection in the first place, which treats
collections embedded into collections as one flat collection without
ever needing to flatten them.

Attached, just in case you're interested.

  - Chris


On Mon, Aug 20, 2012 at 10:54 AM, Frank Shearar <[hidden email]> wrote:

> On 20 August 2012 16:51,  <[hidden email]> wrote:
>> Frank Shearar uploaded a new version of Collections to project The Inbox:
>> http://source.squeak.org/inbox/Collections-fbs.489.mcz
>>
>> ==================== Summary ====================
>>
>> Name: Collections-fbs.489
>> Author: fbs
>> Time: 20 August 2012, 4:51:36.018 pm
>> UUID: d1834809-65b5-46c0-9b26-fea7ec04a946
>> Ancestors: Collections-ul.488
>>
>> SequenceableCollection >> flatten turns a nested SequenceableCollection - a collection of collections of ... -  into a simple collection. Thus, #(1 (2 3) (4)) flatten = #(1 2 3 4).
>>
>> =============== Diff against Collections-ul.488 ===============
>>
>> Item was added:
>> + ----- Method: SequenceableCollection>>flatten (in category 'copying') -----
>> + flatten
>> +       | recur stream |
>> +       stream := WriteStream on: (Array new: self size). "A lower bound on the final stream count."
>> +       recur := [:coll | coll
>> +               do: [:each |
>> +                       each isCollection
>> +                               ifTrue: [recur value: each]
>> +                               ifFalse: [stream nextPut: each]]].
>> +       recur value: self.
>> +       ^ stream contents.!
>
> I've had to implement this a few times now, so I thought it time to
> offer it up for general consumption. My original implementation was
> recursive, using #collect: and #inject:into:. This of course used a
> lot of #copy sends. Using a WriteStream seems more efficient, both in
> time trials and in looking at MessageTally outputs (where GC usage was
> a fraction of the recursive implementation, and 40% of CPU time was
> actually spent in primitives).
>
> frank
>



BrpExtensions-cmm.10.mcz (7K) Download Attachment
Reply | Threaded
Open this post in threaded view
|

Re: The Inbox: Collections-fbs.489.mcz

Frank Shearar-3
On 20 August 2012 17:17, Chris Muller <[hidden email]> wrote:
> Even more efficient might be to:  rather than flattening the
> collection, another possible solution might be to employ Brent
> Pinkney's Medley collection in the first place, which treats
> collections embedded into collections as one flat collection without
> ever needing to flatten them.
>
> Attached, just in case you're interested.

Ah, right: Medley >> #do: iterates over the (sub-)collections, and
because Medley is-a Collection, that means most (all?) of the other
things will work out the box - #collect:, #inject:into: and so on -
because normally we define these actions in terms of #do:. Neat!

Another way to flatten without flattening would be to have an external
iterator: PreOrderTraversal on: myNestedCollection could define all
the usual collection methods, and use the underlying collections'
#do:. (And such an iterator could iterate over heterogeneous
structures too.)

Either way, you don't need to garbage collect objects you didn't
create in the first place.

And for those too lazy to open up the mcz, BrpExtensions also has:
* Ruby-like conditions: 1 if: (a > b). (I'm wondering where #unless: is though)
* Object >> #asArray/#asCollection (which we've discussed here rather recently)
* Emphasising key absence for maps: Dictionary >> #at:ifAbsent:ifPresent:

I'm a bit curious as to the SimplifiedChineseEnvironment lurking
there, since LanguageEnvironment >> #leadingChar already does what
SCE's does. (Cruft, I guess, since nice updated LE's only recently
(2011/01/05)

frank

>   - Chris
>
>
> On Mon, Aug 20, 2012 at 10:54 AM, Frank Shearar <[hidden email]> wrote:
>> On 20 August 2012 16:51,  <[hidden email]> wrote:
>>> Frank Shearar uploaded a new version of Collections to project The Inbox:
>>> http://source.squeak.org/inbox/Collections-fbs.489.mcz
>>>
>>> ==================== Summary ====================
>>>
>>> Name: Collections-fbs.489
>>> Author: fbs
>>> Time: 20 August 2012, 4:51:36.018 pm
>>> UUID: d1834809-65b5-46c0-9b26-fea7ec04a946
>>> Ancestors: Collections-ul.488
>>>
>>> SequenceableCollection >> flatten turns a nested SequenceableCollection - a collection of collections of ... -  into a simple collection. Thus, #(1 (2 3) (4)) flatten = #(1 2 3 4).
>>>
>>> =============== Diff against Collections-ul.488 ===============
>>>
>>> Item was added:
>>> + ----- Method: SequenceableCollection>>flatten (in category 'copying') -----
>>> + flatten
>>> +       | recur stream |
>>> +       stream := WriteStream on: (Array new: self size). "A lower bound on the final stream count."
>>> +       recur := [:coll | coll
>>> +               do: [:each |
>>> +                       each isCollection
>>> +                               ifTrue: [recur value: each]
>>> +                               ifFalse: [stream nextPut: each]]].
>>> +       recur value: self.
>>> +       ^ stream contents.!
>>
>> I've had to implement this a few times now, so I thought it time to
>> offer it up for general consumption. My original implementation was
>> recursive, using #collect: and #inject:into:. This of course used a
>> lot of #copy sends. Using a WriteStream seems more efficient, both in
>> time trials and in looking at MessageTally outputs (where GC usage was
>> a fraction of the recursive implementation, and 40% of CPU time was
>> actually spent in primitives).
>>
>> frank
>>
>
>
>

Reply | Threaded
Open this post in threaded view
|

Re: The Inbox: Collections-fbs.489.mcz

Colin Putney-3
On Mon, Aug 20, 2012 at 10:04 AM, Frank Shearar <[hidden email]> wrote:
> Another way to flatten without flattening would be to have an external
> iterator: PreOrderTraversal on: myNestedCollection could define all
> the usual collection methods, and use the underlying collections'
> #do:. (And such an iterator could iterate over heterogeneous
> structures too.)

I quite like this approach. I used it in Filesystem, for walking
directory trees. See FSPreorderGuide, FSPostorderGuide and
FSBreadthFirstGuide. The term "Guide" comes from the fact that they
work with Visitors, guiding them around the filesystem.

Colin

Reply | Threaded
Open this post in threaded view
|

Re: The Inbox: Collections-fbs.489.mcz

Frank Shearar-3
On 20 August 2012 18:29, Colin Putney <[hidden email]> wrote:

> On Mon, Aug 20, 2012 at 10:04 AM, Frank Shearar <[hidden email]> wrote:
>> Another way to flatten without flattening would be to have an external
>> iterator: PreOrderTraversal on: myNestedCollection could define all
>> the usual collection methods, and use the underlying collections'
>> #do:. (And such an iterator could iterate over heterogeneous
>> structures too.)
>
> I quite like this approach. I used it in Filesystem, for walking
> directory trees. See FSPreorderGuide, FSPostorderGuide and
> FSBreadthFirstGuide. The term "Guide" comes from the fact that they
> work with Visitors, guiding them around the filesystem.

My Zipper library uses the same idea: take a look at
https://github.com/frankshearar/zipr/blob/master/lib/zipr/zipper.rb
from about line 373 onwards. (I would link to my Smalltalk zipper, but
its implementation isn't near as polished as the Ruby one. Clearly I
wrote the Ruby one after learning from my mistakes in implementing the
Smalltalk one!)

OK, it's a bit complicated by the fact that the various navigation
routines use a trampoline to iterate-ify a recursive method, and it
uses the Either monad for "try go left and if you can do this,
otherwise do that" logic, but the idea should still be quite clear:
it's handy to have things over which you can iterate providing basic
navigation - down, left, right and so on - and tying the bits together
with a more sophisticated traversal mechanism. Especially if said
traversal can map, fold and do all sorts of interesting things.

In FileSystem's case, I imagine that the various visitors say "this is
how you can iterate over the little bit of the overall structure I
represent". I've heard Visitor described as "Visitor = Iteration +
Double Dispatch."

frank

> Colin
>

Reply | Threaded
Open this post in threaded view
|

Re: The Inbox: Collections-fbs.489.mcz

Chris Muller-3
In reply to this post by Frank Shearar-3
> I'm a bit curious as to the SimplifiedChineseEnvironment lurking
> there, since LanguageEnvironment >> #leadingChar already does what
> SCE's does. (Cruft, I guess, since nice updated LE's only recently
> (2011/01/05)

Yeah, that can probably be removed.