moose collection extensions

Previous Topic Next Topic
 
classic Classic list List threaded Threaded
24 messages Options
12
Reply | Threaded
Open this post in threaded view
|

Re: moose collection extensions

Stéphane Ducasse
>> Levente you should realize that these methods have been defined and used
>> more than a couple of years ago.
>
> Most of the methods I suggested were added to squeak long before they started working on the project.

I do not understand what you want to say but this is not important.
We were working on vw in 1996 -> 2008 for Moose. So probably some came for Squeak.
What I want to say is that judging is always driven by a context.

> I don't think that growing a Set is faster than growing an Array, then converting it to a Set, but let's see the numbers:
>
> "Prepare an array with 10000 elements, each element is an array with one integer. 10% is duplicate."
> data := (1 to: 9000) collect: [ :each | { SmallInteger maxVal atRandom } ].
> data := data, (data last: 1000).
> data := data shuffled.
>
> (1 to: 10) collect: [ :run |
> [ (data gather: [ :each | each ]) asSet ] timeToRun ]. ===> #(14 15 14 15 14 14 15 14 15 14).
>
> (1 to: 10) collect: [ :run |
> [ data flatCollectAsSet: [ :each | each ] ] timeToRun ]. ===> #(29 29 29 28 29 28 28 29 29 29)

This is something I can understand. So we will fix that.

stef
_______________________________________________
Pharo-project mailing list
[hidden email]
http://lists.gforge.inria.fr/cgi-bin/mailman/listinfo/pharo-project
Reply | Threaded
Open this post in threaded view
|

Re: moose collection extensions

Lukas Renggli
In reply to this post by Stéphane Ducasse
Attached is #foldLeft: and #foldRight: from PetitParser. These are
generalized #fold: operators that I use for years in Magritte already.

Lukas

2009/12/28 Stéphane Ducasse <[hidden email]>:

> Lukas
>
> I know that you also have a cool lib of other highorder function.
> Where can we find it?
>
> Stef
>
> On Dec 27, 2009, at 5:26 PM, Stéphane Ducasse wrote:
>
>> hi
>>
>> here are the collection extensions we use in Moose.
>> I copied them to PharoTaskForces so that we can discuss and tweak the code if wanted.
>>
>> My favorite is
>>       flatCollect:/flatCollectAsSet:
>>       groupedBy:
>>
>>       There are really useful.
>>
>> Stef
>>
>>
>> testFlatCollectArray
>>       "self debug: #testFlatCollectArray"
>>
>>       self assert: ((#((1 2) (3 4) (5 3)) flatCollect: [ :each ]) = #(1 2 3 4 5 3)).
>>       self assert: ((#((1 2) (2 3) (1 3 4)) flatCollect: [:each]) = #(1 2 2 3 1 3 4)).
>>
>>       self assert: ((#((1 2) (2 3) () ()) flatCollect: [:each]) = #(1 2 2 3)).
>>
>>       self assert: ((#((1 2) (2 3) (1 3 4)) flatCollect: [:each| Array with: each])
>>                                       =  #(#(1 2) #(2 3) #(1 3 4))).
>>
>>       self assert: ((#((1 2) (2 3) (1 3 4)) flatCollect: [:each| Set with: each])
>>                                       =  #(#(1 2) #(2 3) #(1 3 4))).
>>
>>
>> testFlatCollectSet
>>       "self debug: #testFlatCollectSet"
>>
>>       self assert: ((#((1 2) (1 2) (1 3 4)) asSet  flatCollect: [:each]) = #(1 1 2 3 4) asSet).
>>       self assert: ((#() asSet flatCollect: [:each]) = #() asSet).
>>
>>       self assert:  ((#((1 2) () (1 3 4)) asSet  flatCollect: [:each]) = #(1 1 2 3 4) asSet).
>>       self assert:  ((#((1 2) #((99)) (1 3 4)) asSet  flatCollect: [:each])
>>                                       = #(1 1 2 3 4 (99)) asSet).
>>       self assert:  ((#((1 2) #(()) (1 3 4)) asSet  flatCollect: [:each])
>>                                       = #(1 1 2 3 4 ()) asSet).
>>
>> testCollectAsSet
>>       "self debug: #testCollectAsSet"
>>
>>       self assert: ((#() collectAsSet: [:each | each odd]) = Set new).
>>       self assert: (#(1 2 3 4 5 6) collectAsSet: [:each | each odd])
>>                                        = (Set with: true with: false).
>>       self assert: (#(1 3 5 7 9 11) collectAsSet: [:each | each odd])
>>                                       = (Set with: true).
>>
>>       self assert: (#(1 2 3 4 5 4 3 2 1) collectAsSet: [:each | each]) = (1 to: 5) asSet.
>>
>>
>> testGroupedByArray
>>       "self debug: #testGroupedByArray"
>>
>>       | res |
>>       res := #(1 2 3 4 5) groupedBy: [:each | each odd].
>>       self assert:   (res at: true) = #(1 3 5).
>>       self assert: (res at: false) = #(2 4)
>>
>>
>>
>> Set>>flatCollect: aBlock
>>
>>
>>       ^self flatCollectAsSet: aBlock
>>
>>
>> Symbol>>value
>>       "Allow this object to act as a ValueHolder on itself."
>>
>>       ^self
>>
>> OrderedCollection>>removeAtIndex: anIndex
>>       "Remove the element of the collection at position anIndex.  Answer the object removed."
>>
>>       | obj |
>>       obj := self at: anIndex.
>>       self removeIndex: anIndex + firstIndex - 1.
>>       ^obj
>>
>> Collection
>> ==============================
>>
>> collectAsSet: aBlock
>>       "Evaluates aBlock for each element of the receiver and collects
>>       the resulting values into a Set."
>>
>>       "This is an efficient shorthand for [ (self collect: aBlock) asSet ]."
>>       "originally developed by a. kuhn and released under MIT."
>>
>>       ^self inject: Set new into: [ :set :each |
>>               set add: (aBlock value: each); yourself ].
>>
>>
>> copyEmpty: aSize
>>       "Answer a copy of the receiver that contains no elements.
>>
>>       This method should be redefined in subclasses that add
>>       instance variables, so that the state of those variables
>>       is preserved"
>>
>>       ^self class new: aSize
>>
>>
>> flatCollect: aBlock
>>       "Evaluate aBlock for each of the receiver's elements and answer the
>>       list of all resulting values flatten one level. Assumes that aBlock returns some kind
>>       of collection for each element. Equivalent to the lisp's mapcan"
>>       "original written by a. Kuhn and released under MIT"
>>
>>       | stream |
>>       self isEmpty ifTrue: [ ^ self copy ].
>>       stream := (self species new: 0) nsWriteStream.
>>       self do: [ :each | stream nextPutAll: (aBlock value: each) ].
>>       ^ stream contents
>>
>> flatCollectAsSet: aBlock
>>       "Evaluate aBlock for each of the receiver's elements and answer the
>>       list of all resulting values flatten one level. Assumes that aBlock returns some kind
>>       of collection for each element. Equivalent to the lisp's mapcan"
>>
>>       "original written by a. Kuhn and released under MIT"
>>
>>       | set |
>>       self isEmpty ifTrue: [^self copy ].
>>       set := Set new.
>>       self do: [ :each |
>>               set addAll: (aBlock value: each) ].
>>       ^set
>>
>>
>> flatten
>>       "Recursively collect each non-collection element of the receiver and its descendant
>>       collections.  Please note, this implementation assumes that strings are to be treated
>>       as objects rather than as collection."
>>
>>       ^self gather: [ :each ]
>>
>> groupedBy: aBlock
>>       "Return a dictionary whose keys are the result of evaluating aBlock for all elements in
>>        the collection, and the value for each key is the collection of elements that evaluated
>>        to that key. e.g.
>>            #(1 2 3 4 5) groupedBy: [:each | each odd]
>>          a Dictionary
>>            true ---> #( 1 3 5)
>>            false --> #(2 4)
>>       originally developed by a. kuhn and released under MIT."
>>
>>       | result |
>>       result := Dictionary new.
>>       self do:
>>               [:each | | key collection |
>>               key := aBlock value: each.
>>               collection := result at: key ifAbsentPut: [OrderedCollection new].
>>               collection add: each].
>>       self species ~~ OrderedCollection ifTrue:
>>               ["Convert the result collections to be the right type.
>>                 Note that it should be safe to modify the dictionary
>>                 while iterating because we only replace values for existing keys"
>>               result keysAndValuesDo:
>>                       [:key :value | result at: key put: (self species withAll: value)]].
>>
>>       ^result
>>
>> includesAll: aCollection
>>       "Answer true if the receiver includes all elements of aCollection with at
>>       least as many occurrences as in aCollection. For a less strict comparison
>>       please refer to supersetOf: and its inverse subsetOf:."
>>
>>
>>       ^(aCollection isCollection) and: [
>>               aCollection size <= self size and: [
>>                       aCollection allSatisfy: [ :each |
>>                               (aCollection occurrencesOf: each) <= (self occurrencesOf: each) ]]]
>>
>> nilSafeGroupedBy: aBlock
>>       ^ self groupedBy: [ :each |
>>               | value |
>>               value := aBlock value: each.
>>               value ifNil: [ UndefinedObject ].
>>       ]
>>
>> selectAsSet: aBlock
>>       "Evaluate aBlock with each of the receiver's elements as the argument.
>>       Collect into a new set, only those elements for which
>>       aBlock evaluates to true.  Answer the new collection."
>>
>>       | newSet |
>>       newSet := Set new.
>>       self do: [:each | (aBlock value: each) ifTrue: [newSet add: each]].
>>       ^newSet
>>
>> shuffle
>>       "Swaps the receiver's elements at random."
>>
>>       self shuffle: (self size * self size log) asInteger
>>
>> sum: aSymbolOrBlock
>>
>>       ^self
>>               inject: 0
>>               into: [:sum :each | sum + (aSymbolOrBlock value: each)]
>>
>> shuffle: times
>>       "Swaps random elements of the receiver."
>>
>>       | size random |
>>       size := self size.
>>       random := Random new.
>>       times timesRepeat: [
>>               self swap: (random next * size) floor + 1 with: (random next * size) floor + 1
>>       ].
>>
>>
>>
>>
>> _______________________________________________
>> Pharo-project mailing list
>> [hidden email]
>> http://lists.gforge.inria.fr/cgi-bin/mailman/listinfo/pharo-project
>
>
> _______________________________________________
> Pharo-project mailing list
> [hidden email]
> http://lists.gforge.inria.fr/cgi-bin/mailman/listinfo/pharo-project
>


--
Lukas Renggli
http://www.lukas-renggli.ch

_______________________________________________
Pharo-project mailing list
[hidden email]
http://lists.gforge.inria.fr/cgi-bin/mailman/listinfo/pharo-project

SequenceableCollection-folding.st (1K) Download Attachment
Reply | Threaded
Open this post in threaded view
|

Re: moose collection extensions

Adrian Kuhn
In reply to this post by Stéphane Ducasse
Stéphane Ducasse <stephane.ducasse@...> writes:

> > I don't think that growing a Set is faster than growing an Array, then
> > converting it to a Set, but let's see the numbers:
> >
> > "Prepare an array with 10000 elements, each element is an array with one
> > integer. 10% is duplicate."
>
> This is something I can understand. So we will fix that.

This is rather a matter of memory. The use case in Moose back then used to be
 closer to 10 million objects with a set of eventual size of 100 element. So
 the intend is to avoid allocating 40 MB of memory when all you need is a set
 of 100 elements. So at least in Moose, you should keep it :)

--AA


_______________________________________________
Pharo-project mailing list
[hidden email]
http://lists.gforge.inria.fr/cgi-bin/mailman/listinfo/pharo-project
Reply | Threaded
Open this post in threaded view
|

Re: moose collection extensions

Stéphane Ducasse
> I don't think that growing a Set is faster than growing an Array, then
>>> converting it to a Set, but let's see the numbers:
>>>
>>> "Prepare an array with 10000 elements, each element is an array with one
>>> integer. 10% is duplicate."
>>
>> This is something I can understand. So we will fix that.
>
> This is rather a matter of memory. The use case in Moose back then used to be
> closer to 10 million objects with a set of eventual size of 100 element. So
> the intend is to avoid allocating 40 MB of memory when all you need is a set
> of 100 elements. So at least in Moose, you should keep it :)

Context :)
Thanks.

I will add a comment to flatCollectAsSet: going in that direction.

Stef


_______________________________________________
Pharo-project mailing list
[hidden email]
http://lists.gforge.inria.fr/cgi-bin/mailman/listinfo/pharo-project
12