Hi. I was needing something like the SQL select distinct, that doesn't take into account repeated objects. I didn't found anything useful in Collection, and thus, I have implemented this:
Collection >> collectDistinct: aBlock collectDistinct: aBlock "Evaluate aBlock with each of the receiver's elements as the argument. Collect the resulting values into a Set, thus repeated objects will not be present. Answer the new collection." | newSet | newSet := self species new asSet. self do: [:each | newSet add: (aBlock value: each)]. ^ newSet Is there a better way ? Do you think it make sense to put this in Pharo ? Cheers Mariano _______________________________________________ Pharo-project mailing list [hidden email] http://lists.gforge.inria.fr/cgi-bin/mailman/listinfo/pharo-project |
self collect: aBlock as: Set
2010/3/10 Mariano Martinez Peck <[hidden email]>: > Hi. I was needing something like the SQL select distinct, that doesn't take > into account repeated objects. I didn't found anything useful in Collection, > and thus, I have implemented this: > > Collection >> collectDistinct: aBlock > > > collectDistinct: aBlock > "Evaluate aBlock with each of the receiver's elements as the argument. > Collect the resulting values into a Set, thus repeated objects will not > be present. > Answer the new collection." > > | newSet | > newSet := self species new asSet. > self do: [:each | newSet add: (aBlock value: each)]. > ^ newSet > > > Is there a better way ? Do you think it make sense to put this in Pharo ? > > Cheers > > Mariano > > _______________________________________________ > Pharo-project mailing list > [hidden email] > http://lists.gforge.inria.fr/cgi-bin/mailman/listinfo/pharo-project > _______________________________________________ Pharo-project mailing list [hidden email] http://lists.gforge.inria.fr/cgi-bin/mailman/listinfo/pharo-project |
hahhha thanks Nicolas..I was sure somewhere it should be...I was blind...weird I check there!!!
thanks a lot. mariano On Wed, Mar 10, 2010 at 2:07 PM, Nicolas Cellier <[hidden email]> wrote: self collect: aBlock as: Set _______________________________________________ Pharo-project mailing list [hidden email] http://lists.gforge.inria.fr/cgi-bin/mailman/listinfo/pharo-project |
In reply to this post by Nicolas Cellier
On 10 March 2010 15:07, Nicolas Cellier
<[hidden email]> wrote: > self collect: aBlock as: Set > One more time, i found this very beatiful and useful extension to Collection protocol. And i am proud being a witness when this thing is born! :) > 2010/3/10 Mariano Martinez Peck <[hidden email]>: >> Hi. I was needing something like the SQL select distinct, that doesn't take >> into account repeated objects. I didn't found anything useful in Collection, >> and thus, I have implemented this: >> >> Collection >> collectDistinct: aBlock >> >> >> collectDistinct: aBlock >> "Evaluate aBlock with each of the receiver's elements as the argument. >> Collect the resulting values into a Set, thus repeated objects will not >> be present. >> Answer the new collection." >> >> | newSet | >> newSet := self species new asSet. >> self do: [:each | newSet add: (aBlock value: each)]. >> ^ newSet >> >> >> Is there a better way ? Do you think it make sense to put this in Pharo ? >> >> Cheers >> >> Mariano >> >> _______________________________________________ >> Pharo-project mailing list >> [hidden email] >> http://lists.gforge.inria.fr/cgi-bin/mailman/listinfo/pharo-project >> > > _______________________________________________ > Pharo-project mailing list > [hidden email] > http://lists.gforge.inria.fr/cgi-bin/mailman/listinfo/pharo-project > -- Best regards, Igor Stasenko AKA sig. _______________________________________________ Pharo-project mailing list [hidden email] http://lists.gforge.inria.fr/cgi-bin/mailman/listinfo/pharo-project |
2010/3/11 Igor Stasenko <[hidden email]>:
> On 10 March 2010 15:07, Nicolas Cellier > <[hidden email]> wrote: >> self collect: aBlock as: Set >> > > One more time, i found this very beatiful and useful extension to > Collection protocol. > And i am proud being a witness when this thing is born! :) > Yes, I remember saying I was not convinced some times ago... http://lists.squeakfoundation.org/pipermail/squeak-dev/2008-June/129457.html Being able to change own opinion is a good thing ! Nicolas >> 2010/3/10 Mariano Martinez Peck <[hidden email]>: >>> Hi. I was needing something like the SQL select distinct, that doesn't take >>> into account repeated objects. I didn't found anything useful in Collection, >>> and thus, I have implemented this: >>> >>> Collection >> collectDistinct: aBlock >>> >>> >>> collectDistinct: aBlock >>> "Evaluate aBlock with each of the receiver's elements as the argument. >>> Collect the resulting values into a Set, thus repeated objects will not >>> be present. >>> Answer the new collection." >>> >>> | newSet | >>> newSet := self species new asSet. >>> self do: [:each | newSet add: (aBlock value: each)]. >>> ^ newSet >>> >>> >>> Is there a better way ? Do you think it make sense to put this in Pharo ? >>> >>> Cheers >>> >>> Mariano >>> >>> _______________________________________________ >>> Pharo-project mailing list >>> [hidden email] >>> http://lists.gforge.inria.fr/cgi-bin/mailman/listinfo/pharo-project >>> >> >> _______________________________________________ >> Pharo-project mailing list >> [hidden email] >> http://lists.gforge.inria.fr/cgi-bin/mailman/listinfo/pharo-project >> > > > > -- > Best regards, > Igor Stasenko AKA sig. > > _______________________________________________ > Pharo-project mailing list > [hidden email] > http://lists.gforge.inria.fr/cgi-bin/mailman/listinfo/pharo-project _______________________________________________ Pharo-project mailing list [hidden email] http://lists.gforge.inria.fr/cgi-bin/mailman/listinfo/pharo-project |
In reply to this post by Mariano Martinez Peck
>>>>> "Mariano" == Mariano Martinez Peck <[hidden email]> writes:
Mariano> Is there a better way ? Do you think it make sense to put this in Pharo ? I can envision collection types where collect: can create something that asSet can more easily turn into a set in bulk than adding each element individually like this. Have you already benchmarked your verbose code against the "simple" code under varying sizes? -- Randal L. Schwartz - Stonehenge Consulting Services, Inc. - +1 503 777 0095 <[hidden email]> <URL:http://www.stonehenge.com/merlyn/> Smalltalk/Perl/Unix consulting, Technical writing, Comedy, etc. etc. See http://methodsandmessages.vox.com/ for Smalltalk and Seaside discussion _______________________________________________ Pharo-project mailing list [hidden email] http://lists.gforge.inria.fr/cgi-bin/mailman/listinfo/pharo-project |
>>>>> "Randal" == Randal L Schwartz <[hidden email]> writes:
Randal> I can envision collection types where collect: can create something Randal> that asSet can more easily turn into a set in bulk than adding each Randal> element individually like this. In particular, turning an array of 10,000 elements into a set can start with a particular hashing size, whereas starting with an empty set and adding 10,000 elements will likely involve rehasing a number of times. If 10,000 isn't troubling enough, make that a million. :) -- Randal L. Schwartz - Stonehenge Consulting Services, Inc. - +1 503 777 0095 <[hidden email]> <URL:http://www.stonehenge.com/merlyn/> Smalltalk/Perl/Unix consulting, Technical writing, Comedy, etc. etc. See http://methodsandmessages.vox.com/ for Smalltalk and Seaside discussion _______________________________________________ Pharo-project mailing list [hidden email] http://lists.gforge.inria.fr/cgi-bin/mailman/listinfo/pharo-project |
On Wed, 10 Mar 2010, Randal L. Schwartz wrote:
>>>>>> "Randal" == Randal L Schwartz <[hidden email]> writes: > > Randal> I can envision collection types where collect: can create something > Randal> that asSet can more easily turn into a set in bulk than adding each > Randal> element individually like this. Keys of dictionaries, but it's hackish. > > In particular, turning an array of 10,000 elements into a set > can start with a particular hashing size, whereas starting with > an empty set and adding 10,000 elements will likely involve rehasing > a number of times. If 10,000 isn't troubling enough, make that > a million. :) > Lets see what we've got (in Squeak): Smalltalk garbageCollect. [ (1 to: 1000000) collect: #yourself as: Set ] timeToRun. "==> 2651." Smalltalk garbageCollect. [ (1 to: 1000000) asSet ] timeToRun. "==> 2522". | s | s := Set new. Smalltalk garbageCollect. [ 1 to: 1000000 do: [ :each | s add: each ] ] timeToRun. "==> 1192" | s | s := Set new: 1000000. Smalltalk garbageCollect. [ 1 to: 1000000 do: [ :each | s add: each ] ] timeToRun. "==> 472" Note that adding numbers from 1 to n has the lowest cost for addition, since there will be no hash collisions at all. So try it with randomized data: a := (1 to: 1000000) asArray shuffled. Smalltalk garbageCollect. [ a collect: #yourself as: Set ] timeToRun. "==> 4143" Smalltalk garbageCollect. [ a asSet ] timeToRun. "==> 3995". | s | s := Set new. Smalltalk garbageCollect. [ s addAll: a ] timeToRun. "==> 157304 oops weak hashes". | s | s := Set new: 1000000. Smalltalk garbageCollect. [ s addAll: a ] timeToRun. "==> 636". With better hashes: a := Set new: 1000000. [ a size < 1000000 ] whileTrue: [ a add: SmallInteger maxVal atRandom ]. a := a asArray. Smalltalk garbageCollect. [ a collect: #yourself as: Set ] timeToRun. "==> 3895" Smalltalk garbageCollect. [ a asSet ] timeToRun. "==> 3777". | s | s := Set new. Smalltalk garbageCollect. [ s addAll: a ] timeToRun. "==> 4729 much better". | s | s := Set new: 1000000. Smalltalk garbageCollect. [ s addAll: a ] timeToRun. "==> 774". Conclusions: - #collect:as: is (almost) as fast as #asSet - better hashes can make a huge difference - message sends cost a lot compared to bytecodes - if you can guess the size of your collection, do it (applies for all growing collections and collection streaming) - only optimize if you have to in your own code Levente > -- > Randal L. Schwartz - Stonehenge Consulting Services, Inc. - +1 503 777 0095 > <[hidden email]> <URL:http://www.stonehenge.com/merlyn/> > Smalltalk/Perl/Unix consulting, Technical writing, Comedy, etc. etc. > See http://methodsandmessages.vox.com/ for Smalltalk and Seaside discussion > > _______________________________________________ > Pharo-project mailing list > [hidden email] > http://lists.gforge.inria.fr/cgi-bin/mailman/listinfo/pharo-project > _______________________________________________ Pharo-project mailing list [hidden email] http://lists.gforge.inria.fr/cgi-bin/mailman/listinfo/pharo-project |
In reply to this post by Nicolas Cellier
Yes
I remember when I was against { . } :) On Mar 11, 2010, at 1:54 AM, Nicolas Cellier wrote: > 2010/3/11 Igor Stasenko <[hidden email]>: >> On 10 March 2010 15:07, Nicolas Cellier >> <[hidden email]> wrote: >>> self collect: aBlock as: Set >>> >> >> One more time, i found this very beatiful and useful extension to >> Collection protocol. >> And i am proud being a witness when this thing is born! :) >> > > Yes, > I remember saying I was not convinced some times ago... > http://lists.squeakfoundation.org/pipermail/squeak-dev/2008-June/129457.html > Being able to change own opinion is a good thing ! > > Nicolas > >>> 2010/3/10 Mariano Martinez Peck <[hidden email]>: >>>> Hi. I was needing something like the SQL select distinct, that doesn't take >>>> into account repeated objects. I didn't found anything useful in Collection, >>>> and thus, I have implemented this: >>>> >>>> Collection >> collectDistinct: aBlock >>>> >>>> >>>> collectDistinct: aBlock >>>> "Evaluate aBlock with each of the receiver's elements as the argument. >>>> Collect the resulting values into a Set, thus repeated objects will not >>>> be present. >>>> Answer the new collection." >>>> >>>> | newSet | >>>> newSet := self species new asSet. >>>> self do: [:each | newSet add: (aBlock value: each)]. >>>> ^ newSet >>>> >>>> >>>> Is there a better way ? Do you think it make sense to put this in Pharo ? >>>> >>>> Cheers >>>> >>>> Mariano >>>> >>>> _______________________________________________ >>>> Pharo-project mailing list >>>> [hidden email] >>>> http://lists.gforge.inria.fr/cgi-bin/mailman/listinfo/pharo-project >>>> >>> >>> _______________________________________________ >>> Pharo-project mailing list >>> [hidden email] >>> http://lists.gforge.inria.fr/cgi-bin/mailman/listinfo/pharo-project >>> >> >> >> >> -- >> Best regards, >> Igor Stasenko AKA sig. >> >> _______________________________________________ >> Pharo-project mailing list >> [hidden email] >> http://lists.gforge.inria.fr/cgi-bin/mailman/listinfo/pharo-project > > _______________________________________________ > Pharo-project mailing list > [hidden email] > http://lists.gforge.inria.fr/cgi-bin/mailman/listinfo/pharo-project _______________________________________________ Pharo-project mailing list [hidden email] http://lists.gforge.inria.fr/cgi-bin/mailman/listinfo/pharo-project |
In reply to this post by Levente Uzonyi-2
> > So try it with randomized data: > > a := (1 to: 1000000) asArray shuffled. > > Smalltalk garbageCollect. > [ a collect: #yourself as: Set ] timeToRun. "==> 4143" > > Smalltalk garbageCollect. > [ a asSet ] timeToRun. "==> 3995". > > | s | > s := Set new. > Smalltalk garbageCollect. > [ s addAll: a ] timeToRun. "==> 157304 oops weak hashes". > > | s | > s := Set new: 1000000. > Smalltalk garbageCollect. > [ s addAll: a ] timeToRun. "==> 636". > > > With better hashes: levente how do you determine that you use better hashes below > a := Set new: 1000000. > [ a size < 1000000 ] whileTrue: [ > a add: SmallInteger maxVal atRandom ]. > a := a asArray. > > Smalltalk garbageCollect. > [ a collect: #yourself as: Set ] timeToRun. "==> 3895" > > Smalltalk garbageCollect. > [ a asSet ] timeToRun. "==> 3777". > > | s | > s := Set new. > Smalltalk garbageCollect. > [ s addAll: a ] timeToRun. "==> 4729 much better". > > | s | > s := Set new: 1000000. > Smalltalk garbageCollect. > [ s addAll: a ] timeToRun. "==> 774". > > > Conclusions: > - #collect:as: is (almost) as fast as #asSet > - better hashes can make a huge difference > - message sends cost a lot compared to bytecodes > - if you can guess the size of your collection, do it (applies for all growing collections and collection streaming) > - only optimize if you have to in your own code > > > Levente > >> -- >> Randal L. Schwartz - Stonehenge Consulting Services, Inc. - +1 503 777 0095 >> <[hidden email]> <URL:http://www.stonehenge.com/merlyn/> >> Smalltalk/Perl/Unix consulting, Technical writing, Comedy, etc. etc. >> See http://methodsandmessages.vox.com/ for Smalltalk and Seaside discussion >> >> _______________________________________________ >> Pharo-project mailing list >> [hidden email] >> http://lists.gforge.inria.fr/cgi-bin/mailman/listinfo/pharo-project >> > > _______________________________________________ > Pharo-project mailing list > [hidden email] > http://lists.gforge.inria.fr/cgi-bin/mailman/listinfo/pharo-project _______________________________________________ Pharo-project mailing list [hidden email] http://lists.gforge.inria.fr/cgi-bin/mailman/listinfo/pharo-project |
2010/3/11 Stéphane Ducasse <[hidden email]>:
> >> >> So try it with randomized data: >> >> a := (1 to: 1000000) asArray shuffled. >> >> Smalltalk garbageCollect. >> [ a collect: #yourself as: Set ] timeToRun. "==> 4143" >> >> Smalltalk garbageCollect. >> [ a asSet ] timeToRun. "==> 3995". >> >> | s | >> s := Set new. >> Smalltalk garbageCollect. >> [ s addAll: a ] timeToRun. "==> 157304 oops weak hashes". >> >> | s | >> s := Set new: 1000000. >> Smalltalk garbageCollect. >> [ s addAll: a ] timeToRun. "==> 636". >> >> >> With better hashes: > > levente how do you determine that you use better hashes below > SmallInteger are their own hash. Code below produces random hashes with uniform density. > >> a := Set new: 1000000. >> [ a size < 1000000 ] whileTrue: [ >> a add: SmallInteger maxVal atRandom ]. >> a := a asArray. >> >> Smalltalk garbageCollect. >> [ a collect: #yourself as: Set ] timeToRun. "==> 3895" >> >> Smalltalk garbageCollect. >> [ a asSet ] timeToRun. "==> 3777". >> >> | s | >> s := Set new. >> Smalltalk garbageCollect. >> [ s addAll: a ] timeToRun. "==> 4729 much better". >> >> | s | >> s := Set new: 1000000. >> Smalltalk garbageCollect. >> [ s addAll: a ] timeToRun. "==> 774". >> >> >> Conclusions: >> - #collect:as: is (almost) as fast as #asSet >> - better hashes can make a huge difference >> - message sends cost a lot compared to bytecodes >> - if you can guess the size of your collection, do it (applies for all growing collections and collection streaming) > yes!! >> - only optimize if you have to in your own code >> >> >> Levente >> >>> -- >>> Randal L. Schwartz - Stonehenge Consulting Services, Inc. - +1 503 777 0095 >>> <[hidden email]> <URL:http://www.stonehenge.com/merlyn/> >>> Smalltalk/Perl/Unix consulting, Technical writing, Comedy, etc. etc. >>> See http://methodsandmessages.vox.com/ for Smalltalk and Seaside discussion >>> >>> _______________________________________________ >>> Pharo-project mailing list >>> [hidden email] >>> http://lists.gforge.inria.fr/cgi-bin/mailman/listinfo/pharo-project >>> >> >> _______________________________________________ >> Pharo-project mailing list >> [hidden email] >> http://lists.gforge.inria.fr/cgi-bin/mailman/listinfo/pharo-project > > > _______________________________________________ > Pharo-project mailing list > [hidden email] > http://lists.gforge.inria.fr/cgi-bin/mailman/listinfo/pharo-project > _______________________________________________ Pharo-project mailing list [hidden email] http://lists.gforge.inria.fr/cgi-bin/mailman/listinfo/pharo-project |
In reply to this post by Nicolas Cellier
Den 11.03.2010 01:54, skrev Nicolas Cellier: > 2010/3/11 Igor Stasenko <[hidden email]>: > >> On 10 March 2010 15:07, Nicolas Cellier >> <[hidden email]> wrote: >> >>> self collect: aBlock as: Set >>> >>> >> One more time, i found this very beatiful and useful extension to >> Collection protocol. >> And i am proud being a witness when this thing is born! :) >> >> > Yes, > I remember saying I was not convinced some times ago... > http://lists.squeakfoundation.org/pipermail/squeak-dev/2008-June/129457.html > Being able to change own opinion is a good thing ! > > Nicolas > 1. (self collect: aBlock) asSet or 2. mySet := Set new: self size. self do: aBlockAddingToMySetAtEnd. or (well, this one is IMO indeed a bit ugly) 3. self inject: (Set new:self size) into: [ :sub :next | sub add: next collectValue; yourself ] ... Sure, they're a bit verbose, and when you'd like to use 1 vs 2 or 3 varies abit on the collection contents, but they all work in all dialects, and aren't THAT much longer, unless your collect block is really short. Cheers, Henry > >>> 2010/3/10 Mariano Martinez Peck <[hidden email]>: >>> >>>> Hi. I was needing something like the SQL select distinct, that doesn't take >>>> into account repeated objects. I didn't found anything useful in Collection, >>>> and thus, I have implemented this: >>>> >>>> Collection >> collectDistinct: aBlock >>>> >>>> >>>> collectDistinct: aBlock >>>> "Evaluate aBlock with each of the receiver's elements as the argument. >>>> Collect the resulting values into a Set, thus repeated objects will not >>>> be present. >>>> Answer the new collection." >>>> >>>> | newSet | >>>> newSet := self species new asSet. >>>> self do: [:each | newSet add: (aBlock value: each)]. >>>> ^ newSet >>>> >>>> >>>> Is there a better way ? Do you think it make sense to put this in Pharo ? >>>> >>>> Cheers >>>> >>>> Mariano >>>> >>>> _______________________________________________ >>>> Pharo-project mailing list >>>> [hidden email] >>>> http://lists.gforge.inria.fr/cgi-bin/mailman/listinfo/pharo-project >>>> >>>> >>> _______________________________________________ >>> Pharo-project mailing list >>> [hidden email] >>> http://lists.gforge.inria.fr/cgi-bin/mailman/listinfo/pharo-project >>> >>> >> >> >> -- >> Best regards, >> Igor Stasenko AKA sig. >> >> _______________________________________________ >> Pharo-project mailing list >> [hidden email] >> http://lists.gforge.inria.fr/cgi-bin/mailman/listinfo/pharo-project >> > _______________________________________________ > Pharo-project mailing list > [hidden email] > http://lists.gforge.inria.fr/cgi-bin/mailman/listinfo/pharo-project > > > _______________________________________________ Pharo-project mailing list [hidden email] http://lists.gforge.inria.fr/cgi-bin/mailman/listinfo/pharo-project |
In reply to this post by Stéphane Ducasse
On Thu, 11 Mar 2010, Stéphane Ducasse wrote:
> >> >> So try it with randomized data: >> >> a := (1 to: 1000000) asArray shuffled. >> >> Smalltalk garbageCollect. >> [ a collect: #yourself as: Set ] timeToRun. "==> 4143" >> >> Smalltalk garbageCollect. >> [ a asSet ] timeToRun. "==> 3995". >> >> | s | >> s := Set new. >> Smalltalk garbageCollect. >> [ s addAll: a ] timeToRun. "==> 157304 oops weak hashes". >> >> | s | >> s := Set new: 1000000. >> Smalltalk garbageCollect. >> [ s addAll: a ] timeToRun. "==> 636". >> >> >> With better hashes: > > levente how do you determine that you use better hashes below hashes come from the better, because the distribution of (hash \\ array size) values is more uniform, therefore the chance of clustering and collisions is lower. Levente > > >> a := Set new: 1000000. >> [ a size < 1000000 ] whileTrue: [ >> a add: SmallInteger maxVal atRandom ]. >> a := a asArray. >> >> Smalltalk garbageCollect. >> [ a collect: #yourself as: Set ] timeToRun. "==> 3895" >> >> Smalltalk garbageCollect. >> [ a asSet ] timeToRun. "==> 3777". >> >> | s | >> s := Set new. >> Smalltalk garbageCollect. >> [ s addAll: a ] timeToRun. "==> 4729 much better". >> >> | s | >> s := Set new: 1000000. >> Smalltalk garbageCollect. >> [ s addAll: a ] timeToRun. "==> 774". >> >> >> Conclusions: >> - #collect:as: is (almost) as fast as #asSet >> - better hashes can make a huge difference >> - message sends cost a lot compared to bytecodes >> - if you can guess the size of your collection, do it (applies for all growing collections and collection streaming) > yes!! >> - only optimize if you have to in your own code >> >> >> Levente >> >>> -- >>> Randal L. Schwartz - Stonehenge Consulting Services, Inc. - +1 503 777 0095 >>> <[hidden email]> <URL:http://www.stonehenge.com/merlyn/> >>> Smalltalk/Perl/Unix consulting, Technical writing, Comedy, etc. etc. >>> See http://methodsandmessages.vox.com/ for Smalltalk and Seaside discussion >>> >>> _______________________________________________ >>> Pharo-project mailing list >>> [hidden email] >>> http://lists.gforge.inria.fr/cgi-bin/mailman/listinfo/pharo-project >>> >> >> _______________________________________________ >> Pharo-project mailing list >> [hidden email] >> http://lists.gforge.inria.fr/cgi-bin/mailman/listinfo/pharo-project > > > _______________________________________________ > Pharo-project mailing list > [hidden email] > http://lists.gforge.inria.fr/cgi-bin/mailman/listinfo/pharo-project > Pharo-project mailing list [hidden email] http://lists.gforge.inria.fr/cgi-bin/mailman/listinfo/pharo-project |
In reply to this post by Henrik Sperre Johansen
On Thu, 11 Mar 2010, Henrik Johansen wrote:
> > > Den 11.03.2010 01:54, skrev Nicolas Cellier: >> 2010/3/11 Igor Stasenko <[hidden email]>: >> >>> On 10 March 2010 15:07, Nicolas Cellier >>> <[hidden email]> wrote: >>> >>>> self collect: aBlock as: Set >>>> >>>> >>> One more time, i found this very beatiful and useful extension to >>> Collection protocol. >>> And i am proud being a witness when this thing is born! :) >>> >>> >> Yes, >> I remember saying I was not convinced some times ago... >> http://lists.squeakfoundation.org/pipermail/squeak-dev/2008-June/129457.html >> Being able to change own opinion is a good thing ! >> >> Nicolas >> > I have to admit, I still don't quite see what's wrong with > 1. > (self collect: aBlock) asSet ('foo' collect: [ :each | each asciiValue ]) asSet "===> boom :)" > or > 2. > mySet := Set new: self size. > self do: aBlockAddingToMySetAtEnd. That's ok, but it's two lines + a temporary declaration. > > or (well, this one is IMO indeed a bit ugly) > 3. > self inject: (Set new:self size) into: > [ :sub :next | > sub add: next collectValue; yourself ] ... > > Sure, they're a bit verbose, and when you'd like to use 1 vs 2 or 3 > varies abit on the collection contents, but they all work in all > dialects, and aren't THAT much longer, unless your collect block is > really short. I did a quick sampling (not representative) and found that 1 out of 20 collect blocks is longer than a single line. #collect:as: is a solution to the problem about #species which limits the use of #collect: in some cases. You can find the proposal about #collect:as: here if you're interested: http://lists.squeakfoundation.org/pipermail/squeak-dev/2009-November/141245.html And who knows, maybe other dialects will implement these methods too, until then, don't use #collect:as: in code which should be portable across dialects. Levente > > Cheers, > Henry >> >>>> 2010/3/10 Mariano Martinez Peck <[hidden email]>: >>>> >>>>> Hi. I was needing something like the SQL select distinct, that doesn't take >>>>> into account repeated objects. I didn't found anything useful in Collection, >>>>> and thus, I have implemented this: >>>>> >>>>> Collection >> collectDistinct: aBlock >>>>> >>>>> >>>>> collectDistinct: aBlock >>>>> "Evaluate aBlock with each of the receiver's elements as the argument. >>>>> Collect the resulting values into a Set, thus repeated objects will not >>>>> be present. >>>>> Answer the new collection." >>>>> >>>>> | newSet | >>>>> newSet := self species new asSet. >>>>> self do: [:each | newSet add: (aBlock value: each)]. >>>>> ^ newSet >>>>> >>>>> >>>>> Is there a better way ? Do you think it make sense to put this in Pharo ? >>>>> >>>>> Cheers >>>>> >>>>> Mariano >>>>> >>>>> _______________________________________________ >>>>> Pharo-project mailing list >>>>> [hidden email] >>>>> http://lists.gforge.inria.fr/cgi-bin/mailman/listinfo/pharo-project >>>>> >>>>> >>>> _______________________________________________ >>>> Pharo-project mailing list >>>> [hidden email] >>>> http://lists.gforge.inria.fr/cgi-bin/mailman/listinfo/pharo-project >>>> >>>> >>> >>> >>> -- >>> Best regards, >>> Igor Stasenko AKA sig. >>> >>> _______________________________________________ >>> Pharo-project mailing list >>> [hidden email] >>> http://lists.gforge.inria.fr/cgi-bin/mailman/listinfo/pharo-project >>> >> _______________________________________________ >> Pharo-project mailing list >> [hidden email] >> http://lists.gforge.inria.fr/cgi-bin/mailman/listinfo/pharo-project >> >> >> > > _______________________________________________ > Pharo-project mailing list > [hidden email] > http://lists.gforge.inria.fr/cgi-bin/mailman/listinfo/pharo-project > _______________________________________________ Pharo-project mailing list [hidden email] http://lists.gforge.inria.fr/cgi-bin/mailman/listinfo/pharo-project |
In reply to this post by Mariano Martinez Peck
Em 10/03/2010 09:50, Mariano Martinez Peck < [hidden email] > escreveu:
> Hi. I was needing something like the SQL select distinct, that > doesn't take into account repeated objects. I didn't found anything > useful in Collection, and thus, I have implemented this: > Collection >> collectDistinct: aBlock > > collectDistinct: aBlock > "Evaluate aBlock with each of the receiver's elements as the > argument. Collect the resulting values into a Set, thus repeated > objects will not be present. Answer the new collection." > | newSet | > newSet := self species new asSet. > self do: [:each | > newSet add: (aBlock value: each)]. > ^newSet > > Is there a better way ? Do you think it make sense to put this in > Pharo ? > Cheers IIUC, your intent with this method is/was to get non repeated elements in a generic collection. However, the way is coded the result is Set which is returned instead of the collection from which the method had been called. I didn't miss something, the result of your proposed method is akin to calling Collection>>collect: and sending asSet to it. ITYM: | newSet | newSet := Set new . self do: [:each | newSet add: (aBlock value: each)]. ^newSet as: (self class) HTH -- Cesar Rabak _______________________________________________ Pharo-project mailing list [hidden email] http://lists.gforge.inria.fr/cgi-bin/mailman/listinfo/pharo-project |
In reply to this post by Nicolas Cellier
Although the end result should be the same, I wonder if for a collection of items where there is a lot of duplicates the sequential nature of collecting and then converting to a Set wouldn't be less performant than Mariano's proposed method?
-- Cesar Rabak Em 10/03/2010 10:07, Nicolas Cellier < [hidden email] > escreveu: self collect: aBlock as: Set 2010/3/10 Mariano Martinez Peck : > Hi. I was needing something like the SQL select distinct, that doesn't take > into account repeated objects. I didn't found anything useful in Collection, > and thus, I have implemented this: > > Collection >> collectDistinct: aBlock > > > collectDistinct: aBlock > "Evaluate aBlock with each of the receiver's elements as the argument. > Collect the resulting values into a Set, thus repeated objects will not > be present. > Answer the new collection." > > | newSet | > newSet := self species new asSet. > self do: [:each | newSet add: (aBlock value: each)]. > ^ newSet > > > Is there a better way ? Do you think it make sense to put this in Pharo ? > > Cheers > > Mariano > > _______________________________________________ > Pharo-project mailing list > [hidden email] > http://lists.gforge.inria.fr/cgi-bin/mailman/listinfo/pharo-project > _______________________________________________ Pharo-project mailing list [hidden email] http://lists.gforge.inria.fr/cgi-bin/mailman/listinfo/pharo-project _______________________________________________ Pharo-project mailing list [hidden email] http://lists.gforge.inria.fr/cgi-bin/mailman/listinfo/pharo-project |
On 13.03.2010 00:14, [hidden email] wrote: > Although the end result should be the same, I wonder if for a collection of items where there is a lot of duplicates the sequential nature of collecting and then converting to a Set wouldn't be less performant than Mariano's proposed method? > > -- > Cesar Rabak In my experience, yes, it would. The difference in practice is usually so small you wouldn't prefer one over the other unless it show up in a profile of a time-critical method though. Cheers, Henry Few distinct: [(1 to: 500000) collect: [:each | each odd] as: Set] timeToRun 1347 [|aSet | aSet := Set new. (1 to: 500000) do: [:each | aSet add: each odd] ] timeToRun 375 All distinct: [(1 to: 500000) collect: [:each | each +1] as: Set] timeToRun 1350 [|aSet | aSet := Set new: 500000. (1 to: 500000) do: [:each | aSet add: each +1] ] timeToRun 1740 > > > Em 10/03/2010 10:07, Nicolas Cellier< [hidden email]> escreveu: > self collect: aBlock as: Set > > 2010/3/10 Mariano Martinez Peck : >> Hi. I was needing something like the SQL select distinct, that doesn't take >> into account repeated objects. I didn't found anything useful in Collection, >> and thus, I have implemented this: >> >> Collection>> collectDistinct: aBlock >> >> >> collectDistinct: aBlock >> "Evaluate aBlock with each of the receiver's elements as the argument. >> Collect the resulting values into a Set, thus repeated objects will not >> be present. >> Answer the new collection." >> >> | newSet | >> newSet := self species new asSet. >> self do: [:each | newSet add: (aBlock value: each)]. >> ^ newSet >> >> >> Is there a better way ? Do you think it make sense to put this in Pharo ? >> >> Cheers >> >> Mariano >> >> _______________________________________________ >> Pharo-project mailing list >> [hidden email] >> http://lists.gforge.inria.fr/cgi-bin/mailman/listinfo/pharo-project >> > _______________________________________________ > Pharo-project mailing list > [hidden email] > http://lists.gforge.inria.fr/cgi-bin/mailman/listinfo/pharo-project > > > _______________________________________________ > Pharo-project mailing list > [hidden email] > http://lists.gforge.inria.fr/cgi-bin/mailman/listinfo/pharo-project _______________________________________________ Pharo-project mailing list [hidden email] http://lists.gforge.inria.fr/cgi-bin/mailman/listinfo/pharo-project |
Free forum by Nabble | Edit this page |