Nicolas
I was discussing with lukas and we would be interested to get your changes to dictionary keys (Ihope that this icorrect since I'm dead) We would like to integrate your changes. Do you have a basis that we could work? Else we can just try in the image directly. Stef _______________________________________________ Pharo-project mailing list [hidden email] http://lists.gforge.inria.fr/cgi-bin/mailman/listinfo/pharo-project |
Yes I can have a look.
Otherwise, the changes are quite simple: 1) track senders of keys and sometimes senders of senders 2) identify those sending a message not understandable by an Array (add: remove: etc...), insert keys asSet in this case 3) identify thise sending a potential inefficient message (includes:) insert keys asSet in this case (only if includes: is in a loop!) 4) change code for keys Then eventually change some chains: keys asSortedCollection asArray -> keys asArray sort Nicolas 2009/12/3 Stéphane Ducasse <[hidden email]>: > Nicolas > > I was discussing with lukas and we would be interested to get your changes to dictionary keys (Ihope that this icorrect since I'm dead) > We would like to integrate your changes. Do you have a basis that we could work? > Else we can just try in the image directly. > > Stef > _______________________________________________ > Pharo-project mailing list > [hidden email] > http://lists.gforge.inria.fr/cgi-bin/mailman/listinfo/pharo-project > _______________________________________________ Pharo-project mailing list [hidden email] http://lists.gforge.inria.fr/cgi-bin/mailman/listinfo/pharo-project |
tx
On Dec 3, 2009, at 10:13 PM, Nicolas Cellier wrote: > Yes I can have a look. > Otherwise, the changes are quite simple: > 1) track senders of keys and sometimes senders of senders > 2) identify those sending a message not understandable by an Array > (add: remove: etc...), insert keys asSet in this case > 3) identify thise sending a potential inefficient message (includes:) > insert keys asSet in this case (only if includes: is in a loop!) > 4) change code for keys > > Then eventually change some chains: > keys asSortedCollection asArray -> keys asArray sort > > Nicolas > > 2009/12/3 Stéphane Ducasse <[hidden email]>: >> Nicolas >> >> I was discussing with lukas and we would be interested to get your changes to dictionary keys (Ihope that this icorrect since I'm dead) >> We would like to integrate your changes. Do you have a basis that we could work? >> Else we can just try in the image directly. >> >> Stef >> _______________________________________________ >> Pharo-project mailing list >> [hidden email] >> http://lists.gforge.inria.fr/cgi-bin/mailman/listinfo/pharo-project >> > > _______________________________________________ > Pharo-project mailing list > [hidden email] > http://lists.gforge.inria.fr/cgi-bin/mailman/listinfo/pharo-project _______________________________________________ Pharo-project mailing list [hidden email] http://lists.gforge.inria.fr/cgi-bin/mailman/listinfo/pharo-project |
In reply to this post by Nicolas Cellier
A small addendum ;)
On 03.12.2009 22:13, Nicolas Cellier wrote: > 3) identify thise sending a potential inefficient message (includes:) > insert keys asSet in this case (only if includes: is in a loop!) > > 3b. If the keys collection isn't used for anything else, change to use includesKey: instead. Cheers, Henry _______________________________________________ Pharo-project mailing list [hidden email] http://lists.gforge.inria.fr/cgi-bin/mailman/listinfo/pharo-project |
I wonder if ever something along the following lines was considered?
Dictionary>>keys "Answer a Set containing the receiver's keys." | result | result := Set basicNew. result setTally: tally array: (array collect: [ :each | each isNil ifFalse: [ each key ] ]). ^ result This returns a Set, so it wouldn't break any semantics. In my benchmark this is roughly 8x faster than the current implementation, and 2x faster than the optimized Squeak implementation. The drawback is that it makes some heavy assumptions on the internal structure of Dictionary, Association and Set. Lukas 2009/12/4 Henrik Sperre Johansen <[hidden email]>: > A small addendum ;) > On 03.12.2009 22:13, Nicolas Cellier wrote: >> 3) identify thise sending a potential inefficient message (includes:) >> insert keys asSet in this case (only if includes: is in a loop!) >> >> > 3b. If the keys collection isn't used for anything else, change to use > includesKey: instead. > > Cheers, > Henry > > _______________________________________________ > Pharo-project mailing list > [hidden email] > http://lists.gforge.inria.fr/cgi-bin/mailman/listinfo/pharo-project > -- Lukas Renggli http://www.lukas-renggli.ch _______________________________________________ Pharo-project mailing list [hidden email] http://lists.gforge.inria.fr/cgi-bin/mailman/listinfo/pharo-project |
Actually the code can be further optimized:
Dictionary>>keys "Answer a Set containing the receiver's keys." | result container | result := Set basicNew setTally: tally array: array copy. container := result array. 1 to: container size do: [ :index | (container at: index) ifNotNil: [ :assoc | container at: index put: assoc key ] ]. ^ result This is roughly 22 times faster than the current implementation, and 5 times faster than the squeak implementation. The system doesn't seem to break with this change :-) Lukas 2009/12/4 Lukas Renggli <[hidden email]>: > I wonder if ever something along the following lines was considered? > > Dictionary>>keys > "Answer a Set containing the receiver's keys." > > | result | > result := Set basicNew. > result setTally: tally array: (array collect: [ :each | > each isNil ifFalse: [ each key ] ]). > ^ result > > This returns a Set, so it wouldn't break any semantics. In my > benchmark this is roughly 8x faster than the current implementation, > and 2x faster than the optimized Squeak implementation. > > The drawback is that it makes some heavy assumptions on the internal > structure of Dictionary, Association and Set. > > Lukas > > 2009/12/4 Henrik Sperre Johansen <[hidden email]>: >> A small addendum ;) >> On 03.12.2009 22:13, Nicolas Cellier wrote: >>> 3) identify thise sending a potential inefficient message (includes:) >>> insert keys asSet in this case (only if includes: is in a loop!) >>> >>> >> 3b. If the keys collection isn't used for anything else, change to use >> includesKey: instead. >> >> Cheers, >> Henry >> >> _______________________________________________ >> Pharo-project mailing list >> [hidden email] >> http://lists.gforge.inria.fr/cgi-bin/mailman/listinfo/pharo-project >> > > > > -- > Lukas Renggli > http://www.lukas-renggli.ch > -- Lukas Renggli http://www.lukas-renggli.ch _______________________________________________ Pharo-project mailing list [hidden email] http://lists.gforge.inria.fr/cgi-bin/mailman/listinfo/pharo-project |
2009/12/4 Lukas Renggli <[hidden email]>:
> Actually the code can be further optimized: > > Dictionary>>keys > "Answer a Set containing the receiver's keys." > > | result container | > result := Set basicNew setTally: tally array: array copy. > container := result array. > 1 to: container size do: [ :index | > (container at: index) > ifNotNil: [ :assoc | container at: index put: assoc key ] ]. > ^ result > > This is roughly 22 times faster than the current implementation, and 5 > times faster than the squeak implementation. > > The system doesn't seem to break with this change :-) > Thus i vote -1 for this kind of optimization :) > Lukas > > 2009/12/4 Lukas Renggli <[hidden email]>: >> I wonder if ever something along the following lines was considered? >> >> Dictionary>>keys >> "Answer a Set containing the receiver's keys." >> >> | result | >> result := Set basicNew. >> result setTally: tally array: (array collect: [ :each | >> each isNil ifFalse: [ each key ] ]). >> ^ result >> >> This returns a Set, so it wouldn't break any semantics. In my >> benchmark this is roughly 8x faster than the current implementation, >> and 2x faster than the optimized Squeak implementation. >> >> The drawback is that it makes some heavy assumptions on the internal >> structure of Dictionary, Association and Set. >> >> Lukas >> >> 2009/12/4 Henrik Sperre Johansen <[hidden email]>: >>> A small addendum ;) >>> On 03.12.2009 22:13, Nicolas Cellier wrote: >>>> 3) identify thise sending a potential inefficient message (includes:) >>>> insert keys asSet in this case (only if includes: is in a loop!) >>>> >>>> >>> 3b. If the keys collection isn't used for anything else, change to use >>> includesKey: instead. >>> >>> Cheers, >>> Henry >>> >>> _______________________________________________ >>> Pharo-project mailing list >>> [hidden email] >>> http://lists.gforge.inria.fr/cgi-bin/mailman/listinfo/pharo-project >>> >> >> >> >> -- >> Lukas Renggli >> http://www.lukas-renggli.ch >> > > > > -- > Lukas Renggli > http://www.lukas-renggli.ch > > _______________________________________________ > Pharo-project mailing list > [hidden email] > http://lists.gforge.inria.fr/cgi-bin/mailman/listinfo/pharo-project > -- Best regards, Igor Stasenko AKA sig. _______________________________________________ Pharo-project mailing list [hidden email] http://lists.gforge.inria.fr/cgi-bin/mailman/listinfo/pharo-project |
In reply to this post by Lukas Renggli
2009/12/4 Lukas Renggli <[hidden email]>:
> Actually the code can be further optimized: > > Dictionary>>keys > "Answer a Set containing the receiver's keys." > > | result container | > result := Set basicNew setTally: tally array: array copy. > container := result array. > 1 to: container size do: [ :index | > (container at: index) > ifNotNil: [ :assoc | container at: index put: assoc key ] ]. > ^ result > > This is roughly 22 times faster than the current implementation, and 5 > times faster than the squeak implementation. > > The system doesn't seem to break with this change :-) > > Lukas > Yes Levente considered this implementation as keysAsSet, but did not pushed it in trunk so far. You must also consider that each further usage of a Set will add a performance penalty vs an Array, especially do: Nicolas > 2009/12/4 Lukas Renggli <[hidden email]>: >> I wonder if ever something along the following lines was considered? >> >> Dictionary>>keys >> "Answer a Set containing the receiver's keys." >> >> | result | >> result := Set basicNew. >> result setTally: tally array: (array collect: [ :each | >> each isNil ifFalse: [ each key ] ]). >> ^ result >> >> This returns a Set, so it wouldn't break any semantics. In my >> benchmark this is roughly 8x faster than the current implementation, >> and 2x faster than the optimized Squeak implementation. >> >> The drawback is that it makes some heavy assumptions on the internal >> structure of Dictionary, Association and Set. >> >> Lukas >> >> 2009/12/4 Henrik Sperre Johansen <[hidden email]>: >>> A small addendum ;) >>> On 03.12.2009 22:13, Nicolas Cellier wrote: >>>> 3) identify thise sending a potential inefficient message (includes:) >>>> insert keys asSet in this case (only if includes: is in a loop!) >>>> >>>> >>> 3b. If the keys collection isn't used for anything else, change to use >>> includesKey: instead. >>> >>> Cheers, >>> Henry >>> >>> _______________________________________________ >>> Pharo-project mailing list >>> [hidden email] >>> http://lists.gforge.inria.fr/cgi-bin/mailman/listinfo/pharo-project >>> >> >> >> >> -- >> Lukas Renggli >> http://www.lukas-renggli.ch >> > > > > -- > Lukas Renggli > http://www.lukas-renggli.ch > > _______________________________________________ > Pharo-project mailing list > [hidden email] > http://lists.gforge.inria.fr/cgi-bin/mailman/listinfo/pharo-project > _______________________________________________ Pharo-project mailing list [hidden email] http://lists.gforge.inria.fr/cgi-bin/mailman/listinfo/pharo-project |
In reply to this post by Lukas Renggli
This merely shows there are no tests testing the keys method of an IdentityDictionary which contains objects for which hash != identityHash...
Cheers, Henry On Dec 4, 2009, at 9:06 25AM, Lukas Renggli wrote: > keys > "Answer a Set containing the receiver's keys." > > | result container | > result := Set basicNew setTally: tally array: array copy. > container := result array. > 1 to: container size do: [ :index | > (container at: index) > ifNotNil: [ :assoc | container at: index put: assoc key ] ]. > ^ result _______________________________________________ Pharo-project mailing list [hidden email] http://lists.gforge.inria.fr/cgi-bin/mailman/listinfo/pharo-project |
Never mind, didn't notice IdentityDictionary reimplements keys...
:/ On Dec 4, 2009, at 10:10 46AM, Henrik Johansen wrote: > This merely shows there are no tests testing the keys method of an IdentityDictionary which contains objects for which hash != identityHash... > > Cheers, > Henry > > On Dec 4, 2009, at 9:06 25AM, Lukas Renggli wrote: > >> keys >> "Answer a Set containing the receiver's keys." >> >> | result container | >> result := Set basicNew setTally: tally array: array copy. >> container := result array. >> 1 to: container size do: [ :index | >> (container at: index) >> ifNotNil: [ :assoc | container at: index put: assoc key ] ]. >> ^ result > > > _______________________________________________ > Pharo-project mailing list > [hidden email] > http://lists.gforge.inria.fr/cgi-bin/mailman/listinfo/pharo-project > _______________________________________________ Pharo-project mailing list [hidden email] http://lists.gforge.inria.fr/cgi-bin/mailman/listinfo/pharo-project |
In reply to this post by Igor Stasenko
On Dec 4, 2009, at 9:12 30AM, Igor Stasenko wrote: > 2009/12/4 Lukas Renggli <[hidden email]>: >> Actually the code can be further optimized: >> >> Dictionary>>keys >> "Answer a Set containing the receiver's keys." >> >> | result container | >> result := Set basicNew setTally: tally array: array copy. >> container := result array. >> 1 to: container size do: [ :index | >> (container at: index) >> ifNotNil: [ :assoc | container at: index put: assoc key ] ]. >> ^ result >> >> This is roughly 22 times faster than the current implementation, and 5 >> times faster than the squeak implementation. >> >> The system doesn't seem to break with this change :-) >> > except that it breaking encapsulation. > Thus i vote -1 for this kind of optimization :) You could also achieve the same performance by writing the method (returning an array) | result currentIX | result := Array new: tally. currentIX := 1. 1 to: array size do: [:ix | (array at: ix) ifNotNil: [:assoc | result at: currentIX put: assoc key. currentIX := currentIX +1]]. ^result Which is, unless you consider using at:put: and expecting an Array of size n to hold n elements, not breaking encapsulation. Though, the block creation overhead of using keysDo (which is the main reason for the performance difference seen between the two) should, if I've understood Eliot correctly, go away with a stack-based vm. While on the topic of the Set hierarchy, is it just me, or is the Keyed* versions a mistake? Giving a block that computes a key means the collection either needs to: - Know about what type of objects is put into it, in which case you break encapsulation. and it would likely be better to just implement hash/= for those types of objects. - Only use info available of Object, in which case you're not very likely to end up with much better hash functions than scaledIdentityHash anyways. Cheers, Henry _______________________________________________ Pharo-project mailing list [hidden email] http://lists.gforge.inria.fr/cgi-bin/mailman/listinfo/pharo-project |
In reply to this post by Lukas Renggli
do you have an idea why people did not use it?
Stef On Dec 4, 2009, at 8:58 AM, Lukas Renggli wrote: > I wonder if ever something along the following lines was considered? > > Dictionary>>keys > "Answer a Set containing the receiver's keys." > > | result | > result := Set basicNew. > result setTally: tally array: (array collect: [ :each | > each isNil ifFalse: [ each key ] ]). > ^ result > > This returns a Set, so it wouldn't break any semantics. In my > benchmark this is roughly 8x faster than the current implementation, > and 2x faster than the optimized Squeak implementation. > > The drawback is that it makes some heavy assumptions on the internal > structure of Dictionary, Association and Set. > > Lukas > > 2009/12/4 Henrik Sperre Johansen <[hidden email]>: >> A small addendum ;) >> On 03.12.2009 22:13, Nicolas Cellier wrote: >>> 3) identify thise sending a potential inefficient message (includes:) >>> insert keys asSet in this case (only if includes: is in a loop!) >>> >>> >> 3b. If the keys collection isn't used for anything else, change to use >> includesKey: instead. >> >> Cheers, >> Henry >> >> _______________________________________________ >> Pharo-project mailing list >> [hidden email] >> http://lists.gforge.inria.fr/cgi-bin/mailman/listinfo/pharo-project >> > > > > -- > Lukas Renggli > http://www.lukas-renggli.ch > > _______________________________________________ > Pharo-project mailing list > [hidden email] > http://lists.gforge.inria.fr/cgi-bin/mailman/listinfo/pharo-project _______________________________________________ Pharo-project mailing list [hidden email] http://lists.gforge.inria.fr/cgi-bin/mailman/listinfo/pharo-project |
In reply to this post by Henrik Sperre Johansen
I would really like to get the changes suggested by nicolas so
if one of you want to support by sending code it would be cool. We are trying to close some fixed issues (besides working on our little research activities). Stef On Dec 4, 2009, at 8:29 AM, Henrik Sperre Johansen wrote: > A small addendum ;) > On 03.12.2009 22:13, Nicolas Cellier wrote: >> 3) identify thise sending a potential inefficient message (includes:) >> insert keys asSet in this case (only if includes: is in a loop!) >> >> > 3b. If the keys collection isn't used for anything else, change to use > includesKey: instead. > > Cheers, > Henry > > _______________________________________________ > Pharo-project mailing list > [hidden email] > http://lists.gforge.inria.fr/cgi-bin/mailman/listinfo/pharo-project _______________________________________________ Pharo-project mailing list [hidden email] http://lists.gforge.inria.fr/cgi-bin/mailman/listinfo/pharo-project |
In reply to this post by Nicolas Cellier
I would be curious to see the average length of dictionary keys (except Smalltalk)
and see the performance penalty between set and array for the do: Stef > 2009/12/4 Lukas Renggli <[hidden email]>: >> Actually the code can be further optimized: >> >> Dictionary>>keys >> "Answer a Set containing the receiver's keys." >> >> | result container | >> result := Set basicNew setTally: tally array: array copy. >> container := result array. >> 1 to: container size do: [ :index | >> (container at: index) >> ifNotNil: [ :assoc | container at: index put: assoc key ] ]. >> ^ result >> >> This is roughly 22 times faster than the current implementation, and 5 >> times faster than the squeak implementation. >> >> The system doesn't seem to break with this change :-) >> >> Lukas >> > > Yes Levente considered this implementation as keysAsSet, but did not > pushed it in trunk so far. > You must also consider that each further usage of a Set will add a > performance penalty vs an Array, especially do: > > Nicolas > >> 2009/12/4 Lukas Renggli <[hidden email]>: >>> I wonder if ever something along the following lines was considered? >>> >>> Dictionary>>keys >>> "Answer a Set containing the receiver's keys." >>> >>> | result | >>> result := Set basicNew. >>> result setTally: tally array: (array collect: [ :each | >>> each isNil ifFalse: [ each key ] ]). >>> ^ result >>> >>> This returns a Set, so it wouldn't break any semantics. In my >>> benchmark this is roughly 8x faster than the current implementation, >>> and 2x faster than the optimized Squeak implementation. >>> >>> The drawback is that it makes some heavy assumptions on the internal >>> structure of Dictionary, Association and Set. >>> >>> Lukas >>> >>> 2009/12/4 Henrik Sperre Johansen <[hidden email]>: >>>> A small addendum ;) >>>> On 03.12.2009 22:13, Nicolas Cellier wrote: >>>>> 3) identify thise sending a potential inefficient message (includes:) >>>>> insert keys asSet in this case (only if includes: is in a loop!) >>>>> >>>>> >>>> 3b. If the keys collection isn't used for anything else, change to use >>>> includesKey: instead. >>>> >>>> Cheers, >>>> Henry >>>> >>>> _______________________________________________ >>>> Pharo-project mailing list >>>> [hidden email] >>>> http://lists.gforge.inria.fr/cgi-bin/mailman/listinfo/pharo-project >>>> >>> >>> >>> >>> -- >>> Lukas Renggli >>> http://www.lukas-renggli.ch >>> >> >> >> >> -- >> Lukas Renggli >> http://www.lukas-renggli.ch >> >> _______________________________________________ >> Pharo-project mailing list >> [hidden email] >> http://lists.gforge.inria.fr/cgi-bin/mailman/listinfo/pharo-project >> > > _______________________________________________ > Pharo-project mailing list > [hidden email] > http://lists.gforge.inria.fr/cgi-bin/mailman/listinfo/pharo-project _______________________________________________ Pharo-project mailing list [hidden email] http://lists.gforge.inria.fr/cgi-bin/mailman/listinfo/pharo-project |
I posted these numbers a while ago.
In my image the average length of a Dictionary is 10, the average length of a Set is 1.2. Lukas 2009/12/4 Stéphane Ducasse <[hidden email]>: > I would be curious to see the average length of dictionary keys (except Smalltalk) > and see the performance penalty between set and array for the do: > > Stef > >> 2009/12/4 Lukas Renggli <[hidden email]>: >>> Actually the code can be further optimized: >>> >>> Dictionary>>keys >>> "Answer a Set containing the receiver's keys." >>> >>> | result container | >>> result := Set basicNew setTally: tally array: array copy. >>> container := result array. >>> 1 to: container size do: [ :index | >>> (container at: index) >>> ifNotNil: [ :assoc | container at: index put: assoc key ] ]. >>> ^ result >>> >>> This is roughly 22 times faster than the current implementation, and 5 >>> times faster than the squeak implementation. >>> >>> The system doesn't seem to break with this change :-) >>> >>> Lukas >>> >> >> Yes Levente considered this implementation as keysAsSet, but did not >> pushed it in trunk so far. >> You must also consider that each further usage of a Set will add a >> performance penalty vs an Array, especially do: >> >> Nicolas >> >>> 2009/12/4 Lukas Renggli <[hidden email]>: >>>> I wonder if ever something along the following lines was considered? >>>> >>>> Dictionary>>keys >>>> "Answer a Set containing the receiver's keys." >>>> >>>> | result | >>>> result := Set basicNew. >>>> result setTally: tally array: (array collect: [ :each | >>>> each isNil ifFalse: [ each key ] ]). >>>> ^ result >>>> >>>> This returns a Set, so it wouldn't break any semantics. In my >>>> benchmark this is roughly 8x faster than the current implementation, >>>> and 2x faster than the optimized Squeak implementation. >>>> >>>> The drawback is that it makes some heavy assumptions on the internal >>>> structure of Dictionary, Association and Set. >>>> >>>> Lukas >>>> >>>> 2009/12/4 Henrik Sperre Johansen <[hidden email]>: >>>>> A small addendum ;) >>>>> On 03.12.2009 22:13, Nicolas Cellier wrote: >>>>>> 3) identify thise sending a potential inefficient message (includes:) >>>>>> insert keys asSet in this case (only if includes: is in a loop!) >>>>>> >>>>>> >>>>> 3b. If the keys collection isn't used for anything else, change to use >>>>> includesKey: instead. >>>>> >>>>> Cheers, >>>>> Henry >>>>> >>>>> _______________________________________________ >>>>> Pharo-project mailing list >>>>> [hidden email] >>>>> http://lists.gforge.inria.fr/cgi-bin/mailman/listinfo/pharo-project >>>>> >>>> >>>> >>>> >>>> -- >>>> Lukas Renggli >>>> http://www.lukas-renggli.ch >>>> >>> >>> >>> >>> -- >>> Lukas Renggli >>> http://www.lukas-renggli.ch >>> >>> _______________________________________________ >>> Pharo-project mailing list >>> [hidden email] >>> http://lists.gforge.inria.fr/cgi-bin/mailman/listinfo/pharo-project >>> >> >> _______________________________________________ >> Pharo-project mailing list >> [hidden email] >> http://lists.gforge.inria.fr/cgi-bin/mailman/listinfo/pharo-project > > > _______________________________________________ > Pharo-project mailing list > [hidden email] > http://lists.gforge.inria.fr/cgi-bin/mailman/listinfo/pharo-project > -- Lukas Renggli http://www.lukas-renggli.ch _______________________________________________ Pharo-project mailing list [hidden email] http://lists.gforge.inria.fr/cgi-bin/mailman/listinfo/pharo-project |
Using this script:
(Dictionary allInstances inject: 0 into: [ :r :e | r + e size ]) / Dictionary allInstances size asFloat I get 14.23 in my image. Using this script: (Set allInstances inject: 0 into: [ :r :e | r + e size ]) / Set allInstances size asFloat I get 0.49 Cheers, Doru On 4 Dec 2009, at 11:49, Lukas Renggli wrote: > I posted these numbers a while ago. > > In my image the average length of a Dictionary is 10, the average > length of a Set is 1.2. > > Lukas > > > 2009/12/4 Stéphane Ducasse <[hidden email]>: >> I would be curious to see the average length of dictionary keys >> (except Smalltalk) >> and see the performance penalty between set and array for the do: >> >> Stef >> >>> 2009/12/4 Lukas Renggli <[hidden email]>: >>>> Actually the code can be further optimized: >>>> >>>> Dictionary>>keys >>>> "Answer a Set containing the receiver's keys." >>>> >>>> | result container | >>>> result := Set basicNew setTally: tally array: array copy. >>>> container := result array. >>>> 1 to: container size do: [ :index | >>>> (container at: index) >>>> ifNotNil: [ :assoc | container at: index >>>> put: assoc key ] ]. >>>> ^ result >>>> >>>> This is roughly 22 times faster than the current implementation, >>>> and 5 >>>> times faster than the squeak implementation. >>>> >>>> The system doesn't seem to break with this change :-) >>>> >>>> Lukas >>>> >>> >>> Yes Levente considered this implementation as keysAsSet, but did not >>> pushed it in trunk so far. >>> You must also consider that each further usage of a Set will add a >>> performance penalty vs an Array, especially do: >>> >>> Nicolas >>> >>>> 2009/12/4 Lukas Renggli <[hidden email]>: >>>>> I wonder if ever something along the following lines was >>>>> considered? >>>>> >>>>> Dictionary>>keys >>>>> "Answer a Set containing the receiver's keys." >>>>> >>>>> | result | >>>>> result := Set basicNew. >>>>> result setTally: tally array: (array collect: [ :each | >>>>> each isNil ifFalse: [ each key ] ]). >>>>> ^ result >>>>> >>>>> This returns a Set, so it wouldn't break any semantics. In my >>>>> benchmark this is roughly 8x faster than the current >>>>> implementation, >>>>> and 2x faster than the optimized Squeak implementation. >>>>> >>>>> The drawback is that it makes some heavy assumptions on the >>>>> internal >>>>> structure of Dictionary, Association and Set. >>>>> >>>>> Lukas >>>>> >>>>> 2009/12/4 Henrik Sperre Johansen <[hidden email]>: >>>>>> A small addendum ;) >>>>>> On 03.12.2009 22:13, Nicolas Cellier wrote: >>>>>>> 3) identify thise sending a potential inefficient message >>>>>>> (includes:) >>>>>>> insert keys asSet in this case (only if includes: is in a loop!) >>>>>>> >>>>>>> >>>>>> 3b. If the keys collection isn't used for anything else, change >>>>>> to use >>>>>> includesKey: instead. >>>>>> >>>>>> Cheers, >>>>>> Henry >>>>>> >>>>>> _______________________________________________ >>>>>> Pharo-project mailing list >>>>>> [hidden email] >>>>>> http://lists.gforge.inria.fr/cgi-bin/mailman/listinfo/pharo-project >>>>>> >>>>> >>>>> >>>>> >>>>> -- >>>>> Lukas Renggli >>>>> http://www.lukas-renggli.ch >>>>> >>>> >>>> >>>> >>>> -- >>>> Lukas Renggli >>>> http://www.lukas-renggli.ch >>>> >>>> _______________________________________________ >>>> Pharo-project mailing list >>>> [hidden email] >>>> http://lists.gforge.inria.fr/cgi-bin/mailman/listinfo/pharo-project >>>> >>> >>> _______________________________________________ >>> Pharo-project mailing list >>> [hidden email] >>> http://lists.gforge.inria.fr/cgi-bin/mailman/listinfo/pharo-project >> >> >> _______________________________________________ >> Pharo-project mailing list >> [hidden email] >> http://lists.gforge.inria.fr/cgi-bin/mailman/listinfo/pharo-project >> > > > > -- > Lukas Renggli > http://www.lukas-renggli.ch > > _______________________________________________ > Pharo-project mailing list > [hidden email] > http://lists.gforge.inria.fr/cgi-bin/mailman/listinfo/pharo-project -- www.tudorgirba.com "No matter how many recipes we know, we still value a chef." _______________________________________________ Pharo-project mailing list [hidden email] http://lists.gforge.inria.fr/cgi-bin/mailman/listinfo/pharo-project |
In reply to this post by Lukas Renggli
this is interesting to see that this is still a guess game.
Even with these numbers I cannot evaluate what could be a slowdown/pseed up impact on the complete system. I imgain that the ietration slow down should be minimal and compensated by the creation speed up. May be in the future when system will have a better knowledge of themselves we may end up having better tools... > I posted these numbers a while ago. > > In my image the average length of a Dictionary is 10, the average > length of a Set is 1.2. > > Lukas > > > 2009/12/4 Stéphane Ducasse <[hidden email]>: >> I would be curious to see the average length of dictionary keys (except Smalltalk) >> and see the performance penalty between set and array for the do: >> >> Stef >> >>> 2009/12/4 Lukas Renggli <[hidden email]>: >>>> Actually the code can be further optimized: >>>> >>>> Dictionary>>keys >>>> "Answer a Set containing the receiver's keys." >>>> >>>> | result container | >>>> result := Set basicNew setTally: tally array: array copy. >>>> container := result array. >>>> 1 to: container size do: [ :index | >>>> (container at: index) >>>> ifNotNil: [ :assoc | container at: index put: assoc key ] ]. >>>> ^ result >>>> >>>> This is roughly 22 times faster than the current implementation, and 5 >>>> times faster than the squeak implementation. >>>> >>>> The system doesn't seem to break with this change :-) >>>> >>>> Lukas >>>> >>> >>> Yes Levente considered this implementation as keysAsSet, but did not >>> pushed it in trunk so far. >>> You must also consider that each further usage of a Set will add a >>> performance penalty vs an Array, especially do: >>> >>> Nicolas >>> >>>> 2009/12/4 Lukas Renggli <[hidden email]>: >>>>> I wonder if ever something along the following lines was considered? >>>>> >>>>> Dictionary>>keys >>>>> "Answer a Set containing the receiver's keys." >>>>> >>>>> | result | >>>>> result := Set basicNew. >>>>> result setTally: tally array: (array collect: [ :each | >>>>> each isNil ifFalse: [ each key ] ]). >>>>> ^ result >>>>> >>>>> This returns a Set, so it wouldn't break any semantics. In my >>>>> benchmark this is roughly 8x faster than the current implementation, >>>>> and 2x faster than the optimized Squeak implementation. >>>>> >>>>> The drawback is that it makes some heavy assumptions on the internal >>>>> structure of Dictionary, Association and Set. >>>>> >>>>> Lukas >>>>> >>>>> 2009/12/4 Henrik Sperre Johansen <[hidden email]>: >>>>>> A small addendum ;) >>>>>> On 03.12.2009 22:13, Nicolas Cellier wrote: >>>>>>> 3) identify thise sending a potential inefficient message (includes:) >>>>>>> insert keys asSet in this case (only if includes: is in a loop!) >>>>>>> >>>>>>> >>>>>> 3b. If the keys collection isn't used for anything else, change to use >>>>>> includesKey: instead. >>>>>> >>>>>> Cheers, >>>>>> Henry >>>>>> >>>>>> _______________________________________________ >>>>>> Pharo-project mailing list >>>>>> [hidden email] >>>>>> http://lists.gforge.inria.fr/cgi-bin/mailman/listinfo/pharo-project >>>>>> >>>>> >>>>> >>>>> >>>>> -- >>>>> Lukas Renggli >>>>> http://www.lukas-renggli.ch >>>>> >>>> >>>> >>>> >>>> -- >>>> Lukas Renggli >>>> http://www.lukas-renggli.ch >>>> >>>> _______________________________________________ >>>> Pharo-project mailing list >>>> [hidden email] >>>> http://lists.gforge.inria.fr/cgi-bin/mailman/listinfo/pharo-project >>>> >>> >>> _______________________________________________ >>> Pharo-project mailing list >>> [hidden email] >>> http://lists.gforge.inria.fr/cgi-bin/mailman/listinfo/pharo-project >> >> >> _______________________________________________ >> Pharo-project mailing list >> [hidden email] >> http://lists.gforge.inria.fr/cgi-bin/mailman/listinfo/pharo-project >> > > > > -- > Lukas Renggli > http://www.lukas-renggli.ch > > _______________________________________________ > Pharo-project mailing list > [hidden email] > http://lists.gforge.inria.fr/cgi-bin/mailman/listinfo/pharo-project _______________________________________________ Pharo-project mailing list [hidden email] http://lists.gforge.inria.fr/cgi-bin/mailman/listinfo/pharo-project |
> Even with these numbers I cannot evaluate what could be a slowdown/pseed up
> impact on the complete system. I imgain that the ietration slow down should be minimal > and compensated by the creation speed up. Yeah, the iteration slowdown should be minimal of using Set vs. Array. What I think is more critical that you change the semantics of the returned data-structure. That could not only break clients, but also their expectations towards performance. I find interesting that most sets are almost empty. Lukas -- Lukas Renggli http://www.lukas-renggli.ch _______________________________________________ Pharo-project mailing list [hidden email] http://lists.gforge.inria.fr/cgi-bin/mailman/listinfo/pharo-project |
In reply to this post by Lukas Renggli
Em 04/12/2009 05:58, Lukas Renggli < [hidden email] > escreveu:
> I wonder if ever something along the following lines was considered? > > Dictionary>>keys > "Answer a Set containing the receiver's keys." > > | result | > result := Set basicNew. > result setTally: tally array: (array collect: [ :each | > each isNil ifFalse: [ each key ] ]). > ^ result Why not: | result | result := Set basicNew. result setTally: tally array: (array collect: [ :each | each ifNil: [ each key ] ]). ^ result A bit more OO ;-) Reading Andrés' book is making more aware of those subtleties :-D My .019999... - Cesar Rabak _______________________________________________ Pharo-project mailing list [hidden email] http://lists.gforge.inria.fr/cgi-bin/mailman/listinfo/pharo-project |
Yeah, this was my first attempt, but this implementation is ways
slower. The block activations with #collect: matters. My implemebtation only calls primitives or inlined constructs. Not nice to look at, but as fast as it can get. Lukas On Friday, December 4, 2009, <[hidden email]> wrote: > Em 04/12/2009 05:58, Lukas Renggli < [hidden email] > escreveu: > >> I wonder if ever something along the following lines was considered? >> >> Dictionary>>keys >> "Answer a Set containing the receiver's keys." >> >> | result | >> result := Set basicNew. >> result setTally: tally array: (array collect: [ :each | >> each isNil ifFalse: [ each key ] ]). >> ^ result > > Why not: > > | result | > result := Set basicNew. > result setTally: tally array: (array collect: [ :each | > each ifNil: [ each key ] ]). > ^ result > > A bit more OO ;-) > > Reading Andrés' book is making more aware of those subtleties :-D > > My .019999... > > - > Cesar Rabak > > _______________________________________________ > Pharo-project mailing list > [hidden email] > http://lists.gforge.inria.fr/cgi-bin/mailman/listinfo/pharo-project -- Lukas Renggli http://www.lukas-renggli.ch _______________________________________________ Pharo-project mailing list [hidden email] http://lists.gforge.inria.fr/cgi-bin/mailman/listinfo/pharo-project |
Free forum by Nabble | Edit this page |