about keys returning a set

Previous Topic Next Topic
 
classic Classic list List threaded Threaded
26 messages Options
12
Reply | Threaded
Open this post in threaded view
|

about keys returning a set

Stéphane Ducasse
Nicolas

I was discussing with lukas and we would be interested to get your changes to dictionary keys (Ihope that this icorrect since I'm dead)
We would like to integrate your changes. Do you have a basis that we could work?
Else we can just try in the image directly.

Stef
_______________________________________________
Pharo-project mailing list
[hidden email]
http://lists.gforge.inria.fr/cgi-bin/mailman/listinfo/pharo-project
Reply | Threaded
Open this post in threaded view
|

Re: about keys returning a set

Nicolas Cellier
Yes I can have a look.
Otherwise, the changes are quite simple:
1) track senders of keys and sometimes senders of senders
2) identify those sending a message not understandable by an Array
(add: remove: etc...), insert keys asSet in this case
3) identify thise sending a potential inefficient message (includes:)
insert keys asSet in this case (only if includes: is in a loop!)
4) change code for keys

Then eventually change some chains:
keys asSortedCollection asArray -> keys asArray sort

Nicolas

2009/12/3 Stéphane Ducasse <[hidden email]>:

> Nicolas
>
> I was discussing with lukas and we would be interested to get your changes to dictionary keys (Ihope that this icorrect since I'm dead)
> We would like to integrate your changes. Do you have a basis that we could work?
> Else we can just try in the image directly.
>
> Stef
> _______________________________________________
> Pharo-project mailing list
> [hidden email]
> http://lists.gforge.inria.fr/cgi-bin/mailman/listinfo/pharo-project
>

_______________________________________________
Pharo-project mailing list
[hidden email]
http://lists.gforge.inria.fr/cgi-bin/mailman/listinfo/pharo-project
Reply | Threaded
Open this post in threaded view
|

Re: about keys returning a set

Stéphane Ducasse
tx

On Dec 3, 2009, at 10:13 PM, Nicolas Cellier wrote:

> Yes I can have a look.
> Otherwise, the changes are quite simple:
> 1) track senders of keys and sometimes senders of senders
> 2) identify those sending a message not understandable by an Array
> (add: remove: etc...), insert keys asSet in this case
> 3) identify thise sending a potential inefficient message (includes:)
> insert keys asSet in this case (only if includes: is in a loop!)
> 4) change code for keys
>
> Then eventually change some chains:
> keys asSortedCollection asArray -> keys asArray sort
>
> Nicolas
>
> 2009/12/3 Stéphane Ducasse <[hidden email]>:
>> Nicolas
>>
>> I was discussing with lukas and we would be interested to get your changes to dictionary keys (Ihope that this icorrect since I'm dead)
>> We would like to integrate your changes. Do you have a basis that we could work?
>> Else we can just try in the image directly.
>>
>> Stef
>> _______________________________________________
>> Pharo-project mailing list
>> [hidden email]
>> http://lists.gforge.inria.fr/cgi-bin/mailman/listinfo/pharo-project
>>
>
> _______________________________________________
> Pharo-project mailing list
> [hidden email]
> http://lists.gforge.inria.fr/cgi-bin/mailman/listinfo/pharo-project


_______________________________________________
Pharo-project mailing list
[hidden email]
http://lists.gforge.inria.fr/cgi-bin/mailman/listinfo/pharo-project
Reply | Threaded
Open this post in threaded view
|

Re: about keys returning a set

Henrik Sperre Johansen
In reply to this post by Nicolas Cellier
A small addendum ;)
On 03.12.2009 22:13, Nicolas Cellier wrote:
> 3) identify thise sending a potential inefficient message (includes:)
> insert keys asSet in this case (only if includes: is in a loop!)
>
>    
3b. If the keys collection isn't used for anything else, change to use
includesKey: instead.

Cheers,
Henry

_______________________________________________
Pharo-project mailing list
[hidden email]
http://lists.gforge.inria.fr/cgi-bin/mailman/listinfo/pharo-project
Reply | Threaded
Open this post in threaded view
|

Re: about keys returning a set

Lukas Renggli
I wonder if ever something along the following lines was considered?

Dictionary>>keys
        "Answer a Set containing the receiver's keys."
       
        | result |
        result := Set basicNew.
        result setTally: tally array: (array collect: [ :each |
                each isNil ifFalse: [ each key ] ]).
        ^ result

This returns a Set, so it wouldn't break any semantics. In my
benchmark this is roughly 8x faster than the current implementation,
and 2x faster than the optimized Squeak implementation.

The drawback is that it makes some heavy assumptions on the internal
structure of Dictionary, Association and Set.

Lukas

2009/12/4 Henrik Sperre Johansen <[hidden email]>:

> A small addendum ;)
> On 03.12.2009 22:13, Nicolas Cellier wrote:
>> 3) identify thise sending a potential inefficient message (includes:)
>> insert keys asSet in this case (only if includes: is in a loop!)
>>
>>
> 3b. If the keys collection isn't used for anything else, change to use
> includesKey: instead.
>
> Cheers,
> Henry
>
> _______________________________________________
> Pharo-project mailing list
> [hidden email]
> http://lists.gforge.inria.fr/cgi-bin/mailman/listinfo/pharo-project
>



--
Lukas Renggli
http://www.lukas-renggli.ch

_______________________________________________
Pharo-project mailing list
[hidden email]
http://lists.gforge.inria.fr/cgi-bin/mailman/listinfo/pharo-project
Reply | Threaded
Open this post in threaded view
|

Re: about keys returning a set

Lukas Renggli
Actually the code can be further optimized:

Dictionary>>keys
        "Answer a Set containing the receiver's keys."
       
        | result container |
        result := Set basicNew setTally: tally array: array copy.
        container := result array.
        1 to: container size do: [ :index |
                (container at: index)
                        ifNotNil: [ :assoc | container at: index put: assoc key ] ].
        ^ result

This is roughly 22 times faster than the current implementation, and 5
times faster than the squeak implementation.

The system doesn't seem to break with this change :-)

Lukas

2009/12/4 Lukas Renggli <[hidden email]>:

> I wonder if ever something along the following lines was considered?
>
> Dictionary>>keys
>        "Answer a Set containing the receiver's keys."
>
>        | result |
>        result := Set basicNew.
>        result setTally: tally array: (array collect: [ :each |
>                each isNil ifFalse: [ each key ] ]).
>        ^ result
>
> This returns a Set, so it wouldn't break any semantics. In my
> benchmark this is roughly 8x faster than the current implementation,
> and 2x faster than the optimized Squeak implementation.
>
> The drawback is that it makes some heavy assumptions on the internal
> structure of Dictionary, Association and Set.
>
> Lukas
>
> 2009/12/4 Henrik Sperre Johansen <[hidden email]>:
>> A small addendum ;)
>> On 03.12.2009 22:13, Nicolas Cellier wrote:
>>> 3) identify thise sending a potential inefficient message (includes:)
>>> insert keys asSet in this case (only if includes: is in a loop!)
>>>
>>>
>> 3b. If the keys collection isn't used for anything else, change to use
>> includesKey: instead.
>>
>> Cheers,
>> Henry
>>
>> _______________________________________________
>> Pharo-project mailing list
>> [hidden email]
>> http://lists.gforge.inria.fr/cgi-bin/mailman/listinfo/pharo-project
>>
>
>
>
> --
> Lukas Renggli
> http://www.lukas-renggli.ch
>



--
Lukas Renggli
http://www.lukas-renggli.ch

_______________________________________________
Pharo-project mailing list
[hidden email]
http://lists.gforge.inria.fr/cgi-bin/mailman/listinfo/pharo-project
Reply | Threaded
Open this post in threaded view
|

Re: about keys returning a set

Igor Stasenko
2009/12/4 Lukas Renggli <[hidden email]>:

> Actually the code can be further optimized:
>
> Dictionary>>keys
>        "Answer a Set containing the receiver's keys."
>
>        | result container |
>        result := Set basicNew setTally: tally array: array copy.
>        container := result array.
>        1 to: container size do: [ :index |
>                (container at: index)
>                        ifNotNil: [ :assoc | container at: index put: assoc key ] ].
>        ^ result
>
> This is roughly 22 times faster than the current implementation, and 5
> times faster than the squeak implementation.
>
> The system doesn't seem to break with this change :-)
>
except that it breaking encapsulation.
Thus i vote -1 for this kind of optimization :)

> Lukas
>
> 2009/12/4 Lukas Renggli <[hidden email]>:
>> I wonder if ever something along the following lines was considered?
>>
>> Dictionary>>keys
>>        "Answer a Set containing the receiver's keys."
>>
>>        | result |
>>        result := Set basicNew.
>>        result setTally: tally array: (array collect: [ :each |
>>                each isNil ifFalse: [ each key ] ]).
>>        ^ result
>>
>> This returns a Set, so it wouldn't break any semantics. In my
>> benchmark this is roughly 8x faster than the current implementation,
>> and 2x faster than the optimized Squeak implementation.
>>
>> The drawback is that it makes some heavy assumptions on the internal
>> structure of Dictionary, Association and Set.
>>
>> Lukas
>>
>> 2009/12/4 Henrik Sperre Johansen <[hidden email]>:
>>> A small addendum ;)
>>> On 03.12.2009 22:13, Nicolas Cellier wrote:
>>>> 3) identify thise sending a potential inefficient message (includes:)
>>>> insert keys asSet in this case (only if includes: is in a loop!)
>>>>
>>>>
>>> 3b. If the keys collection isn't used for anything else, change to use
>>> includesKey: instead.
>>>
>>> Cheers,
>>> Henry
>>>
>>> _______________________________________________
>>> Pharo-project mailing list
>>> [hidden email]
>>> http://lists.gforge.inria.fr/cgi-bin/mailman/listinfo/pharo-project
>>>
>>
>>
>>
>> --
>> Lukas Renggli
>> http://www.lukas-renggli.ch
>>
>
>
>
> --
> Lukas Renggli
> http://www.lukas-renggli.ch
>
> _______________________________________________
> Pharo-project mailing list
> [hidden email]
> http://lists.gforge.inria.fr/cgi-bin/mailman/listinfo/pharo-project
>



--
Best regards,
Igor Stasenko AKA sig.

_______________________________________________
Pharo-project mailing list
[hidden email]
http://lists.gforge.inria.fr/cgi-bin/mailman/listinfo/pharo-project
Reply | Threaded
Open this post in threaded view
|

Re: about keys returning a set

Nicolas Cellier
In reply to this post by Lukas Renggli
2009/12/4 Lukas Renggli <[hidden email]>:

> Actually the code can be further optimized:
>
> Dictionary>>keys
>        "Answer a Set containing the receiver's keys."
>
>        | result container |
>        result := Set basicNew setTally: tally array: array copy.
>        container := result array.
>        1 to: container size do: [ :index |
>                (container at: index)
>                        ifNotNil: [ :assoc | container at: index put: assoc key ] ].
>        ^ result
>
> This is roughly 22 times faster than the current implementation, and 5
> times faster than the squeak implementation.
>
> The system doesn't seem to break with this change :-)
>
> Lukas
>

Yes Levente considered this implementation as keysAsSet, but did not
pushed it in trunk so far.
You must also consider that each further usage of a Set will add a
performance penalty vs an Array, especially do:

Nicolas

> 2009/12/4 Lukas Renggli <[hidden email]>:
>> I wonder if ever something along the following lines was considered?
>>
>> Dictionary>>keys
>>        "Answer a Set containing the receiver's keys."
>>
>>        | result |
>>        result := Set basicNew.
>>        result setTally: tally array: (array collect: [ :each |
>>                each isNil ifFalse: [ each key ] ]).
>>        ^ result
>>
>> This returns a Set, so it wouldn't break any semantics. In my
>> benchmark this is roughly 8x faster than the current implementation,
>> and 2x faster than the optimized Squeak implementation.
>>
>> The drawback is that it makes some heavy assumptions on the internal
>> structure of Dictionary, Association and Set.
>>
>> Lukas
>>
>> 2009/12/4 Henrik Sperre Johansen <[hidden email]>:
>>> A small addendum ;)
>>> On 03.12.2009 22:13, Nicolas Cellier wrote:
>>>> 3) identify thise sending a potential inefficient message (includes:)
>>>> insert keys asSet in this case (only if includes: is in a loop!)
>>>>
>>>>
>>> 3b. If the keys collection isn't used for anything else, change to use
>>> includesKey: instead.
>>>
>>> Cheers,
>>> Henry
>>>
>>> _______________________________________________
>>> Pharo-project mailing list
>>> [hidden email]
>>> http://lists.gforge.inria.fr/cgi-bin/mailman/listinfo/pharo-project
>>>
>>
>>
>>
>> --
>> Lukas Renggli
>> http://www.lukas-renggli.ch
>>
>
>
>
> --
> Lukas Renggli
> http://www.lukas-renggli.ch
>
> _______________________________________________
> Pharo-project mailing list
> [hidden email]
> http://lists.gforge.inria.fr/cgi-bin/mailman/listinfo/pharo-project
>

_______________________________________________
Pharo-project mailing list
[hidden email]
http://lists.gforge.inria.fr/cgi-bin/mailman/listinfo/pharo-project
Reply | Threaded
Open this post in threaded view
|

Re: about keys returning a set

Henrik Sperre Johansen
In reply to this post by Lukas Renggli
This merely shows there are no tests testing the keys method of an IdentityDictionary which contains objects for which hash != identityHash...

Cheers,
Henry

On Dec 4, 2009, at 9:06 25AM, Lukas Renggli wrote:

> keys
> "Answer a Set containing the receiver's keys."
>
> | result container |
> result := Set basicNew setTally: tally array: array copy.
> container := result array.
> 1 to: container size do: [ :index |
> (container at: index)
> ifNotNil: [ :assoc | container at: index put: assoc key ] ].
> ^ result


_______________________________________________
Pharo-project mailing list
[hidden email]
http://lists.gforge.inria.fr/cgi-bin/mailman/listinfo/pharo-project
Reply | Threaded
Open this post in threaded view
|

Re: about keys returning a set

Henrik Sperre Johansen
Never mind, didn't notice IdentityDictionary reimplements keys...

:/


On Dec 4, 2009, at 10:10 46AM, Henrik Johansen wrote:

> This merely shows there are no tests testing the keys method of an IdentityDictionary which contains objects for which hash != identityHash...
>
> Cheers,
> Henry
>
> On Dec 4, 2009, at 9:06 25AM, Lukas Renggli wrote:
>
>> keys
>> "Answer a Set containing the receiver's keys."
>>
>> | result container |
>> result := Set basicNew setTally: tally array: array copy.
>> container := result array.
>> 1 to: container size do: [ :index |
>> (container at: index)
>> ifNotNil: [ :assoc | container at: index put: assoc key ] ].
>> ^ result
>
>
> _______________________________________________
> Pharo-project mailing list
> [hidden email]
> http://lists.gforge.inria.fr/cgi-bin/mailman/listinfo/pharo-project
>


_______________________________________________
Pharo-project mailing list
[hidden email]
http://lists.gforge.inria.fr/cgi-bin/mailman/listinfo/pharo-project
Reply | Threaded
Open this post in threaded view
|

Re: about keys returning a set

Henrik Sperre Johansen
In reply to this post by Igor Stasenko

On Dec 4, 2009, at 9:12 30AM, Igor Stasenko wrote:

> 2009/12/4 Lukas Renggli <[hidden email]>:
>> Actually the code can be further optimized:
>>
>> Dictionary>>keys
>>        "Answer a Set containing the receiver's keys."
>>
>>        | result container |
>>        result := Set basicNew setTally: tally array: array copy.
>>        container := result array.
>>        1 to: container size do: [ :index |
>>                (container at: index)
>>                        ifNotNil: [ :assoc | container at: index put: assoc key ] ].
>>        ^ result
>>
>> This is roughly 22 times faster than the current implementation, and 5
>> times faster than the squeak implementation.
>>
>> The system doesn't seem to break with this change :-)
>>
> except that it breaking encapsulation.
> Thus i vote -1 for this kind of optimization :)

You could also achieve the same performance by writing the method (returning an array)

| result currentIX |
        result := Array new: tally.
        currentIX := 1.
        1 to: array size do: [:ix |
                (array at: ix) ifNotNil: [:assoc |
                        result at: currentIX put: assoc key.
                        currentIX := currentIX +1]].
        ^result

Which is, unless you consider using at:put: and expecting an Array of size n to hold n elements, not breaking encapsulation.

Though, the block creation overhead of using keysDo (which is the main reason for the performance difference seen between the two) should, if I've understood Eliot correctly, go away with a stack-based vm.

While on the topic of the Set hierarchy, is it just me, or is the Keyed* versions a mistake?
Giving a block that computes a key means the collection either needs to:
- Know about what type of objects is put into it, in which case you break encapsulation. and it would likely be better to just implement hash/= for those types of objects.
- Only use info available of Object, in which case you're not very likely to end up with much better hash functions than scaledIdentityHash anyways.

Cheers,
Henry
_______________________________________________
Pharo-project mailing list
[hidden email]
http://lists.gforge.inria.fr/cgi-bin/mailman/listinfo/pharo-project
Reply | Threaded
Open this post in threaded view
|

Re: about keys returning a set

Stéphane Ducasse
In reply to this post by Lukas Renggli
do you have an idea why people did not use it?

Stef

On Dec 4, 2009, at 8:58 AM, Lukas Renggli wrote:

> I wonder if ever something along the following lines was considered?
>
> Dictionary>>keys
> "Answer a Set containing the receiver's keys."
>
> | result |
> result := Set basicNew.
> result setTally: tally array: (array collect: [ :each |
> each isNil ifFalse: [ each key ] ]).
> ^ result
>
> This returns a Set, so it wouldn't break any semantics. In my
> benchmark this is roughly 8x faster than the current implementation,
> and 2x faster than the optimized Squeak implementation.
>
> The drawback is that it makes some heavy assumptions on the internal
> structure of Dictionary, Association and Set.
>
> Lukas
>
> 2009/12/4 Henrik Sperre Johansen <[hidden email]>:
>> A small addendum ;)
>> On 03.12.2009 22:13, Nicolas Cellier wrote:
>>> 3) identify thise sending a potential inefficient message (includes:)
>>> insert keys asSet in this case (only if includes: is in a loop!)
>>>
>>>
>> 3b. If the keys collection isn't used for anything else, change to use
>> includesKey: instead.
>>
>> Cheers,
>> Henry
>>
>> _______________________________________________
>> Pharo-project mailing list
>> [hidden email]
>> http://lists.gforge.inria.fr/cgi-bin/mailman/listinfo/pharo-project
>>
>
>
>
> --
> Lukas Renggli
> http://www.lukas-renggli.ch
>
> _______________________________________________
> Pharo-project mailing list
> [hidden email]
> http://lists.gforge.inria.fr/cgi-bin/mailman/listinfo/pharo-project


_______________________________________________
Pharo-project mailing list
[hidden email]
http://lists.gforge.inria.fr/cgi-bin/mailman/listinfo/pharo-project
Reply | Threaded
Open this post in threaded view
|

Re: about keys returning a set

Stéphane Ducasse
In reply to this post by Henrik Sperre Johansen
I would really like to get the changes suggested by nicolas so
if one of you want to support by sending code it would be cool.
We are trying to close some fixed issues (besides working on our little research activities).

Stef

On Dec 4, 2009, at 8:29 AM, Henrik Sperre Johansen wrote:

> A small addendum ;)
> On 03.12.2009 22:13, Nicolas Cellier wrote:
>> 3) identify thise sending a potential inefficient message (includes:)
>> insert keys asSet in this case (only if includes: is in a loop!)
>>
>>
> 3b. If the keys collection isn't used for anything else, change to use
> includesKey: instead.
>
> Cheers,
> Henry
>
> _______________________________________________
> Pharo-project mailing list
> [hidden email]
> http://lists.gforge.inria.fr/cgi-bin/mailman/listinfo/pharo-project


_______________________________________________
Pharo-project mailing list
[hidden email]
http://lists.gforge.inria.fr/cgi-bin/mailman/listinfo/pharo-project
Reply | Threaded
Open this post in threaded view
|

Re: about keys returning a set

Stéphane Ducasse
In reply to this post by Nicolas Cellier
I would be curious to see the average length of dictionary keys  (except Smalltalk)
and see the performance penalty between set and array for the do:

Stef

> 2009/12/4 Lukas Renggli <[hidden email]>:
>> Actually the code can be further optimized:
>>
>> Dictionary>>keys
>>        "Answer a Set containing the receiver's keys."
>>
>>        | result container |
>>        result := Set basicNew setTally: tally array: array copy.
>>        container := result array.
>>        1 to: container size do: [ :index |
>>                (container at: index)
>>                        ifNotNil: [ :assoc | container at: index put: assoc key ] ].
>>        ^ result
>>
>> This is roughly 22 times faster than the current implementation, and 5
>> times faster than the squeak implementation.
>>
>> The system doesn't seem to break with this change :-)
>>
>> Lukas
>>
>
> Yes Levente considered this implementation as keysAsSet, but did not
> pushed it in trunk so far.
> You must also consider that each further usage of a Set will add a
> performance penalty vs an Array, especially do:
>
> Nicolas
>
>> 2009/12/4 Lukas Renggli <[hidden email]>:
>>> I wonder if ever something along the following lines was considered?
>>>
>>> Dictionary>>keys
>>>        "Answer a Set containing the receiver's keys."
>>>
>>>        | result |
>>>        result := Set basicNew.
>>>        result setTally: tally array: (array collect: [ :each |
>>>                each isNil ifFalse: [ each key ] ]).
>>>        ^ result
>>>
>>> This returns a Set, so it wouldn't break any semantics. In my
>>> benchmark this is roughly 8x faster than the current implementation,
>>> and 2x faster than the optimized Squeak implementation.
>>>
>>> The drawback is that it makes some heavy assumptions on the internal
>>> structure of Dictionary, Association and Set.
>>>
>>> Lukas
>>>
>>> 2009/12/4 Henrik Sperre Johansen <[hidden email]>:
>>>> A small addendum ;)
>>>> On 03.12.2009 22:13, Nicolas Cellier wrote:
>>>>> 3) identify thise sending a potential inefficient message (includes:)
>>>>> insert keys asSet in this case (only if includes: is in a loop!)
>>>>>
>>>>>
>>>> 3b. If the keys collection isn't used for anything else, change to use
>>>> includesKey: instead.
>>>>
>>>> Cheers,
>>>> Henry
>>>>
>>>> _______________________________________________
>>>> Pharo-project mailing list
>>>> [hidden email]
>>>> http://lists.gforge.inria.fr/cgi-bin/mailman/listinfo/pharo-project
>>>>
>>>
>>>
>>>
>>> --
>>> Lukas Renggli
>>> http://www.lukas-renggli.ch
>>>
>>
>>
>>
>> --
>> Lukas Renggli
>> http://www.lukas-renggli.ch
>>
>> _______________________________________________
>> Pharo-project mailing list
>> [hidden email]
>> http://lists.gforge.inria.fr/cgi-bin/mailman/listinfo/pharo-project
>>
>
> _______________________________________________
> Pharo-project mailing list
> [hidden email]
> http://lists.gforge.inria.fr/cgi-bin/mailman/listinfo/pharo-project


_______________________________________________
Pharo-project mailing list
[hidden email]
http://lists.gforge.inria.fr/cgi-bin/mailman/listinfo/pharo-project
Reply | Threaded
Open this post in threaded view
|

Re: about keys returning a set

Lukas Renggli
I posted these numbers a while ago.

In my image the average length of a Dictionary is 10, the average
length of a Set is 1.2.

Lukas


2009/12/4 Stéphane Ducasse <[hidden email]>:

> I would be curious to see the average length of dictionary keys  (except Smalltalk)
> and see the performance penalty between set and array for the do:
>
> Stef
>
>> 2009/12/4 Lukas Renggli <[hidden email]>:
>>> Actually the code can be further optimized:
>>>
>>> Dictionary>>keys
>>>        "Answer a Set containing the receiver's keys."
>>>
>>>        | result container |
>>>        result := Set basicNew setTally: tally array: array copy.
>>>        container := result array.
>>>        1 to: container size do: [ :index |
>>>                (container at: index)
>>>                        ifNotNil: [ :assoc | container at: index put: assoc key ] ].
>>>        ^ result
>>>
>>> This is roughly 22 times faster than the current implementation, and 5
>>> times faster than the squeak implementation.
>>>
>>> The system doesn't seem to break with this change :-)
>>>
>>> Lukas
>>>
>>
>> Yes Levente considered this implementation as keysAsSet, but did not
>> pushed it in trunk so far.
>> You must also consider that each further usage of a Set will add a
>> performance penalty vs an Array, especially do:
>>
>> Nicolas
>>
>>> 2009/12/4 Lukas Renggli <[hidden email]>:
>>>> I wonder if ever something along the following lines was considered?
>>>>
>>>> Dictionary>>keys
>>>>        "Answer a Set containing the receiver's keys."
>>>>
>>>>        | result |
>>>>        result := Set basicNew.
>>>>        result setTally: tally array: (array collect: [ :each |
>>>>                each isNil ifFalse: [ each key ] ]).
>>>>        ^ result
>>>>
>>>> This returns a Set, so it wouldn't break any semantics. In my
>>>> benchmark this is roughly 8x faster than the current implementation,
>>>> and 2x faster than the optimized Squeak implementation.
>>>>
>>>> The drawback is that it makes some heavy assumptions on the internal
>>>> structure of Dictionary, Association and Set.
>>>>
>>>> Lukas
>>>>
>>>> 2009/12/4 Henrik Sperre Johansen <[hidden email]>:
>>>>> A small addendum ;)
>>>>> On 03.12.2009 22:13, Nicolas Cellier wrote:
>>>>>> 3) identify thise sending a potential inefficient message (includes:)
>>>>>> insert keys asSet in this case (only if includes: is in a loop!)
>>>>>>
>>>>>>
>>>>> 3b. If the keys collection isn't used for anything else, change to use
>>>>> includesKey: instead.
>>>>>
>>>>> Cheers,
>>>>> Henry
>>>>>
>>>>> _______________________________________________
>>>>> Pharo-project mailing list
>>>>> [hidden email]
>>>>> http://lists.gforge.inria.fr/cgi-bin/mailman/listinfo/pharo-project
>>>>>
>>>>
>>>>
>>>>
>>>> --
>>>> Lukas Renggli
>>>> http://www.lukas-renggli.ch
>>>>
>>>
>>>
>>>
>>> --
>>> Lukas Renggli
>>> http://www.lukas-renggli.ch
>>>
>>> _______________________________________________
>>> Pharo-project mailing list
>>> [hidden email]
>>> http://lists.gforge.inria.fr/cgi-bin/mailman/listinfo/pharo-project
>>>
>>
>> _______________________________________________
>> Pharo-project mailing list
>> [hidden email]
>> http://lists.gforge.inria.fr/cgi-bin/mailman/listinfo/pharo-project
>
>
> _______________________________________________
> Pharo-project mailing list
> [hidden email]
> http://lists.gforge.inria.fr/cgi-bin/mailman/listinfo/pharo-project
>



--
Lukas Renggli
http://www.lukas-renggli.ch

_______________________________________________
Pharo-project mailing list
[hidden email]
http://lists.gforge.inria.fr/cgi-bin/mailman/listinfo/pharo-project
Reply | Threaded
Open this post in threaded view
|

Re: about keys returning a set

Tudor Girba
Using this script:
(Dictionary allInstances inject: 0 into: [ :r :e | r + e size ]) /  
Dictionary allInstances size asFloat

I get 14.23 in my image.

Using this script:
(Set allInstances inject: 0 into: [ :r :e | r + e size ]) / Set  
allInstances size asFloat

I get 0.49

Cheers,
Doru

On 4 Dec 2009, at 11:49, Lukas Renggli wrote:

> I posted these numbers a while ago.
>
> In my image the average length of a Dictionary is 10, the average
> length of a Set is 1.2.
>
> Lukas
>
>
> 2009/12/4 Stéphane Ducasse <[hidden email]>:
>> I would be curious to see the average length of dictionary keys  
>> (except Smalltalk)
>> and see the performance penalty between set and array for the do:
>>
>> Stef
>>
>>> 2009/12/4 Lukas Renggli <[hidden email]>:
>>>> Actually the code can be further optimized:
>>>>
>>>> Dictionary>>keys
>>>>        "Answer a Set containing the receiver's keys."
>>>>
>>>>        | result container |
>>>>        result := Set basicNew setTally: tally array: array copy.
>>>>        container := result array.
>>>>        1 to: container size do: [ :index |
>>>>                (container at: index)
>>>>                        ifNotNil: [ :assoc | container at: index  
>>>> put: assoc key ] ].
>>>>        ^ result
>>>>
>>>> This is roughly 22 times faster than the current implementation,  
>>>> and 5
>>>> times faster than the squeak implementation.
>>>>
>>>> The system doesn't seem to break with this change :-)
>>>>
>>>> Lukas
>>>>
>>>
>>> Yes Levente considered this implementation as keysAsSet, but did not
>>> pushed it in trunk so far.
>>> You must also consider that each further usage of a Set will add a
>>> performance penalty vs an Array, especially do:
>>>
>>> Nicolas
>>>
>>>> 2009/12/4 Lukas Renggli <[hidden email]>:
>>>>> I wonder if ever something along the following lines was  
>>>>> considered?
>>>>>
>>>>> Dictionary>>keys
>>>>>        "Answer a Set containing the receiver's keys."
>>>>>
>>>>>        | result |
>>>>>        result := Set basicNew.
>>>>>        result setTally: tally array: (array collect: [ :each |
>>>>>                each isNil ifFalse: [ each key ] ]).
>>>>>        ^ result
>>>>>
>>>>> This returns a Set, so it wouldn't break any semantics. In my
>>>>> benchmark this is roughly 8x faster than the current  
>>>>> implementation,
>>>>> and 2x faster than the optimized Squeak implementation.
>>>>>
>>>>> The drawback is that it makes some heavy assumptions on the  
>>>>> internal
>>>>> structure of Dictionary, Association and Set.
>>>>>
>>>>> Lukas
>>>>>
>>>>> 2009/12/4 Henrik Sperre Johansen <[hidden email]>:
>>>>>> A small addendum ;)
>>>>>> On 03.12.2009 22:13, Nicolas Cellier wrote:
>>>>>>> 3) identify thise sending a potential inefficient message  
>>>>>>> (includes:)
>>>>>>> insert keys asSet in this case (only if includes: is in a loop!)
>>>>>>>
>>>>>>>
>>>>>> 3b. If the keys collection isn't used for anything else, change  
>>>>>> to use
>>>>>> includesKey: instead.
>>>>>>
>>>>>> Cheers,
>>>>>> Henry
>>>>>>
>>>>>> _______________________________________________
>>>>>> Pharo-project mailing list
>>>>>> [hidden email]
>>>>>> http://lists.gforge.inria.fr/cgi-bin/mailman/listinfo/pharo-project
>>>>>>
>>>>>
>>>>>
>>>>>
>>>>> --
>>>>> Lukas Renggli
>>>>> http://www.lukas-renggli.ch
>>>>>
>>>>
>>>>
>>>>
>>>> --
>>>> Lukas Renggli
>>>> http://www.lukas-renggli.ch
>>>>
>>>> _______________________________________________
>>>> Pharo-project mailing list
>>>> [hidden email]
>>>> http://lists.gforge.inria.fr/cgi-bin/mailman/listinfo/pharo-project
>>>>
>>>
>>> _______________________________________________
>>> Pharo-project mailing list
>>> [hidden email]
>>> http://lists.gforge.inria.fr/cgi-bin/mailman/listinfo/pharo-project
>>
>>
>> _______________________________________________
>> Pharo-project mailing list
>> [hidden email]
>> http://lists.gforge.inria.fr/cgi-bin/mailman/listinfo/pharo-project
>>
>
>
>
> --
> Lukas Renggli
> http://www.lukas-renggli.ch
>
> _______________________________________________
> Pharo-project mailing list
> [hidden email]
> http://lists.gforge.inria.fr/cgi-bin/mailman/listinfo/pharo-project

--
www.tudorgirba.com

"No matter how many recipes we know, we still value a chef."







_______________________________________________
Pharo-project mailing list
[hidden email]
http://lists.gforge.inria.fr/cgi-bin/mailman/listinfo/pharo-project
Reply | Threaded
Open this post in threaded view
|

Re: about keys returning a set

Stéphane Ducasse
In reply to this post by Lukas Renggli
this is interesting to see that this is still a guess game.
Even with these numbers I cannot evaluate what could be a slowdown/pseed up
impact on the complete system. I imgain that the ietration slow down should be minimal
and compensated by the creation speed up.
May be in the future when system will have a better knowledge of themselves
we may end up having better tools...


> I posted these numbers a while ago.
>
> In my image the average length of a Dictionary is 10, the average
> length of a Set is 1.2.
>
> Lukas
>
>
> 2009/12/4 Stéphane Ducasse <[hidden email]>:
>> I would be curious to see the average length of dictionary keys  (except Smalltalk)
>> and see the performance penalty between set and array for the do:
>>
>> Stef
>>
>>> 2009/12/4 Lukas Renggli <[hidden email]>:
>>>> Actually the code can be further optimized:
>>>>
>>>> Dictionary>>keys
>>>>        "Answer a Set containing the receiver's keys."
>>>>
>>>>        | result container |
>>>>        result := Set basicNew setTally: tally array: array copy.
>>>>        container := result array.
>>>>        1 to: container size do: [ :index |
>>>>                (container at: index)
>>>>                        ifNotNil: [ :assoc | container at: index put: assoc key ] ].
>>>>        ^ result
>>>>
>>>> This is roughly 22 times faster than the current implementation, and 5
>>>> times faster than the squeak implementation.
>>>>
>>>> The system doesn't seem to break with this change :-)
>>>>
>>>> Lukas
>>>>
>>>
>>> Yes Levente considered this implementation as keysAsSet, but did not
>>> pushed it in trunk so far.
>>> You must also consider that each further usage of a Set will add a
>>> performance penalty vs an Array, especially do:
>>>
>>> Nicolas
>>>
>>>> 2009/12/4 Lukas Renggli <[hidden email]>:
>>>>> I wonder if ever something along the following lines was considered?
>>>>>
>>>>> Dictionary>>keys
>>>>>        "Answer a Set containing the receiver's keys."
>>>>>
>>>>>        | result |
>>>>>        result := Set basicNew.
>>>>>        result setTally: tally array: (array collect: [ :each |
>>>>>                each isNil ifFalse: [ each key ] ]).
>>>>>        ^ result
>>>>>
>>>>> This returns a Set, so it wouldn't break any semantics. In my
>>>>> benchmark this is roughly 8x faster than the current implementation,
>>>>> and 2x faster than the optimized Squeak implementation.
>>>>>
>>>>> The drawback is that it makes some heavy assumptions on the internal
>>>>> structure of Dictionary, Association and Set.
>>>>>
>>>>> Lukas
>>>>>
>>>>> 2009/12/4 Henrik Sperre Johansen <[hidden email]>:
>>>>>> A small addendum ;)
>>>>>> On 03.12.2009 22:13, Nicolas Cellier wrote:
>>>>>>> 3) identify thise sending a potential inefficient message (includes:)
>>>>>>> insert keys asSet in this case (only if includes: is in a loop!)
>>>>>>>
>>>>>>>
>>>>>> 3b. If the keys collection isn't used for anything else, change to use
>>>>>> includesKey: instead.
>>>>>>
>>>>>> Cheers,
>>>>>> Henry
>>>>>>
>>>>>> _______________________________________________
>>>>>> Pharo-project mailing list
>>>>>> [hidden email]
>>>>>> http://lists.gforge.inria.fr/cgi-bin/mailman/listinfo/pharo-project
>>>>>>
>>>>>
>>>>>
>>>>>
>>>>> --
>>>>> Lukas Renggli
>>>>> http://www.lukas-renggli.ch
>>>>>
>>>>
>>>>
>>>>
>>>> --
>>>> Lukas Renggli
>>>> http://www.lukas-renggli.ch
>>>>
>>>> _______________________________________________
>>>> Pharo-project mailing list
>>>> [hidden email]
>>>> http://lists.gforge.inria.fr/cgi-bin/mailman/listinfo/pharo-project
>>>>
>>>
>>> _______________________________________________
>>> Pharo-project mailing list
>>> [hidden email]
>>> http://lists.gforge.inria.fr/cgi-bin/mailman/listinfo/pharo-project
>>
>>
>> _______________________________________________
>> Pharo-project mailing list
>> [hidden email]
>> http://lists.gforge.inria.fr/cgi-bin/mailman/listinfo/pharo-project
>>
>
>
>
> --
> Lukas Renggli
> http://www.lukas-renggli.ch
>
> _______________________________________________
> Pharo-project mailing list
> [hidden email]
> http://lists.gforge.inria.fr/cgi-bin/mailman/listinfo/pharo-project


_______________________________________________
Pharo-project mailing list
[hidden email]
http://lists.gforge.inria.fr/cgi-bin/mailman/listinfo/pharo-project
Reply | Threaded
Open this post in threaded view
|

Re: about keys returning a set

Lukas Renggli
> Even with these numbers I cannot evaluate what could be a slowdown/pseed up
> impact on the complete system. I imgain that the ietration slow down should be minimal
> and compensated by the creation speed up.

Yeah, the iteration slowdown should be minimal of using Set vs. Array.

What I think is more critical that you change the semantics of the
returned data-structure. That could not only break clients, but also
their expectations towards performance.

I find interesting that most sets are almost empty.

Lukas

--
Lukas Renggli
http://www.lukas-renggli.ch

_______________________________________________
Pharo-project mailing list
[hidden email]
http://lists.gforge.inria.fr/cgi-bin/mailman/listinfo/pharo-project
Reply | Threaded
Open this post in threaded view
|

Re: about keys returning a set

csrabak
In reply to this post by Lukas Renggli
Em 04/12/2009 05:58, Lukas Renggli < [hidden email] > escreveu:

> I wonder if ever something along the following lines was considered?
>
> Dictionary>>keys
> "Answer a Set containing the receiver's keys."
>
> | result |
> result := Set basicNew.
> result setTally: tally array: (array collect: [ :each |
> each isNil ifFalse: [ each key ] ]).
> ^ result

Why not:

| result |
result := Set basicNew.
result setTally: tally array: (array collect: [ :each |
each ifNil: [ each key ] ]).
^ result

A bit more OO ;-)

Reading Andrés' book is making more aware of those subtleties :-D
 
My .019999...

-
Cesar Rabak

_______________________________________________
Pharo-project mailing list
[hidden email]
http://lists.gforge.inria.fr/cgi-bin/mailman/listinfo/pharo-project
Reply | Threaded
Open this post in threaded view
|

Re: about keys returning a set

Lukas Renggli
Yeah, this was my first attempt, but this implementation is ways
slower. The block activations with #collect: matters. My
implemebtation only calls primitives or inlined constructs. Not nice
to look at, but as fast as it can get.

Lukas

On Friday, December 4, 2009,  <[hidden email]> wrote:

> Em 04/12/2009 05:58, Lukas Renggli < [hidden email] > escreveu:
>
>> I wonder if ever something along the following lines was considered?
>>
>> Dictionary>>keys
>> "Answer a Set containing the receiver's keys."
>>
>> | result |
>> result := Set basicNew.
>> result setTally: tally array: (array collect: [ :each |
>> each isNil ifFalse: [ each key ] ]).
>> ^ result
>
> Why not:
>
> | result |
> result := Set basicNew.
> result setTally: tally array: (array collect: [ :each |
> each ifNil: [ each key ] ]).
> ^ result
>
> A bit more OO ;-)
>
> Reading Andrés' book is making more aware of those subtleties :-D
>
> My .019999...
>
> -
> Cesar Rabak
>
> _______________________________________________
> Pharo-project mailing list
> [hidden email]
> http://lists.gforge.inria.fr/cgi-bin/mailman/listinfo/pharo-project

--
Lukas Renggli
http://www.lukas-renggli.ch

_______________________________________________
Pharo-project mailing list
[hidden email]
http://lists.gforge.inria.fr/cgi-bin/mailman/listinfo/pharo-project
12