dictionary value is an array?

Previous Topic Next Topic
 
classic Classic list List threaded Threaded
15 messages Options
Reply | Threaded
Open this post in threaded view
|

dictionary value is an array?

Stéphane Ducasse

Hi nicolas

If I remember you proposed fixes to have dictionary values -> array?
Is it correct?

I was wondering why?
Because a set makes more sense to me since
        - the order of the elements in values is not useful (because they  
come from a dict)
        - the occurrences not really either
        - we cannot add to the results so we should create another collection

May be I'm totally wrong with the changes but I think that values  
should be set while keys an array.
I would be interested in discussion on that.

Stef





_______________________________________________
Pharo-project mailing list
[hidden email]
http://lists.gforge.inria.fr/cgi-bin/mailman/listinfo/pharo-project
Reply | Threaded
Open this post in threaded view
|

Re: dictionary value is an array?

Nicolas Cellier
2009/10/31 Stéphane Ducasse <[hidden email]>:

>
> Hi nicolas
>
> If I remember you proposed fixes to have dictionary values -> array?
> Is it correct?
>
> I was wondering why?
> Because a set makes more sense to me since
>        - the order of the elements in values is not useful (because they
> come from a dict)
>        - the occurrences not really either
>        - we cannot add to the results so we should create another collection
>
> May be I'm totally wrong with the changes but I think that values
> should be set while keys an array.
> I would be interested in discussion on that.
>
> Stef
>
>

Dictionary values is already an Array...
I proposed to change keys accordingly.

Historically, they were a Bag and a Set because they were unordered.

But very few or no sender expect a behavior specific to a Bag or a Set.
Most just do: select: collect: asArray asSortedCollection etc...
That's where an Array outperforms a Set or a Bag.

So we are paying the price of a Set or a Bag almost for nothing...
And that is why you can see a #fasterKeys.

The drawback is that a few senders will add: to or remove: from keys.
My estimation is around 5%.
Inside the image, no problem, it is easy to fix, just write
(someDictionary keys asSet).
But that put a load on package maintainers.

Nicolas

>
>
>
> _______________________________________________
> Pharo-project mailing list
> [hidden email]
> http://lists.gforge.inria.fr/cgi-bin/mailman/listinfo/pharo-project
>

_______________________________________________
Pharo-project mailing list
[hidden email]
http://lists.gforge.inria.fr/cgi-bin/mailman/listinfo/pharo-project
Reply | Threaded
Open this post in threaded view
|

Re: dictionary value is an array?

Igor Stasenko
2009/10/31 Nicolas Cellier <[hidden email]>:

> 2009/10/31 Stéphane Ducasse <[hidden email]>:
>>
>> Hi nicolas
>>
>> If I remember you proposed fixes to have dictionary values -> array?
>> Is it correct?
>>
>> I was wondering why?
>> Because a set makes more sense to me since
>>        - the order of the elements in values is not useful (because they
>> come from a dict)
>>        - the occurrences not really either
>>        - we cannot add to the results so we should create another collection
>>
>> May be I'm totally wrong with the changes but I think that values
>> should be set while keys an array.
>> I would be interested in discussion on that.
>>
>> Stef
>>
>>
>
> Dictionary values is already an Array...
> I proposed to change keys accordingly.
>
> Historically, they were a Bag and a Set because they were unordered.
>
> But very few or no sender expect a behavior specific to a Bag or a Set.
> Most just do: select: collect: asArray asSortedCollection etc...
> That's where an Array outperforms a Set or a Bag.
>
> So we are paying the price of a Set or a Bag almost for nothing...
> And that is why you can see a #fasterKeys.
>
> The drawback is that a few senders will add: to or remove: from keys.
> My estimation is around 5%.
> Inside the image, no problem, it is easy to fix, just write
> (someDictionary keys asSet).
> But that put a load on package maintainers.
>

+1 Nicolas. Personally, i think that modifying a collection which not
constructed by your own code
is bad programming style, because you never know where this collection
is came from and you can't be sure that there is no other users of it,
which in own turn may modify it at any moment, while often you
assuming that only you own the collection and don't expecting any
modifications to it except in own code.
So, IMO this approach is error prone, and you should never use
#remove: or #add: with collections which constructed by separate
package/layer, and instead, use #asSet, #asOrderedCollection and, of
course #copy, before starting manipulating its elements.


> Nicolas
>
>>
>>
>>
>> _______________________________________________
>> Pharo-project mailing list
>> [hidden email]
>> http://lists.gforge.inria.fr/cgi-bin/mailman/listinfo/pharo-project
>>
>
> _______________________________________________
> Pharo-project mailing list
> [hidden email]
> http://lists.gforge.inria.fr/cgi-bin/mailman/listinfo/pharo-project
>



--
Best regards,
Igor Stasenko AKA sig.

_______________________________________________
Pharo-project mailing list
[hidden email]
http://lists.gforge.inria.fr/cgi-bin/mailman/listinfo/pharo-project
Reply | Threaded
Open this post in threaded view
|

Re: dictionary value is an array?

Nicolas Cellier
2009/10/31 Igor Stasenko <[hidden email]>:

> 2009/10/31 Nicolas Cellier <[hidden email]>:
>> 2009/10/31 Stéphane Ducasse <[hidden email]>:
>>>
>>> Hi nicolas
>>>
>>> If I remember you proposed fixes to have dictionary values -> array?
>>> Is it correct?
>>>
>>> I was wondering why?
>>> Because a set makes more sense to me since
>>>        - the order of the elements in values is not useful (because they
>>> come from a dict)
>>>        - the occurrences not really either
>>>        - we cannot add to the results so we should create another collection
>>>
>>> May be I'm totally wrong with the changes but I think that values
>>> should be set while keys an array.
>>> I would be interested in discussion on that.
>>>
>>> Stef
>>>
>>>
>>
>> Dictionary values is already an Array...
>> I proposed to change keys accordingly.
>>
>> Historically, they were a Bag and a Set because they were unordered.
>>
>> But very few or no sender expect a behavior specific to a Bag or a Set.
>> Most just do: select: collect: asArray asSortedCollection etc...
>> That's where an Array outperforms a Set or a Bag.
>>
>> So we are paying the price of a Set or a Bag almost for nothing...
>> And that is why you can see a #fasterKeys.
>>
>> The drawback is that a few senders will add: to or remove: from keys.
>> My estimation is around 5%.
>> Inside the image, no problem, it is easy to fix, just write
>> (someDictionary keys asSet).
>> But that put a load on package maintainers.
>>
>
> +1 Nicolas. Personally, i think that modifying a collection which not
> constructed by your own code
> is bad programming style, because you never know where this collection
> is came from and you can't be sure that there is no other users of it,
> which in own turn may modify it at any moment, while often you
> assuming that only you own the collection and don't expecting any
> modifications to it except in own code.
> So, IMO this approach is error prone, and you should never use
> #remove: or #add: with collections which constructed by separate
> package/layer, and instead, use #asSet, #asOrderedCollection and, of
> course #copy, before starting manipulating its elements.
>

I agree. removing from another one's collection is not that well behaved !
It's putting expectations higher than necessary.

I see this as a sort of optimization breaking encapsulation...
I want to come with a better optimization.
Unfortunately, historical uses will be found here and there, #keys
being a well-known case.
We have to put pressure on maintainers...
Who ever takes this path shall be ready to sell the idea, and prove
that gains > costs.

A team modifying kernel in uncompatible manner (and we have to, or
just stick to Squeak 3.7) shall provide support for a smooth
transiiton.
Sometimes I feel it would be convenient to inquire a code base larger
than just an image, so that we can emit proposals and sell the idea
rather than just waiting for complaints passively....

Nicolas

>
>> Nicolas
>>
>>>
>>>
>>>
>>> _______________________________________________
>>> Pharo-project mailing list
>>> [hidden email]
>>> http://lists.gforge.inria.fr/cgi-bin/mailman/listinfo/pharo-project
>>>
>>
>> _______________________________________________
>> Pharo-project mailing list
>> [hidden email]
>> http://lists.gforge.inria.fr/cgi-bin/mailman/listinfo/pharo-project
>>
>
>
>
> --
> Best regards,
> Igor Stasenko AKA sig.
>
> _______________________________________________
> Pharo-project mailing list
> [hidden email]
> http://lists.gforge.inria.fr/cgi-bin/mailman/listinfo/pharo-project

_______________________________________________
Pharo-project mailing list
[hidden email]
http://lists.gforge.inria.fr/cgi-bin/mailman/listinfo/pharo-project
Reply | Threaded
Open this post in threaded view
|

Re: dictionary value is an array?

Lukas Renggli
It always annoyed me that #keys and #values create a new collection
object. Form an object-oriented stand-point of view, what would rather
make sense is to return a collection that delegates to the dictionary.

For example, DictionaryKeys would be a subclass of Collection, behave
like a Set and be implemented along ...

    Dictionary>>keys
        ^ DictionaryKeys on: self

    DictionaryKeys>>do: aBlock
        dictionary keysDo: aBlock

    DictionaryKeys>>includes: anObject
        ^ dictionary includesKey: anObject

    DictionaryKeys>>species
        ^ Set

I bet that this would not only result in a major performance
improvement, but also greatly reduce memory consumption. Furthermore
it would solve the problem that #keys suddenly has totally different
performance characteristics as we decide to return an Array instead of
a Set. For example, it wouldn't matter anymore if you wrote

    aDictionary includesKey: #foo

or

    aDictionary keys includes: #foo

I see only obvious problem for such an approach and this is when the
dictionary changes and the keys are stored somewhere. Some users might
not expect the keys to change too.

Cheers,
Lukas

--
Lukas Renggli
http://www.lukas-renggli.ch

_______________________________________________
Pharo-project mailing list
[hidden email]
http://lists.gforge.inria.fr/cgi-bin/mailman/listinfo/pharo-project
Reply | Threaded
Open this post in threaded view
|

Re: dictionary value is an array?

Schwab,Wilhelm K
In reply to this post by Nicolas Cellier
Hello all,

FWIW, Dolphin "lookup tables" use parallel arrays for keys and values.  OA's claim (I do not dispute it, I'm simply reporting what they said) is that the result is faster than a dictionary based on associations.

Bill



-----Original Message-----
From: [hidden email] [mailto:[hidden email]] On Behalf Of Nicolas Cellier
Sent: Saturday, October 31, 2009 9:23 AM
To: [hidden email]
Subject: Re: [Pharo-project] dictionary value is an array?

2009/10/31 Igor Stasenko <[hidden email]>:

> 2009/10/31 Nicolas Cellier <[hidden email]>:
>> 2009/10/31 Stéphane Ducasse <[hidden email]>:
>>>
>>> Hi nicolas
>>>
>>> If I remember you proposed fixes to have dictionary values -> array?
>>> Is it correct?
>>>
>>> I was wondering why?
>>> Because a set makes more sense to me since
>>>        - the order of the elements in values is not useful (because
>>> they come from a dict)
>>>        - the occurrences not really either
>>>        - we cannot add to the results so we should create another
>>> collection
>>>
>>> May be I'm totally wrong with the changes but I think that values
>>> should be set while keys an array.
>>> I would be interested in discussion on that.
>>>
>>> Stef
>>>
>>>
>>
>> Dictionary values is already an Array...
>> I proposed to change keys accordingly.
>>
>> Historically, they were a Bag and a Set because they were unordered.
>>
>> But very few or no sender expect a behavior specific to a Bag or a Set.
>> Most just do: select: collect: asArray asSortedCollection etc...
>> That's where an Array outperforms a Set or a Bag.
>>
>> So we are paying the price of a Set or a Bag almost for nothing...
>> And that is why you can see a #fasterKeys.
>>
>> The drawback is that a few senders will add: to or remove: from keys.
>> My estimation is around 5%.
>> Inside the image, no problem, it is easy to fix, just write
>> (someDictionary keys asSet).
>> But that put a load on package maintainers.
>>
>
> +1 Nicolas. Personally, i think that modifying a collection which not
> constructed by your own code
> is bad programming style, because you never know where this collection
> is came from and you can't be sure that there is no other users of it,
> which in own turn may modify it at any moment, while often you
> assuming that only you own the collection and don't expecting any
> modifications to it except in own code.
> So, IMO this approach is error prone, and you should never use
> #remove: or #add: with collections which constructed by separate
> package/layer, and instead, use #asSet, #asOrderedCollection and, of
> course #copy, before starting manipulating its elements.
>

I agree. removing from another one's collection is not that well behaved !
It's putting expectations higher than necessary.

I see this as a sort of optimization breaking encapsulation...
I want to come with a better optimization.
Unfortunately, historical uses will be found here and there, #keys being a well-known case.
We have to put pressure on maintainers...
Who ever takes this path shall be ready to sell the idea, and prove that gains > costs.

A team modifying kernel in uncompatible manner (and we have to, or just stick to Squeak 3.7) shall provide support for a smooth transiiton.
Sometimes I feel it would be convenient to inquire a code base larger than just an image, so that we can emit proposals and sell the idea rather than just waiting for complaints passively....

Nicolas

>
>> Nicolas
>>
>>>
>>>
>>>
>>> _______________________________________________
>>> Pharo-project mailing list
>>> [hidden email]
>>> http://lists.gforge.inria.fr/cgi-bin/mailman/listinfo/pharo-project
>>>
>>
>> _______________________________________________
>> Pharo-project mailing list
>> [hidden email]
>> http://lists.gforge.inria.fr/cgi-bin/mailman/listinfo/pharo-project
>>
>
>
>
> --
> Best regards,
> Igor Stasenko AKA sig.
>
> _______________________________________________
> Pharo-project mailing list
> [hidden email]
> http://lists.gforge.inria.fr/cgi-bin/mailman/listinfo/pharo-project

_______________________________________________
Pharo-project mailing list
[hidden email]
http://lists.gforge.inria.fr/cgi-bin/mailman/listinfo/pharo-project

_______________________________________________
Pharo-project mailing list
[hidden email]
http://lists.gforge.inria.fr/cgi-bin/mailman/listinfo/pharo-project
Reply | Threaded
Open this post in threaded view
|

Re: dictionary value is an array?

Nicolas Cellier
In reply to this post by Lukas Renggli
2009/10/31 Lukas Renggli <[hidden email]>:

> It always annoyed me that #keys and #values create a new collection
> object. Form an object-oriented stand-point of view, what would rather
> make sense is to return a collection that delegates to the dictionary.
>
> For example, DictionaryKeys would be a subclass of Collection, behave
> like a Set and be implemented along ...
>
>    Dictionary>>keys
>        ^ DictionaryKeys on: self
>
>    DictionaryKeys>>do: aBlock
>        dictionary keysDo: aBlock
>
>    DictionaryKeys>>includes: anObject
>        ^ dictionary includesKey: anObject
>
>    DictionaryKeys>>species
>        ^ Set
>
> I bet that this would not only result in a major performance
> improvement, but also greatly reduce memory consumption. Furthermore
> it would solve the problem that #keys suddenly has totally different
> performance characteristics as we decide to return an Array instead of
> a Set. For example, it wouldn't matter anymore if you wrote
>
>    aDictionary includesKey: #foo
>
> or
>
>    aDictionary keys includes: #foo
>

Yes, the proxy/wrapper pattern can be an elegant and efficient
solution (a good example is ublas from BOOST library).
It depends how many time you will iterate the collection though.
Iterating a Set is often +2x the cost of iterating an Array.

> I see only obvious problem for such an approach and this is when the
> dictionary changes and the keys are stored somewhere. Some users might
> not expect the keys to change too.
>

Yes, an immutable proxy with copy on write behaviour is challenging to
implement...

> Cheers,
> Lukas
>
> --
> Lukas Renggli
> http://www.lukas-renggli.ch
>
> _______________________________________________
> Pharo-project mailing list
> [hidden email]
> http://lists.gforge.inria.fr/cgi-bin/mailman/listinfo/pharo-project
>

_______________________________________________
Pharo-project mailing list
[hidden email]
http://lists.gforge.inria.fr/cgi-bin/mailman/listinfo/pharo-project
Reply | Threaded
Open this post in threaded view
|

Re: dictionary value is an array?

Nicolas Cellier
In reply to this post by Schwab,Wilhelm K
2009/10/31 Schwab,Wilhelm K <[hidden email]>:
> Hello all,
>
> FWIW, Dolphin "lookup tables" use parallel arrays for keys and values.  OA's claim (I do not dispute it, I'm simply reporting what they said) is that the result is faster than a dictionary based on associations.
>
> Bill
>
>

Squeak.MethodDictionary is an example of this pattern too

>
> -----Original Message-----
> From: [hidden email] [mailto:[hidden email]] On Behalf Of Nicolas Cellier
> Sent: Saturday, October 31, 2009 9:23 AM
> To: [hidden email]
> Subject: Re: [Pharo-project] dictionary value is an array?
>
> 2009/10/31 Igor Stasenko <[hidden email]>:
>> 2009/10/31 Nicolas Cellier <[hidden email]>:
>>> 2009/10/31 Stéphane Ducasse <[hidden email]>:
>>>>
>>>> Hi nicolas
>>>>
>>>> If I remember you proposed fixes to have dictionary values -> array?
>>>> Is it correct?
>>>>
>>>> I was wondering why?
>>>> Because a set makes more sense to me since
>>>>        - the order of the elements in values is not useful (because
>>>> they come from a dict)
>>>>        - the occurrences not really either
>>>>        - we cannot add to the results so we should create another
>>>> collection
>>>>
>>>> May be I'm totally wrong with the changes but I think that values
>>>> should be set while keys an array.
>>>> I would be interested in discussion on that.
>>>>
>>>> Stef
>>>>
>>>>
>>>
>>> Dictionary values is already an Array...
>>> I proposed to change keys accordingly.
>>>
>>> Historically, they were a Bag and a Set because they were unordered.
>>>
>>> But very few or no sender expect a behavior specific to a Bag or a Set.
>>> Most just do: select: collect: asArray asSortedCollection etc...
>>> That's where an Array outperforms a Set or a Bag.
>>>
>>> So we are paying the price of a Set or a Bag almost for nothing...
>>> And that is why you can see a #fasterKeys.
>>>
>>> The drawback is that a few senders will add: to or remove: from keys.
>>> My estimation is around 5%.
>>> Inside the image, no problem, it is easy to fix, just write
>>> (someDictionary keys asSet).
>>> But that put a load on package maintainers.
>>>
>>
>> +1 Nicolas. Personally, i think that modifying a collection which not
>> constructed by your own code
>> is bad programming style, because you never know where this collection
>> is came from and you can't be sure that there is no other users of it,
>> which in own turn may modify it at any moment, while often you
>> assuming that only you own the collection and don't expecting any
>> modifications to it except in own code.
>> So, IMO this approach is error prone, and you should never use
>> #remove: or #add: with collections which constructed by separate
>> package/layer, and instead, use #asSet, #asOrderedCollection and, of
>> course #copy, before starting manipulating its elements.
>>
>
> I agree. removing from another one's collection is not that well behaved !
> It's putting expectations higher than necessary.
>
> I see this as a sort of optimization breaking encapsulation...
> I want to come with a better optimization.
> Unfortunately, historical uses will be found here and there, #keys being a well-known case.
> We have to put pressure on maintainers...
> Who ever takes this path shall be ready to sell the idea, and prove that gains > costs.
>
> A team modifying kernel in uncompatible manner (and we have to, or just stick to Squeak 3.7) shall provide support for a smooth transiiton.
> Sometimes I feel it would be convenient to inquire a code base larger than just an image, so that we can emit proposals and sell the idea rather than just waiting for complaints passively....
>
> Nicolas
>
>>
>>> Nicolas
>>>
>>>>
>>>>
>>>>
>>>> _______________________________________________
>>>> Pharo-project mailing list
>>>> [hidden email]
>>>> http://lists.gforge.inria.fr/cgi-bin/mailman/listinfo/pharo-project
>>>>
>>>
>>> _______________________________________________
>>> Pharo-project mailing list
>>> [hidden email]
>>> http://lists.gforge.inria.fr/cgi-bin/mailman/listinfo/pharo-project
>>>
>>
>>
>>
>> --
>> Best regards,
>> Igor Stasenko AKA sig.
>>
>> _______________________________________________
>> Pharo-project mailing list
>> [hidden email]
>> http://lists.gforge.inria.fr/cgi-bin/mailman/listinfo/pharo-project
>
> _______________________________________________
> Pharo-project mailing list
> [hidden email]
> http://lists.gforge.inria.fr/cgi-bin/mailman/listinfo/pharo-project
>
> _______________________________________________
> Pharo-project mailing list
> [hidden email]
> http://lists.gforge.inria.fr/cgi-bin/mailman/listinfo/pharo-project
>

_______________________________________________
Pharo-project mailing list
[hidden email]
http://lists.gforge.inria.fr/cgi-bin/mailman/listinfo/pharo-project
Reply | Threaded
Open this post in threaded view
|

Re: dictionary value is an array?

Martin McClure-2
In reply to this post by Lukas Renggli
Lukas Renggli wrote:
> It always annoyed me that #keys and #values create a new collection
> object. Form an object-oriented stand-point of view, what would rather
> make sense is to return a collection that delegates to the dictionary.

+1, I've had similar thoughts.

I'd think DictionaryKeys would need to be a read-only collection.
Otherwise, what would be the effect of DictionaryKeys>>add: on the
underlying Dictionary? And even worse effects of trying to add to a
DictionaryValues...

Regards,

-Martin

_______________________________________________
Pharo-project mailing list
[hidden email]
http://lists.gforge.inria.fr/cgi-bin/mailman/listinfo/pharo-project
Reply | Threaded
Open this post in threaded view
|

Re: dictionary value is an array?

Martin McClure-2
In reply to this post by Schwab,Wilhelm K
Schwab,Wilhelm K wrote:
> Hello all,
>
> FWIW, Dolphin "lookup tables" use parallel arrays for keys and values.  OA's claim (I do not dispute it, I'm simply reporting what they said) is that the result is faster than a dictionary based on associations.

Parallel arrays should indeed be faster than Associations. The overhead
at add: time of creating a new Association object is an appreciable
fraction of the overall add: time.

But lookups should be even faster with a single array that alternates
keys and values, an approach I often use. Why is this faster? Modern RAM
isn't really "random access"; accesses to locations in the CPU cache can
be up to two orders of magnitude faster than accesses to locations in
main memory. If the key and value are next to each other, they are often
in the same cache line, so the number of reads from main memory is halved.

Regards,

-Martin

_______________________________________________
Pharo-project mailing list
[hidden email]
http://lists.gforge.inria.fr/cgi-bin/mailman/listinfo/pharo-project
Reply | Threaded
Open this post in threaded view
|

Re: dictionary value is an array?

Igor Stasenko
In reply to this post by Martin McClure-2
2009/10/31 Martin McClure <[hidden email]>:

> Lukas Renggli wrote:
>> It always annoyed me that #keys and #values create a new collection
>> object. Form an object-oriented stand-point of view, what would rather
>> make sense is to return a collection that delegates to the dictionary.
>
> +1, I've had similar thoughts.
>
> I'd think DictionaryKeys would need to be a read-only collection.
> Otherwise, what would be the effect of DictionaryKeys>>add: on the
> underlying Dictionary? And even worse effects of trying to add to a
> DictionaryValues...
>
Yeah, i talked about this earlier in this topic.
I think that given approach can be generalized to serve for multiple purposes.

Imagine a proxy object, which takes a collection and block with two arguments:

| dictKeys |
dictKeys := CollectionFilter new collection: (Object methodDict)
filter: [:coll :aBlock |
        coll keysDo: [:each | aBlock value: each ]
].
dictKeys asSet

There are many ways how this proxy can be used.
For instance, a #reject: and #select: , in parallel we could have
#rejectedBy: and #selectedBy:
which creating a filter proxy instead of creating new collection:

CollectionFilter class>>collection: aCollection rejectedBy: rejectBlock
  ^ self new collection: aCollection filter: [:coll :aBlock |
      coll do: [:each | (rejectBlock value: each) ifFalse: [ aBlock
value: each ]
  ]]


> Regards,
>
> -Martin
>
> _______________________________________________
> Pharo-project mailing list
> [hidden email]
> http://lists.gforge.inria.fr/cgi-bin/mailman/listinfo/pharo-project
>



--
Best regards,
Igor Stasenko AKA sig.

_______________________________________________
Pharo-project mailing list
[hidden email]
http://lists.gforge.inria.fr/cgi-bin/mailman/listinfo/pharo-project

CollectionFilter.st (788 bytes) Download Attachment
Reply | Threaded
Open this post in threaded view
|

Re: dictionary value is an array?

Nicolas Cellier
2009/10/31 Igor Stasenko <[hidden email]>:

> 2009/10/31 Martin McClure <[hidden email]>:
>> Lukas Renggli wrote:
>>> It always annoyed me that #keys and #values create a new collection
>>> object. Form an object-oriented stand-point of view, what would rather
>>> make sense is to return a collection that delegates to the dictionary.
>>
>> +1, I've had similar thoughts.
>>
>> I'd think DictionaryKeys would need to be a read-only collection.
>> Otherwise, what would be the effect of DictionaryKeys>>add: on the
>> underlying Dictionary? And even worse effects of trying to add to a
>> DictionaryValues...
>>
>
> Yeah, i talked about this earlier in this topic.
> I think that given approach can be generalized to serve for multiple purposes.
>
> Imagine a proxy object, which takes a collection and block with two arguments:
>
> | dictKeys |
> dictKeys := CollectionFilter new collection: (Object methodDict)
> filter: [:coll :aBlock |
>        coll keysDo: [:each | aBlock value: each ]
> ].
> dictKeys asSet
>
> There are many ways how this proxy can be used.
> For instance, a #reject: and #select: , in parallel we could have
> #rejectedBy: and #selectedBy:
> which creating a filter proxy instead of creating new collection:
>
> CollectionFilter class>>collection: aCollection rejectedBy: rejectBlock
>  ^ self new collection: aCollection filter: [:coll :aBlock |
>      coll do: [:each | (rejectBlock value: each) ifFalse: [ aBlock
> value: each ]
>  ]]
>

That also has been generalized to stream.
See Nile or gnu smalltalk for example.

>
>> Regards,
>>
>> -Martin
>>
>> _______________________________________________
>> Pharo-project mailing list
>> [hidden email]
>> http://lists.gforge.inria.fr/cgi-bin/mailman/listinfo/pharo-project
>>
>
>
>
> --
> Best regards,
> Igor Stasenko AKA sig.
>
> _______________________________________________
> Pharo-project mailing list
> [hidden email]
> http://lists.gforge.inria.fr/cgi-bin/mailman/listinfo/pharo-project
>

_______________________________________________
Pharo-project mailing list
[hidden email]
http://lists.gforge.inria.fr/cgi-bin/mailman/listinfo/pharo-project
Reply | Threaded
Open this post in threaded view
|

Re: dictionary value is an array?

Stéphane Ducasse
In reply to this post by Nicolas Cellier
what is a parallel array?

> 2009/10/31 Schwab,Wilhelm K <[hidden email]>:
>> Hello all,
>>
>> FWIW, Dolphin "lookup tables" use parallel arrays for keys and  
>> values.  OA's claim (I do not dispute it, I'm simply reporting what  
>> they said) is that the result is faster than a dictionary based on  
>> associations.
>>
>> Bill
>>
>>
>
> Squeak.MethodDictionary is an example of this pattern too
>
>>
>> -----Original Message-----
>> From: [hidden email] [mailto:pharo-
>> [hidden email]] On Behalf Of Nicolas Cellier
>> Sent: Saturday, October 31, 2009 9:23 AM
>> To: [hidden email]
>> Subject: Re: [Pharo-project] dictionary value is an array?
>>
>> 2009/10/31 Igor Stasenko <[hidden email]>:
>>> 2009/10/31 Nicolas Cellier <[hidden email]>:
>>>> 2009/10/31 Stéphane Ducasse <[hidden email]>:
>>>>>
>>>>> Hi nicolas
>>>>>
>>>>> If I remember you proposed fixes to have dictionary values ->  
>>>>> array?
>>>>> Is it correct?
>>>>>
>>>>> I was wondering why?
>>>>> Because a set makes more sense to me since
>>>>>        - the order of the elements in values is not useful  
>>>>> (because
>>>>> they come from a dict)
>>>>>        - the occurrences not really either
>>>>>        - we cannot add to the results so we should create another
>>>>> collection
>>>>>
>>>>> May be I'm totally wrong with the changes but I think that values
>>>>> should be set while keys an array.
>>>>> I would be interested in discussion on that.
>>>>>
>>>>> Stef
>>>>>
>>>>>
>>>>
>>>> Dictionary values is already an Array...
>>>> I proposed to change keys accordingly.
>>>>
>>>> Historically, they were a Bag and a Set because they were  
>>>> unordered.
>>>>
>>>> But very few or no sender expect a behavior specific to a Bag or  
>>>> a Set.
>>>> Most just do: select: collect: asArray asSortedCollection etc...
>>>> That's where an Array outperforms a Set or a Bag.
>>>>
>>>> So we are paying the price of a Set or a Bag almost for nothing...
>>>> And that is why you can see a #fasterKeys.
>>>>
>>>> The drawback is that a few senders will add: to or remove: from  
>>>> keys.
>>>> My estimation is around 5%.
>>>> Inside the image, no problem, it is easy to fix, just write
>>>> (someDictionary keys asSet).
>>>> But that put a load on package maintainers.
>>>>
>>>
>>> +1 Nicolas. Personally, i think that modifying a collection which  
>>> not
>>> constructed by your own code
>>> is bad programming style, because you never know where this  
>>> collection
>>> is came from and you can't be sure that there is no other users of  
>>> it,
>>> which in own turn may modify it at any moment, while often you
>>> assuming that only you own the collection and don't expecting any
>>> modifications to it except in own code.
>>> So, IMO this approach is error prone, and you should never use
>>> #remove: or #add: with collections which constructed by separate
>>> package/layer, and instead, use #asSet, #asOrderedCollection and, of
>>> course #copy, before starting manipulating its elements.
>>>
>>
>> I agree. removing from another one's collection is not that well  
>> behaved !
>> It's putting expectations higher than necessary.
>>
>> I see this as a sort of optimization breaking encapsulation...
>> I want to come with a better optimization.
>> Unfortunately, historical uses will be found here and there, #keys  
>> being a well-known case.
>> We have to put pressure on maintainers...
>> Who ever takes this path shall be ready to sell the idea, and prove  
>> that gains > costs.
>>
>> A team modifying kernel in uncompatible manner (and we have to, or  
>> just stick to Squeak 3.7) shall provide support for a smooth  
>> transiiton.
>> Sometimes I feel it would be convenient to inquire a code base  
>> larger than just an image, so that we can emit proposals and sell  
>> the idea rather than just waiting for complaints passively....
>>
>> Nicolas
>>
>>>
>>>> Nicolas
>>>>
>>>>>
>>>>>
>>>>>
>>>>> _______________________________________________
>>>>> Pharo-project mailing list
>>>>> [hidden email]
>>>>> http://lists.gforge.inria.fr/cgi-bin/mailman/listinfo/pharo- 
>>>>> project
>>>>>
>>>>
>>>> _______________________________________________
>>>> Pharo-project mailing list
>>>> [hidden email]
>>>> http://lists.gforge.inria.fr/cgi-bin/mailman/listinfo/pharo-project
>>>>
>>>
>>>
>>>
>>> --
>>> Best regards,
>>> Igor Stasenko AKA sig.
>>>
>>> _______________________________________________
>>> Pharo-project mailing list
>>> [hidden email]
>>> http://lists.gforge.inria.fr/cgi-bin/mailman/listinfo/pharo-project
>>
>> _______________________________________________
>> Pharo-project mailing list
>> [hidden email]
>> http://lists.gforge.inria.fr/cgi-bin/mailman/listinfo/pharo-project
>>
>> _______________________________________________
>> Pharo-project mailing list
>> [hidden email]
>> http://lists.gforge.inria.fr/cgi-bin/mailman/listinfo/pharo-project
>>
>
> _______________________________________________
> Pharo-project mailing list
> [hidden email]
> http://lists.gforge.inria.fr/cgi-bin/mailman/listinfo/pharo-project


_______________________________________________
Pharo-project mailing list
[hidden email]
http://lists.gforge.inria.fr/cgi-bin/mailman/listinfo/pharo-project
Reply | Threaded
Open this post in threaded view
|

Re: dictionary value is an array?

Stéphane Ducasse
I got it after sending it :)

On Oct 31, 2009, at 5:31 PM, Stéphane Ducasse wrote:

> what is a parallel array?
>
>> 2009/10/31 Schwab,Wilhelm K <[hidden email]>:
>>> Hello all,
>>>
>>> FWIW, Dolphin "lookup tables" use parallel arrays for keys and
>>> values.  OA's claim (I do not dispute it, I'm simply reporting what
>>> they said) is that the result is faster than a dictionary based on
>>> associations.
>>>
>>> Bill
>>>
>>>
>>
>> Squeak.MethodDictionary is an example of this pattern too
>>
>>>
>>> -----Original Message-----
>>> From: [hidden email] [mailto:pharo-
>>> [hidden email]] On Behalf Of Nicolas Cellier
>>> Sent: Saturday, October 31, 2009 9:23 AM
>>> To: [hidden email]
>>> Subject: Re: [Pharo-project] dictionary value is an array?
>>>
>>> 2009/10/31 Igor Stasenko <[hidden email]>:
>>>> 2009/10/31 Nicolas Cellier <[hidden email]>:
>>>>> 2009/10/31 Stéphane Ducasse <[hidden email]>:
>>>>>>
>>>>>> Hi nicolas
>>>>>>
>>>>>> If I remember you proposed fixes to have dictionary values ->
>>>>>> array?
>>>>>> Is it correct?
>>>>>>
>>>>>> I was wondering why?
>>>>>> Because a set makes more sense to me since
>>>>>>       - the order of the elements in values is not useful
>>>>>> (because
>>>>>> they come from a dict)
>>>>>>       - the occurrences not really either
>>>>>>       - we cannot add to the results so we should create another
>>>>>> collection
>>>>>>
>>>>>> May be I'm totally wrong with the changes but I think that values
>>>>>> should be set while keys an array.
>>>>>> I would be interested in discussion on that.
>>>>>>
>>>>>> Stef
>>>>>>
>>>>>>
>>>>>
>>>>> Dictionary values is already an Array...
>>>>> I proposed to change keys accordingly.
>>>>>
>>>>> Historically, they were a Bag and a Set because they were
>>>>> unordered.
>>>>>
>>>>> But very few or no sender expect a behavior specific to a Bag or
>>>>> a Set.
>>>>> Most just do: select: collect: asArray asSortedCollection etc...
>>>>> That's where an Array outperforms a Set or a Bag.
>>>>>
>>>>> So we are paying the price of a Set or a Bag almost for nothing...
>>>>> And that is why you can see a #fasterKeys.
>>>>>
>>>>> The drawback is that a few senders will add: to or remove: from
>>>>> keys.
>>>>> My estimation is around 5%.
>>>>> Inside the image, no problem, it is easy to fix, just write
>>>>> (someDictionary keys asSet).
>>>>> But that put a load on package maintainers.
>>>>>
>>>>
>>>> +1 Nicolas. Personally, i think that modifying a collection which
>>>> not
>>>> constructed by your own code
>>>> is bad programming style, because you never know where this
>>>> collection
>>>> is came from and you can't be sure that there is no other users of
>>>> it,
>>>> which in own turn may modify it at any moment, while often you
>>>> assuming that only you own the collection and don't expecting any
>>>> modifications to it except in own code.
>>>> So, IMO this approach is error prone, and you should never use
>>>> #remove: or #add: with collections which constructed by separate
>>>> package/layer, and instead, use #asSet, #asOrderedCollection and,  
>>>> of
>>>> course #copy, before starting manipulating its elements.
>>>>
>>>
>>> I agree. removing from another one's collection is not that well
>>> behaved !
>>> It's putting expectations higher than necessary.
>>>
>>> I see this as a sort of optimization breaking encapsulation...
>>> I want to come with a better optimization.
>>> Unfortunately, historical uses will be found here and there, #keys
>>> being a well-known case.
>>> We have to put pressure on maintainers...
>>> Who ever takes this path shall be ready to sell the idea, and prove
>>> that gains > costs.
>>>
>>> A team modifying kernel in uncompatible manner (and we have to, or
>>> just stick to Squeak 3.7) shall provide support for a smooth
>>> transiiton.
>>> Sometimes I feel it would be convenient to inquire a code base
>>> larger than just an image, so that we can emit proposals and sell
>>> the idea rather than just waiting for complaints passively....
>>>
>>> Nicolas
>>>
>>>>
>>>>> Nicolas
>>>>>
>>>>>>
>>>>>>
>>>>>>
>>>>>> _______________________________________________
>>>>>> Pharo-project mailing list
>>>>>> [hidden email]
>>>>>> http://lists.gforge.inria.fr/cgi-bin/mailman/listinfo/pharo-
>>>>>> project
>>>>>>
>>>>>
>>>>> _______________________________________________
>>>>> Pharo-project mailing list
>>>>> [hidden email]
>>>>> http://lists.gforge.inria.fr/cgi-bin/mailman/listinfo/pharo- 
>>>>> project
>>>>>
>>>>
>>>>
>>>>
>>>> --
>>>> Best regards,
>>>> Igor Stasenko AKA sig.
>>>>
>>>> _______________________________________________
>>>> Pharo-project mailing list
>>>> [hidden email]
>>>> http://lists.gforge.inria.fr/cgi-bin/mailman/listinfo/pharo-project
>>>
>>> _______________________________________________
>>> Pharo-project mailing list
>>> [hidden email]
>>> http://lists.gforge.inria.fr/cgi-bin/mailman/listinfo/pharo-project
>>>
>>> _______________________________________________
>>> Pharo-project mailing list
>>> [hidden email]
>>> http://lists.gforge.inria.fr/cgi-bin/mailman/listinfo/pharo-project
>>>
>>
>> _______________________________________________
>> Pharo-project mailing list
>> [hidden email]
>> http://lists.gforge.inria.fr/cgi-bin/mailman/listinfo/pharo-project
>
>
> _______________________________________________
> Pharo-project mailing list
> [hidden email]
> http://lists.gforge.inria.fr/cgi-bin/mailman/listinfo/pharo-project


_______________________________________________
Pharo-project mailing list
[hidden email]
http://lists.gforge.inria.fr/cgi-bin/mailman/listinfo/pharo-project
Reply | Threaded
Open this post in threaded view
|

Re: dictionary value is an array?

Andres Valloud-4
In reply to this post by Lukas Renggli
Watch out, because code like this may or may not work:

someDictionary keys do:
  [:each | each isNotInteresting ifTrue: [someDictionary removeKey: each]]

Thus, I think keys should answer a new collection.  If the intent is to
iterate over the keys, then I'd send the message directly to the dictionary.

Andres.


Lukas Renggli wrote:

> It always annoyed me that #keys and #values create a new collection
> object. Form an object-oriented stand-point of view, what would rather
> make sense is to return a collection that delegates to the dictionary.
>
> For example, DictionaryKeys would be a subclass of Collection, behave
> like a Set and be implemented along ...
>
>     Dictionary>>keys
>         ^ DictionaryKeys on: self
>
>     DictionaryKeys>>do: aBlock
>         dictionary keysDo: aBlock
>
>     DictionaryKeys>>includes: anObject
>         ^ dictionary includesKey: anObject
>
>     DictionaryKeys>>species
>         ^ Set
>
> I bet that this would not only result in a major performance
> improvement, but also greatly reduce memory consumption. Furthermore
> it would solve the problem that #keys suddenly has totally different
> performance characteristics as we decide to return an Array instead of
> a Set. For example, it wouldn't matter anymore if you wrote
>
>     aDictionary includesKey: #foo
>
> or
>
>     aDictionary keys includes: #foo
>
> I see only obvious problem for such an approach and this is when the
> dictionary changes and the keys are stored somewhere. Some users might
> not expect the keys to change too.
>
> Cheers,
> Lukas
>
>  

_______________________________________________
Pharo-project mailing list
[hidden email]
http://lists.gforge.inria.fr/cgi-bin/mailman/listinfo/pharo-project