Hi nicolas If I remember you proposed fixes to have dictionary values -> array? Is it correct? I was wondering why? Because a set makes more sense to me since - the order of the elements in values is not useful (because they come from a dict) - the occurrences not really either - we cannot add to the results so we should create another collection May be I'm totally wrong with the changes but I think that values should be set while keys an array. I would be interested in discussion on that. Stef _______________________________________________ Pharo-project mailing list [hidden email] http://lists.gforge.inria.fr/cgi-bin/mailman/listinfo/pharo-project |
2009/10/31 Stéphane Ducasse <[hidden email]>:
> > Hi nicolas > > If I remember you proposed fixes to have dictionary values -> array? > Is it correct? > > I was wondering why? > Because a set makes more sense to me since > - the order of the elements in values is not useful (because they > come from a dict) > - the occurrences not really either > - we cannot add to the results so we should create another collection > > May be I'm totally wrong with the changes but I think that values > should be set while keys an array. > I would be interested in discussion on that. > > Stef > > Dictionary values is already an Array... I proposed to change keys accordingly. Historically, they were a Bag and a Set because they were unordered. But very few or no sender expect a behavior specific to a Bag or a Set. Most just do: select: collect: asArray asSortedCollection etc... That's where an Array outperforms a Set or a Bag. So we are paying the price of a Set or a Bag almost for nothing... And that is why you can see a #fasterKeys. The drawback is that a few senders will add: to or remove: from keys. My estimation is around 5%. Inside the image, no problem, it is easy to fix, just write (someDictionary keys asSet). But that put a load on package maintainers. Nicolas > > > > _______________________________________________ > Pharo-project mailing list > [hidden email] > http://lists.gforge.inria.fr/cgi-bin/mailman/listinfo/pharo-project > _______________________________________________ Pharo-project mailing list [hidden email] http://lists.gforge.inria.fr/cgi-bin/mailman/listinfo/pharo-project |
2009/10/31 Nicolas Cellier <[hidden email]>:
> 2009/10/31 Stéphane Ducasse <[hidden email]>: >> >> Hi nicolas >> >> If I remember you proposed fixes to have dictionary values -> array? >> Is it correct? >> >> I was wondering why? >> Because a set makes more sense to me since >> - the order of the elements in values is not useful (because they >> come from a dict) >> - the occurrences not really either >> - we cannot add to the results so we should create another collection >> >> May be I'm totally wrong with the changes but I think that values >> should be set while keys an array. >> I would be interested in discussion on that. >> >> Stef >> >> > > Dictionary values is already an Array... > I proposed to change keys accordingly. > > Historically, they were a Bag and a Set because they were unordered. > > But very few or no sender expect a behavior specific to a Bag or a Set. > Most just do: select: collect: asArray asSortedCollection etc... > That's where an Array outperforms a Set or a Bag. > > So we are paying the price of a Set or a Bag almost for nothing... > And that is why you can see a #fasterKeys. > > The drawback is that a few senders will add: to or remove: from keys. > My estimation is around 5%. > Inside the image, no problem, it is easy to fix, just write > (someDictionary keys asSet). > But that put a load on package maintainers. > +1 Nicolas. Personally, i think that modifying a collection which not constructed by your own code is bad programming style, because you never know where this collection is came from and you can't be sure that there is no other users of it, which in own turn may modify it at any moment, while often you assuming that only you own the collection and don't expecting any modifications to it except in own code. So, IMO this approach is error prone, and you should never use #remove: or #add: with collections which constructed by separate package/layer, and instead, use #asSet, #asOrderedCollection and, of course #copy, before starting manipulating its elements. > Nicolas > >> >> >> >> _______________________________________________ >> Pharo-project mailing list >> [hidden email] >> http://lists.gforge.inria.fr/cgi-bin/mailman/listinfo/pharo-project >> > > _______________________________________________ > Pharo-project mailing list > [hidden email] > http://lists.gforge.inria.fr/cgi-bin/mailman/listinfo/pharo-project > -- Best regards, Igor Stasenko AKA sig. _______________________________________________ Pharo-project mailing list [hidden email] http://lists.gforge.inria.fr/cgi-bin/mailman/listinfo/pharo-project |
2009/10/31 Igor Stasenko <[hidden email]>:
> 2009/10/31 Nicolas Cellier <[hidden email]>: >> 2009/10/31 Stéphane Ducasse <[hidden email]>: >>> >>> Hi nicolas >>> >>> If I remember you proposed fixes to have dictionary values -> array? >>> Is it correct? >>> >>> I was wondering why? >>> Because a set makes more sense to me since >>> - the order of the elements in values is not useful (because they >>> come from a dict) >>> - the occurrences not really either >>> - we cannot add to the results so we should create another collection >>> >>> May be I'm totally wrong with the changes but I think that values >>> should be set while keys an array. >>> I would be interested in discussion on that. >>> >>> Stef >>> >>> >> >> Dictionary values is already an Array... >> I proposed to change keys accordingly. >> >> Historically, they were a Bag and a Set because they were unordered. >> >> But very few or no sender expect a behavior specific to a Bag or a Set. >> Most just do: select: collect: asArray asSortedCollection etc... >> That's where an Array outperforms a Set or a Bag. >> >> So we are paying the price of a Set or a Bag almost for nothing... >> And that is why you can see a #fasterKeys. >> >> The drawback is that a few senders will add: to or remove: from keys. >> My estimation is around 5%. >> Inside the image, no problem, it is easy to fix, just write >> (someDictionary keys asSet). >> But that put a load on package maintainers. >> > > +1 Nicolas. Personally, i think that modifying a collection which not > constructed by your own code > is bad programming style, because you never know where this collection > is came from and you can't be sure that there is no other users of it, > which in own turn may modify it at any moment, while often you > assuming that only you own the collection and don't expecting any > modifications to it except in own code. > So, IMO this approach is error prone, and you should never use > #remove: or #add: with collections which constructed by separate > package/layer, and instead, use #asSet, #asOrderedCollection and, of > course #copy, before starting manipulating its elements. > I agree. removing from another one's collection is not that well behaved ! It's putting expectations higher than necessary. I see this as a sort of optimization breaking encapsulation... I want to come with a better optimization. Unfortunately, historical uses will be found here and there, #keys being a well-known case. We have to put pressure on maintainers... Who ever takes this path shall be ready to sell the idea, and prove that gains > costs. A team modifying kernel in uncompatible manner (and we have to, or just stick to Squeak 3.7) shall provide support for a smooth transiiton. Sometimes I feel it would be convenient to inquire a code base larger than just an image, so that we can emit proposals and sell the idea rather than just waiting for complaints passively.... Nicolas > >> Nicolas >> >>> >>> >>> >>> _______________________________________________ >>> Pharo-project mailing list >>> [hidden email] >>> http://lists.gforge.inria.fr/cgi-bin/mailman/listinfo/pharo-project >>> >> >> _______________________________________________ >> Pharo-project mailing list >> [hidden email] >> http://lists.gforge.inria.fr/cgi-bin/mailman/listinfo/pharo-project >> > > > > -- > Best regards, > Igor Stasenko AKA sig. > > _______________________________________________ > Pharo-project mailing list > [hidden email] > http://lists.gforge.inria.fr/cgi-bin/mailman/listinfo/pharo-project _______________________________________________ Pharo-project mailing list [hidden email] http://lists.gforge.inria.fr/cgi-bin/mailman/listinfo/pharo-project |
It always annoyed me that #keys and #values create a new collection
object. Form an object-oriented stand-point of view, what would rather make sense is to return a collection that delegates to the dictionary. For example, DictionaryKeys would be a subclass of Collection, behave like a Set and be implemented along ... Dictionary>>keys ^ DictionaryKeys on: self DictionaryKeys>>do: aBlock dictionary keysDo: aBlock DictionaryKeys>>includes: anObject ^ dictionary includesKey: anObject DictionaryKeys>>species ^ Set I bet that this would not only result in a major performance improvement, but also greatly reduce memory consumption. Furthermore it would solve the problem that #keys suddenly has totally different performance characteristics as we decide to return an Array instead of a Set. For example, it wouldn't matter anymore if you wrote aDictionary includesKey: #foo or aDictionary keys includes: #foo I see only obvious problem for such an approach and this is when the dictionary changes and the keys are stored somewhere. Some users might not expect the keys to change too. Cheers, Lukas -- Lukas Renggli http://www.lukas-renggli.ch _______________________________________________ Pharo-project mailing list [hidden email] http://lists.gforge.inria.fr/cgi-bin/mailman/listinfo/pharo-project |
In reply to this post by Nicolas Cellier
Hello all,
FWIW, Dolphin "lookup tables" use parallel arrays for keys and values. OA's claim (I do not dispute it, I'm simply reporting what they said) is that the result is faster than a dictionary based on associations. Bill -----Original Message----- From: [hidden email] [mailto:[hidden email]] On Behalf Of Nicolas Cellier Sent: Saturday, October 31, 2009 9:23 AM To: [hidden email] Subject: Re: [Pharo-project] dictionary value is an array? 2009/10/31 Igor Stasenko <[hidden email]>: > 2009/10/31 Nicolas Cellier <[hidden email]>: >> 2009/10/31 Stéphane Ducasse <[hidden email]>: >>> >>> Hi nicolas >>> >>> If I remember you proposed fixes to have dictionary values -> array? >>> Is it correct? >>> >>> I was wondering why? >>> Because a set makes more sense to me since >>> - the order of the elements in values is not useful (because >>> they come from a dict) >>> - the occurrences not really either >>> - we cannot add to the results so we should create another >>> collection >>> >>> May be I'm totally wrong with the changes but I think that values >>> should be set while keys an array. >>> I would be interested in discussion on that. >>> >>> Stef >>> >>> >> >> Dictionary values is already an Array... >> I proposed to change keys accordingly. >> >> Historically, they were a Bag and a Set because they were unordered. >> >> But very few or no sender expect a behavior specific to a Bag or a Set. >> Most just do: select: collect: asArray asSortedCollection etc... >> That's where an Array outperforms a Set or a Bag. >> >> So we are paying the price of a Set or a Bag almost for nothing... >> And that is why you can see a #fasterKeys. >> >> The drawback is that a few senders will add: to or remove: from keys. >> My estimation is around 5%. >> Inside the image, no problem, it is easy to fix, just write >> (someDictionary keys asSet). >> But that put a load on package maintainers. >> > > +1 Nicolas. Personally, i think that modifying a collection which not > constructed by your own code > is bad programming style, because you never know where this collection > is came from and you can't be sure that there is no other users of it, > which in own turn may modify it at any moment, while often you > assuming that only you own the collection and don't expecting any > modifications to it except in own code. > So, IMO this approach is error prone, and you should never use > #remove: or #add: with collections which constructed by separate > package/layer, and instead, use #asSet, #asOrderedCollection and, of > course #copy, before starting manipulating its elements. > I agree. removing from another one's collection is not that well behaved ! It's putting expectations higher than necessary. I see this as a sort of optimization breaking encapsulation... I want to come with a better optimization. Unfortunately, historical uses will be found here and there, #keys being a well-known case. We have to put pressure on maintainers... Who ever takes this path shall be ready to sell the idea, and prove that gains > costs. A team modifying kernel in uncompatible manner (and we have to, or just stick to Squeak 3.7) shall provide support for a smooth transiiton. Sometimes I feel it would be convenient to inquire a code base larger than just an image, so that we can emit proposals and sell the idea rather than just waiting for complaints passively.... Nicolas > >> Nicolas >> >>> >>> >>> >>> _______________________________________________ >>> Pharo-project mailing list >>> [hidden email] >>> http://lists.gforge.inria.fr/cgi-bin/mailman/listinfo/pharo-project >>> >> >> _______________________________________________ >> Pharo-project mailing list >> [hidden email] >> http://lists.gforge.inria.fr/cgi-bin/mailman/listinfo/pharo-project >> > > > > -- > Best regards, > Igor Stasenko AKA sig. > > _______________________________________________ > Pharo-project mailing list > [hidden email] > http://lists.gforge.inria.fr/cgi-bin/mailman/listinfo/pharo-project _______________________________________________ Pharo-project mailing list [hidden email] http://lists.gforge.inria.fr/cgi-bin/mailman/listinfo/pharo-project _______________________________________________ Pharo-project mailing list [hidden email] http://lists.gforge.inria.fr/cgi-bin/mailman/listinfo/pharo-project |
In reply to this post by Lukas Renggli
2009/10/31 Lukas Renggli <[hidden email]>:
> It always annoyed me that #keys and #values create a new collection > object. Form an object-oriented stand-point of view, what would rather > make sense is to return a collection that delegates to the dictionary. > > For example, DictionaryKeys would be a subclass of Collection, behave > like a Set and be implemented along ... > > Dictionary>>keys > ^ DictionaryKeys on: self > > DictionaryKeys>>do: aBlock > dictionary keysDo: aBlock > > DictionaryKeys>>includes: anObject > ^ dictionary includesKey: anObject > > DictionaryKeys>>species > ^ Set > > I bet that this would not only result in a major performance > improvement, but also greatly reduce memory consumption. Furthermore > it would solve the problem that #keys suddenly has totally different > performance characteristics as we decide to return an Array instead of > a Set. For example, it wouldn't matter anymore if you wrote > > aDictionary includesKey: #foo > > or > > aDictionary keys includes: #foo > Yes, the proxy/wrapper pattern can be an elegant and efficient solution (a good example is ublas from BOOST library). It depends how many time you will iterate the collection though. Iterating a Set is often +2x the cost of iterating an Array. > I see only obvious problem for such an approach and this is when the > dictionary changes and the keys are stored somewhere. Some users might > not expect the keys to change too. > Yes, an immutable proxy with copy on write behaviour is challenging to implement... > Cheers, > Lukas > > -- > Lukas Renggli > http://www.lukas-renggli.ch > > _______________________________________________ > Pharo-project mailing list > [hidden email] > http://lists.gforge.inria.fr/cgi-bin/mailman/listinfo/pharo-project > _______________________________________________ Pharo-project mailing list [hidden email] http://lists.gforge.inria.fr/cgi-bin/mailman/listinfo/pharo-project |
In reply to this post by Schwab,Wilhelm K
2009/10/31 Schwab,Wilhelm K <[hidden email]>:
> Hello all, > > FWIW, Dolphin "lookup tables" use parallel arrays for keys and values. OA's claim (I do not dispute it, I'm simply reporting what they said) is that the result is faster than a dictionary based on associations. > > Bill > > Squeak.MethodDictionary is an example of this pattern too > > -----Original Message----- > From: [hidden email] [mailto:[hidden email]] On Behalf Of Nicolas Cellier > Sent: Saturday, October 31, 2009 9:23 AM > To: [hidden email] > Subject: Re: [Pharo-project] dictionary value is an array? > > 2009/10/31 Igor Stasenko <[hidden email]>: >> 2009/10/31 Nicolas Cellier <[hidden email]>: >>> 2009/10/31 Stéphane Ducasse <[hidden email]>: >>>> >>>> Hi nicolas >>>> >>>> If I remember you proposed fixes to have dictionary values -> array? >>>> Is it correct? >>>> >>>> I was wondering why? >>>> Because a set makes more sense to me since >>>> - the order of the elements in values is not useful (because >>>> they come from a dict) >>>> - the occurrences not really either >>>> - we cannot add to the results so we should create another >>>> collection >>>> >>>> May be I'm totally wrong with the changes but I think that values >>>> should be set while keys an array. >>>> I would be interested in discussion on that. >>>> >>>> Stef >>>> >>>> >>> >>> Dictionary values is already an Array... >>> I proposed to change keys accordingly. >>> >>> Historically, they were a Bag and a Set because they were unordered. >>> >>> But very few or no sender expect a behavior specific to a Bag or a Set. >>> Most just do: select: collect: asArray asSortedCollection etc... >>> That's where an Array outperforms a Set or a Bag. >>> >>> So we are paying the price of a Set or a Bag almost for nothing... >>> And that is why you can see a #fasterKeys. >>> >>> The drawback is that a few senders will add: to or remove: from keys. >>> My estimation is around 5%. >>> Inside the image, no problem, it is easy to fix, just write >>> (someDictionary keys asSet). >>> But that put a load on package maintainers. >>> >> >> +1 Nicolas. Personally, i think that modifying a collection which not >> constructed by your own code >> is bad programming style, because you never know where this collection >> is came from and you can't be sure that there is no other users of it, >> which in own turn may modify it at any moment, while often you >> assuming that only you own the collection and don't expecting any >> modifications to it except in own code. >> So, IMO this approach is error prone, and you should never use >> #remove: or #add: with collections which constructed by separate >> package/layer, and instead, use #asSet, #asOrderedCollection and, of >> course #copy, before starting manipulating its elements. >> > > I agree. removing from another one's collection is not that well behaved ! > It's putting expectations higher than necessary. > > I see this as a sort of optimization breaking encapsulation... > I want to come with a better optimization. > Unfortunately, historical uses will be found here and there, #keys being a well-known case. > We have to put pressure on maintainers... > Who ever takes this path shall be ready to sell the idea, and prove that gains > costs. > > A team modifying kernel in uncompatible manner (and we have to, or just stick to Squeak 3.7) shall provide support for a smooth transiiton. > Sometimes I feel it would be convenient to inquire a code base larger than just an image, so that we can emit proposals and sell the idea rather than just waiting for complaints passively.... > > Nicolas > >> >>> Nicolas >>> >>>> >>>> >>>> >>>> _______________________________________________ >>>> Pharo-project mailing list >>>> [hidden email] >>>> http://lists.gforge.inria.fr/cgi-bin/mailman/listinfo/pharo-project >>>> >>> >>> _______________________________________________ >>> Pharo-project mailing list >>> [hidden email] >>> http://lists.gforge.inria.fr/cgi-bin/mailman/listinfo/pharo-project >>> >> >> >> >> -- >> Best regards, >> Igor Stasenko AKA sig. >> >> _______________________________________________ >> Pharo-project mailing list >> [hidden email] >> http://lists.gforge.inria.fr/cgi-bin/mailman/listinfo/pharo-project > > _______________________________________________ > Pharo-project mailing list > [hidden email] > http://lists.gforge.inria.fr/cgi-bin/mailman/listinfo/pharo-project > > _______________________________________________ > Pharo-project mailing list > [hidden email] > http://lists.gforge.inria.fr/cgi-bin/mailman/listinfo/pharo-project > _______________________________________________ Pharo-project mailing list [hidden email] http://lists.gforge.inria.fr/cgi-bin/mailman/listinfo/pharo-project |
In reply to this post by Lukas Renggli
Lukas Renggli wrote:
> It always annoyed me that #keys and #values create a new collection > object. Form an object-oriented stand-point of view, what would rather > make sense is to return a collection that delegates to the dictionary. +1, I've had similar thoughts. I'd think DictionaryKeys would need to be a read-only collection. Otherwise, what would be the effect of DictionaryKeys>>add: on the underlying Dictionary? And even worse effects of trying to add to a DictionaryValues... Regards, -Martin _______________________________________________ Pharo-project mailing list [hidden email] http://lists.gforge.inria.fr/cgi-bin/mailman/listinfo/pharo-project |
In reply to this post by Schwab,Wilhelm K
Schwab,Wilhelm K wrote:
> Hello all, > > FWIW, Dolphin "lookup tables" use parallel arrays for keys and values. OA's claim (I do not dispute it, I'm simply reporting what they said) is that the result is faster than a dictionary based on associations. Parallel arrays should indeed be faster than Associations. The overhead at add: time of creating a new Association object is an appreciable fraction of the overall add: time. But lookups should be even faster with a single array that alternates keys and values, an approach I often use. Why is this faster? Modern RAM isn't really "random access"; accesses to locations in the CPU cache can be up to two orders of magnitude faster than accesses to locations in main memory. If the key and value are next to each other, they are often in the same cache line, so the number of reads from main memory is halved. Regards, -Martin _______________________________________________ Pharo-project mailing list [hidden email] http://lists.gforge.inria.fr/cgi-bin/mailman/listinfo/pharo-project |
In reply to this post by Martin McClure-2
2009/10/31 Martin McClure <[hidden email]>:
> Lukas Renggli wrote: >> It always annoyed me that #keys and #values create a new collection >> object. Form an object-oriented stand-point of view, what would rather >> make sense is to return a collection that delegates to the dictionary. > > +1, I've had similar thoughts. > > I'd think DictionaryKeys would need to be a read-only collection. > Otherwise, what would be the effect of DictionaryKeys>>add: on the > underlying Dictionary? And even worse effects of trying to add to a > DictionaryValues... > I think that given approach can be generalized to serve for multiple purposes. Imagine a proxy object, which takes a collection and block with two arguments: | dictKeys | dictKeys := CollectionFilter new collection: (Object methodDict) filter: [:coll :aBlock | coll keysDo: [:each | aBlock value: each ] ]. dictKeys asSet There are many ways how this proxy can be used. For instance, a #reject: and #select: , in parallel we could have #rejectedBy: and #selectedBy: which creating a filter proxy instead of creating new collection: CollectionFilter class>>collection: aCollection rejectedBy: rejectBlock ^ self new collection: aCollection filter: [:coll :aBlock | coll do: [:each | (rejectBlock value: each) ifFalse: [ aBlock value: each ] ]] > Regards, > > -Martin > > _______________________________________________ > Pharo-project mailing list > [hidden email] > http://lists.gforge.inria.fr/cgi-bin/mailman/listinfo/pharo-project > -- Best regards, Igor Stasenko AKA sig. _______________________________________________ Pharo-project mailing list [hidden email] http://lists.gforge.inria.fr/cgi-bin/mailman/listinfo/pharo-project CollectionFilter.st (788 bytes) Download Attachment |
2009/10/31 Igor Stasenko <[hidden email]>:
> 2009/10/31 Martin McClure <[hidden email]>: >> Lukas Renggli wrote: >>> It always annoyed me that #keys and #values create a new collection >>> object. Form an object-oriented stand-point of view, what would rather >>> make sense is to return a collection that delegates to the dictionary. >> >> +1, I've had similar thoughts. >> >> I'd think DictionaryKeys would need to be a read-only collection. >> Otherwise, what would be the effect of DictionaryKeys>>add: on the >> underlying Dictionary? And even worse effects of trying to add to a >> DictionaryValues... >> > > Yeah, i talked about this earlier in this topic. > I think that given approach can be generalized to serve for multiple purposes. > > Imagine a proxy object, which takes a collection and block with two arguments: > > | dictKeys | > dictKeys := CollectionFilter new collection: (Object methodDict) > filter: [:coll :aBlock | > coll keysDo: [:each | aBlock value: each ] > ]. > dictKeys asSet > > There are many ways how this proxy can be used. > For instance, a #reject: and #select: , in parallel we could have > #rejectedBy: and #selectedBy: > which creating a filter proxy instead of creating new collection: > > CollectionFilter class>>collection: aCollection rejectedBy: rejectBlock > ^ self new collection: aCollection filter: [:coll :aBlock | > coll do: [:each | (rejectBlock value: each) ifFalse: [ aBlock > value: each ] > ]] > That also has been generalized to stream. See Nile or gnu smalltalk for example. > >> Regards, >> >> -Martin >> >> _______________________________________________ >> Pharo-project mailing list >> [hidden email] >> http://lists.gforge.inria.fr/cgi-bin/mailman/listinfo/pharo-project >> > > > > -- > Best regards, > Igor Stasenko AKA sig. > > _______________________________________________ > Pharo-project mailing list > [hidden email] > http://lists.gforge.inria.fr/cgi-bin/mailman/listinfo/pharo-project > _______________________________________________ Pharo-project mailing list [hidden email] http://lists.gforge.inria.fr/cgi-bin/mailman/listinfo/pharo-project |
In reply to this post by Nicolas Cellier
what is a parallel array?
> 2009/10/31 Schwab,Wilhelm K <[hidden email]>: >> Hello all, >> >> FWIW, Dolphin "lookup tables" use parallel arrays for keys and >> values. OA's claim (I do not dispute it, I'm simply reporting what >> they said) is that the result is faster than a dictionary based on >> associations. >> >> Bill >> >> > > Squeak.MethodDictionary is an example of this pattern too > >> >> -----Original Message----- >> From: [hidden email] [mailto:pharo- >> [hidden email]] On Behalf Of Nicolas Cellier >> Sent: Saturday, October 31, 2009 9:23 AM >> To: [hidden email] >> Subject: Re: [Pharo-project] dictionary value is an array? >> >> 2009/10/31 Igor Stasenko <[hidden email]>: >>> 2009/10/31 Nicolas Cellier <[hidden email]>: >>>> 2009/10/31 Stéphane Ducasse <[hidden email]>: >>>>> >>>>> Hi nicolas >>>>> >>>>> If I remember you proposed fixes to have dictionary values -> >>>>> array? >>>>> Is it correct? >>>>> >>>>> I was wondering why? >>>>> Because a set makes more sense to me since >>>>> - the order of the elements in values is not useful >>>>> (because >>>>> they come from a dict) >>>>> - the occurrences not really either >>>>> - we cannot add to the results so we should create another >>>>> collection >>>>> >>>>> May be I'm totally wrong with the changes but I think that values >>>>> should be set while keys an array. >>>>> I would be interested in discussion on that. >>>>> >>>>> Stef >>>>> >>>>> >>>> >>>> Dictionary values is already an Array... >>>> I proposed to change keys accordingly. >>>> >>>> Historically, they were a Bag and a Set because they were >>>> unordered. >>>> >>>> But very few or no sender expect a behavior specific to a Bag or >>>> a Set. >>>> Most just do: select: collect: asArray asSortedCollection etc... >>>> That's where an Array outperforms a Set or a Bag. >>>> >>>> So we are paying the price of a Set or a Bag almost for nothing... >>>> And that is why you can see a #fasterKeys. >>>> >>>> The drawback is that a few senders will add: to or remove: from >>>> keys. >>>> My estimation is around 5%. >>>> Inside the image, no problem, it is easy to fix, just write >>>> (someDictionary keys asSet). >>>> But that put a load on package maintainers. >>>> >>> >>> +1 Nicolas. Personally, i think that modifying a collection which >>> not >>> constructed by your own code >>> is bad programming style, because you never know where this >>> collection >>> is came from and you can't be sure that there is no other users of >>> it, >>> which in own turn may modify it at any moment, while often you >>> assuming that only you own the collection and don't expecting any >>> modifications to it except in own code. >>> So, IMO this approach is error prone, and you should never use >>> #remove: or #add: with collections which constructed by separate >>> package/layer, and instead, use #asSet, #asOrderedCollection and, of >>> course #copy, before starting manipulating its elements. >>> >> >> I agree. removing from another one's collection is not that well >> behaved ! >> It's putting expectations higher than necessary. >> >> I see this as a sort of optimization breaking encapsulation... >> I want to come with a better optimization. >> Unfortunately, historical uses will be found here and there, #keys >> being a well-known case. >> We have to put pressure on maintainers... >> Who ever takes this path shall be ready to sell the idea, and prove >> that gains > costs. >> >> A team modifying kernel in uncompatible manner (and we have to, or >> just stick to Squeak 3.7) shall provide support for a smooth >> transiiton. >> Sometimes I feel it would be convenient to inquire a code base >> larger than just an image, so that we can emit proposals and sell >> the idea rather than just waiting for complaints passively.... >> >> Nicolas >> >>> >>>> Nicolas >>>> >>>>> >>>>> >>>>> >>>>> _______________________________________________ >>>>> Pharo-project mailing list >>>>> [hidden email] >>>>> http://lists.gforge.inria.fr/cgi-bin/mailman/listinfo/pharo- >>>>> project >>>>> >>>> >>>> _______________________________________________ >>>> Pharo-project mailing list >>>> [hidden email] >>>> http://lists.gforge.inria.fr/cgi-bin/mailman/listinfo/pharo-project >>>> >>> >>> >>> >>> -- >>> Best regards, >>> Igor Stasenko AKA sig. >>> >>> _______________________________________________ >>> Pharo-project mailing list >>> [hidden email] >>> http://lists.gforge.inria.fr/cgi-bin/mailman/listinfo/pharo-project >> >> _______________________________________________ >> Pharo-project mailing list >> [hidden email] >> http://lists.gforge.inria.fr/cgi-bin/mailman/listinfo/pharo-project >> >> _______________________________________________ >> Pharo-project mailing list >> [hidden email] >> http://lists.gforge.inria.fr/cgi-bin/mailman/listinfo/pharo-project >> > > _______________________________________________ > Pharo-project mailing list > [hidden email] > http://lists.gforge.inria.fr/cgi-bin/mailman/listinfo/pharo-project _______________________________________________ Pharo-project mailing list [hidden email] http://lists.gforge.inria.fr/cgi-bin/mailman/listinfo/pharo-project |
I got it after sending it :)
On Oct 31, 2009, at 5:31 PM, Stéphane Ducasse wrote: > what is a parallel array? > >> 2009/10/31 Schwab,Wilhelm K <[hidden email]>: >>> Hello all, >>> >>> FWIW, Dolphin "lookup tables" use parallel arrays for keys and >>> values. OA's claim (I do not dispute it, I'm simply reporting what >>> they said) is that the result is faster than a dictionary based on >>> associations. >>> >>> Bill >>> >>> >> >> Squeak.MethodDictionary is an example of this pattern too >> >>> >>> -----Original Message----- >>> From: [hidden email] [mailto:pharo- >>> [hidden email]] On Behalf Of Nicolas Cellier >>> Sent: Saturday, October 31, 2009 9:23 AM >>> To: [hidden email] >>> Subject: Re: [Pharo-project] dictionary value is an array? >>> >>> 2009/10/31 Igor Stasenko <[hidden email]>: >>>> 2009/10/31 Nicolas Cellier <[hidden email]>: >>>>> 2009/10/31 Stéphane Ducasse <[hidden email]>: >>>>>> >>>>>> Hi nicolas >>>>>> >>>>>> If I remember you proposed fixes to have dictionary values -> >>>>>> array? >>>>>> Is it correct? >>>>>> >>>>>> I was wondering why? >>>>>> Because a set makes more sense to me since >>>>>> - the order of the elements in values is not useful >>>>>> (because >>>>>> they come from a dict) >>>>>> - the occurrences not really either >>>>>> - we cannot add to the results so we should create another >>>>>> collection >>>>>> >>>>>> May be I'm totally wrong with the changes but I think that values >>>>>> should be set while keys an array. >>>>>> I would be interested in discussion on that. >>>>>> >>>>>> Stef >>>>>> >>>>>> >>>>> >>>>> Dictionary values is already an Array... >>>>> I proposed to change keys accordingly. >>>>> >>>>> Historically, they were a Bag and a Set because they were >>>>> unordered. >>>>> >>>>> But very few or no sender expect a behavior specific to a Bag or >>>>> a Set. >>>>> Most just do: select: collect: asArray asSortedCollection etc... >>>>> That's where an Array outperforms a Set or a Bag. >>>>> >>>>> So we are paying the price of a Set or a Bag almost for nothing... >>>>> And that is why you can see a #fasterKeys. >>>>> >>>>> The drawback is that a few senders will add: to or remove: from >>>>> keys. >>>>> My estimation is around 5%. >>>>> Inside the image, no problem, it is easy to fix, just write >>>>> (someDictionary keys asSet). >>>>> But that put a load on package maintainers. >>>>> >>>> >>>> +1 Nicolas. Personally, i think that modifying a collection which >>>> not >>>> constructed by your own code >>>> is bad programming style, because you never know where this >>>> collection >>>> is came from and you can't be sure that there is no other users of >>>> it, >>>> which in own turn may modify it at any moment, while often you >>>> assuming that only you own the collection and don't expecting any >>>> modifications to it except in own code. >>>> So, IMO this approach is error prone, and you should never use >>>> #remove: or #add: with collections which constructed by separate >>>> package/layer, and instead, use #asSet, #asOrderedCollection and, >>>> of >>>> course #copy, before starting manipulating its elements. >>>> >>> >>> I agree. removing from another one's collection is not that well >>> behaved ! >>> It's putting expectations higher than necessary. >>> >>> I see this as a sort of optimization breaking encapsulation... >>> I want to come with a better optimization. >>> Unfortunately, historical uses will be found here and there, #keys >>> being a well-known case. >>> We have to put pressure on maintainers... >>> Who ever takes this path shall be ready to sell the idea, and prove >>> that gains > costs. >>> >>> A team modifying kernel in uncompatible manner (and we have to, or >>> just stick to Squeak 3.7) shall provide support for a smooth >>> transiiton. >>> Sometimes I feel it would be convenient to inquire a code base >>> larger than just an image, so that we can emit proposals and sell >>> the idea rather than just waiting for complaints passively.... >>> >>> Nicolas >>> >>>> >>>>> Nicolas >>>>> >>>>>> >>>>>> >>>>>> >>>>>> _______________________________________________ >>>>>> Pharo-project mailing list >>>>>> [hidden email] >>>>>> http://lists.gforge.inria.fr/cgi-bin/mailman/listinfo/pharo- >>>>>> project >>>>>> >>>>> >>>>> _______________________________________________ >>>>> Pharo-project mailing list >>>>> [hidden email] >>>>> http://lists.gforge.inria.fr/cgi-bin/mailman/listinfo/pharo- >>>>> project >>>>> >>>> >>>> >>>> >>>> -- >>>> Best regards, >>>> Igor Stasenko AKA sig. >>>> >>>> _______________________________________________ >>>> Pharo-project mailing list >>>> [hidden email] >>>> http://lists.gforge.inria.fr/cgi-bin/mailman/listinfo/pharo-project >>> >>> _______________________________________________ >>> Pharo-project mailing list >>> [hidden email] >>> http://lists.gforge.inria.fr/cgi-bin/mailman/listinfo/pharo-project >>> >>> _______________________________________________ >>> Pharo-project mailing list >>> [hidden email] >>> http://lists.gforge.inria.fr/cgi-bin/mailman/listinfo/pharo-project >>> >> >> _______________________________________________ >> Pharo-project mailing list >> [hidden email] >> http://lists.gforge.inria.fr/cgi-bin/mailman/listinfo/pharo-project > > > _______________________________________________ > Pharo-project mailing list > [hidden email] > http://lists.gforge.inria.fr/cgi-bin/mailman/listinfo/pharo-project _______________________________________________ Pharo-project mailing list [hidden email] http://lists.gforge.inria.fr/cgi-bin/mailman/listinfo/pharo-project |
In reply to this post by Lukas Renggli
Watch out, because code like this may or may not work:
someDictionary keys do: [:each | each isNotInteresting ifTrue: [someDictionary removeKey: each]] Thus, I think keys should answer a new collection. If the intent is to iterate over the keys, then I'd send the message directly to the dictionary. Andres. Lukas Renggli wrote: > It always annoyed me that #keys and #values create a new collection > object. Form an object-oriented stand-point of view, what would rather > make sense is to return a collection that delegates to the dictionary. > > For example, DictionaryKeys would be a subclass of Collection, behave > like a Set and be implemented along ... > > Dictionary>>keys > ^ DictionaryKeys on: self > > DictionaryKeys>>do: aBlock > dictionary keysDo: aBlock > > DictionaryKeys>>includes: anObject > ^ dictionary includesKey: anObject > > DictionaryKeys>>species > ^ Set > > I bet that this would not only result in a major performance > improvement, but also greatly reduce memory consumption. Furthermore > it would solve the problem that #keys suddenly has totally different > performance characteristics as we decide to return an Array instead of > a Set. For example, it wouldn't matter anymore if you wrote > > aDictionary includesKey: #foo > > or > > aDictionary keys includes: #foo > > I see only obvious problem for such an approach and this is when the > dictionary changes and the keys are stored somewhere. Some users might > not expect the keys to change too. > > Cheers, > Lukas > > _______________________________________________ Pharo-project mailing list [hidden email] http://lists.gforge.inria.fr/cgi-bin/mailman/listinfo/pharo-project |
Free forum by Nabble | Edit this page |