Levente Uzonyi uploaded a new version of Collections to project The Trunk:
http://source.squeak.org/trunk/Collections-ul.915.mcz ==================== Summary ==================== Name: Collections-ul.915 Author: ul Time: 5 October 2020, 12:30:33.597525 am UUID: 93341c2a-ebb2-4d2b-abee-e3f42f509836 Ancestors: Collections-eem.913 HashedCollection changes: - make #capacity return the actual capacity of the collection instead of the size of the internal array. This change is obviously not backwards compatible. - improve the performance of #isEmpty when tally is 0 OrderedDictionary changes: - make it a subclass of PluggableDictionary. This lets one create e.g. an ordered identity dictionary without creating a subclass with duplicated behavior - simplify #initialize and #growTo: now that #capacity is accurate =============== Diff against Collections-eem.913 =============== Item was changed: ----- Method: HashedCollection>>capacity (in category 'accessing') ----- capacity + "Answer the current capacity of the receiver - aka the number of elements the receiver can hold without growing." - "Answer the current capacity of the receiver." + ^ array size * 3 // 4! - ^ array size! Item was changed: ----- Method: HashedCollection>>isEmpty (in category 'testing') ----- isEmpty "For non-weak collections, we can use the tally to speed up the empty check. For weak collections, we must use the traditional way because the tally is unreliable. Also see #size vs. #slowSize." + tally = 0 ifTrue: [ ^true ]. + ^array class isWeak and: [ super isEmpty ]! - ^ array class isWeak - ifFalse: [ tally = 0 ] - ifTrue: [ super isEmpty ]! Item was changed: ----- Method: HashedCollection>>removeAll (in category 'removing') ----- removeAll "remove all elements from this collection. Preserve the capacity" + self initialize: array size! - self initialize: self capacity! Item was changed: + PluggableDictionary subclass: #OrderedDictionary - Dictionary subclass: #OrderedDictionary instanceVariableNames: 'order' classVariableNames: '' poolDictionaries: '' category: 'Collections-Sequenceable'! !OrderedDictionary commentStamp: 'mt 1/16/2015 10:42' prior: 0! I am an ordered dictionary. I have an additional index (called 'order') to keep track of the insertion order of my associations. The read access is not affected by the additional index. The index is updated in O(1) [time] when inserting new keys. For present keys, that insertion involves actions in O(n) to move the respective element to the end of the order. The growth operation compacts the index and takes O(n) additional time. NOTE: This is still no instance of SequenceableCollection. Having this, some protocols are missing and may require working on #associations, which is an Array and thus sequenceable.! Item was changed: ----- Method: OrderedDictionary>>growTo: (in category 'private') ----- growTo: anInteger - | oldOrder | super growTo: anInteger. + order := order grownBy: self capacity - order size! - oldOrder := order. - "Grow only to 75%. See #atNewIndex:put: in HashedCollection." - order := self arrayType new: anInteger + 1 * 3 // 4. - order - replaceFrom: 1 - to: tally - with: oldOrder - startingAt: 1! Item was changed: ----- Method: OrderedDictionary>>initialize: (in category 'private') ----- initialize: n super initialize: n. + order := self arrayType new: self capacity! - order := self arrayType new: n + 1 * 3 // 4! |
> OrderedDictionary changes: > - make it a subclass of PluggableDictionary. This lets one create e.g. an ordered identity dictionary without creating a subclass with duplicated behavior
Just a stupid question: Why do we have separate Pluggable/IdentityDictionary subclasses of Dictionary at all and did not directly implement Dictionary with a pluggable block? Is this a historic decision or an optimization thing? :-)
Best,
Christoph
Von: Squeak-dev <[hidden email]> im Auftrag von [hidden email] <[hidden email]>
Gesendet: Montag, 5. Oktober 2020 00:31:44 An: [hidden email]; [hidden email] Betreff: [squeak-dev] The Trunk: Collections-ul.915.mcz Levente Uzonyi uploaded a new version of Collections to project The Trunk:
http://source.squeak.org/trunk/Collections-ul.915.mcz ==================== Summary ==================== Name: Collections-ul.915 Author: ul Time: 5 October 2020, 12:30:33.597525 am UUID: 93341c2a-ebb2-4d2b-abee-e3f42f509836 Ancestors: Collections-eem.913 HashedCollection changes: - make #capacity return the actual capacity of the collection instead of the size of the internal array. This change is obviously not backwards compatible. - improve the performance of #isEmpty when tally is 0 OrderedDictionary changes: - make it a subclass of PluggableDictionary. This lets one create e.g. an ordered identity dictionary without creating a subclass with duplicated behavior - simplify #initialize and #growTo: now that #capacity is accurate =============== Diff against Collections-eem.913 =============== Item was changed: ----- Method: HashedCollection>>capacity (in category 'accessing') ----- capacity + "Answer the current capacity of the receiver - aka the number of elements the receiver can hold without growing." - "Answer the current capacity of the receiver." + ^ array size * 3 // 4! - ^ array size! Item was changed: ----- Method: HashedCollection>>isEmpty (in category 'testing') ----- isEmpty "For non-weak collections, we can use the tally to speed up the empty check. For weak collections, we must use the traditional way because the tally is unreliable. Also see #size vs. #slowSize." + tally = 0 ifTrue: [ ^true ]. + ^array class isWeak and: [ super isEmpty ]! - ^ array class isWeak - ifFalse: [ tally = 0 ] - ifTrue: [ super isEmpty ]! Item was changed: ----- Method: HashedCollection>>removeAll (in category 'removing') ----- removeAll "remove all elements from this collection. Preserve the capacity" + self initialize: array size! - self initialize: self capacity! Item was changed: + PluggableDictionary subclass: #OrderedDictionary - Dictionary subclass: #OrderedDictionary instanceVariableNames: 'order' classVariableNames: '' poolDictionaries: '' category: 'Collections-Sequenceable'! !OrderedDictionary commentStamp: 'mt 1/16/2015 10:42' prior: 0! I am an ordered dictionary. I have an additional index (called 'order') to keep track of the insertion order of my associations. The read access is not affected by the additional index. The index is updated in O(1) [time] when inserting new keys. For present keys, that insertion involves actions in O(n) to move the respective element to the end of the order. The growth operation compacts the index and takes O(n) additional time. NOTE: This is still no instance of SequenceableCollection. Having this, some protocols are missing and may require working on #associations, which is an Array and thus sequenceable.! Item was changed: ----- Method: OrderedDictionary>>growTo: (in category 'private') ----- growTo: anInteger - | oldOrder | super growTo: anInteger. + order := order grownBy: self capacity - order size! - oldOrder := order. - "Grow only to 75%. See #atNewIndex:put: in HashedCollection." - order := self arrayType new: anInteger + 1 * 3 // 4. - order - replaceFrom: 1 - to: tally - with: oldOrder - startingAt: 1! Item was changed: ----- Method: OrderedDictionary>>initialize: (in category 'private') ----- initialize: n super initialize: n. + order := self arrayType new: self capacity! - order := self arrayType new: n + 1 * 3 // 4!
Carpe Squeak!
|
Hi Christoph,
On Tue, 6 Oct 2020, Thiede, Christoph wrote: > > > OrderedDictionary changes: > > > - make it a subclass of PluggableDictionary. This lets one create e.g. an ordered identity dictionary without creating a subclass with duplicated behavior > > Just a stupid question: Why do we have separate Pluggable/IdentityDictionary subclasses of Dictionary at all and did not directly implement Dictionary with a pluggable block? Is this a historic decision or an optimization > thing? :-) AFAIK PluggableDictionary was not part of Smalltalk-80, but Dictionary and IdentityDictionary were. Dictionary and IdentityDictionary are slightly faster than their PluggableDictionary equivalents. In theory, we could simplify the HashedCollection hierarchy by making Set and Dictionary pluggable. But in practice, too many things use IdentitySet and IdentityDictionary directly. Perhaps we could replace those classes with stubs having the right "constructor" but we couldn't remove those classes. From the users' point of view, it's probably better that we have separate classes. (And there are all sorts of interesting questsions raising from making Set and Dictionary pluggable. E.g. how would that affect other subclasses of Set/Dictionary? Levente > > Best, > Christoph > > _________________________________________________________________________________________________________________________________________________________________________________________________________________________________ > Von: Squeak-dev <[hidden email]> im Auftrag von [hidden email] <[hidden email]> > Gesendet: Montag, 5. Oktober 2020 00:31:44 > An: [hidden email]; [hidden email] > Betreff: [squeak-dev] The Trunk: Collections-ul.915.mcz > Levente Uzonyi uploaded a new version of Collections to project The Trunk: > http://source.squeak.org/trunk/Collections-ul.915.mcz > > ==================== Summary ==================== > > Name: Collections-ul.915 > Author: ul > Time: 5 October 2020, 12:30:33.597525 am > UUID: 93341c2a-ebb2-4d2b-abee-e3f42f509836 > Ancestors: Collections-eem.913 > > HashedCollection changes: > - make #capacity return the actual capacity of the collection instead of the size of the internal array. This change is obviously not backwards compatible. > - improve the performance of #isEmpty when tally is 0 > > OrderedDictionary changes: > - make it a subclass of PluggableDictionary. This lets one create e.g. an ordered identity dictionary without creating a subclass with duplicated behavior > - simplify #initialize and #growTo: now that #capacity is accurate > > =============== Diff against Collections-eem.913 =============== > > Item was changed: > ----- Method: HashedCollection>>capacity (in category 'accessing') ----- > capacity > + "Answer the current capacity of the receiver - aka the number of elements the receiver can hold without growing." > - "Answer the current capacity of the receiver." > > + ^ array size * 3 // 4! > - ^ array size! > > Item was changed: > ----- Method: HashedCollection>>isEmpty (in category 'testing') ----- > isEmpty > "For non-weak collections, we can use the tally to speed up the empty check. For weak collections, we must use the traditional way because the tally is unreliable. Also see #size vs. #slowSize." > > + tally = 0 ifTrue: [ ^true ]. > + ^array class isWeak and: [ super isEmpty ]! > - ^ array class isWeak > - ifFalse: [ tally = 0 ] > - ifTrue: [ super isEmpty ]! > > Item was changed: > ----- Method: HashedCollection>>removeAll (in category 'removing') ----- > removeAll > "remove all elements from this collection. > Preserve the capacity" > > + self initialize: array size! > - self initialize: self capacity! > > Item was changed: > + PluggableDictionary subclass: #OrderedDictionary > - Dictionary subclass: #OrderedDictionary > instanceVariableNames: 'order' > classVariableNames: '' > poolDictionaries: '' > category: 'Collections-Sequenceable'! > > !OrderedDictionary commentStamp: 'mt 1/16/2015 10:42' prior: 0! > I am an ordered dictionary. I have an additional index (called 'order') to keep track of the insertion order of my associations. > > The read access is not affected by the additional index. > > The index is updated in O(1) [time] when inserting new keys. For present keys, that insertion involves actions in O(n) to move the respective element to the end of the order. > > The growth operation compacts the index and takes O(n) additional time. > > NOTE: This is still no instance of SequenceableCollection. Having this, some protocols are missing and may require working on #associations, which is an Array and thus sequenceable.! > > Item was changed: > ----- Method: OrderedDictionary>>growTo: (in category 'private') ----- > growTo: anInteger > > - | oldOrder | > super growTo: anInteger. > + order := order grownBy: self capacity - order size! > - oldOrder := order. > - "Grow only to 75%. See #atNewIndex:put: in HashedCollection." > - order := self arrayType new: anInteger + 1 * 3 // 4. > - order > - replaceFrom: 1 > - to: tally > - with: oldOrder > - startingAt: 1! > > Item was changed: > ----- Method: OrderedDictionary>>initialize: (in category 'private') ----- > initialize: n > > super initialize: n. > + order := self arrayType new: self capacity! > - order := self arrayType new: n + 1 * 3 // 4! > > > > |
> On 2020-10-06, at 5:17 AM, Levente Uzonyi <[hidden email]> wrote: > > > AFAIK PluggableDictionary was not part of Smalltalk-80, but Dictionary and IdentityDictionary were. Correct. > > Dictionary and IdentityDictionary are slightly faster than their PluggableDictionary equivalents. Yah; the #scanFor.... methods will be slightly slower because of the test for the relevant block. However, for anything much more complicated than a SmallInteger #= test I suspect that the actual total time will be very close. After all the object being added has to evaluate its #= against whatever is already in the slot and comparing two Wooblies with 100 instvars (and instvar related sub-tests) to try will likely swamp the #scanFor... loop. I don't think I've had any reason to use a PluggableDictionary before but I can see that having a way to use an application specific equivalence check instead of relying on a system-wide and inevitably limiting definition has its uses. I can imagine cases where one might need two dictionaries of nominally the same contents but where the equivalence test for inclusion needs to be different in order to make comparisons of some interesting state. tim -- tim Rowledge; [hidden email]; http://www.rowledge.org/tim Strange OpCodes: DC: Divide and Conquer |
Free forum by Nabble | Edit this page |