Maybe Levente will help me clarifying this...
I encountered several different implementations: MessageTally, Semaphore, FutureMaker, LocatedMethod, TextAnchor hash ^self identityHash (or ^someInstVar identityHash, which is equivalent) Object hash ^self scaledIdentityHash The first does not scale well for large Set/Dictionary, and the second costs a little overhed on small (Set/Dictionary). My questions are: - are these different implementations deliberate ? For example knowing there won't be any large Set... In which case a comment would help - or should we always use the scaledIdentityHash to hash ? (and let identityHash for printing and a few well commented exceptional cases) Beside, apart printing usage, I note these two usages of IdentityHash as hash: MethodDictionary scanFor: (I guess we don't expect more than 4096 messages too often). Player recaptureUniqueCostumes I also noted that MethodDictionary does not override scanForEmptySlotFor: which is defined in term either of hash or scaledIdentityHash, rather than IdentityHash This is currently OK, because the unique path I found was: MethodDictionary>>#at:put -> HashCollection>>atNewIndex:put: -> MethodDictionary>>#grow OK, grow overrides super, otherwise: -> HashCollection>>#grow -> HashCollection>>#growTo: -> Dictionary>>#noCheckNoGrowFillFrom: -> HashCollection>>#scanForEmptySlotFor: -> Symbol>>#hash But it's a potential tricky trap for future. Nicolas |
On Thu, 24 Dec 2009, Nicolas Cellier wrote:
> Maybe Levente will help me clarifying this... > I encountered several different implementations: > > MessageTally, Semaphore, FutureMaker, LocatedMethod, TextAnchor > hash > ^self identityHash > (or ^someInstVar identityHash, which is equivalent) > > Object > hash > ^self scaledIdentityHash > > The first does not scale well for large Set/Dictionary, and the second > costs a little overhed on small (Set/Dictionary). > My questions are: > - are these different implementations deliberate ? For example knowing > there won't be any large Set... In which case a comment would help No. > - or should we always use the scaledIdentityHash to hash ? (and let > identityHash for printing and a few well commented exceptional cases) Exactly. I quickly checked the classes and I think that: - all of the above should be #scaledIdentityHash - optionally MessageTally >> #hash should use the process's hash, since #= also compares the process variable > > Beside, apart printing usage, I note these two usages of IdentityHash as hash: > MethodDictionary scanFor: (I guess we don't expect more than 4096 > messages too often). MethodDictionary's array's size is a power of two, and #scaledIdentityHash is just #identityHash bitShift: 18. Using #scaledIdentityHash would degrade the performance to the minimum. The use of #hash instead of #identityHash could help with large MethodDictionaries, since the keys are Symbols, but the current code assumes that the size will never reach 4096. And that's acceptable IMO. > Player recaptureUniqueCostumes > That should use #scaledIdentityHash, since the set defined there mimics an IdentitySet. > I also noted that MethodDictionary does not override > scanForEmptySlotFor: which is defined in term either of hash or > scaledIdentityHash, rather than IdentityHash > This is currently OK, because the unique path I found was: > > MethodDictionary>>#at:put > -> HashCollection>>atNewIndex:put: > -> MethodDictionary>>#grow > OK, grow overrides super, otherwise: > -> HashCollection>>#grow > -> HashCollection>>#growTo: > -> Dictionary>>#noCheckNoGrowFillFrom: > -> HashCollection>>#scanForEmptySlotFor: > -> Symbol>>#hash > > But it's a potential tricky trap for future. > Well, maybe. MethodDictionary has a different structure than other HashedCollections and because of this it has to reimplement #grow and #rehash. #scanForEmptySlotFor: is only used by those two methods in general (WeakKeyDictionary is the only exception). Levente > Nicolas > > |
2009/12/24 Levente Uzonyi <[hidden email]>:
> On Thu, 24 Dec 2009, Nicolas Cellier wrote: > >> Maybe Levente will help me clarifying this... >> I encountered several different implementations: >> >> MessageTally, Semaphore, FutureMaker, LocatedMethod, TextAnchor >> hash >> ^self identityHash >> (or ^someInstVar identityHash, which is equivalent) >> >> Object >> hash >> ^self scaledIdentityHash >> >> The first does not scale well for large Set/Dictionary, and the second >> costs a little overhed on small (Set/Dictionary). >> My questions are: >> - are these different implementations deliberate ? For example knowing >> there won't be any large Set... In which case a comment would help > > No. > >> - or should we always use the scaledIdentityHash to hash ? (and let >> identityHash for printing and a few well commented exceptional cases) > > Exactly. > > I quickly checked the classes and I think that: > - all of the above should be #scaledIdentityHash > - optionally MessageTally >> #hash should use the process's hash, > since #= also compares the process variable > >> >> Beside, apart printing usage, I note these two usages of IdentityHash as >> hash: >> MethodDictionary scanFor: (I guess we don't expect more than 4096 >> messages too often). > > MethodDictionary's array's size is a power of two, and #scaledIdentityHash > is just #identityHash bitShift: 18. Using #scaledIdentityHash would degrade > the performance to the minimum. The use of #hash instead of #identityHash > could help with large MethodDictionaries, since the keys are Symbols, but > the current code assumes that the size will never reach 4096. And that's > acceptable IMO. > Indeed, current trunk monster is far: (ProtoObject withAllSubclasses asArray collect: [:e | e selectors size -> e]) sort last. 1167->Morph We should accept only refactorings that reduce this number :) >> Player recaptureUniqueCostumes >> > > That should use #scaledIdentityHash, since the set defined there mimics an > IdentitySet. > >> I also noted that MethodDictionary does not override >> scanForEmptySlotFor: which is defined in term either of hash or >> scaledIdentityHash, rather than IdentityHash >> This is currently OK, because the unique path I found was: >> >> MethodDictionary>>#at:put >> -> HashCollection>>atNewIndex:put: >> -> MethodDictionary>>#grow >> OK, grow overrides super, otherwise: >> -> HashCollection>>#grow >> -> HashCollection>>#growTo: >> -> Dictionary>>#noCheckNoGrowFillFrom: >> -> HashCollection>>#scanForEmptySlotFor: >> -> Symbol>>#hash >> >> But it's a potential tricky trap for future. >> > > Well, maybe. MethodDictionary has a different structure than other > HashedCollections and because of this it has to reimplement #grow and > #rehash. #scanForEmptySlotFor: is only used by those two methods in general > (WeakKeyDictionary is the only exception). > You have a better view (conceptual) and maybe my fear was exagerated. But it was not that easy to follow the senders the reverse-engineering way... > > Levente > Thanks for explanations. >> Nicolas >> >> > > |
Free forum by Nabble | Edit this page |