Hi All,
In VisualWorks Message implements #= & #hash naturally; two messages whose selectors and arguments are #= are also equal. But in Cuis, Squeak and Pharo Message inherits #= and #hash from Object, i.e. uses identity comparison. This is, to say the least, annoying. Any objections to implementing comparing in Message to match VisualWorks? _,,,^..^,,,_ best, Eliot |
Hi David,
On Mon, Nov 19, 2018 at 9:52 AM David T. Lewis <[hidden email]> wrote: On Mon, Nov 19, 2018 at 09:32:17AM -0800, Eliot Miranda wrote: For me it is relevant. Two messages with different lookupClasses, e.g. one with nil and one with a specific class, represent different messages, one a normal send one a super send. So my changes in waiting include lookupClass in both hash and =. I don't think it makes much difference, but the incompatibility with VisualWorks, while regrettable, feels correct to me. Dave _,,,^..^,,,_ best, Eliot |
To me this looks like a good change, yes. Marcus |
In reply to this post by Eliot Miranda-2
On Mon, 19 Nov 2018, Eliot Miranda wrote:
> Hi David, > > On Mon, Nov 19, 2018 at 9:52 AM David T. Lewis <[hidden email]> wrote: > On Mon, Nov 19, 2018 at 09:32:17AM -0800, Eliot Miranda wrote: > > Hi All, > > > > In VisualWorks Message implements #= & #hash naturally; two messages > > whose selectors and arguments are #= are also equal. But in Cuis, Squeak > > and Pharo Message inherits #= and #hash from Object, i.e. uses identity > > comparison. This is, to say the least, annoying. Any objections to > > implementing comparing in Message to match VisualWorks? > > > > That sounds like an obviously good thing to do :-) > > Is the lookupClass instance variable relevant for comparisons? I am > guessing not, since we already have #analogousCodeTo: for that type of > comparison. > > > For me it is relevant. Two messages with different lookupClasses, e.g. one with nil and one with a specific class, represent different messages, one a normal send one a super send. So my changes in > waiting include lookupClass in both hash and =. I don't think it makes much difference, but the incompatibility with VisualWorks, while regrettable, feels correct to me. as long as you don't rename your classes while you store them in hashed collections. Levente > > Dave > > > _,,,^..^,,,_ > best, Eliot > > |
In reply to this post by Eliot Miranda-2
Hi Eliot,
(below) On 19/11/2018 02:32 p.m., Eliot Miranda via Cuis-dev wrote:
Just added these implementations to Cuis: = aMessage "Any object is equal to itself" self == aMessage ifTrue: [ ^ true ]. self class == aMessage class ifFalse: [ ^false ]. selector = aMessage selector ifFalse: [ ^false ]. lookupClass = aMessage lookupClass ifFalse: [ ^false ]. ^args = aMessage arguments hash "Hash is reimplemented because = is implemented." ^selector hash bitXor: args hash Do they look ok? (I considered lookupClass for #=, but not for hash, as collisions because of that seem unlikely, and in any case, grouping them that way could be useful). Cheers, -- Juan Vuletich www.cuis-smalltalk.org https://github.com/Cuis-Smalltalk/Cuis-Smalltalk-Dev https://github.com/jvuletich https://www.linkedin.com/in/juan-vuletich-75611b3 @JuanVuletich |
In reply to this post by Levente Uzonyi
It sounds good, but an Inbox submission would go a long way to
understanding exactly what you mean esp. re: understanding this special condition with the lookupClass being nil... and whether it deserves a first-class name or not... On Tue, Nov 20, 2018 at 4:10 AM Levente Uzonyi <[hidden email]> wrote: > > On Mon, 19 Nov 2018, Eliot Miranda wrote: > > > Hi David, > > > > On Mon, Nov 19, 2018 at 9:52 AM David T. Lewis <[hidden email]> wrote: > > On Mon, Nov 19, 2018 at 09:32:17AM -0800, Eliot Miranda wrote: > > > Hi All, > > > > > > In VisualWorks Message implements #= & #hash naturally; two messages > > > whose selectors and arguments are #= are also equal. But in Cuis, Squeak > > > and Pharo Message inherits #= and #hash from Object, i.e. uses identity > > > comparison. This is, to say the least, annoying. Any objections to > > > implementing comparing in Message to match VisualWorks? > > > > > > > That sounds like an obviously good thing to do :-) > > > > Is the lookupClass instance variable relevant for comparisons? I am > > guessing not, since we already have #analogousCodeTo: for that type of > > comparison. > > > > > > For me it is relevant. Two messages with different lookupClasses, e.g. one with nil and one with a specific class, represent different messages, one a normal send one a super send. So my changes in > > waiting include lookupClass in both hash and =. I don't think it makes much difference, but the incompatibility with VisualWorks, while regrettable, feels correct to me. > > What about Behavior >> #hash? It uses the name of the class, which is fine > as long as you don't rename your classes while you store them in hashed > collections. > > Levente > > > > > Dave > > > > > > _,,,^..^,,,_ > > best, Eliot > > > > |
It is interesting. In base Squeak, the only time that lookupClass is set is, well, never(!). There is a setter method, but it isn't called is base Squeak. It is checked a couple of times, but unless there is hidden machinery in the VM that sets it, it isn't actually ever used. Unless someone decided to use it in their code. -cbc On Tue, Nov 20, 2018 at 9:06 AM Chris Muller <[hidden email]> wrote: It sounds good, but an Inbox submission would go a long way to |
In reply to this post by Eliot Miranda-2
Wait, Message is the one used internally by the system. It doesn't
have #= and #hash because it didn't need it. *MessageSend* is the one for external use, and it already defines #= and #hash. On Mon, Nov 19, 2018 at 11:32 AM Eliot Miranda <[hidden email]> wrote: > > Hi All, > > In VisualWorks Message implements #= & #hash naturally; two messages whose selectors and arguments are #= are also equal. But in Cuis, Squeak and Pharo Message inherits #= and #hash from Object, i.e. uses identity comparison. This is, to say the least, annoying. Any objections to implementing comparing in Message to match VisualWorks? > > _,,,^..^,,,_ > best, Eliot > |
In reply to this post by Juan Vuletich-4
Hi Juan, On Tue, Nov 20, 2018 at 8:00 AM Juan Vuletich <[hidden email]> wrote:
this last line should be ^args literalEqual: aMessage arguments and this is very important :-)
_,,,^..^,,,_ best, Eliot |
In reply to this post by Levente Uzonyi
To make things more clear, the current implementation of Behavior >> #hash
has two negative side effects: - behaviors stored in collections relying on the hash value (e.g. Set, Dictionary) will have to be rehashed whenever a behavior is renamed - objects using Behavior >> #hash to implement their own #hash, like what Eliot just did to Message will suffer from the same issue. So Sets and Dictionaries holding those kind of objects will have to be rehashed as well upon the rename of the behavior. My questions related to this: - why does Behavior >> #hash rely on the name instead of identity? - do we want to fix those issues mentioned above or do we just say that one should not rename classes and expect things to work? Levente On Tue, 20 Nov 2018, Levente Uzonyi wrote: > On Mon, 19 Nov 2018, Eliot Miranda wrote: > >> Hi David, >> >> On Mon, Nov 19, 2018 at 9:52 AM David T. Lewis <[hidden email]> wrote: >> On Mon, Nov 19, 2018 at 09:32:17AM -0800, Eliot Miranda wrote: >> > Hi All, >> > >> > In VisualWorks Message implements #= & #hash naturally; two >> messages >> > whose selectors and arguments are #= are also equal. But in Cuis, >> Squeak >> > and Pharo Message inherits #= and #hash from Object, i.e. uses >> identity >> > comparison. This is, to say the least, annoying. Any objections >> to >> > implementing comparing in Message to match VisualWorks? >> > >> >> That sounds like an obviously good thing to do :-) >> >> Is the lookupClass instance variable relevant for comparisons? I am >> guessing not, since we already have #analogousCodeTo: for that type >> of >> comparison. >> >> >> For me it is relevant. Two messages with different lookupClasses, e.g. one >> with nil and one with a specific class, represent different messages, one a >> normal send one a super send. So my changes in >> waiting include lookupClass in both hash and =. I don't think it makes >> much difference, but the incompatibility with VisualWorks, while >> regrettable, feels correct to me. > > What about Behavior >> #hash? It uses the name of the class, which is fine as > long as you don't rename your classes while you store them in hashed > collections. > > Levente > >> >> Dave >> >> >> _,,,^..^,,,_ >> best, Eliot >> > |
On 11/20/18 5:33 PM, Levente Uzonyi
wrote:
- why does Behavior >> #hash rely on the name instead of identity?FWIW, I found this...
|
In reply to this post by Levente Uzonyi
> To make things more clear, the current implementation of Behavior >> #hash
> has two negative side effects: > - behaviors stored in collections relying on the hash value (e.g. Set, > Dictionary) will have to be rehashed whenever a behavior is renamed > - objects using Behavior >> #hash to implement their own #hash, like what > Eliot just did to Message will suffer from the same issue. So Sets and > Dictionaries holding those kind of objects will have to be rehashed as > well upon the rename of the behavior. > > My questions related to this: > - why does Behavior >> #hash rely on the name instead of identity? If you mean #identityHash, then its because involving an unstable value in a #hash calculation is never a good idea. #identityHash can be different for the same class between two different images, or if the class was ever becomed or reloaded into a new image, etc. > - do we want to fix those issues mentioned above or do we just say that > one should not rename classes and expect things to work? Neither. We just say that when one renames a class to rehash all relevant HashedCollections. - Chris |
On Tue, 20 Nov 2018, Chris Muller wrote:
>> To make things more clear, the current implementation of Behavior >> #hash >> has two negative side effects: >> - behaviors stored in collections relying on the hash value (e.g. Set, >> Dictionary) will have to be rehashed whenever a behavior is renamed >> - objects using Behavior >> #hash to implement their own #hash, like what >> Eliot just did to Message will suffer from the same issue. So Sets and >> Dictionaries holding those kind of objects will have to be rehashed as >> well upon the rename of the behavior. >> >> My questions related to this: >> - why does Behavior >> #hash rely on the name instead of identity? > > If you mean #identityHash, then its because involving an unstable > value in a #hash calculation is never a good idea. #identityHash can > be different for the same class between two different images, or if > the class was ever becomed or reloaded into a new image, etc. Is there an actual user of that feature? Bob found out that #hash had been changed during the developement of Squeak 3.9. Therefore this issue is not present in Cuis (forked from 3.7). And I just checked Pharo and found that Behavior >> #hash had been removed from there. So, I suggest we remove it as well unless there's a really good reason to keep it. > >> - do we want to fix those issues mentioned above or do we just say that >> one should not rename classes and expect things to work? > > Neither. We just say that when one renames a class to rehash all > relevant HashedCollections. That's "one should not rename classes and expect things to work", isn't it? Levente > > - Chris |
I'm neutral on this, but here is a bit more
(and from earlier it would seem)
'From Squeak3.7beta of ''1 April
2004'' [latest update: #5963] on 22 June 2004 at 8:51:57 pm'
"Change
Set: BehaviorHashEnh v1.2
Date: 22 June 2004, 16.02.2006 Author: Stephan Rudlof, md md: added a line to the poscript to uncompactify the MethodProperties class. We want to add an instVar for the selector. Improves the default Object>>hash for Behaviors by installing Behavior>>hash. String>>hash has been changed a little to avoid infinite recursion (without changing its semantics). All is done in the postscript. Important ----------- This is a special changeset: Do not export and import this changeset again after importing it the first time! Then the methods are not installed alone in the postscript anymore, leading to serious problems! ----------- Rationale: Object>>hash calling ProtoObject>>identityHash gives poor results for Behaviors. Therefore a new Behavior>>hash using Symbol>>hash or String>>hash (the latter slightly changed to avoide infinite recursion) will be installed. Consequences: - It speeds up Set/Dictionary operations with Behaviors a lot (see below). - The main consequence for other objects as Behaviors seems to be a changed hash if they use self species hash as a start value for computing their hash. But AFAICS this doesn't hurt, since in most cases (non meta classes as species) it maps to Symbol>>hash, which is fast. On 11/21/18 7:31 AM, Levente Uzonyi
wrote:
On Tue, 20 Nov 2018, Chris Muller wrote: |
In reply to this post by Eliot Miranda-2
On 11/20/2018 6:59 PM, Eliot Miranda wrote:
Thanks Eliot. Fix is now at Cuis GitHub repo. Cheers, -- Juan Vuletich www.cuis-smalltalk.org https://github.com/Cuis-Smalltalk/Cuis-Smalltalk-Dev https://github.com/jvuletich https://www.linkedin.com/in/juan-vuletich-75611b3 @JuanVuletich |
In reply to this post by Bob Arning-2
Nice find. So the reason for the change was to improve performance, but
now that change actually makes things slower. We have 22-bit identity hashes and #hashMultiply is quicker too, so a new hash function based on those could be 6x-10x quicker. Levente On Wed, 21 Nov 2018, Bob Arning wrote: > I'm neutral on this, but here is a bit more (and from earlier it would seem) > > > 'From Squeak3.7beta of ''1 April 2004'' [latest update: #5963] on 22 June 2004 at 8:51:57 pm' > "Change Set: BehaviorHashEnh v1.2 > Date: 22 June 2004, 16.02.2006 > Author: Stephan Rudlof, md > md: added a line to the poscript to uncompactify the MethodProperties class. We want to add an instVar for the selector. > Improves the default Object>>hash for Behaviors by installing Behavior>>hash. String>>hash has been changed a little to avoid infinite recursion (without changing its semantics). > All is done in the postscript. > > Important > ----------- > This is a special changeset: Do not export and import this changeset again after importing it the first time! Then the methods are not installed alone in the postscript anymore, leading to serious > problems! > ----------- > > Rationale: Object>>hash calling ProtoObject>>identityHash gives poor results for Behaviors. Therefore a new Behavior>>hash using Symbol>>hash or String>>hash (the latter slightly changed to avoide > infinite recursion) will be installed. > > Consequences: > - It speeds up Set/Dictionary operations with Behaviors a lot (see below). > - The main consequence for other objects as Behaviors seems to be a changed hash if they use > self species hash > as a start value for computing their hash. > But AFAICS this doesn't hurt, since in most cases (non meta classes as species) it maps to Symbol>>hash, which is fast. > On 11/21/18 7:31 AM, Levente Uzonyi wrote: > On Tue, 20 Nov 2018, Chris Muller wrote: > > To make things more clear, the current implementation of Behavior >> #hash > has two negative side effects: > - behaviors stored in collections relying on the hash value (e.g. Set, > Dictionary) will have to be rehashed whenever a behavior is renamed > - objects using Behavior >> #hash to implement their own #hash, like what > Eliot just did to Message will suffer from the same issue. So Sets and > Dictionaries holding those kind of objects will have to be rehashed as > well upon the rename of the behavior. > > My questions related to this: > - why does Behavior >> #hash rely on the name instead of identity? > > > If you mean #identityHash, then its because involving an unstable > value in a #hash calculation is never a good idea. #identityHash can > be different for the same class between two different images, or if > the class was ever becomed or reloaded into a new image, etc. > > > Is there an actual user of that feature? > > Bob found out that #hash had been changed during the developement of Squeak 3.9. Therefore this issue is not present in Cuis (forked from 3.7). And I just checked Pharo and found that > Behavior >> #hash had been removed from there. > So, I suggest we remove it as well unless there's a really good reason to keep it. > > > - do we want to fix those issues mentioned above or do we just say that > one should not rename classes and expect things to work? > > > Neither. We just say that when one renames a class to rehash all > relevant HashedCollections. > > > That's "one should not rename classes and expect things to work", isn't it? > > Levente > > > - Chris > > > > > |
In reply to this post by Levente Uzonyi
Hi Levente,
> >> My questions related to this: > >> - why does Behavior >> #hash rely on the name instead of identity? > > > > If you mean #identityHash, then its because involving an unstable > > value in a #hash calculation is never a good idea. #identityHash can > > be different for the same class between two different images, or if > > the class was ever becomed or reloaded into a new image, etc. > > Is there an actual user of that feature? Several actually. If your primary motivation is improving performance then we could cache the hash value, possibly even straight in the CompiledMethod itself (hard-coded). But we _must not_ destroy stable and deterministic hashes. It would be a huge blow to existing users and even future potential possibilities of future implementations of things. Anything that would try to share #hash calculations between images... Possibly Croquet. It's easy to imagine some things! MagmaDictionary is just a Dictionary except the internal array grows on disk providing size scalability while sparing RAM. The #hash of the objects put into it are calculated by the clients -- it must be consistent between users. source.squeak.org uses a MagmaDictionary for the revision history. It was the best way to introduce the capability without invading Monticello's code at all. > Bob found out that #hash had been changed during the developement of > Squeak 3.9. Therefore this issue is not present in Cuis (forked from 3.7). > And I just checked Pharo and found that Behavior >> #hash had been removed > from there. I think that's a big mistake. There's no need for us to cross the line of a stable hash to a volatile one. We can cache it. |
In reply to this post by Levente Uzonyi
> That's "one should not rename classes and expect things to work", isn't
> it? Not really, since that is already true anyway. If you rename a class, all references to that class will be broken. |
Free forum by Nabble | Edit this page |