The subject of this mail is exactly my question.
I came to this question by looking at Object>>#= message implementation. """ = anObject "Answer whether the receiver and the argument represent the same object. If = is redefined in any subclass, consider also redefining the message hash." ^self == anObject """ When do we need to redefine #hash message? Is it the right way to implement equality between two objects or is there another message that I should override? Regards, Julien |
I think you redefine hash to allow a Set to know if two objects are
the same or not. So it's not an equals but it's a way for a Set to know if an equivalent element is already on a Set or not. Correct me if i'm wrong. On 26 May 2015 at 20:45, Julien Delplanque <[hidden email]> wrote: > The subject of this mail is exactly my question. > > I came to this question by looking at Object>>#= message implementation. > > """ > = anObject > "Answer whether the receiver and the argument represent the same > object. If = is redefined in any subclass, consider also redefining the > message hash." > > ^self == anObject > """ > > When do we need to redefine #hash message? > > Is it the right way to implement equality between two objects or is there > another message that I should override? > > Regards, > Julien > -- Cheers Cyril Ferlicot |
In reply to this post by Julien Delplanque
2015-05-26 15:45 GMT-03:00 Julien Delplanque <[hidden email]>:
> The subject of this mail is exactly my question. > > I came to this question by looking at Object>>#= message implementation. > > """ > = anObject > "Answer whether the receiver and the argument represent the same > object. If = is redefined in any subclass, consider also redefining the > message hash." > > ^self == anObject > """ > > When do we need to redefine #hash message? Whenever you redefine #=. > Is it the right way to implement equality between two objects or is there > another message that I should override? #hash, as per the #= suggests. Hashed collections (Set mainly, but there are others) index and lookup objects by its hash value. Esteban A. Maringolo ps: The proper implementation of #hash to avoid collisions (two different objects sharing the same hash) can be plain simple or a little more complex (mathematically challenging). |
Mmmh,
Let's assume I have a Letter object. This object have a body (String) and a title (also a String). I want to define that two Letter objects are equals if and only if theirs bodies are equals and theirs titles are equals. How do I implement it? I don't understand why I should override #hash in this case? Julien On 26/05/15 20:54, Esteban A. Maringolo wrote: > 2015-05-26 15:45 GMT-03:00 Julien Delplanque <[hidden email]>: >> The subject of this mail is exactly my question. >> >> I came to this question by looking at Object>>#= message implementation. >> >> """ >> = anObject >> "Answer whether the receiver and the argument represent the same >> object. If = is redefined in any subclass, consider also redefining the >> message hash." >> >> ^self == anObject >> """ >> >> When do we need to redefine #hash message? > Whenever you redefine #=. > >> Is it the right way to implement equality between two objects or is there >> another message that I should override? > #hash, as per the #= suggests. > > Hashed collections (Set mainly, but there are others) index and lookup > objects by its hash value. > > Esteban A. Maringolo > > ps: The proper implementation of #hash to avoid collisions (two > different objects sharing the same hash) can be plain simple or a > little more complex (mathematically challenging). > |
In reply to this post by Julien Delplanque
Hi
Any two objects that are = must answer the same #hash value. The hash is used in any of the HashedCollection's for storage and then to find the exact element the #= is used. If 2 objects are #= but their hash values are different the HashedCollection will not find the correct storage slot and you will have undefined behaviour when looking up objects. Out of interest, Andres Valloud wrote a whole book on 'Hashing in Smalltalk' (http://www.lulu.com/content/1455536) Also the Javadoc comments are pretty good in explaining the usage: > The general contract of hashCode is: > Whenever it is invoked on the same object more than once during an execution of a Java application, the hashCode method must consistently return the same integer, provided no information used in equals comparisons on the object is modified. This integer need not remain consistent from one execution of an application to another execution of the same application. > If two objects are equal according to the equals(Object) method, then calling the hashCode method on each of the two objects must produce the same integer result. > It is not required that if two objects are unequal according to the equals(Object) method, then calling the hashCode method on each of the two objects must produce distinct integer results. However, the programmer should be aware that producing distinct integer results for unequal objects may improve the performance of hash tables. > As much as is reasonably practical, the hashCode method defined by class Object does return distinct integers for distinct objects. Cheers Carlo On 26 May 2015, at 8:45 PM, Julien Delplanque <[hidden email]> wrote: The subject of this mail is exactly my question. I came to this question by looking at Object>>#= message implementation. """ = anObject "Answer whether the receiver and the argument represent the same object. If = is redefined in any subclass, consider also redefining the message hash." ^self == anObject """ When do we need to redefine #hash message? Is it the right way to implement equality between two objects or is there another message that I should override? Regards, Julien |
Thanks, I understand now :)
On 26/05/15 21:06, Carlo wrote: > Hi > > Any two objects that are = must answer the same #hash value. > The hash is used in any of the HashedCollection's for storage and then to find the exact element the #= is used. > If 2 objects are #= but their hash values are different the HashedCollection will not find the correct storage slot and you will have undefined behaviour when looking up objects. > > Out of interest, Andres Valloud wrote a whole book on 'Hashing in Smalltalk' (http://www.lulu.com/content/1455536) > > Also the Javadoc comments are pretty good in explaining the usage: >> The general contract of hashCode is: >> Whenever it is invoked on the same object more than once during an execution of a Java application, the hashCode method must consistently return the same integer, provided no information used in equals comparisons on the object is modified. This integer need not remain consistent from one execution of an application to another execution of the same application. >> If two objects are equal according to the equals(Object) method, then calling the hashCode method on each of the two objects must produce the same integer result. >> It is not required that if two objects are unequal according to the equals(Object) method, then calling the hashCode method on each of the two objects must produce distinct integer results. However, the programmer should be aware that producing distinct integer results for unequal objects may improve the performance of hash tables. >> As much as is reasonably practical, the hashCode method defined by class Object does return distinct integers for distinct objects. > Cheers > Carlo > > On 26 May 2015, at 8:45 PM, Julien Delplanque <[hidden email]> wrote: > > The subject of this mail is exactly my question. > > I came to this question by looking at Object>>#= message implementation. > > """ > = anObject > "Answer whether the receiver and the argument represent the same > object. If = is redefined in any subclass, consider also redefining the > message hash." > > ^self == anObject > """ > > When do we need to redefine #hash message? > > Is it the right way to implement equality between two objects or is there another message that I should override? > > Regards, > Julien > > > |
Also implied, but seldom stated, is that all values used in = and hash must only be set at creation, never after.
The reason is the same, keeping hashed collection usage sane; p := Point x: 3 y: 2. Set add: p. "Imaginary method" p setX: 2. Set includes: p -> false Cheers, Henry > On 26 May 2015, at 9:11 , Julien Delplanque <[hidden email]> wrote: > > Thanks, I understand now :) > > On 26/05/15 21:06, Carlo wrote: >> Hi >> >> Any two objects that are = must answer the same #hash value. >> The hash is used in any of the HashedCollection's for storage and then to find the exact element the #= is used. >> If 2 objects are #= but their hash values are different the HashedCollection will not find the correct storage slot and you will have undefined behaviour when looking up objects. >> >> Out of interest, Andres Valloud wrote a whole book on 'Hashing in Smalltalk' (http://www.lulu.com/content/1455536) >> >> Also the Javadoc comments are pretty good in explaining the usage: >>> The general contract of hashCode is: >>> Whenever it is invoked on the same object more than once during an execution of a Java application, the hashCode method must consistently return the same integer, provided no information used in equals comparisons on the object is modified. This integer need not remain consistent from one execution of an application to another execution of the same application. >>> If two objects are equal according to the equals(Object) method, then calling the hashCode method on each of the two objects must produce the same integer result. >>> It is not required that if two objects are unequal according to the equals(Object) method, then calling the hashCode method on each of the two objects must produce distinct integer results. However, the programmer should be aware that producing distinct integer results for unequal objects may improve the performance of hash tables. >>> As much as is reasonably practical, the hashCode method defined by class Object does return distinct integers for distinct objects. >> Cheers >> Carlo >> >> On 26 May 2015, at 8:45 PM, Julien Delplanque <[hidden email]> wrote: >> >> The subject of this mail is exactly my question. >> >> I came to this question by looking at Object>>#= message implementation. >> >> """ >> = anObject >> "Answer whether the receiver and the argument represent the same >> object. If = is redefined in any subclass, consider also redefining the >> message hash." >> >> ^self == anObject >> """ >> >> When do we need to redefine #hash message? >> >> Is it the right way to implement equality between two objects or is there another message that I should override? >> >> Regards, >> Julien >> >> >> > > |
Wow, compelling example!
Alexandre > On May 29, 2015, at 12:33 PM, Henrik Johansen <[hidden email]> wrote: > > Also implied, but seldom stated, is that all values used in = and hash must only be set at creation, never after. > The reason is the same, keeping hashed collection usage sane; > p := Point x: 3 y: 2. > Set add: p. > "Imaginary method" > p setX: 2. > Set includes: p -> false > > Cheers, > Henry > >> On 26 May 2015, at 9:11 , Julien Delplanque <[hidden email]> wrote: >> >> Thanks, I understand now :) >> >> On 26/05/15 21:06, Carlo wrote: >>> Hi >>> >>> Any two objects that are = must answer the same #hash value. >>> The hash is used in any of the HashedCollection's for storage and then to find the exact element the #= is used. >>> If 2 objects are #= but their hash values are different the HashedCollection will not find the correct storage slot and you will have undefined behaviour when looking up objects. >>> >>> Out of interest, Andres Valloud wrote a whole book on 'Hashing in Smalltalk' (http://www.lulu.com/content/1455536) >>> >>> Also the Javadoc comments are pretty good in explaining the usage: >>>> The general contract of hashCode is: >>>> Whenever it is invoked on the same object more than once during an execution of a Java application, the hashCode method must consistently return the same integer, provided no information used in equals comparisons on the object is modified. This integer need not remain consistent from one execution of an application to another execution of the same application. >>>> If two objects are equal according to the equals(Object) method, then calling the hashCode method on each of the two objects must produce the same integer result. >>>> It is not required that if two objects are unequal according to the equals(Object) method, then calling the hashCode method on each of the two objects must produce distinct integer results. However, the programmer should be aware that producing distinct integer results for unequal objects may improve the performance of hash tables. >>>> As much as is reasonably practical, the hashCode method defined by class Object does return distinct integers for distinct objects. >>> Cheers >>> Carlo >>> >>> On 26 May 2015, at 8:45 PM, Julien Delplanque <[hidden email]> wrote: >>> >>> The subject of this mail is exactly my question. >>> >>> I came to this question by looking at Object>>#= message implementation. >>> >>> """ >>> = anObject >>> "Answer whether the receiver and the argument represent the same >>> object. If = is redefined in any subclass, consider also redefining the >>> message hash." >>> >>> ^self == anObject >>> """ >>> >>> When do we need to redefine #hash message? >>> >>> Is it the right way to implement equality between two objects or is there another message that I should override? >>> >>> Regards, >>> Julien >>> >>> >>> >> >> > > -- _,.;:~^~:;._,.;:~^~:;._,.;:~^~:;._,.;:~^~:;._,.;: Alexandre Bergel http://www.bergel.eu ^~:;._,.;:~^~:;._,.;:~^~:;._,.;:~^~:;._,.;:~^~:;. |
Free forum by Nabble | Edit this page |