Where do true, false and nil obtain their hash value? They inherit
#hash from Object, so it is their identityHash, but I noticed this is consistent between images -- how? It's great, but is there any danger of that value ever changing? That would be bad.. |
On 09.05.2012, at 21:28, Chris Muller wrote:
> Where do true, false and nil obtain their hash value? They inherit > #hash from Object, so it is their identityHash, but I noticed this is > consistent between images -- how? It's great, but is there any danger > of that value ever changing? That would be bad.. The identity hash bits are stored in each object's header. And since true, false, nil are the same decades old instances, their hash did not change. Depending on what the SystemTracer does, it may be different in an image derived by that though. E.g. you may want to check a 64 bit image. - Bert - |
>> Where do true, false and nil obtain their hash value? They inherit
>> #hash from Object, so it is their identityHash, but I noticed this is >> consistent between images -- how? It's great, but is there any danger >> of that value ever changing? That would be bad.. > > > The identity hash bits are stored in each object's header. And since true, false, nil are the same decades old instances, their hash did not change. > > Depending on what the SystemTracer does, it may be different in an image derived by that though. E.g. you may want to check a 64 bit image. So, just in thinking about it -- that is a VERY distant dependency that could manifest as a bug way up in at an app-level in production. Because no SUnit test would be able to catch it, and when trying to debug it in production, invariably someone with the "classic" hashes not be able to reproduce the problem.. Hopefully it would only be a "lookup" problem but what if it was in the context of an #at:ifAbsentPut:? What a nightmare! So, it would seem a good idea to override #hash to return their current value as a fixed constant. Thoughts? |
On 09.05.2012, at 22:19, Chris Muller wrote: >>> Where do true, false and nil obtain their hash value? They inherit >>> #hash from Object, so it is their identityHash, but I noticed this is >>> consistent between images -- how? It's great, but is there any danger >>> of that value ever changing? That would be bad.. >> >> >> The identity hash bits are stored in each object's header. And since true, false, nil are the same decades old instances, their hash did not change. >> >> Depending on what the SystemTracer does, it may be different in an image derived by that though. E.g. you may want to check a 64 bit image. > > So, just in thinking about it -- that is a VERY distant dependency > that could manifest as a bug way up in at an app-level in production. > Because no SUnit test would be able to catch it, and when trying to > debug it in production, invariably someone with the "classic" hashes > not be able to reproduce the problem.. Hopefully it would only be a > "lookup" problem but what if it was in the context of an > #at:ifAbsentPut:? What a nightmare! > > So, it would seem a good idea to override #hash to return their > current value as a fixed constant. > > Thoughts? -1 Why would you depend on the exact value of the identity hash? When writing a new image and the identity hash changed, then obviously all dictionaries would have to be rehashed. Seems like a non-issue to me. Besides, YAGNI. If and when you need that you could still add these methods. Though you wont ;^) - Bert - |
> Why would you depend on the exact value of the identity hash?
Not the value of the identityHash, the value of the hash. > When writing a new image and the identity hash changed, then obviously all dictionaries would have to be rehashed. Seems like a non-issue to me. That's the exact problem I want to avoid. Think in the context of a multiuser client-server app accessing the same large persistent domain model. The domain model includes a complex domain object used as a key in a dictionary, and one of the attributes used to determine its hash is one of its boolean attributes. Now, some of the clients suddenly happen to get a new #identityHash for true, which in turn changes its #hash. Now they cannot access the domain's whose hash depends on true or false. Worse, they might #at:put: into it and then the model is "corrupted" because there is no universally-consistent notion of true's hash across all client images. You're right, I could add my own private extensions, but we should consider having "universal" atomics given the the aforementioned insidiousness of the above situation..? |
On Wed, May 9, 2012 at 1:46 PM, Chris Muller <[hidden email]> wrote: > Why would you depend on the exact value of the identity hash? How so? Given that within an image the hash and dictionaries hashed there-on are consistent how does it matter if two different images have different hashes and different hash values? How do these clients suddenly acquire new hashes for nil, true and false? Smalltalk systems have been like this for many decades and have not found this to be a significant problem in practice. There ca be performance advantages with consistent hashes (e.g. of symbols, avoiding having to rehash method dictionaries on code load). But no fundamental problem.
BTW, there are even lisp systems that use an object's identity as its hash /and/ have moving garbage collectors so that all identity-hashed collections are rehashed after gGC. These systems keep working too, even though an objects' hash changes through its lifetime, let alone differs between systems.
best, Eliot |
In reply to this post by Chris Muller-3
2012/5/9 Chris Muller <[hidden email]>:
>> Why would you depend on the exact value of the identity hash? > > Not the value of the identityHash, the value of the hash. > >> When writing a new image and the identity hash changed, then obviously all dictionaries would have to be rehashed. Seems like a non-issue to me. > > That's the exact problem I want to avoid. Think in the context of a > multiuser client-server app accessing the same large persistent domain > model. The domain model includes a complex domain object used as a > key in a dictionary, and one of the attributes used to determine its > hash is one of its boolean attributes. > > Now, some of the clients suddenly happen to get a new #identityHash > for true, which in turn changes its #hash. Now they cannot access the > domain's whose hash depends on true or false. Worse, they might > #at:put: into it and then the model is "corrupted" because there is no > universally-consistent notion of true's hash across all client images. > > You're right, I could add my own private extensions, but we should > consider having "universal" atomics given the the aforementioned > insidiousness of the above situation..? > If this can only occur in this persistence scheme, then you obviously require a #persistentHash. Object>>persistentHash ^self hash True,False,UndefinedObject>>persistentHash ^some specific constant... It is true that any other literal, arrays of such, or any hash-caring object built of such would be cross-image hash-persistent... So we are very close to it. But YAGNI, your hash sounds hackish... Maybe your own image tracer could implement the hack too ? Nicolas |
In reply to this post by Bert Freudenberg
On Wed, May 09, 2012 at 09:57:16PM +0200, Bert Freudenberg wrote:
> On 09.05.2012, at 21:28, Chris Muller wrote: > > > Where do true, false and nil obtain their hash value? They inherit > > #hash from Object, so it is their identityHash, but I noticed this is > > consistent between images -- how? It's great, but is there any danger > > of that value ever changing? That would be bad.. > > > The identity hash bits are stored in each object's header. And since true, false, nil are the same decades old instances, their hash did not change. > > Depending on what the SystemTracer does, it may be different in an image derived by that though. E.g. you may want to check a 64 bit image. > > - Bert - FWIW, on a 32-bit image: ImageFormat thisImageFileFormat asInteger ==> 6504 Smalltalk wordSize ==> 4 nil identityHash ==> 3840 true identityHash ==> 2950 false identityHash ==> 3152 And on a 64-bit image: ImageFormat thisImageFileFormat asInteger ==> 68002 Smalltalk wordSize ==> 8 nil identityHash ==> 3840 true identityHash ==> 2950 false identityHash ==> 3152 Dave |
In reply to this post by Eliot Miranda-2
> How so? Given that within an image the hash and dictionaries hashed
> there-on are consistent how does it matter if two different images have > different hashes and different hash values? As I tried to explain, it matters in the case where the two different images are accessing a persistent, legacy domain model which has the true object involved in the calculation of a hash value. It wouldn't even necessarily have to be a client-server app -- maybe the "persistent model" is just a serialized object file that the "new" (inconsistent) wanted to load. The way it is now, the system is dependent solely on the identityHash of true and false and nil, even though I want them to be treated as equivalent "value" objects just like Integers would be.. |
In reply to this post by Nicolas Cellier
> If this can only occur in this persistence scheme, then you obviously
> require a #persistentHash. No, just the standard "value" #hash is sufficient. A #persistentHash wouldn't work in the use-case I described because what if they used a standard Dictionary which bases on #hash -- so the client with the "new" true wouldn't be able to access. > It is true that any other literal, arrays of such, or any hash-caring > object built of such would be cross-image hash-persistent... > So we are very close to it. So why opposed to including true, false, and nil then? > But YAGNI, your hash sounds hackish... > Maybe your own image tracer could implement the hack too ? If anything, I see what we have now as a hack -- because the correctness of #hash for the **universal *value* of true** is dependent on the *implementation* of that true -- that it is the same one ever created in all prior Squeak's.. - Chris |
In reply to this post by David T. Lewis
> FWIW, on a 32-bit image:
> > ImageFormat thisImageFileFormat asInteger ==> 6504 > Smalltalk wordSize ==> 4 > nil identityHash ==> 3840 > true identityHash ==> 2950 > false identityHash ==> 3152 > > And on a 64-bit image: > > ImageFormat thisImageFileFormat asInteger ==> 68002 > Smalltalk wordSize ==> 8 > nil identityHash ==> 3840 > true identityHash ==> 2950 > false identityHash ==> 3152 This is very good news -- still, I see no harm in my proposal. Why won't someone find some fault with it or at least acknowledge how horrible the failure-case scenario would be to debug.. :) |
In reply to this post by Eliot Miranda-2
> How do these clients suddenly
> acquire new hashes for nil, true and false? I should add, it's already happened. In 2009 Levente changed Object>>#identityHash to answer the scaledIdentityHash. |
In reply to this post by Chris Muller-3
On Wed, May 09, 2012 at 06:33:41PM -0500, Chris Muller wrote:
> > FWIW, on a 32-bit image: > > > > ??ImageFormat thisImageFileFormat asInteger ==> 6504 > > ??Smalltalk wordSize ==> 4 > > ??nil identityHash ==> 3840 > > ??true identityHash ==> 2950 > > ??false identityHash ==> 3152 > > > > And on a 64-bit image: > > > > ??ImageFormat thisImageFileFormat asInteger ==> 68002 > > ??Smalltalk wordSize ==> 8 > > ??nil identityHash ==> 3840 > > ??true identityHash ==> 2950 > > ??false identityHash ==> 3152 > > This is very good news -- still, I see no harm in my proposal. Why > won't someone find some fault with it or at least acknowledge how > horrible the failure-case scenario would be to debug.. :) Nothing horrible is going to happen any time soon, but that does not make it a good idea. true refers to an object like any other, and there is no particular reason to expect that the object that represents "true" in one image should have the same identityHash as the object that represents "true" in another image. Consider your multi-user client-server application example. Suppose that it becomes fabulously successful and scales effortlessly to support thousands of clients, and you later become interested in permitting VisualWorks client images to join the party. Oops. Dave |
In reply to this post by Chris Muller-3
2012/5/10 Chris Muller <[hidden email]>:
>> How do these clients suddenly >> acquire new hashes for nil, true and false? > > I should add, it's already happened. In 2009 Levente changed > Object>>#identityHash to answer the scaledIdentityHash. > If I had to choose arbitrary constants that would be something stupid like ^36r0true ^36r0false ^36r0nil but it would cost you a rehash of persistent databases... |
In reply to this post by Chris Muller-3
On 10.05.2012, at 01:40, Chris Muller wrote: >> How do these clients suddenly >> acquire new hashes for nil, true and false? > > I should add, it's already happened. In 2009 Levente changed > Object>>#identityHash to answer the scaledIdentityHash. Not in Squeak. Our IdentityDictionary uses scaledIdentityHash nowadays, but identityHash itself is left alone, answering the primitive value directly. You may be thinking of Pharo, where you now need to use basicIdentityHash to get the primitive hash, and identityHash answers the more useful scaled value. Be aware though that Pharo also manipulates SmallInteger hashes where Squeak doesn't (yet, anyway). Did I mention it's a bad idea to depend on actual hash values? ;) - Bert - |
2012/5/10 Bert Freudenberg <[hidden email]>:
> > On 10.05.2012, at 01:40, Chris Muller wrote: > >>> How do these clients suddenly >>> acquire new hashes for nil, true and false? >> >> I should add, it's already happened. In 2009 Levente changed >> Object>>#identityHash to answer the scaledIdentityHash. > > Not in Squeak. Our IdentityDictionary uses scaledIdentityHash nowadays, but identityHash itself is left alone, answering the primitive value directly. > > You may be thinking of Pharo, where you now need to use basicIdentityHash to get the primitive hash, and identityHash answers the more useful scaled value. Be aware though that Pharo also manipulates SmallInteger hashes where Squeak doesn't (yet, anyway). > > Did I mention it's a bad idea to depend on actual hash values? ;) > > - Bert - > Sure, I already changed various Number>>hash and could as well change Point hash to follow recommendations from Andres valloud book hashing in smalltalk... Nicolas |
In reply to this post by Bert Freudenberg
On Thu, May 10, 2012 at 12:41:13PM +0200, Bert Freudenberg wrote:
> > On 10.05.2012, at 01:40, Chris Muller wrote: > > >> How do these clients suddenly > >> acquire new hashes for nil, true and false? > > > > I should add, it's already happened. In 2009 Levente changed > > Object>>#identityHash to answer the scaledIdentityHash. > > Not in Squeak. Our IdentityDictionary uses scaledIdentityHash nowadays, but identityHash itself is left alone, answering the primitive value directly. > > You may be thinking of Pharo, where you now need to use basicIdentityHash to get the primitive hash, and identityHash answers the more useful scaled value. Be aware though that Pharo also manipulates SmallInteger hashes where Squeak doesn't (yet, anyway). > > Did I mention it's a bad idea to depend on actual hash values? ;) > > - Bert - Squeak: nil identityHash ==> 3840 true identityHash ==> 2950 false identityHash ==> 3152 Pharo: nil identityHash ==> 1006632960 true identityHash ==> 773324800 false identityHash ==> 826277888 nil basicIdentityHash ==> 3840 true basicIdentityHash ==> 2950 false basicIdentityHash ==> 3152 |
In reply to this post by Nicolas Cellier
On 05/10/2012 04:21 AM, Nicolas Cellier wrote:
> Sure, I already changed various Number>>hash and could as well change > Point hash to follow recommendations from Andres valloud book hashing > in smalltalk... > > Nicolas DateAndTime>>#hash could be changed to : hash ^ (jdn hashMultiply bitXor: seconds + offset asSeconds) bitXor: nanos which is 130x faster than whats currently in the image: hash ^ self asUTC ticks hash The collision rate on the proposed hash function is 0.04% ( 4 per 10,000 ) |
On Thu, May 10, 2012 at 6:21 AM, Paul DeBruicker <[hidden email]> wrote: On 05/10/2012 04:21 AM, Nicolas Cellier wrote: Just doit. Make that change. If you don't have commit rights, commit it to inbox. please! best, Eliot |
In reply to this post by Bert Freudenberg
>> I should add, it's already happened. In 2009 Levente changed
>> Object>>#identityHash to answer the scaledIdentityHash. > > Not in Squeak. Our IdentityDictionary uses scaledIdentityHash nowadays, but identityHash itself is left alone, answering the primitive value directly. I meant to say Object>>#hash, not #identityHash. So, before 12/1/2009: true hash "2950" but after 12/1/2009 true hash "773324800" So, any saved persistent EToys ReferenceStream object-models files with true involved in the calculation of #hash prior to 2009 will now be goofed up unless you remember to rehash all regular Dictionary's after loading it. The properties of this bug are: - it is hidden, you had no idea it was there because no SUnit test can possibly catch it. It didn't show until production. - it is image-specific -- you load the file an image before Levente's change and everything seems fine. What's going on? - it is "intermittent" because there's a small possibility that, if the Dictionary were small, you might get lucky with a "hit" anyway when calculating the slot to start searching at - it could lead to corrupt data model, because perhaps the app does something like #at:ifAbsentPut:, and maybe even on an otherwise-equivalent object, so you end up with TWO of the "same" object in the dictionary. What a disaster! Now does it make sense? |
Free forum by Nabble | Edit this page |