Following the recent UTC updates to Chronology-Core, the hash for DateAndTime is:
DateAndTime>>hash ^utcMicroseconds hash This implies the need for a couple of follow up actions. I should have done "HashedCollection rehashAll" somewhere in the latest updates. This would be necessary for any hashed collections keyed by DateAndTime. I also need to update two unit tests that document the hashing, and that now fail. Before I do anything about these, does anyone have any comments or concerns regarding how equality testing and hashing are being done for DateAndTime? Thanks, Dave |
I don't think that this has a practical impact, but I will mention it
in case I am overlooking something. The hash for a DateAndTime in a 64-bit image will be different from the hash for the same DateAndTime in a 32-bit image, because the utcMicroseconds value in a 64-bit Spur image is a SmallInteger, and in a 32-bit image it is a LargePositiveInteger. This might be a concern if there is some scenario in which two images of different word size are referring to the same hashed collection (possibly via Magma or Gemstone?). Dave On Sat, Jan 05, 2019 at 03:30:23PM -0500, David T. Lewis wrote: > Following the recent UTC updates to Chronology-Core, the hash for DateAndTime is: > > DateAndTime>>hash > ^utcMicroseconds hash > > This implies the need for a couple of follow up actions. I should have done > "HashedCollection rehashAll" somewhere in the latest updates. This would be > necessary for any hashed collections keyed by DateAndTime. I also need to > update two unit tests that document the hashing, and that now fail. > > Before I do anything about these, does anyone have any comments or concerns > regarding how equality testing and hashing are being done for DateAndTime? > > Thanks, > Dave > |
Hi David,
> On Jan 5, 2019, at 8:17 PM, David T. Lewis <[hidden email]> wrote: > > I don't think that this has a practical impact, but I will mention it > in case I am overlooking something. The hash for a DateAndTime in a > 64-bit image will be different from the hash for the same DateAndTime > in a 32-bit image, because the utcMicroseconds value in a 64-bit Spur > image is a SmallInteger, and in a 32-bit image it is a LargePositiveInteger. This was taken care of for this kind of scenario in Name: Kernel-eem.1198 Author: eem Time: 24 November 2018, 1:44:47.526422 pm UUID: 100137c4-2514-4b7f-9064-3dcdfe7d8cc9 Ancestors: Kernel-eem.1197 Redefine LargePositiveInteger hash for compatibility between 32-bit and 64-bit systems. > > This might be a concern if there is some scenario in which two images of > different word size are referring to the same hashed collection (possibly > via Magma or Gemstone?). > > Dave _,,,^..^,,,_ (phone) > >> On Sat, Jan 05, 2019 at 03:30:23PM -0500, David T. Lewis wrote: >> Following the recent UTC updates to Chronology-Core, the hash for DateAndTime is: >> >> DateAndTime>>hash >> ^utcMicroseconds hash >> >> This implies the need for a couple of follow up actions. I should have done >> "HashedCollection rehashAll" somewhere in the latest updates. This would be >> necessary for any hashed collections keyed by DateAndTime. I also need to >> update two unit tests that document the hashing, and that now fail. >> >> Before I do anything about these, does anyone have any comments or concerns >> regarding how equality testing and hashing are being done for DateAndTime? >> >> Thanks, >> Dave >> > |
On Sat, Jan 05, 2019 at 08:53:52PM -0800, Eliot Miranda wrote:
> Hi David, > > > On Jan 5, 2019, at 8:17 PM, David T. Lewis <[hidden email]> wrote: > > > > I don't think that this has a practical impact, but I will mention it > > in case I am overlooking something. The hash for a DateAndTime in a > > 64-bit image will be different from the hash for the same DateAndTime > > in a 32-bit image, because the utcMicroseconds value in a 64-bit Spur > > image is a SmallInteger, and in a 32-bit image it is a LargePositiveInteger. > > This was taken care of for this kind of scenario in > > Name: Kernel-eem.1198 > Author: eem > Time: 24 November 2018, 1:44:47.526422 pm > UUID: 100137c4-2514-4b7f-9064-3dcdfe7d8cc9 > Ancestors: Kernel-eem.1197 > > Redefine LargePositiveInteger hash for compatibility between 32-bit > and 64-bit systems. Ah, right, I forgot about that update. I was looking at an outdated 32-bit image. Thanks! Dave |
> On Sat, Jan 05, 2019 at 08:53:52PM -0800, Eliot Miranda wrote:
> > Hi David, > > > > > On Jan 5, 2019, at 8:17 PM, David T. Lewis <[hidden email]> wrote: > > > > > > I don't think that this has a practical impact, but I will mention it > > > in case I am overlooking something. The hash for a DateAndTime in a > > > 64-bit image will be different from the hash for the same DateAndTime > > > in a 32-bit image, because the utcMicroseconds value in a 64-bit Spur > > > image is a SmallInteger, and in a 32-bit image it is a LargePositiveInteger. > > > > This was taken care of for this kind of scenario in > > > > Name: Kernel-eem.1198 > > Author: eem > > Time: 24 November 2018, 1:44:47.526422 pm > > UUID: 100137c4-2514-4b7f-9064-3dcdfe7d8cc9 > > Ancestors: Kernel-eem.1197 > > > > Redefine LargePositiveInteger hash for compatibility between 32-bit > > and 64-bit systems. Shouldn't the condition should be ^self digitLength <= 4, instead of 8? See? 32-bit image 1073741824 hash = 230045764 64-bit image 1073741824 hash = 1073741824 I must be missing something. I can't believe we didn't notice this and went into trunk before asking this question... > Ah, right, I forgot about that update. I was looking at an outdated 32-bit image. > > Thanks! > Dave > > |
On Sun, Jan 06, 2019 at 05:53:23PM -0600, Chris Muller wrote:
> > On Sat, Jan 05, 2019 at 08:53:52PM -0800, Eliot Miranda wrote: > > > Hi David, > > > > > > > On Jan 5, 2019, at 8:17 PM, David T. Lewis <[hidden email]> wrote: > > > > > > > > I don't think that this has a practical impact, but I will mention it > > > > in case I am overlooking something. The hash for a DateAndTime in a > > > > 64-bit image will be different from the hash for the same DateAndTime > > > > in a 32-bit image, because the utcMicroseconds value in a 64-bit Spur > > > > image is a SmallInteger, and in a 32-bit image it is a LargePositiveInteger. > > > > > > This was taken care of for this kind of scenario in > > > > > > Name: Kernel-eem.1198 > > > Author: eem > > > Time: 24 November 2018, 1:44:47.526422 pm > > > UUID: 100137c4-2514-4b7f-9064-3dcdfe7d8cc9 > > > Ancestors: Kernel-eem.1197 > > > > > > Redefine LargePositiveInteger hash for compatibility between 32-bit > > > and 64-bit systems. > > Shouldn't the condition should be ^self digitLength <= 4, instead of 8? > > See? > > 32-bit image > 1073741824 hash = 230045764 > > 64-bit image > 1073741824 hash = 1073741824 > > I must be missing something. I can't believe we didn't notice this > and went into trunk before asking this question... > Eeek! This is crying out for a new unit test called IntegerTest>>testHash that documents the expected hash values over various boundary conditions, including maxVal 32/64, minVal 32/64, and digitLength 4 or 8. The test needs to pass on 32 and 64 bit images. Dave > > > > > Ah, right, I forgot about that update. I was looking at an outdated 32-bit image. > > > > Thanks! > > Dave > > > > > |
In reply to this post by Chris Muller-3
Hi Chris, On Sun, Jan 6, 2019 at 3:54 PM Chris Muller <[hidden email]> wrote: > On Sat, Jan 05, 2019 at 08:53:52PM -0800, Eliot Miranda wrote: Of c course not. It should be that all integers of 64-bits or less are their rowen hashes.
No. You must be using an un-updated image. Here's what I get: Smalltalk wordSize 4 (2 raisedTo: 63) hash = (2 raisedTo: 63) true (SmallInteger maxVal + 1) hash = (SmallInteger maxVal + 1) true Smalltalk wordSize 8 (2 raisedTo: 63) hash = (2 raisedTo: 63) true (SmallInteger maxVal + 1) hash = (SmallInteger maxVal + 1) true
Um, please. I /did/ ask the question, I /did/ test. I /do know/ that the idea is for integers of 64-bits or less be their own hashes. Your result clearly indicates that you're using some erroneous configuration.
_,,,^..^,,,_ best, Eliot |
>> > > Name: Kernel-eem.1198
>> > > Author: eem >> > > Time: 24 November 2018, 1:44:47.526422 pm >> > > UUID: 100137c4-2514-4b7f-9064-3dcdfe7d8cc9 >> > > Ancestors: Kernel-eem.1197 >> > > >> > > Redefine LargePositiveInteger hash for compatibility between 32-bit >> > > and 64-bit systems. >> >> Shouldn't the condition should be ^self digitLength <= 4, instead of 8? > > > Of c course not. It should be that all integers of 64-bits or less are their rowen hashes. >> >> >> See? >> >> 32-bit image >> 1073741824 hash = 230045764 >> >> 64-bit image >> 1073741824 hash = 1073741824 > > > No. You must be using an un-updated image. Here's what I get: > > Smalltalk wordSize 4 > (2 raisedTo: 63) hash = (2 raisedTo: 63) true > (SmallInteger maxVal + 1) hash = (SmallInteger maxVal + 1) true > > > Smalltalk wordSize 8 > (2 raisedTo: 63) hash = (2 raisedTo: 63) true > (SmallInteger maxVal + 1) hash = (SmallInteger maxVal + 1) true Whew! You're right, and I knew I wasn't running an updated image, I thought I could read and understand such little code while watching some NFL playoff game, but I misread it. :/ Sorry. >> I must be missing something. I can't believe we didn't notice this >> and went into trunk before asking this question... > > > Um, please. I /did/ ask the question, I /did/ test. I /do know/ that the idea is for integers of 64-bits or less be their own hashes. Your result clearly indicates that you're using some erroneous configuration. My question was directed at Dave and his question, not you (e.g., going in trunk and THEN asking about hash). But, I can see how you thought that by my not inlining properly, sorry. I put a TON of hours into testing UTCDateAndtime over the holidays. I had to. Best, Chris |
On Mon, Jan 07, 2019 at 01:36:10PM -0600, Chris Muller wrote:
> >> > > Name: Kernel-eem.1198 > >> > > Author: eem > >> > > Time: 24 November 2018, 1:44:47.526422 pm > >> > > UUID: 100137c4-2514-4b7f-9064-3dcdfe7d8cc9 > >> > > Ancestors: Kernel-eem.1197 > >> > > > >> > > Redefine LargePositiveInteger hash for compatibility between 32-bit > >> > > and 64-bit systems. > >> > >> Shouldn't the condition should be ^self digitLength <= 4, instead of 8? > > > > > > Of c course not. It should be that all integers of 64-bits or less are their rowen hashes. > >> > >> > >> See? > >> > >> 32-bit image > >> 1073741824 hash = 230045764 > >> > >> 64-bit image > >> 1073741824 hash = 1073741824 > > > > > > No. You must be using an un-updated image. Here's what I get: > > > > Smalltalk wordSize 4 > > (2 raisedTo: 63) hash = (2 raisedTo: 63) true > > (SmallInteger maxVal + 1) hash = (SmallInteger maxVal + 1) true > > > > > > Smalltalk wordSize 8 > > (2 raisedTo: 63) hash = (2 raisedTo: 63) true > > (SmallInteger maxVal + 1) hash = (SmallInteger maxVal + 1) true > > Whew! You're right, and I knew I wasn't running an updated image, I > thought I could read and understand such little code while watching > some NFL playoff game, but I misread it. :/ Sorry. > > >> I must be missing something. I can't believe we didn't notice this > >> and went into trunk before asking this question... > > > > > > Um, please. I /did/ ask the question, I /did/ test. I /do know/ that the idea is for integers of 64-bits or less be their own hashes. Your result clearly indicates that you're using some erroneous configuration. > > My question was directed at Dave and his question, not you (e.g., > going in trunk and THEN asking about hash). But, I can see how you > thought that by my not inlining properly, sorry. I put a TON of hours > into testing UTCDateAndtime over the holidays. I had to. Thanks Chris, the testing has been very helpful. Here is what I said a couple of weeks ago prior to moving the updates to trunk (http://lists.squeakfoundation.org/pipermail/squeak-dev/2018-December/200994.html): > Once moved to trunk, there will be eight failing tests in Chronology-Tests. > Each of these represents an issue that I think should be addressed in > discussion on the squeak-dev list. Of the eight failing tests that I mentioned, two were for the hash function, and that is what I was following up on in this thread. I wanted to make sure that I was not overlooking anything before I updated the tests to match the new hash function. As far as I can there is no problem with the hashing now, aside from the fact that I neglected to do a HashedCollection rehashAll (this still needs to be done). So the unit tests are updated to match the hash changes, and those two tests are now green. But there are still five test failures left to discuss, so more to follow :-) Dave |
Free forum by Nabble | Edit this page |