Hi Chris,
interesting! On Wed, Nov 12, 2014 at 1:40 PM, Chris Muller <[hidden email]> wrote: I finally tracked down why the keys of the #knownEnvironments Remember that Spur has a common header format for both 32-bit and 64-bit versions, so in both there is a 22-bit identityHash and hence the identityHashes of all objects in a 64-bit Spur image bootstrapped from a 32-bit Spur image will be _unchanged_. Convenient. So no need to worry. And it should be the case that a freshly bootstrapped 64-bit Spur image does not need to be rehashed to function properly. But while we're on the subject, one thing we could do is arrange that Symbols have an identityHash based on their value. So when interning a string we'd compute its string hash and derive and assign the identityHash of the Symbol from the string hash. That would mean that when unpickling classes in e.g. Fuel we would not have to rehash method dictionaries, which would be very nice indeed.
best,
Eliot |
>> Anyway, something to be aware of -- anywhere we have true, false or
>> nil used in a hash calculation, now has a different hash in Spur vs. >> Cog. Maybe we should think about separating those objects' logical >> "value" hash from their identityHash in trunk..? That could be useful >> when we move to 64-bit someday.. > > Remember that Spur has a common header format for both 32-bit and 64-bit > versions, so in both there is a 22-bit identityHash and hence the > identityHashes of all objects in a 64-bit Spur image bootstrapped from a > 32-bit Spur image will be _unchanged_. Convenient. So no need to worry. > And it should be the case that a freshly bootstrapped 64-bit Spur image does > not need to be rehashed to function properly. Suprising that their identityHash needs to change for Spur but not to go to 64-bit.. Wait, I thought one of the benefits of 64-bit was to finally increase that small identityHash? |
On 14.11.2014, at 17:56, Chris Muller <[hidden email]> wrote: >>> Anyway, something to be aware of -- anywhere we have true, false or >>> nil used in a hash calculation, now has a different hash in Spur vs. >>> Cog. Maybe we should think about separating those objects' logical >>> "value" hash from their identityHash in trunk..? That could be useful >>> when we move to 64-bit someday.. >> >> Remember that Spur has a common header format for both 32-bit and 64-bit >> versions, so in both there is a 22-bit identityHash and hence the >> identityHashes of all objects in a 64-bit Spur image bootstrapped from a >> 32-bit Spur image will be _unchanged_. Convenient. So no need to worry. >> And it should be the case that a freshly bootstrapped 64-bit Spur image does >> not need to be rehashed to function properly. > > Suprising that their identityHash needs to change for Spur > but not to go to 64-bit.. Spur already increases the number of bits to 22. It does not increase it again for 64 bits. 4 M different hashes should be enough, just like 4 M possible classes should be enough ;) > Wait, I thought one of the benefits of 64-bit was to finally increase > that small identityHash? 22 > 10 - Bert - smime.p7s (5K) Download Attachment |
In reply to this post by Chris Muller-3
On Fri, 14 Nov 2014, Chris Muller wrote:
>>> Anyway, something to be aware of -- anywhere we have true, false or >>> nil used in a hash calculation, now has a different hash in Spur vs. >>> Cog. Maybe we should think about separating those objects' logical >>> "value" hash from their identityHash in trunk..? That could be useful >>> when we move to 64-bit someday.. >> >> Remember that Spur has a common header format for both 32-bit and 64-bit >> versions, so in both there is a 22-bit identityHash and hence the >> identityHashes of all objects in a 64-bit Spur image bootstrapped from a >> 32-bit Spur image will be _unchanged_. Convenient. So no need to worry. >> And it should be the case that a freshly bootstrapped 64-bit Spur image does >> not need to be rehashed to function properly. > > Suprising that their identityHash needs to change for Spur but not to > go to 64-bit.. > > Wait, I thought one of the benefits of 64-bit was to finally increase > that small identityHash? 22 is already a lot more than the current 12. Current hashed collections should give excellent performance up to 4 million elements with a 22-bit identity hash. Insertion and lookup performance should be good up to 60 million elements, and removal performance should be good up to 20 million. And you'll still be able to use hashed collections optimized for large sizes[1][2] if you want to store more objects. Levente [1] http://leves.web.elte.hu/LargeIdentityDictionary/ [2] http://leves.web.elte.hu/LargeIdentityDictionary/LargeIdentityDictionary2.png P.S.: With a primitive I suggested long ago, the blue line on the picture (#at:) could be as flat as the red (#includesKey:). |
In reply to this post by Bert Freudenberg
On Fri, Nov 14, 2014 at 9:20 AM, Bert Freudenberg <[hidden email]> wrote:
Right. I thought since identityHashes would be changing (class identityHashes must change for Spur's class table, and there are more than 2^10 objects in an image) I would assign new hashes to all objects in the image that needed them and start with 1, 2 & 3 as the hashes for the first objects, nil, false & true. Any system which relies on identityHashes not changing from V3 to Spur will be broken anyway, so why keep the hashes for those objects? > but not to go to 64-bit.. Right (see both Bert's & Levente's responses). Spur lifts the number of identityHashes from 2^10 to 2^22. There's no room for more in a 64-bit system. SPur is designed to go beyond 32-bits, but it isn't designed for terabyte heaps. One step at a time ;-) - Bert - best,
Eliot |
On Fri, Nov 14, 2014 at 10:17 AM, Eliot Miranda <[hidden email]> wrote: --
I should say no room for more hash bits in *this* 64-bit system. Here's the Sour object header: headerForSlots: numSlots format: formatField classIndex: classIndex <api> "The header format in LSB is MSB: | 8: numSlots | (on a byte boundary) | 2 bits | (msb,lsb = {isMarked,?}) | 22: identityHash | (on a word boundary) | 3 bits | (msb <-> lsb = {isGrey,isPinned,isRemembered} | 5: format | (on a byte boundary) | 2 bits | (msb,lsb = {isImmutable,?}) | 22: classIndex | (on a word boundary) : LSB The remaining bits (7) are used for isImmutable (bit 23) isRemembered (bit 29) isPinned (bit 30) isGrey (bit 31) isMarked (bit 55) leaving 2 unused bits, each next to a 22-bit field, allowing those fields to be expanded to 23 bits.. The three bit field { isGrey, isPinned, isRemembered } is for bits that are never set in young objects. This allows the remembered table to be pruned when full by using these bits as a reference count of newSpace objects from the remembered table. Objects with a high count should be tenured to prune the remembered table." <returnTypeC: #usqLong> <inline: true> ^ ((self cCoerceSimple: numSlots to: #usqLong) << self numSlotsFullShift) + (formatField << self formatShift) + classIndex I hope this makes sense... best,
Eliot |
In reply to this post by Levente Uzonyi-2
>> Suprising that their identityHash needs to change for Spur but not to
>> go to 64-bit.. >> >> Wait, I thought one of the benefits of 64-bit was to finally increase >> that small identityHash? > > 22 is already a lot more than the current 12. Ah, indeed, I forgot how utterly small the current identityHash is! I of course would like to go even bigger than 22. I've been using MaIdentityDictionary's developed by Igor which provide linked-lists at each of the 4096 slots so that collisions are less onerous. That same strategy under 22-bit will scale a LOT further.. Ohh, I should put a #isRunningSpur check in there.. > Current hashed collections > should give excellent performance up to 4 million elements with a 22-bit > identity hash. Insertion and lookup performance should be good up to 60 > million elements, and removal performance should be good up to 20 million. > And you'll still be able to use hashed collections optimized for large > sizes[1][2] if you want to store more objects. I did try your LargeIdentityDictionary a few years ago but for some reason Magma test suite couldn't pass with it, and since I had Igor's I didn't have quite enough urgency to debug why. > Levente > > [1] http://leves.web.elte.hu/LargeIdentityDictionary/ > [2] > http://leves.web.elte.hu/LargeIdentityDictionary/LargeIdentityDictionary2.png > > P.S.: With a primitive I suggested long ago, the blue line on the picture > (#at:) could be as flat as the red (#includesKey:). |
Free forum by Nabble | Edit this page |