Hi folks. I am trying to understand a little how hash is assigned and how it works in the SqueakVM. I have a couple of questions: 1) There are 12 bits in the ObjectHeader for the hash. I saw that the value of this hash comes from #newObjectHash. But....this method answers a hash of 16 bit: newObjectHash "Answer a new 16-bit pseudo-random number for use as an identity hash." lastHash := 13849 + (27181 * lastHash) bitAnd: 65535. ^ lastHash Since in the OH we have 12 bits, each place where this method is used, then we have to do something like this: header1 := (classFormat bitAnd: 16r1FF00) bitOr: (hash << HashBitsOffset bitAnd: HashBits). or hash := self newObjectHash. header := (self longAt: newOop) bitAnd: 16r1FFFF. "use old ccIndex, format, size, and header-type fields" header := header bitOr: ((hash << 17) bitAnd: 16r1FFE0000). .... so, we always need to trim it to 12 bits using (hash << HashBitsOffset bitAnd: HashBits). Now, I wonder, why #newObjectHash does not answer a 12bits long hash directly ? is it because of the hash function performance? 2) It looks to me that not, but can it change the hash of an object? if true, why and where? 3) If #basicIdentityHash is ONLY used for Sets and friends, why not letting the hash assignment as late/lazy as possible? i mean, instead of creating and assigning a hash at object creation, what if we do this in #hashBitsOf: oop where we can check in the hash is empty and if true assign one. Would this make sense or it would be worst? Thanks Mariano |
On Thu, Jan 6, 2011 at 6:23 AM, Mariano Martinez Peck <[hidden email]> wrote:
IMO it is far better and I will implement this in the new object representation. VW has always had lazy hash creation.
Note in Cog the hash is derived from the allocation pointer not from the pseudo-random hash generator, avoiding a read-modify-write cycle in object allocation. best Eliot
|
On Thu, Jan 6, 2011 at 6:33 PM, Eliot Miranda <[hidden email]> wrote:
Excellent. Marcus told me about this :)
Yeah, I saw it: newObjectHash "Derive the new object hash from the allocation pointer. This is less costly than using lastHash because it avoids the read-modify-write cycle to update lastHash. Since the size of eden is a power of two and larger than the hash range this provides a well-distributed and fairly random set of values." <inline: true> ^freeStart >> BytesPerWord
|
In reply to this post by Eliot Miranda-2
Well 10 years back I built a VM that precomputed all the random numbers since it's a *small* amount, you'll note the code does store the last seed so storing the index wasn't hard so it was lookup in static array, increment index. There was a few percent performance gain if you did Object new 100 million times. But I'm afraid nothing you could see in real life. This was on hardware were executing instructions cost computable microseconds. The other adventure was tuning the New logic, not a complete rewrite but refining the IF tree logic, that resulted in a few percent, but again in real life nothing that would justify screwing with the code. Still that was long ago, maybe things have changed. I still have the changes sets of course. Let me attach the email from July 19, 2003 12:52:37 PM PDT [VM] HashBits, a lazy way On 2011-01-06, at 9:33 AM, Eliot Miranda wrote:
-- =========================================================================== John M. McIntosh <[hidden email]> Twitter: squeaker68882 Corporate Smalltalk Consulting Ltd. http://www.smalltalkconsulting.com =========================================================================== Things are a bit quiet this morning so I wandered over to my freebsd intel box and downloaded the unix 3.5-development tar file and build an intel squeak vm. I used a interp.c I had (from june 30th) which was a bit newer, but followed the same pattern as the one that Ian is shipping, gnuified, use of global structure foo, use of AHC changeset to localize variables in the interpreter For the 3.6.1bx interp.c that used BitOfGCTuning-JMM which removes the +1 in the headerTypeBytes lookup, plus some localization for the mark/sweep InterpreterFixes-3Point6 from Andreas CodeGenEnh-3Point6 from Andreas MakePrimPointXInternal-JMM which makes point x/y internal in the interp.c allocateNoFillHashTable-JMM which has improved allocation/nofill & hashtable lookup. Not quite the same, however in running tinybenchmarks I noted june 30thVM 40,738,383 bytecode/sec 972,076 sends/sec 3.6.1bxVM 40,429,564 bytecodes/sec 984,169 sends/sec Similar performance, but the 3.6.1bxVM has better sends/second. Due to the CodeGenEnh I think... Certainly one is not extraordinary faster than the other. However when I looked at allocation rates for 1 asFloat, 2@3, Array new: 5 I saw june30thVM 37,654, 40,198, 31,037 3.6.1bxVM 75,952, 82,524, 39,325 Ah, much nicer, also quite remarkable as compared to the powerpc benchmarks. These changes did translate into a savings in the macrobenchmarks. 371,654 versus 376,568 It's hard to say if the better performance is due to allocation, or to faster sends/second. however cutting small object allocation time by 50%should show up somewhere. Someone will need to do a better degree of testing, to better reflect just the hashtable/ old hash differences? Versus my wholesale before/after testing. And of course try this under windows. PS The intel box is a dual pentium 350Mhz, which ran 49.5% busy during the testing. gcc version 2.95.4 was used, gcc 3.3.x might do a better job? |
Free forum by Nabble | Edit this page |