jtuchel schrieb am
Freitag, 6. November 2020 um 07:51:46 UTC+1:
Hi again,
I gave up waiting for the results Browser on my
tracing results yesterday. The image grew above 4 GB in
size and the Browser still didn't open after ~3 hrs. So
I tried sampling at a rate of 5ms.
The results are a bit surprising. If the mein problem
would be inefficient hash algroithms or an ineffecient
IdentityDictionary, I would expect IdentityDictionary
methods like at:ifAbsentPut: and such on top of the
sorted list of methods most time spent in. That is not
the case in a sample of 12 runs. The methods most time
spent in are isRegistered: and registeredObjectsDo: as
well as Collection>>#includes: .
IdentityDictionary and IdentitySet are on the list, but
with low percentages of the overall execution time time.
The top of the list in my Workbench looks like this:
(50,4%)
UnitOfWork>>#registeredObjectsDo:
(16,6%)
Collection>>#includes:
(2,2%)
IdentityDictionary>>#includesKey:
(2,0%)
IdentityDictionary#at:ifAbsentPut:
(1,9%)
IdentitySet>>#includes:
....
These methods do use #= extensively, of course, but I
am not so sure this is related to hashing, right? The
main job of these methods is to iterate over
#registeredObjects, which, iiuc, does also not rely on
hashing, because all they do is walk thorugh a long list
of pointers, visiting each object. So I am almost sure
this is not a hashing issue, but just a simple case of
too much work due to too many objects in the
#registeredObjects collection.
@Alan: would you agree on this thesis?
Just to see if I can improve things by another
hashing function, i tried implementing hash functions on
the two classes that are the majority on the list of
registered objects as
hash
^id hash "the send of #hash is probably not
necessary, sind id is an Integer anyways, but might one
day in a century or so be a LargeInteger..."
The performance wasn't affected at all, it neither
improved nor got worse. There are a few questions about
hashing in this context, which may be very important for
the purpose of changing hashing for persistent objects,
like
- since registeredObjects and undoMaps are
IdentityDctionaries, I guess they're not relying on
hash at all, but basicHash instead. #basicHash is a
VM primitive in VAST, so maybe there is not much
point in overriding this. I am most likely not more
clever than what the VM guys do for hashing...
- If I wanted to implement another hashing agorithm,
should the class be part of the hash? most of our
persistent objects have a sequence number as id,
each of them created by the database for each table
individually with 1, so if all persistent objects
just return the id as their hash value, and if Glorp
manages instances of different classes in
Dctionaries, there are probably lots of collisions.
So Maybe teh Class's hash should be part of an
Object's hash? Something like
self class
hash * id hash
maybe?
But I am not so sure hashing is relevant in my case.
My gut feeling is that I simply have a problem of too
many registered objects in the session. This is most
likely a consequence of the way we handle our
Transaction (see my other question about best practices
on this group).
So the next thing I'll try is to change the
Transction handling for this specific dialog first and
see if this has an effect.
Thanks for reading, and also lots of thanks for any
comments on my thinking out loud here...
Joachim