There are days when I think I still don't understand how to use Glorp right. We are looking into performance issues where teh insert of 4 rows into 3 tables and takes a up to a minute instead of a few msecs. The time is spent in Glorp, not in the database. We added a bunch of logging statements to GlorpSession and UnitOfWork and we already know most of the time is spent in Glorp before any SQL is issued to the Database. This extreme slowdown appears only for users who have loaded lots of objects from the DB. In our current case, there are a bit more than 10,000 entries in the undoMap of the currentUnitOfWork. It seems like 10,000 is a magic number here, a few weeks ago when less data was in play, the performance was okay for this user. Users with just a few hundred objects have very nice performance. I want to find ot whether this is a VAST specific problem. Glorp uses an IdentityDictionary for the undoMap on both VAST and VW (and I guess in Pharo as well). This may or may not be a problem, I simply don't know. Is there anybody here on this list *not* on VA Smalltalk who has such big transactions (remember: not number of updates, just objects loeded from the DB!). I wonder how I can go on from here? Response times of one minute and more are not acceptable... Any ideas? You received this message because you are subscribed to the Google Groups "glorp-group" group. To unsubscribe from this group and stop receiving emails from it, send an email to [hidden email]. To view this discussion on the web visit https://groups.google.com/d/msgid/glorp-group/88932d86-9c11-4df8-b2c4-f046308a71d1n%40googlegroups.com. |
My first thought when seeing a non-linear slowdown on something large is hashing performance. It also might not be in the undoMap, but in the generated row maps, or somewhere else. But if the time is spent in Glorp, then a profile should help. Or even the quick and dirty profiler of pausing execution. If something is spending 90% of its time doing something, then a random pause will probably stop in that something. On Thu, Nov 5, 2020 at 2:06 AM jtuchel <[hidden email]> wrote:
You received this message because you are subscribed to the Google Groups "glorp-group" group. To unsubscribe from this group and stop receiving emails from it, send an email to [hidden email]. To view this discussion on the web visit https://groups.google.com/d/msgid/glorp-group/CAGWHZ99W5KP2F2uuZTAt-bDEphE6S75oKSYvrWy61Ef-MY2nxQ%40mail.gmail.com. |
Hi Alan,
this is what is currently blocking my
machine. I'm traceing a single commit of that slow kind and
tracing alone took almost 4 hours ;-) I didn't expect this to take
that long, otherwise I'd have started with sampling. I have a lot
of time to answer my mails now ;-) Next time I'll sample first
(which is what the manual says, btw, but who reads manuals ;-)) ).
The machine is now working on opening the Performance Workbench... I'll be back with more info and very
likely questions on what do do about this.
I already added some logging to the
server application. The very same commit is fast for users with
only a few hundred objects in their object net. The commitUOW
takes between 150 and 400 msec for those users.
I am glad you suspect something similar
like I do. Shows me I am learning. Maybe it is time to look up a
chapter or two in my copy of Andres' Hashing book while I wait for
the Workbench to open...
Joachim
Am 05.11.20 um 14:40 schrieb Alan
Knight:
-- ----------------------------------------------------------------------- Objektfabrik Joachim Tuchel [hidden email] Fliederweg 1 http://www.objektfabrik.de D-71640 Ludwigsburg http://joachimtuchel.wordpress.com Telefon: +49 7141 56 10 86 0 Fax: +49 7141 56 10 86 1-- You received this message because you are subscribed to the Google Groups "glorp-group" group. To unsubscribe from this group and stop receiving emails from it, send an email to [hidden email]. To view this discussion on the web visit https://groups.google.com/d/msgid/glorp-group/13120648-c283-91cd-2116-9fc1e8a235dc%40objektfabrik.de. |
Hi again, I gave up waiting for the results Browser on my tracing results yesterday. The image grew above 4 GB in size and the Browser still didn't open after ~3 hrs. So I tried sampling at a rate of 5ms. The results are a bit surprising. If the mein problem would be inefficient hash algroithms or an ineffecient IdentityDictionary, I would expect IdentityDictionary methods like at:ifAbsentPut: and such on top of the sorted list of methods most time spent in. That is not the case in a sample of 12 runs. The methods most time spent in are isRegistered: and registeredObjectsDo: as well as Collection>>#includes: . IdentityDictionary and IdentitySet are on the list, but with low percentages of the overall execution time time. The top of the list in my Workbench looks like this: (50,4%) UnitOfWork>>#registeredObjectsDo: (16,6%) Collection>>#includes: (2,2%) IdentityDictionary>>#includesKey: (2,0%) IdentityDictionary#at:ifAbsentPut: (1,9%) IdentitySet>>#includes: .... These methods do use #= extensively, of course, but I am not so sure this is related to hashing, right? The main job of these methods is to iterate over #registeredObjects, which, iiuc, does also not rely on hashing, because all they do is walk thorugh a long list of pointers, visiting each object. So I am almost sure this is not a hashing issue, but just a simple case of too much work due to too many objects in the #registeredObjects collection. @Alan: would you agree on this thesis? Just to see if I can improve things by another hashing function, i tried implementing hash functions on the two classes that are the majority on the list of registered objects as hash ^id hash "the send of #hash is probably not necessary, sind id is an Integer anyways, but might one day in a century or so be a LargeInteger..." The performance wasn't affected at all, it neither improved nor got worse. There are a few questions about hashing in this context, which may be very important for the purpose of changing hashing for persistent objects, like
But I am not so sure hashing is relevant in my case. My gut feeling is that I simply have a problem of too many registered objects in the session. This is most likely a consequence of the way we handle our Transaction (see my other question about best practices on this group). So the next thing I'll try is to change the Transction handling for this specific dialog first and see if this has an effect. Thanks for reading, and also lots of thanks for any comments on my thinking out loud here... Joachim You received this message because you are subscribed to the Google Groups "glorp-group" group. To unsubscribe from this group and stop receiving emails from it, send an email to [hidden email]. To view this discussion on the web visit https://groups.google.com/d/msgid/glorp-group/2a29fb59-ada8-496a-b145-6ff0129d1c69n%40googlegroups.com. |
little correction: >If I wanted to implement another hashing agorithm, should the class be
part of the hash? most of our persistent objects have a sequence number
as id, each of them >created by the database for each table individually
*starting* with 1, jtuchel schrieb am Freitag, 6. November 2020 um 07:51:46 UTC+1:
You received this message because you are subscribed to the Google Groups "glorp-group" group. To unsubscribe from this group and stop receiving emails from it, send an email to [hidden email]. To view this discussion on the web visit https://groups.google.com/d/msgid/glorp-group/1354642e-d73b-4698-bf01-ff7826bca7ddn%40googlegroups.com. |
I can already answer parts of my questions ;-) I can easily make this perform a whole lot slower by overriding #basicHash in my persistent classes. Thus I can easily move IdentitySet>>#includes: and IdentityDictionary>>#at:ifAbsent: to the top of the list of worst performers ;-) So basicHash clearly has in anfluence on the overall performance. There is only this little remaining riddle: can I use this knowledge to achieve the opposite effect ;-)) I am a bit sceptical. In my attempts to play with #basicHash, it always showed up in the list, but it had obviously never been there with the default implementation (because sampling won't measure VM primitives, I guess). So I either chose very slow hashing algorithms, or the hashing algorithms I chose were bad. Is suspect a combination of both ;-) I chose to include the Class in order to make the hash of an instance of ClassA with id 17 distinguishable from an instance of ClassB with the same id(17). My observation with Hashes of all Classes in the image is that they are all in the range between 1 and 32767. So I went to Andres' book and found his chapter on VsiualWorks' #hash implementation in Date. I thought the class' hash is somewhat similar to a Date's year, just that the class hashes take 15 bits instead of 9. so I tried self class hash * 32768 + id hash (self class hash bitShift: 15) bitXOr: id hash And a few even lower performing and less clever ideas. But they all just made things worse. So, what do I do with this new knoweldge? I don't know, tbh. jtuchel schrieb am Freitag, 6. November 2020 um 08:21:12 UTC+1:
You received this message because you are subscribed to the Google Groups "glorp-group" group. To unsubscribe from this group and stop receiving emails from it, send an email to [hidden email]. To view this discussion on the web visit https://groups.google.com/d/msgid/glorp-group/27d40c12-b71e-4c92-b96e-724e4c2e2de6n%40googlegroups.com. |
Hi Joachim,
Is VAST available with a 64 bit VM? If so, it might be interesting to see what happens to the performance there. The problem with Smalltalk hashes in a 32-bit implementation with lots of memory (and objects) is that it means you get lots of duplicate hash values and lookups can start to slow down due to sequential search issues. Or are you already using a 64-bit version? On 11/6/2020 3:10 AM, jtuchel wrote:
-- You received this message because you are subscribed to the Google Groups "glorp-group" group. To unsubscribe from this group and stop receiving emails from it, send an email to [hidden email]. To view this discussion on the web visit https://groups.google.com/d/msgid/glorp-group/ec1d8cfa-33af-471e-9a0f-d7b3bdde294f%40gmail.com. |
Tom, we are on the 64 bit - VM already. There is one thing I didn't see all the time during sampling and testing: we had quite a few GCs going on in the middle of Transaction commits. So I am now looking into effects of increasing old and new Space. First few experiments show at least some effect. Joachim Tom Robinson schrieb am Freitag, 6. November 2020 um 13:12:09 UTC+1:
You received this message because you are subscribed to the Google Groups "glorp-group" group. To unsubscribe from this group and stop receiving emails from it, send an email to [hidden email]. To view this discussion on the web visit https://groups.google.com/d/msgid/glorp-group/7aadd1fe-dc34-40d5-9d03-874ca08a69efn%40googlegroups.com. |
In reply to this post by jtuchel
This sounds to me like maybe the problem isn't hashing. I assume those top lines are a hierarchical view - that is, they include the calls underneath them in the total time. So, if the 50% of time iterating registered objects includes the stuff in the do: block, that seems reasonable. That's kind of what the whole operation does. But I'm a bit suspicious of the Collection>>includes:. Does that mean it's falling back to a superclass linear search somewhere? Also, that you're having a lot of GCs and the image is getting very large seems suspicious. RowMaps are heavy - we're iterating every registered object and every registered object's backup copy, and creating a Row object for each of them. But they're not *that* heavy with only 10K objects. Not having so many objects registered is definitely the best long-term answer. But there's something funny going on. Maybe an allocation profile would be interesting. On Fri, Nov 6, 2020 at 1:51 AM jtuchel <[hidden email]> wrote:
You received this message because you are subscribed to the Google Groups "glorp-group" group. To unsubscribe from this group and stop receiving emails from it, send an email to [hidden email]. To view this discussion on the web visit https://groups.google.com/d/msgid/glorp-group/CAGWHZ9_tKM%2BASYOiKCZoeRoFxPxSPng-23dv1QJ0bGC3ySouuw%40mail.gmail.com. |
Alan, I've spent quie a while now trying to understand this much better. The problem is definitely not hashing. In the specific situation I am looking at, memory consumption grows by 70 MB in #createRowsForPartialWrites, the creation of RowMaps from registeredObjects is what makes this slow. This is due to the fact that #registeredObjects is a Collection with more than 20.000 Objects. So clearly I need to find ways to reduce the number or registeredObjects (as you say). There is, of course, not much use in managing 20.000 Objects when all you do is insert 7 rows in 3 tables. The tricky question is: how to do that in a clever way that doesn't break the program... Joachim alan.knight schrieb am Freitag, 6. November 2020 um 14:57:23 UTC+1:
You received this message because you are subscribed to the Google Groups "glorp-group" group. To unsubscribe from this group and stop receiving emails from it, send an email to [hidden email]. To view this discussion on the web visit https://groups.google.com/d/msgid/glorp-group/8dd26d37-a7a8-43c2-986d-c61e5b898e0bn%40googlegroups.com. |
Free forum by Nabble | Edit this page |