Hi group,
-- one of our customers is about to migrate from RHEL 6 (Kernel 2.6) to RHEL 7 (Kernel 3.3). In their tests on RHEL 6 our application used in a long running calculation about 500 - 800 seconds for one calculation step (depending on the exact configuration, all in all it needs about an hour). The same situation on RHEL 7: now the application uses between 8000 - 13000 seconds for one calculation step (15 - 20 hours for the whole calculation). Even though we don't have the exact configuration, we can reproduce these numbers. Our application does some database reading at the beginning of the calculation after that it is calculation (with some networking). Right now i'm trying to profile parts of our application. Has anybody experienced similar behaviour? Or is this a known issue? Some library functions that are now emulated instead of direct execution? Any ideas? Some good advice on profiling would also be appreciated. The Benchmark Workshop is showing strange numbers (e.g. -1,600.9 % execution time). Regards, Hermann Ottens You received this message because you are subscribed to the Google Groups "VA Smalltalk" group. To unsubscribe from this group and stop receiving emails from it, send an email to [hidden email]. To view this discussion on the web visit https://groups.google.com/d/msgid/va-smalltalk/8010c5dc-5b8a-4df9-975d-a419b2091e29%40googlegroups.com. |
Hi Hermann, not sure if our case matches with yours, but we had something similar: a couple of initial DB reads took only 20 seconds in a Workspace on Windows, but up to 50 minutes on our production machines on Linux. We had to look around for possible causes for quite a while. It was an interesting journey. The end result was that the difference between the production and the development machines was not Linux vs. Windows or any OS version. And this has been happing in both VAST 8.6 and VAST 9.0. It also wasn't anything in our code or in different DB2 versions or whatever. The difference was in Smalltalk process usage... Evaluating a few methods in a Transcript runs in the highly prioritized UI thread, our production image is an SST based web server, where "normal" processes during startup of teh server machinery run at much lower priority. So after a lot of speculation and searching we simply tried to fork the code snippet that does that initial loading in a highly priotitized background process. And we were back down to 20 seconds. We couldn't believe it! I only mention this because you said your code is also reading data from the DB. So it might be worth looking at this as well. I may be completely wrong, though. HTH Joachim Am Mittwoch, 16. Oktober 2019 11:04:14 UTC+2 schrieb Hermann Ottens:
You received this message because you are subscribed to the Google Groups "VA Smalltalk" group. To unsubscribe from this group and stop receiving emails from it, send an email to [hidden email]. To view this discussion on the web visit https://groups.google.com/d/msgid/va-smalltalk/e263cb6e-f620-4853-b13c-d1e00abb58d8%40googlegroups.com. |
Hi Joachim,
-- thank you for your answer. We are logging every SQL statement (including execution times and number of rows returned/affected) and therefore know that it is not the database that is causing the increase in time. Also we are comparing runtime behaviour with runtime behaviour and development behaviour with development behaviour on RHEL and RHEL 7 respectively. Everything else is the same: same application running on the same hardware, with the same VM configuration and the same db client against the same database. Right now i think the only remaining difference is the OS. Besides, when i look at top, the Smalltalk process is at 100% cpu usage. Nevertheless, thank you for your insight. It reminds me, that it is often the things beyond any suspicion, that are causing trouble. Yours, Hermann You received this message because you are subscribed to the Google Groups "VA Smalltalk" group. To unsubscribe from this group and stop receiving emails from it, send an email to [hidden email]. To view this discussion on the web visit https://groups.google.com/d/msgid/va-smalltalk/ad7939ec-4463-4f17-bcb1-74ee5e4edd32%40googlegroups.com. |
Hi group, i tried to narrow down the problem. Using the perf top tool i examined the processes while running. On RHEL 7 (VAST 8.5.2): i saw that the smalltalk process seems to be using a lot of time in two points called L226 and L679 (about 24% and 18%). The third place is mergeRelocationFunction at about 5%. (These symbols can be looked up with nm, they exist in esvmnx40.so.) On RHEL 6 (VAST 8.5.2): no single point seems to exceed 5%, and i see a lot of different function names in the topmost places of perf (VMprMBAGetInstVarWP, methodNativeExecuteNextBytecode, nativeExecuteNextBytecode VMprPointersWithoutAt, sendSpecial, etc.). L226 and L679 also occur but only with values below 1%. I have the feeling, that here the VM seems actually doing things. On RHEL 7 (VAST 9.1.0): Again similar behaviour as with VAST 8.5.2, the names are different EsGGC 38%, markAndLinkAll 31%, mergeRelocationFunction 5%. It seems on RHEL 7 that VAST is very busy doing some garbage collection instead of "real" work. Is it possible that the os libraries have changed so much, that the way of allocating/disallocating memory is now extremely inefficient? (Just guessing.) My colleague suggested that it possibly could be the result of some kernel patch (such as meltdown). Any other suggestions? Regards, Hermann Ottens You received this message because you are subscribed to the Google Groups "VA Smalltalk" group. To unsubscribe from this group and stop receiving emails from it, send an email to [hidden email]. To view this discussion on the web visit https://groups.google.com/d/msgid/va-smalltalk/c14dd3e1-4b8f-483a-9eed-6bba4eb1543a%40googlegroups.com. |
Greetings Hermann,
I would send us a support case on this. We have some ideas based on your configuration to try. Seth -- You received this message because you are subscribed to the Google Groups "VA Smalltalk" group. To unsubscribe from this group and stop receiving emails from it, send an email to [hidden email]. To view this discussion on the web visit https://groups.google.com/d/msgid/va-smalltalk/24d9ce44-ba65-46a0-95c5-ecad7f976d38%40googlegroups.com. |
Hi Seth, i just sent a mail to Instantiations Support. Yours, Hermann You received this message because you are subscribed to the Google Groups "VA Smalltalk" group. To unsubscribe from this group and stop receiving emails from it, send an email to [hidden email]. To view this discussion on the web visit https://groups.google.com/d/msgid/va-smalltalk/47d383be-4890-482a-a5bc-f94587ee69e0%40googlegroups.com. |
Received...Thanks for sending this in.
We are actively looking into it. - Seth -- You received this message because you are subscribed to the Google Groups "VA Smalltalk" group. To unsubscribe from this group and stop receiving emails from it, send an email to [hidden email]. To view this discussion on the web visit https://groups.google.com/d/msgid/va-smalltalk/87659270-431c-42b3-bade-c5f6bd1957d9%40googlegroups.com. |
Hi group,
-- Instantiations solved the case. For anyone who is interested: The following setting in our application ini file was the only one that concerned VAST: [VM Options] newSpaceSize=2097152 anything else.was application specific. This worked fine on RHEL 6 (and on SLES 11). On RHEL 7 (and on SLES 12) we need to add oldSpaceSize=... Why that is necessary is not yet known. When we leave it away, our application
Thanks to Seth for his advice. Regards, Hermann Ottens You received this message because you are subscribed to the Google Groups "VA Smalltalk" group. To unsubscribe from this group and stop receiving emails from it, send an email to [hidden email]. To view this discussion on the web visit https://groups.google.com/d/msgid/va-smalltalk/b305b8e5-e598-4efe-94ec-3f50ca05314c%40googlegroups.com. |
Free forum by Nabble | Edit this page |