I am trying to do a very heavy lifting task with Squeak, but it is
taking longer than I think it should. So of course, I used the profiler, but I just can't make any sense of it. I profiled a huge operation that took a couple of hours. To do the profile I used: MessageTally spyAllOn: [ ... my big operation... ] According to the profile, only 46% of the time was spent doing anything related to MY stuff, but with 41% of it being in "primitives"! 46.0% {9799162ms} MauiWorld class>>maui: 46.0% {9799162ms} MauiFamily>>newViewNamed:for: <--- snip ----> | 41.2% {8776640ms} MaFixedObjectBuffer(MaObjectBuffer)>>oid | 41.2% {8776640ms} primitives The snipped portion is all my code, zooming down a single tree branch straight to "primitives". Does this mean the Smalltalk is optimized to consuming only 5% of the total of everything? The other 53.9% was spent, here: -------------------------------- Process: (60) 10116: [] in Delay>>wait -------------------------------- 53.9% {11482061ms} EventSensor>>eventTickler 50.5% {10757775ms} Delay>>wait |50.5% {10757775ms} primitives 2.5% {532563ms} primitives What is this?? Is it possible there is something wrong with the profiler? From the section of the report about garbage-collection, it would appear so: --------------------------------------------- **Leaves** 50.6% {10779078ms} Delay>>wait 41.2% {8776640ms} MaByteObjectBuffer(MaObjectBuffer)>>oid 2.5% {532563ms} EventSensor>>eventTickler **Memory** old +105,090,944 bytes young +141,708 bytes used +105,232,652 bytes free +48,992 bytes **GCs** full 34 totalling 20,612,579ms (97.0% uptime), avg 606252.0ms incr 703603 totalling 18446744070288121880 milliSeconds ms (8.6594166983905e13% uptime), avg 2.6217546073977e13ms tenures 2,429 (avg 289 GCs/tenure) root table 0 overflows ---------------------------------------------- As interesting as Check out the time for incremental GC's.. Yes, that really is eighteen quintillion, four hundred forty-six quadrillion, seven hundred forty-four trillion, seventy billion, two hundred eighty-eight million, one hundred twenty-one thousand, eight hundred eighty milliseconds spent doing incremental GC's. That can't be right... :) Wait, there's more. After profiling my operation, the image is just sitting there with the Spy Results, but the vm was consuming nearly 100% of the CPU. I do not think this was the sound issue because I have that preference set to stop playing, and also that the utilization was not so pegged but dipping down to 95% a bit, I think. My question: What the hell is Squeak doing? Again, I try to call upon MessageTally for the answer: MessageTally spyAllOn: [ (Delay forSeconds: 30) wait ] What do I get? Nada! (pasted below). I have a deadline pressing and I'm really stuck on this! The profiler is an absolutely _essential_ tool. Is my code the problem or is there a problem with Squeak? Thank you for any insights! - Chris "Nada" - 29927 tallies, 30016 msec. **Tree** -------------------------------- Process: (60) 23304: [] in Delay>>wait -------------------------------- 5.4% {1621ms} EventSensor>>eventTickler 5.4% {1621ms} Delay>>wait 5.4% {1621ms} primitives -------------------------------- Process: (10) 20578: ProcessorScheduler class>>idleProcess -------------------------------- 94.4% {28335ms} ProcessorScheduler class>>idleProcess **Leaves** 94.4% {28335ms} ProcessorScheduler class>>idleProcess 5.5% {1651ms} Delay>>wait **Memory** old +0 bytes young -331,640 bytes used -331,640 bytes free +331,640 bytes **GCs** full 0 totalling 0ms (0.0% uptime) incr 8 totalling 10,843ms (36.0% uptime), avg 1355.0ms tenures 0 root table 0 overflows |
See the discussion here:
http://lists.squeakfoundation.org/pipermail/squeak-dev/2009-February/134405.html What you're seeing is issue #3 in the above post. Cheers, - Andreas On 4/22/2010 6:29 PM, Chris Muller wrote: > I am trying to do a very heavy lifting task with Squeak, but it is > taking longer than I think it should. So of course, I used the > profiler, but I just can't make any sense of it. I profiled a huge > operation that took a couple of hours. To do the profile I used: > > MessageTally spyAllOn: [ ... my big operation... ] > > According to the profile, only 46% of the time was spent doing > anything related to MY stuff, but with 41% of it being in > "primitives"! > > 46.0% {9799162ms} MauiWorld class>>maui: > 46.0% {9799162ms} MauiFamily>>newViewNamed:for: > <--- snip ----> > > | 41.2% {8776640ms} > MaFixedObjectBuffer(MaObjectBuffer)>>oid > > | 41.2% {8776640ms} primitives > > The snipped portion is all my code, zooming down a single tree branch > straight to "primitives". Does this mean the Smalltalk is optimized > to consuming only 5% of the total of everything? > > The other 53.9% was spent, here: > > -------------------------------- > Process: (60) 10116: [] in Delay>>wait > -------------------------------- > 53.9% {11482061ms} EventSensor>>eventTickler > 50.5% {10757775ms} Delay>>wait > |50.5% {10757775ms} primitives > 2.5% {532563ms} primitives > > What is this?? > > Is it possible there is something wrong with the profiler? From the > section of the report about garbage-collection, it would appear so: > > --------------------------------------------- > **Leaves** > 50.6% {10779078ms} Delay>>wait > 41.2% {8776640ms} MaByteObjectBuffer(MaObjectBuffer)>>oid > 2.5% {532563ms} EventSensor>>eventTickler > > **Memory** > old +105,090,944 bytes > young +141,708 bytes > used +105,232,652 bytes > free +48,992 bytes > > **GCs** > full 34 totalling 20,612,579ms (97.0% uptime), avg 606252.0ms > incr 703603 totalling 18446744070288121880 milliSeconds ms > (8.6594166983905e13% uptime), avg 2.6217546073977e13ms > tenures 2,429 (avg 289 GCs/tenure) > root table 0 overflows > ---------------------------------------------- > > As interesting as > > Check out the time for incremental GC's.. Yes, that really is > eighteen quintillion, four hundred forty-six quadrillion, seven > hundred forty-four trillion, seventy billion, two hundred eighty-eight > million, one hundred twenty-one thousand, eight hundred eighty > milliseconds spent doing incremental GC's. That can't be right... > :) > > Wait, there's more. > > After profiling my operation, the image is just sitting there with the > Spy Results, but the vm was consuming nearly 100% of the CPU. I do > not think this was the sound issue because I have that preference set > to stop playing, and also that the utilization was not so pegged but > dipping down to 95% a bit, I think. > > My question: What the hell is Squeak doing? > > Again, I try to call upon MessageTally for the answer: > > MessageTally spyAllOn: [ (Delay forSeconds: 30) wait ] > > What do I get? Nada! (pasted below). > > I have a deadline pressing and I'm really stuck on this! The profiler > is an absolutely _essential_ tool. Is my code the problem or is there > a problem with Squeak? > > Thank you for any insights! > - Chris > > "Nada" > - 29927 tallies, 30016 msec. > > **Tree** > -------------------------------- > Process: (60) 23304: [] in Delay>>wait > -------------------------------- > 5.4% {1621ms} EventSensor>>eventTickler > 5.4% {1621ms} Delay>>wait > 5.4% {1621ms} primitives > -------------------------------- > Process: (10) 20578: ProcessorScheduler class>>idleProcess > -------------------------------- > 94.4% {28335ms} ProcessorScheduler class>>idleProcess > > **Leaves** > 94.4% {28335ms} ProcessorScheduler class>>idleProcess > 5.5% {1651ms} Delay>>wait > > **Memory** > old +0 bytes > young -331,640 bytes > used -331,640 bytes > free +331,640 bytes > > **GCs** > full 0 totalling 0ms (0.0% uptime) > incr 8 totalling 10,843ms (36.0% uptime), avg 1355.0ms > tenures 0 > root table 0 overflows > > |
In reply to this post by Chris Muller-3
On 2010-04-22, at 6:29 PM, Chris Muller wrote: > Check out the time for incremental GC's.. Yes, that really is > eighteen quintillion, four hundred forty-six quadrillion, seven > hundred forty-four trillion, seventy billion, two hundred eighty-eight > million, one hundred twenty-one thousand, eight hundred eighty > milliseconds spent doing incremental GC's. That can't be right... > :) Well I've not looked but I guess someone did convert the tally logic to use the NEW microsecond clock so the 11,482,061ms is 11 seconds 482,061 micro seconds. Also I stick the 64 bit value for the microsecond clock into a large positive integer. positive64BitIntegerFor so the 18446744070288121880 is actually negative if you convert back to a 64bit integer and do some math to arrive at what the signed value is. -- =========================================================================== John M. McIntosh <[hidden email]> Twitter: squeaker68882 Corporate Smalltalk Consulting Ltd. http://www.smalltalkconsulting.com =========================================================================== smime.p7s (3K) Download Attachment |
In reply to this post by Andreas.Raab
On 23 April 2010 04:42, Andreas Raab <[hidden email]> wrote:
> See the discussion here: > > http://lists.squeakfoundation.org/pipermail/squeak-dev/2009-February/134405.html > > What you're seeing is issue #3 in the above post. > Andreas, for me, it feels like this whole post deserves to be found in HelpSystem! :) > Cheers, > - Andreas > > On 4/22/2010 6:29 PM, Chris Muller wrote: >> >> I am trying to do a very heavy lifting task with Squeak, but it is >> taking longer than I think it should. So of course, I used the >> profiler, but I just can't make any sense of it. I profiled a huge >> operation that took a couple of hours. To do the profile I used: >> >> MessageTally spyAllOn: [ ... my big operation... ] >> >> According to the profile, only 46% of the time was spent doing >> anything related to MY stuff, but with 41% of it being in >> "primitives"! >> >> 46.0% {9799162ms} MauiWorld class>>maui: >> 46.0% {9799162ms} MauiFamily>>newViewNamed:for: >> <--- snip ----> >> >> | 41.2% {8776640ms} >> MaFixedObjectBuffer(MaObjectBuffer)>>oid >> >> | 41.2% {8776640ms} primitives >> >> The snipped portion is all my code, zooming down a single tree branch >> straight to "primitives". Does this mean the Smalltalk is optimized >> to consuming only 5% of the total of everything? >> >> The other 53.9% was spent, here: >> >> -------------------------------- >> Process: (60) 10116: [] in Delay>>wait >> -------------------------------- >> 53.9% {11482061ms} EventSensor>>eventTickler >> 50.5% {10757775ms} Delay>>wait >> |50.5% {10757775ms} primitives >> 2.5% {532563ms} primitives >> >> What is this?? >> >> Is it possible there is something wrong with the profiler? From the >> section of the report about garbage-collection, it would appear so: >> >> --------------------------------------------- >> **Leaves** >> 50.6% {10779078ms} Delay>>wait >> 41.2% {8776640ms} MaByteObjectBuffer(MaObjectBuffer)>>oid >> 2.5% {532563ms} EventSensor>>eventTickler >> >> **Memory** >> old +105,090,944 bytes >> young +141,708 bytes >> used +105,232,652 bytes >> free +48,992 bytes >> >> **GCs** >> full 34 totalling 20,612,579ms (97.0% uptime), >> avg 606252.0ms >> incr 703603 totalling 18446744070288121880 milliSeconds >> ms >> (8.6594166983905e13% uptime), avg 2.6217546073977e13ms >> tenures 2,429 (avg 289 GCs/tenure) >> root table 0 overflows >> ---------------------------------------------- >> >> As interesting as >> >> Check out the time for incremental GC's.. Yes, that really is >> eighteen quintillion, four hundred forty-six quadrillion, seven >> hundred forty-four trillion, seventy billion, two hundred eighty-eight >> million, one hundred twenty-one thousand, eight hundred eighty >> milliseconds spent doing incremental GC's. That can't be right... >> :) >> >> Wait, there's more. >> >> After profiling my operation, the image is just sitting there with the >> Spy Results, but the vm was consuming nearly 100% of the CPU. I do >> not think this was the sound issue because I have that preference set >> to stop playing, and also that the utilization was not so pegged but >> dipping down to 95% a bit, I think. >> >> My question: What the hell is Squeak doing? >> >> Again, I try to call upon MessageTally for the answer: >> >> MessageTally spyAllOn: [ (Delay forSeconds: 30) wait ] >> >> What do I get? Nada! (pasted below). >> >> I have a deadline pressing and I'm really stuck on this! The profiler >> is an absolutely _essential_ tool. Is my code the problem or is there >> a problem with Squeak? >> >> Thank you for any insights! >> - Chris >> >> "Nada" >> - 29927 tallies, 30016 msec. >> >> **Tree** >> -------------------------------- >> Process: (60) 23304: [] in Delay>>wait >> -------------------------------- >> 5.4% {1621ms} EventSensor>>eventTickler >> 5.4% {1621ms} Delay>>wait >> 5.4% {1621ms} primitives >> -------------------------------- >> Process: (10) 20578: ProcessorScheduler class>>idleProcess >> -------------------------------- >> 94.4% {28335ms} ProcessorScheduler class>>idleProcess >> **Leaves** >> 94.4% {28335ms} ProcessorScheduler class>>idleProcess >> 5.5% {1651ms} Delay>>wait >> >> **Memory** >> old +0 bytes >> young -331,640 bytes >> used -331,640 bytes >> free +331,640 bytes >> >> **GCs** >> full 0 totalling 0ms (0.0% uptime) >> incr 8 totalling 10,843ms (36.0% uptime), avg 1355.0ms >> tenures 0 >> root table 0 overflows >> >> > > > -- Best regards, Igor Stasenko AKA sig. |
In reply to this post by Andreas.Raab
> http://lists.squeakfoundation.org/pipermail/squeak-dev/2009-February/134405.html Any ideas on why the primitive clock-check was removed from the VM ? Brent |
On 4/23/2010 7:52 AM, Brent Pinkney wrote:
> >> http://lists.squeakfoundation.org/pipermail/squeak-dev/2009-February/134405.html > > Any ideas on why the primitive clock-check was removed from the VM ? Performance. It somewhat slowed primitive dispatch since a timer check had to be done for each primitive. Cheers, - Andreas |
Hi,
> Performance. It somewhat slowed primitive dispatch since a timer check had to be done for each primitive. I would really like to help Chris out on this. I have downloaded http://www.squeakvm.org/unix/release/Squeak-3.6-0.src.tar.gz and am comparing interp.c against that in 3.10-4. Could you be able to point me to the code that has been removed and any advise on whether I can patch 3.10-4 to build Chris a VM so that he can profile? Thanks Brent |
On 4/23/2010 11:59 AM, Brent Pinkney wrote:
> Hi, >> Performance. It somewhat slowed primitive dispatch since a timer check had to be done for each primitive. > > I would really like to help Chris out on this. > > I have downloaded http://www.squeakvm.org/unix/release/Squeak-3.6-0.src.tar.gz and am comparing interp.c against that in 3.10-4. > > Could you be able to point me to the code that has been removed and any advise on whether I can patch 3.10-4 to build Chris a VM > so that he can profile? Look at primitiveResponse. You need to make sure that ioMSecs() > nextWakeupTick ifTrue:[self checkForInterrupts]. Cheers, - Andreas |
Hi Andreas,
> Look at primitiveResponse. You need to make sure that > > ioMSecs() > nextWakeupTick ifTrue:[self checkForInterrupts]. Ok, I have compared the 3.6-0 interp.c against the 3.10-4 interp.c (both from www.squeakvm.org). I have made a patch to Squeak-3.10-4/platforms/unix/src/vm/interp.c to (hopefully) restore the 3.6 profiling behaviour. My changes are... #define DoProfilerTimerCheck 1 sqInt dispatchFunctionPointer(void * aFunctionPointer) { int timerPending; int startTime; /* this is required if the MessageTally is to function correctly */ if (DoProfilerTimerCheck) { timerPending = foo->nextWakeupTick != 0; if (timerPending) { startTime = ioLowResMSecs(); } } ((void (*)(void))aFunctionPointer)(); if (DoProfilerTimerCheck) { if (timerPending) { if ((ioLowResMSecs()) != startTime) { if (((ioMSecs()) & MillisecondClockMask) >= foo->nextWakeupTick) { if (foo->successFlag) { checkForInterrupts(); } else { foo->interruptCheckCounter = 0; } } } } } } I would be grateful for any insight as to whether this is the correct change - it seems to profile correctly on my Kubuntu 8.04 box. I am willing to pursue this change back into VMMaker once blessed. Brent |
(CC to vm-dev list)
I added Mantis 7515 to track this issue. http://bugs.squeak.org/view.php?id=7515 "Need way to add timer checks to #dispathFunctionPointer: without performance hit" Dave On Fri, Apr 23, 2010 at 11:11:29PM +0200, Brent Pinkney wrote: > Hi Andreas, > > > Look at primitiveResponse. You need to make sure that > > > > ioMSecs() > nextWakeupTick ifTrue:[self checkForInterrupts]. > > Ok, I have compared the 3.6-0 interp.c against the 3.10-4 interp.c (both from www.squeakvm.org). > > I have made a patch to Squeak-3.10-4/platforms/unix/src/vm/interp.c to (hopefully) restore the 3.6 profiling behaviour. > My changes are... > > > #define DoProfilerTimerCheck 1 > > sqInt dispatchFunctionPointer(void * aFunctionPointer) { > > int timerPending; > int startTime; > > /* this is required if the MessageTally is to function correctly */ > if (DoProfilerTimerCheck) { > timerPending = foo->nextWakeupTick != 0; > if (timerPending) { > startTime = ioLowResMSecs(); > } > } > ((void (*)(void))aFunctionPointer)(); > if (DoProfilerTimerCheck) { > if (timerPending) { > if ((ioLowResMSecs()) != startTime) { > if (((ioMSecs()) & MillisecondClockMask) >= foo->nextWakeupTick) { > if (foo->successFlag) { > checkForInterrupts(); > } else { > foo->interruptCheckCounter = 0; > } > } > } > } > } > } > > > I would be grateful for any insight as to whether this is the correct change - it seems to profile correctly on my Kubuntu 8.04 > box. > > I am willing to pursue this change back into VMMaker once blessed. > > Brent > > |
Free forum by Nabble | Edit this page |