I'm using Ian's "IDB Profiler" package to analyze and improve the
performance of one of my apps (45% reduction in run time already - thanks, Ian :-), and I'm seeing some information in the profile browser that I don't understand. In the "Selectors" tab on the profile browser I have a method that is shown as having received 35% of the tallys in the "Tally Active %" column, but only shows up with 2% of the runtime in the "Time Active %" column. This seems contradictory to me. Can someone help me understand what I'm looking at a bit better? Which set of numbers ("tally active" or "time active") should I trust more? Thanks. |
Bob,
> I'm using Ian's "IDB Profiler" package to analyze and improve the > performance of one of my apps (45% reduction in run time already - > thanks, Ian :-), and I'm seeing some information in the profile > browser that I don't understand. In the "Selectors" tab on the > profile browser I have a method that is shown as having received 35% > of the tallys in the "Tally Active %" column, but only shows up with > 2% of the runtime in the "Time Active %" column. This seems > contradictory to me. Can someone help me understand what I'm looking > at a bit better? Which set of numbers ("tally active" or "time > active") should I trust more? Thanks. You are correct in thinking that Tally and Time should normally be very similar, in fact there is probably a good argument for removing Tally as I don't think in adds much. There are a few reasons why you might be seeing a discrepancy though ... 1) A bug. 2) The profiler works by taking a sample every so many milliseconds. It can however only take the sample when the Dolphin VM allows it to. If the code being evaluated when a sample becomes due is in the VM, a DLL or within a heavily optimised Smalltalk method then the sample will have to wait until Dolphin permits the profiler to interrupt. This can cause some strange effects, especially if more than one sampling interval occurs between samples actually being taken. - Tally might only be increased by one but Time could add the actual time between samples. It can also create "attractors" where the first interruptible point after a section of uninterruptible code gets more hits than would be expected. 3) Callbacks. The profiler tree is created using a normal stack trace, the same one as you see in the debugger. An external callback can cause two Smalltalk methods to be incorrectly seen as parent/child nodes when they are actually only connected via an external callback. 4) Recursion. Very difficult to profile in a meaningful way. 5) A bug. There are probably other reasons but the above are the ones that stick in my mind, bearing in mind that it's a long time since I looked at the profiler in detail and I am now very hazy about how the statistical part of it works. Was it an image method that showed the problem, if so which one?. If not I'd appreciate it if you could mail me (or post here) a copy of the method, or indicate what it was doing. Regards Ian |
In reply to this post by Bob Jarvis
Bob,
> I'm using Ian's "IDB Profiler" package to analyze and improve the > performance of one of my apps (45% reduction in run time already - > thanks, Ian :-), and I'm seeing some information in the profile > browser that I don't understand. In the "Selectors" tab on the > profile browser I have a method that is shown as having received 35% > of the tallys in the "Tally Active %" column, but only shows up with > 2% of the runtime in the "Time Active %" column. This seems > contradictory to me. Can someone help me understand what I'm looking > at a bit better? Which set of numbers ("tally active" or "time > active") should I trust more? Thanks. You are correct in thinking that Tally and Time should normally be very similar, in fact there is probably a good argument for removing Tally as I don't think in adds much. There are a few reasons why you might be seeing a discrepancy though ... 1) A bug. 2) The profiler works by taking a sample every so many milliseconds. It can however only take the sample when the Dolphin VM allows it to. If the code being evaluated when a sample becomes due is in the VM, a DLL or within a heavily optimised Smalltalk method then the sample will have to wait until Dolphin permits the profiler to interrupt. This can cause some strange effects, especially if more than one sampling interval occurs between samples actually being taken. - Tally might only be increased by one but Time could add the actual time between samples. It can also create "attractors" where the first interruptible point after a section of uninterruptible code gets more hits than would be expected. 3) Callbacks. The profiler tree is created using a normal stack trace, the same one as you see in the debugger. An external callback can cause two Smalltalk methods to be incorrectly seen as parent/child nodes when they are actually only connected via an external callback. 4) Recursion. Very difficult to profile in a meaningful way. 5) A bug. There are probably other reasons but the above are the ones that stick in my mind, bearing in mind that it's a long time since I looked at the profiler in detail and I am now very hazy about how the statistical part of it works. Was it an image method that showed the problem, if so which one?. If not I'd appreciate it if you could mail me (or post here) a copy of the method, or indicate what it was doing. Regards Ian |
In reply to this post by Bob Jarvis
Bob,
> I'm using Ian's "IDB Profiler" package to analyze and improve the > performance of one of my apps (45% reduction in run time already - > thanks, Ian :-), and I'm seeing some information in the profile > browser that I don't understand. In the "Selectors" tab on the > profile browser I have a method that is shown as having received 35% > of the tallys in the "Tally Active %" column, but only shows up with > 2% of the runtime in the "Time Active %" column. This seems > contradictory to me. Can someone help me understand what I'm looking > at a bit better? Which set of numbers ("tally active" or "time > active") should I trust more? Thanks. I'll defer to Ian for the details of the profiler. If the 2% figure is accurate, then even radical improvements in per call time won't give you much benefit; reducing the number of calls probably won't help much either. I'm no expert in profiling, but, I generally look for the expensive parts and then find better algorithms for them. Usually that's been to identify and create/udpate only the invalid items/area/etc. rather than slash/burn/replace-all; that's especially true with graphics. If you loop over a wide range of possibilities and detect a small subset that's of interest, you might try to replace that with something that directly enumerates the items of interest. Look for wasted work in inner loops; can something move up a loop? Could things move further up if you reversed the inner and outter loops? I spend a fair amount of time trying to produce graphics more efficiently, where skipping needless GDI calls is a big winner (obviously); some people make the mistake of thinking "GDI will clip it" - try to skip the call if that's true. Also, keep in mind that finalization can hide expensive calls. If you leave lots of garbage for last rites, then your user suffers through it (assuming Win9x doesn't crash first as a result) later, and your profile won't reflect the work. If you **know** you're done with a finalizable object, you might do well to clean it up at the time - obvious caveats about not freeing something that's still in use apply. BTW, can/should you keep that object around for later use? Another thing (probably obvious but worth mentioning JIC) is that you might want to arrange your profiled block to repeat a particular task many times; that's helpful for things that happen quickly, since otherwise you end up profing the overhead more than the task. Is any of that even close to relevant? Have a good one, Bill --- Wilhelm K. Schwab, Ph.D. [hidden email] |
In reply to this post by Ian Bartholomew-7
"Ian Bartholomew" <[hidden email]> wrote in message news:<Y5or8.249071$[hidden email]>...
> Bob, > > > I'm using Ian's "IDB Profiler" package to analyze and improve the > > performance of one of my apps (45% reduction in run time already - > > thanks, Ian :-), and I'm seeing some information in the profile > > browser that I don't understand. In the "Selectors" tab on the > > profile browser I have a method that is shown as having received 35% > > of the tallys in the "Tally Active %" column, but only shows up with > > 2% of the runtime in the "Time Active %" column. This seems > > contradictory to me. Can someone help me understand what I'm looking > > at a bit better? Which set of numbers ("tally active" or "time > > active") should I trust more? Thanks. > > You are correct in thinking that Tally and Time should normally be very > similar, in fact there is probably a good argument for removing Tally as I > don't think in adds much. There are a few reasons why you might be seeing a > discrepancy though ... > > 1) A bug. > > 2) The profiler works by taking a sample every so many milliseconds. It can > however only take the sample when the Dolphin VM allows it to. If the code > being evaluated when a sample becomes due is in the VM, a DLL or within a > heavily optimised Smalltalk method then the sample will have to wait until > Dolphin permits the profiler to interrupt. This can cause some strange > effects, especially if more than one sampling interval occurs between > samples actually being taken. - Tally might only be increased by one but > Time could add the actual time between samples. It can also create > "attractors" where the first interruptible point after a section of > uninterruptible code gets more hits than would be expected. > > 3) Callbacks. The profiler tree is created using a normal stack trace, the > same one as you see in the debugger. An external callback can cause two > Smalltalk methods to be incorrectly seen as parent/child nodes when they are > actually only connected via an external callback. > > 4) Recursion. Very difficult to profile in a meaningful way. > > 5) A bug. > > There are probably other reasons but the above are the ones that stick in my > mind, bearing in mind that it's a long time since I looked at the profiler > in detail and I am now very hazy about how the statistical part of it works. > > Was it an image method that showed the problem, if so which one?. If not I'd > appreciate it if you could mail me (or post here) a copy of the method, or > indicate what it was doing. What's happening is that I'm using my Rio database package (part of my goodies stuff) to suck about six thousand rows of data out of a database. As part of the process I have to eliminate a bunch of rows I don't want, and it's faster to do this in the application than to have the database do it. What I'm seeing is that the Tally reports that 26% of the tallies of the query-and-eliminate-rows process is occupied looking at one field in each returned row. Time reports that only 1% of the runtime is spent on this method. This morning I had a bright idea. I eliminated the field lookup, replacing it with a literal assignment (which is OK for testing purposes). Actual measured runtime dropped from 11.8 seconds to 10.8 seconds (about an 8% time reduction). Restoring the code that performs the field lookup instead of the literal assignment increased the runtime again. My interpretation of this is: A) I've got to figure out how to speed up the field lookup code! B) The Tally display can be very useful. Please don't eliminate the Tally display. C) The Profiler is one really great tool! I'll point out that before I started using the Profiler on this code it took about 19 seconds to perform this same query against the same data. Now I'm just trying to grind out those last few percentage points. A thousand thanks for this, Ian! :-) Bob Jarvis |
Free forum by Nabble | Edit this page |