Smalltalk › Usenets › Dolphin Smalltalk

Help interpreting profiler information

_‹ Previous Topic Next Topic _›

Classic

List

Threaded

5 messages Options

Bob Jarvis

Help interpreting profiler information

I'm using Ian's "IDB Profiler" package to analyze and improve the
performance of one of my apps (45% reduction in run time already -
thanks, Ian :-), and I'm seeing some information in the profile
browser that I don't understand. In the "Selectors" tab on the
profile browser I have a method that is shown as having received 35%
of the tallys in the "Tally Active %" column, but only shows up with
2% of the runtime in the "Time Active %" column. This seems
contradictory to me. Can someone help me understand what I'm looking
at a bit better? Which set of numbers ("tally active" or "time
active") should I trust more? Thanks.

Ian Bartholomew-7

Re: Help interpreting profiler information

Bob,

> I'm using Ian's "IDB Profiler" package to analyze and improve the
> performance of one of my apps (45% reduction in run time already -
> thanks, Ian :-), and I'm seeing some information in the profile
> browser that I don't understand. In the "Selectors" tab on the
> profile browser I have a method that is shown as having received 35%
> of the tallys in the "Tally Active %" column, but only shows up with
> 2% of the runtime in the "Time Active %" column. This seems
> contradictory to me. Can someone help me understand what I'm looking
> at a bit better? Which set of numbers ("tally active" or "time
> active") should I trust more? Thanks.

You are correct in thinking that Tally and Time should normally be very
similar, in fact there is probably a good argument for removing Tally as I
don't think in adds much. There are a few reasons why you might be seeing a
discrepancy though ...

1) A bug.

2) The profiler works by taking a sample every so many milliseconds. It can
however only take the sample when the Dolphin VM allows it to. If the code
being evaluated when a sample becomes due is in the VM, a DLL or within a
heavily optimised Smalltalk method then the sample will have to wait until
Dolphin permits the profiler to interrupt. This can cause some strange
effects, especially if more than one sampling interval occurs between
samples actually being taken. - Tally might only be increased by one but
Time could add the actual time between samples. It can also create
"attractors" where the first interruptible point after a section of
uninterruptible code gets more hits than would be expected.

3) Callbacks. The profiler tree is created using a normal stack trace, the
same one as you see in the debugger. An external callback can cause two
Smalltalk methods to be incorrectly seen as parent/child nodes when they are
actually only connected via an external callback.

4) Recursion. Very difficult to profile in a meaningful way.

5) A bug.

There are probably other reasons but the above are the ones that stick in my
mind, bearing in mind that it's a long time since I looked at the profiler
in detail and I am now very hazy about how the statistical part of it works.

Was it an image method that showed the problem, if so which one?. If not I'd
appreciate it if you could mail me (or post here) a copy of the method, or
indicate what it was doing.

Regards
Ian

Ian Bartholomew-7

Re: Help interpreting profiler information

In reply to this post by Bob Jarvis

Bob,

Bill Schwab-2

Re: Help interpreting profiler information

In reply to this post by Bob Jarvis

Bob,

I'll defer to Ian for the details of the profiler.

If the 2% figure is accurate, then even radical improvements in per call
time won't give you much benefit; reducing the number of calls probably
won't help much either. I'm no expert in profiling, but, I generally look
for the expensive parts and then find better algorithms for them. Usually
that's been to identify and create/udpate only the invalid items/area/etc.
rather than slash/burn/replace-all; that's especially true with graphics.
If you loop over a wide range of possibilities and detect a small subset
that's of interest, you might try to replace that with something that
directly enumerates the items of interest. Look for wasted work in inner
loops; can something move up a loop? Could things move further up if you
reversed the inner and outter loops? I spend a fair amount of time trying
to produce graphics more efficiently, where skipping needless GDI calls is a
big winner (obviously); some people make the mistake of thinking "GDI will
clip it" - try to skip the call if that's true.

Also, keep in mind that finalization can hide expensive calls. If you leave
lots of garbage for last rites, then your user suffers through it (assuming
Win9x doesn't crash first as a result) later, and your profile won't reflect
the work. If you **know** you're done with a finalizable object, you might
do well to clean it up at the time - obvious caveats about not freeing
something that's still in use apply. BTW, can/should you keep that object
around for later use?

Another thing (probably obvious but worth mentioning JIC) is that you might
want to arrange your profiled block to repeat a particular task many times;
that's helpful for things that happen quickly, since otherwise you end up
profing the overhead more than the task.

Is any of that even close to relevant?

Have a good one,

Bill

---
Wilhelm K. Schwab, Ph.D.
[hidden email]

Bob Jarvis

Re: Help interpreting profiler information

In reply to this post by Ian Bartholomew-7

"Ian Bartholomew" <[hidden email]> wrote in message news:<Y5or8.249071$[hidden email]>...

> Bob,
>
> > I'm using Ian's "IDB Profiler" package to analyze and improve the
> > performance of one of my apps (45% reduction in run time already -
> > thanks, Ian :-), and I'm seeing some information in the profile
> > browser that I don't understand. In the "Selectors" tab on the
> > profile browser I have a method that is shown as having received 35%
> > of the tallys in the "Tally Active %" column, but only shows up with
> > 2% of the runtime in the "Time Active %" column. This seems
> > contradictory to me. Can someone help me understand what I'm looking
> > at a bit better? Which set of numbers ("tally active" or "time
> > active") should I trust more? Thanks.
>
> You are correct in thinking that Tally and Time should normally be very
> similar, in fact there is probably a good argument for removing Tally as I
> don't think in adds much. There are a few reasons why you might be seeing a
> discrepancy though ...
>
> 1) A bug.
>
> 2) The profiler works by taking a sample every so many milliseconds. It can
> however only take the sample when the Dolphin VM allows it to. If the code
> being evaluated when a sample becomes due is in the VM, a DLL or within a
> heavily optimised Smalltalk method then the sample will have to wait until
> Dolphin permits the profiler to interrupt. This can cause some strange
> effects, especially if more than one sampling interval occurs between
> samples actually being taken. - Tally might only be increased by one but
> Time could add the actual time between samples. It can also create
> "attractors" where the first interruptible point after a section of
> uninterruptible code gets more hits than would be expected.
>
> 3) Callbacks. The profiler tree is created using a normal stack trace, the
> same one as you see in the debugger. An external callback can cause two
> Smalltalk methods to be incorrectly seen as parent/child nodes when they are
> actually only connected via an external callback.
>
> 4) Recursion. Very difficult to profile in a meaningful way.
>
> 5) A bug.
>
> There are probably other reasons but the above are the ones that stick in my
> mind, bearing in mind that it's a long time since I looked at the profiler
> in detail and I am now very hazy about how the statistical part of it works.
>
> Was it an image method that showed the problem, if so which one?. If not I'd
> appreciate it if you could mail me (or post here) a copy of the method, or
> indicate what it was doing.

What's happening is that I'm using my Rio database package (part of my
goodies stuff) to suck about six thousand rows of data out of a
database. As part of the process I have to eliminate a bunch of rows
I don't want, and it's faster to do this in the application than to
have the database do it. What I'm seeing is that the Tally reports
that 26% of the tallies of the query-and-eliminate-rows process is
occupied looking at one field in each returned row. Time reports that
only 1% of the runtime is spent on this method.

This morning I had a bright idea. I eliminated the field lookup,
replacing it with a literal assignment (which is OK for testing
purposes). Actual measured runtime dropped from 11.8 seconds to 10.8
seconds (about an 8% time reduction). Restoring the code that
performs the field lookup instead of the literal assignment increased
the runtime again.

My interpretation of this is:
A) I've got to figure out how to speed up the field lookup code!
B) The Tally display can be very useful. Please don't eliminate
the Tally display.
C) The Profiler is one really great tool!

I'll point out that before I started using the Profiler on this code
it took about 19 seconds to perform this same query against the same
data. Now I'm just trying to grind out those last few percentage
points. A thousand thanks for this, Ian! :-)

Bob Jarvis