External calls and UI responsiveness

Previous Topic Next Topic
 
classic Classic list List threaded Threaded
9 messages Options
Reply | Threaded
Open this post in threaded view
|

External calls and UI responsiveness

Bill Dargel
I've run into some problems with the responsiveness of the user
interface in Dolphin when a background process is making external calls.
The issue came up with having a background process access  a database
(over a slow connection). The calls to #sqlExecute: average close to
100ms each. While the background process is executing the UI is
practically frozen. I investigated and was able to reproduce it easily
for testing purposes using a blocking call to the OS sleep function. I
added:

    !KernelLibrary methodsFor!
    blockedSleep: anInteger
        <stdcall: void Sleep dword>
        ^self invalidCall! !

I did the following tests...
First I worked out a repeat count such that a compute block would keep
busy for ~100ms on my machine (Dolphin 5.0.3, Win2K, 670MHz).

    countFor100ms := 312000.
    Time millisecondsToRun: [countFor100ms timesRepeat: [2 min: 3]].
    countFor10ms := countFor100ms // 10.
    countFor1ms := countFor100ms // 100.

One can fork the following background process (for 30 sec total) and not
suffer in the responsiveness of the UI:

    "(1) half computing, half in external calls - responsive UI"
    [150 timesRepeat:
        [countFor100ms timesRepeat: [2 min: 3].
        KernelLibrary default blockedSleep: 100]
    ] forkAt: Processor userBackgroundPriority.

Spending a larger fraction of the time blocked on the external call, and
things start to slow done a bit.

    "(2) ~90% in external calls - UI a bit sluggish"
    [150 timesRepeat:
        [countFor10ms timesRepeat: [2 min: 3].
        KernelLibrary default blockedSleep: 100]
    ] forkAt: Processor userBackgroundPriority.

Finally, at a level which seems to be on par with what my background
database access process was doing, the UI is getting pretty unusable.

    "(3) ~99% in external calls - UI takes about 3 seconds to respond to
a click"
    [150 timesRepeat:
        [countFor1ms timesRepeat: [2 min: 3].
        KernelLibrary default blockedSleep: 100]
    ] forkAt: Processor userBackgroundPriority.

An interesting work around that I discovered was to put a short sleep
into the background process. If I hack a "sleep: 1" into the caller of
#sqlExecute:, the UI stays the most responsive of all of these tests,
even though 99% of the time the entire image is blocked in the external
call.

    "(4) 99% in external calls and 1% sleeping - UI quite responsive"
   [150 timesRepeat:
        [Processor sleep: 1.
        KernelLibrary default blockedSleep: 100]
    ] forkAt: Processor userBackgroundPriority.

I'd be interested to hear how and when the VM signals the inputSemaphore
that indicates that windows events are pending, and where the Smalltalk
process switches may occur as it relates to external calls.

As I understand the worst case (3), the entire image/VM OS thread is
blocked on the external call 99% of the time. The other 1% is being used
to compute in the background (priority 3) process and to service the
main UI (priority 5) process. Apparently many of those cycles are
devoted to the background process rather than the higher priority UI
process that needs attention. In case (4) where the background process
sleeps rather than computes for the 1%, presumably the idle process may
run, and ends up giving the UI process all the time that it wants.

Is some change needed in the VM so that case (3) would act more
reasonably?

As a second avenue of attack, I tried hacking the #sqlExecute: method to
use 'overlap'. This, as one might expect, handles keeping the UI
responsive, but introduced another problem.

The background process in question often gets terminated (in response to
some UI event) and is then usually re-instantiated. When I've made the
#sqlExecute: call overlapped, there are problems in terminating the
background process. When the background process is first started it
shows in the ProcessMonitor as 'ready' in DBParameterizedStatement>>exec
(the caller of #sqlExecute:). After terminating it and starting it over
again, it shows up as 'waiting' in DBParameterizedStatement>>exec.
Trying to terminate another time shows it as 'waiting' in [] in
ProcessScheduler>>vmi:list:no:with:. And by the time I reach this point,
the only way out of Dolphin is to kill it from the Windows Task Manager.
I'm guessing that there's something in the ODBC interface that isn't
taking kindly to have the Smalltalk process terminated? Not sure though,
since terminating and restarting works just fine when using the
equivalent of (4) above and the blocking #sqlExecute: call.

I tried to explore the issue by using KernelLibrary>>sleep:. But ran
into this other strange anomaly. I'm not sure that it actually relates
to any of my real issues. But it made me curious.

    #(5 5 10 10 20 20 40 40 50 50 100 100) collect: [:startupDelay |
        | process state time |
        state := 1.
        process := BlockingCallMonitor new
            callBlock: [[KernelLibrary default sleep: 5000] ensure:
[state := 2]];
            priority: 4;
            monitorOnce.
        time := Time millisecondsToRun:
            [Processor sleep: startupDelay.
            process terminate.
            [state = 2] whileFalse: [Processor sleep: 10]].
        startupDelay -> time].

gives results like this on Win2K:
#(5 -> 236 5 -> 55 10 -> 60 10 -> 60 20 -> 60 20 -> 60 40 -> 61 40 -> 60
50 -> 5063 50 -> 5055 100 -> 5059 100 -> 5052)

For some reason it seems to take on the order of 50ms before the kernel
sleep really gets started. If you terminate it before it's gone that
long, it apparently stops before it gets started. If you wait longer
than that before terminating the process, then the termination doesn't
really occur until after the 5s sleep is over. My guess is that the
external call isn't interrupted in any way, and that there is some
overhead in coming up with the OS thread to use such that it doesn't
start immediately? Also the first time it takes 220ms - 240ms to
terminate (on Win2K) rather than 50ms or 60ms for some reason. Trying
the same thing on XP doesn't show that particular anomaly on the first
run, though sometimes the first one would not terminate until after the
5s sleep had expired.

In summary, any insights into how the Smalltalk UI process and external
calls (whether overlapped or not) for other processes interact, or
perhaps more to the point, any ideas or experience reports on accessing
a database in a background process would be appreciated.

thanks,
-Bill

-------------------------------------------
Bill Dargel            [hidden email]
Shoshana Technologies
100 West Joy Road, Ann Arbor, MI 48105  USA


Reply | Threaded
Open this post in threaded view
|

Re: External calls and UI responsiveness

Chris Uppal-3
Bill Dargel wrote:

> I'd be interested to hear how and when the VM signals the
> inputSemaphore that indicates that windows events are pending, and
> where the Smalltalk process switches may occur as it relates to
> external calls.

I can't answer your questions (though I'd be interested to know too).  However,
does sprinkling a few:

    Processor yield.

Solve/alleviate your problems ?

I tried it on my machine and it *seemed* to help with your example, but I can't
be sure because even your worst case didn't slow my machine (nearly identical
to yours: D5.0.3, 650Mz, W2K) down very much.  The UI became sluggish, but I
was only getting fractional second delays in response to mouse clicks, rather
than the multisecond pauses that you mentioned.

    -- chris


Reply | Threaded
Open this post in threaded view
|

Re: External calls and UI responsiveness

Blair McGlashan
In reply to this post by Bill Dargel
Bill

You wrote in message news:[hidden email]...
> I've run into some problems with the responsiveness of the user
> interface in Dolphin when a background process is making external calls.
> The issue came up with having a background process access  a database
> (over a slow connection). The calls to #sqlExecute: average close to
> 100ms each. While the background process is executing the UI is
> practically frozen.

Dolphin Processes are "green" threads, and all run on a single native thread
with task switching managed by the Dolphin VM in conjunction with some code.
This means that any time a normal (non-overlapped) external call is
executing, all Smalltalk processing is effectively suspended until the
external call completes.

>...I investigated and was able to reproduce it easily
> for testing purposes using a blocking call to the OS sleep function.
>.... [snip]...
> Spending a larger fraction of the time blocked on the external call, and
> things start to slow done a bit.
>
>     "(2) ~90% in external calls - UI a bit sluggish"
>     [150 timesRepeat:
>         [countFor10ms timesRepeat: [2 min: 3].
>         KernelLibrary default blockedSleep: 100]
>     ] forkAt: Processor userBackgroundPriority.
>
> Finally, at a level which seems to be on par with what my background
> database access process was doing, the UI is getting pretty unusable.
>
>     "(3) ~99% in external calls - UI takes about 3 seconds to respond to
> a click"
>     [150 timesRepeat:
>         [countFor1ms timesRepeat: [2 min: 3].
>         KernelLibrary default blockedSleep: 100]
>     ] forkAt: Processor userBackgroundPriority.
>

This is probably because of the way the VM has to poll for Windows messages
when running background processes. Unfortunately there is no way on Windows
(at least not that I have been able to discover) to receive an interrupt
when new Windows messages are available in the message queue. Furthermore it
isn't possible, AFAIK, to use another native thread to wait on the message
queue (e.g. with WaitMessage()), because message queues are per-thread. Even
the AttachThreadInput() API does not seem to help here. Therefore, when
running Smalltalk code, Dolphin has to regularly check the Windows message
queue for input. Normally this is done by main UI Process' message loop,
running in Smalltalk, and the VM does not get involved. If, however, the
main UI process has quiesced, but background processes are still running,
then the VM will periodically check the input queue status. Since this is
expensive, it only does this very infrequently, based on a "message send"
counter. Maintaining and checking the message send counter is very cheap,
and this can also be used for other purposes to interrupt the VM's normal
processing (e.g. debugging, Process pre-emption, etc).

The VM will check the message queue pretty frequently if running normal
Smalltalk code, at least in human terms. The interval between checks is
measured in the number of method activations (or certain unconditional jumps
used in loops) that can be executed in about 40 milliseconds, as I recall.
This was the amount of time it took a Pentium 90 to execute 24576 message
sends in some early version of Dolphin. A modern machine is considerably
faster of course, so the number of message sends is considerably higher,
983040 on the machine I'm using at the moment. The sampling interval is
calculated on startup (see InputState>>primaryStartup), with a minimum of
2^14 and a maximum of 2^20.

Anyway, because the mechanism used to measure time between samples is based
on real message sends, it is possible to create cases which make it
ineffective simply by performing insufficient Smalltalk processing
interleaved with lengthy external calls (but see below).

> An interesting work around that I discovered was to put a short sleep
> into the background process. If I hack a "sleep: 1" into the caller of
> #sqlExecute:, the UI stays the most responsive of all of these tests,
> even though 99% of the time the entire image is blocked in the external
> call.

The precise reason for that is not immediately obvious, but it could be due
to a number of things. Firstly you are sending more Smalltalk messages, so
the polling mechanism may come into play. Secondly the timing mechanism used
by Delays is based on the Windows multi-media high resolution timers. These
are managed by a thread in the VM, and when one fires it sets a Windows
Event that is being waited on by the MsgWaitForMultipleObjects call in
InputState>>idleNT/idle9X. Although in 5.0 the idle Process does not check
the input queue when it wakes up because of the timer Event being signalled
(which is not true of the forthcoming release where this mechanism has been
altered slightly to prevent short-cycling of the idle process when the UI
process is blocked on another Semaphore), this will also cause more
Smalltalk code to run in order to process the idle loop.

>
>     "(4) 99% in external calls and 1% sleeping - UI quite responsive"
>    [150 timesRepeat:
>         [Processor sleep: 1.
>         KernelLibrary default blockedSleep: 100]
>     ] forkAt: Processor userBackgroundPriority.
>
> I'd be interested to hear how and when the VM signals the inputSemaphore
> that indicates that windows events are pending, and where the Smalltalk
> process switches may occur as it relates to external calls.

The signalling mechanism is explained above. The input semaphore is also
signalled internally by the idle process, and perhaps in other dark places
in the image that I can't remember off the top of my head.

In Dolphin a process switch can theoretically occur between bytecodes,
however in practice interrupts are not recognised except on method
activations and unconditional jumps (as part of the same mechanism used to
poll the message queue). Any processor synchronisation primitive such as
signalling a Semaphore could also cause a process switch.

>
> As I understand the worst case (3), the entire image/VM OS thread is
> blocked on the external call 99% of the time. The other 1% is being used
> to compute in the background (priority 3) process and to service the
> main UI (priority 5) process. Apparently many of those cycles are
> devoted to the background process rather than the higher priority UI
> process that needs attention. In case (4) where the background process
> sleeps rather than computes for the 1%, presumably the idle process may
> run, and ends up giving the UI process all the time that it wants.
>
> Is some change needed in the VM so that case (3) would act more
> reasonably?

Its possible that the mechanism could be improved to use a measure of
elapsed time that is more accurate for these cases, but an overridding
objective of the design was to minimise the overhead of the polling. In fact
the overhead is so low, that on modern machines one can increase the
sampling rate substantially before it starts to have any measurable effect
on benchmarks (so perhaps the algorithm used to calculate the sampling
interval should not just scale up linearly from a P90).

In D5 there is no public interface for setting the sampling interval, but it
is held in the 'SamplingInterval' class variable of InputState, so you can
always evaluate an expression such as:

    InputState classPool at: 'SamplingInterval' put: 16384

Or you could subclass InputState and replace the default instance:

    SessionManager current inputState: (MyInputState new
initializeVeryCarefullyOrDolphinWillSurelyCrash; yourself)

Another technique would be to simply signal the user input semaphore
yourself on completion of your long running external call from the
background process. It doesn't do any harm to accumulate excess signals,
this will just cause the UI proc to cycle round its loop a few times before
it quiesces.

But overlapped calls are probably the real solution to this problem....

>
> As a second avenue of attack, I tried hacking the #sqlExecute: method to
> use 'overlap'. This, as one might expect, handles keeping the UI
> responsive, but introduced another problem.
>
> The background process in question often gets terminated (in response to
> some UI event) and is then usually re-instantiated. When I've made the
> #sqlExecute: call overlapped, there are problems in terminating the
> background process. ....

When you terminate a Process currently executing an overlapped call, the VM
raises a Win32 exception in the overlapped call thread in order to terminate
it cleanly. However this exception cannot be serviced until the thread
enters an "alertable" state, which may not be until the long running
operation has actually finished. The reason for doing this, rather than just
calling TerminateThread(), is that we want to give the thread every
opportunity to clean up properly when it is shut down. The MSDN
documentation for TerminateThread() contains rather dire warnings about its
misuse:

http://msdn.microsoft.com/library/default.asp?url=/library/en-us/dllproc/bas
e/terminatethread.asp

Even raising an exception is potentially dangerous, in that the code may not
be exception aware. In many cases, however, the external call never enters
an "alertable" state (it has to call certain blocking system APIs to do
that), so the exception is not actually received until the external call
returns into the Dolphin code anyway.

>...When the background process is first started it
> shows in the ProcessMonitor as 'ready' in DBParameterizedStatement>>exec
> (the caller of #sqlExecute:). After terminating it and starting it over
> again, it shows up as 'waiting' in DBParameterizedStatement>>exec.

Its waiting for the exec call to finish.

> Trying to terminate another time shows it as 'waiting' in [] in
> ProcessScheduler>>vmi:list:no:with:. And by the time I reach this point,
> the only way out of Dolphin is to kill it from the Windows Task Manager.
> I'm guessing that there's something in the ODBC interface that isn't
> taking kindly to have the Smalltalk process terminated? Not sure though,
> since terminating and restarting works just fine when using the
> equivalent of (4) above and the blocking #sqlExecute: call.

Its possible that even the remote exception mechanism I mentioned above is
indeed not safe for the ODBC driver if it is not exception aware, in which
case the best method is to have a flag you can test in the Smalltalk code
that allows you to initiate shutdown when the external call returns
normally.

>
> I tried to explore the issue by using KernelLibrary>>sleep:. But ran
> into this other strange anomaly. I'm not sure that it actually relates
> to any of my real issues. But it made me curious.
>
>     #(5 5 10 10 20 20 40 40 50 50 100 100) collect: [:startupDelay |
>         | process state time |
>         state := 1.
>         process := BlockingCallMonitor new
>             callBlock: [[KernelLibrary default sleep: 5000] ensure:
> [state := 2]];
>             priority: 4;
>             monitorOnce.
>         time := Time millisecondsToRun:
>             [Processor sleep: startupDelay.
>             process terminate.
>             [state = 2] whileFalse: [Processor sleep: 10]].
>         startupDelay -> time].
>
> gives results like this on Win2K:
> #(5 -> 236 5 -> 55 10 -> 60 10 -> 60 20 -> 60 20 -> 60 40 -> 61 40 -> 60
> 50 -> 5063 50 -> 5055 100 -> 5059 100 -> 5052)
>
> For some reason it seems to take on the order of 50ms before the kernel
> sleep really gets started.

Here you're getting into the interaction between the overlapped call
mechanism and the host OS. There certainly isn't a fixed overhead of 50mS
for an overlapped call, which you can easily prove by timing the overlapped
and non-overlapped versions of some consistent operation. The precise
performance characteristics will vary by the version of Windows and the
scheduling algorithm, and there will be quite a lot of variability. There is
a certain amount of context switching involved in both initiating and
completing the overlapped call, and that carries some overhead and
introduces a slight delay.

>... If you terminate it before it's gone that
> long, it apparently stops before it gets started. If you wait longer
> than that before terminating the process, then the termination doesn't
> really occur until after the 5s sleep is over. My guess is that the
> external call isn't interrupted in any way, and that there is some
> overhead in coming up with the OS thread to use such that it doesn't
> start immediately? Also the first time it takes 220ms - 240ms to
> terminate (on Win2K) rather than 50ms or 60ms for some reason.

For your first call the VM has to create a thread, a relatively expensive
process. Subsequent to that it is able to reuse one (the same one in fact)
from the pool.

The overlapped call thread servicing the call is running at the same
priority as the main Dolphin thread, therefore it is quite possible
(depending on the OSs scheduler) that the main Dolphin thread will be able
to terminate the call before it has even been initiated. Sleep() is not an
alertable function, and so once it has started, it will run to completion.
If you use SleepEx(), then I'd wager the behaviour will be different.

>...Trying
> the same thing on XP doesn't show that particular anomaly on the first
> run, though sometimes the first one would not terminate until after the
> 5s sleep had expired.

Assuming that is on the same machine (and not a faster one), it seems that
MS have been able to make the creation and scheduling of threads more
efficient on XP. Certainly my own results on your test were more consistent:

#(5 -> 5012 5 -> 5014 10 -> 5016 10 -> 5019 20 -> 5015 20 -> 5017 40 -> 5014
40 -> 5014 50 -> 5014 50 -> 5016 100 -> 5020 100 -> 5011)

The actual overhead of an overlapped call is much smaller than this for a
normal operation (the above including the 10mS sleep, and the additional
context switching introduced by the timer thread and polling loop), for
example if I try:

(1 to: 10) collect: [:each | Time microsecondsToRun: [KernelLibrary default
sleep: each]]

Then the amount of time over the sleep period seems to average about 700-800
microseconds on my XP machine. Even this isn't a very reliable timeing,
since Sleep() is itself a process synchronisation primitive and so carries a
lot of OS overhead, and the sleep delay is only a minimum requested delay
which the system makes a best effort to stick to, not a hard real-time
delay. By adding an overlapped version of a Win32 call which I assume
doesn't contain any process synchronisation primitives (lstrlenW), I got the
following timings per call of lstrlenW on 'abc':

non-overlapped:     0.15uS
overlapped:            53.8uS

So the overhead inherent in the actual overlapped call mechanism seems to be
of the order of 50uS on this dual Athlon 1900+ system with WinXP. Most of
this is probably down to OS context switching time, the overlapped call
mechanism being pretty lightweight, so expect a great deal of variability in
the results depending on the machine configuration. Also a single processor
machine might well give very different figures.

>
> In summary, any insights into how the Smalltalk UI process and external
> calls (whether overlapped or not) for other processes interact, or
> perhaps more to the point, any ideas or experience reports on accessing
> a database in a background process would be appreciated.

Hope the above helps.

Another useful thing to know about is the two "queues" in the Processor
instance. 'overlappedCalls' holds all Processes running uncompleted
overlapped calls. 'pendingTerminations' holds those processes that have been
#terminated, but which are waiting for the overlapped call to either
complete, or reach an alert point that allows the exception to get through.

Regards

Blair


Reply | Threaded
Open this post in threaded view
|

Re: External calls and UI responsiveness

Bill Dargel
In reply to this post by Chris Uppal-3
Chris Uppal wrote:

> I can't answer your questions (though I'd be interested to know too).  However,
> does sprinkling a few:
>
>     Processor yield.
>
> Solve/alleviate your problems ?

I gave a quick try putting it in the test case with the
KernelLibrary>>blockedSleep: call, and didn't see any difference. As I understand
things now from Blair's excellent post, the main UI process is waiting on the
inputSemaphore not knowing there are Windows events in the message queue. The yield
in the background process doesn't do anything as there are no equal/higher priority
processes ready to run.

> I tried it on my machine and it *seemed* to help with your example, but I can't
> be sure because even your worst case didn't slow my machine (nearly identical
> to yours: D5.0.3, 650Mz, W2K) down very much.  The UI became sluggish, but I
> was only getting fractional second delays in response to mouse clicks, rather
> than the multisecond pauses that you mentioned.

Hmm. I wonder what the difference is. You sure that the "countFor1ms timesRepeat:
[2 min: 3]" is taking about 1ms on your machine? I suppose you could always
decrease the count or fraction of time that it spends sending Smalltalk messages
until things get real sluggish. You don't have something else running that ends up
signaling the inputSemaphore, do you?

regards,
-Bill

-------------------------------------------
Bill Dargel            [hidden email]
Shoshana Technologies
100 West Joy Road, Ann Arbor, MI 48105  USA


Reply | Threaded
Open this post in threaded view
|

Re: External calls and UI responsiveness

Bill Dargel
In reply to this post by Blair McGlashan
Blair McGlashan wrote:

Thanks for the excellent post, which explains a lot about the design and
implementation. It's really helped my understanding considerably. Hopefully
others found it useful as well. I've included some other follow-on comments
below.

> Bill Dargel wrote:
> > An interesting work around that I discovered was to put a short sleep
> > into the background process. If I hack a "sleep: 1" into the caller of
> > #sqlExecute:, the UI stays the most responsive of all of these tests,
> > even though 99% of the time the entire image is blocked in the external
> > call.
>
> The precise reason for that is not immediately obvious, but it could be due
> to a number of things.

In working it out based on everything I've learned here, it now seems pretty
clear why this works. It's not a good general solution, if there's more than one
background process, but in the situation I was investigating, there's just the
one background process. When it gets put to sleep, the idle process runs, checks
MsgWaitForMultipleObjects, finds there's an event waiting (mouse click, or
whatever) and signals the main UI process to handle it. Result - good
responsiveness (at least relatively speaking) as the UI can potentially run
every 100ms.[snip]

> > Is some change needed in the VM so that case (3) would act more
> > reasonably?
>
> Its possible that the mechanism could be improved to use a measure of
> elapsed time that is more accurate for these cases, but an overridding
> objective of the design was to minimise the overhead of the polling.

Just thinking off the top of my head, seems like a timed mechanism could be put
together that wouldn't be any more overhead than the current scheme. I envision
using some sort of Window's timer (perhaps on its own thread?) that could just
set a flag that would be checked by the VM loop (similar to how the current
message counting is checked) to determine when to poll for Windows messages.

[snip]

> Its possible that even the remote exception mechanism I mentioned above is
> indeed not safe for the ODBC driver if it is not exception aware, in which
> case the best method is to have a flag you can test in the Smalltalk code
> that allows you to initiate shutdown when the external call returns
> normally.

I'll probably end up exploring this option. Given that eventually I'll need the
possibility of more than one background process at a time, the trick of throwing
in a "sleep: 1", won't really take care of the problem. I'm using MyODBC v2.50,
and noticed that they just moved v3.51 to production status a few weeks ago. I
guess first thing to do is try the newer version to see if it's better equipped
to terminate cleanly in a overlapped call.

[snip]

> > I tried to explore the issue by using KernelLibrary>>sleep:. But ran
> > into this other strange anomaly.
>    [snip]
> For your first call the VM has to create a thread, a relatively expensive
> process. Subsequent to that it is able to reuse one (the same one in fact)
> from the pool.

Hmm. Is the pool of threads then emptied? If I run the loop in a workspace just
a second or two later, the first one in the loop still takes 100ms - 200ms
longer, including your example below.

[snip]

> (1 to: 10) collect: [:each | Time microsecondsToRun: [KernelLibrary default
> sleep: each]]

I displayed this in a workspace on my 670MHz Win2K machine a dozen times with
the following results:

1: #(210980 1892 2976 3980 4982 5987 6988 7992 8996 9999)
2: #(211596 1906 2978 3980 4981 5985 6988 7991 8995 9999)
3: #(212612 1914 2975 3981 4982 5984 6992 7990 8998 9998)
4: #(152 1947 2964 3977 4983 5986 6989 7994 8996 9997)
5: #(99881 1874 2977 3976 4983 5985 80273 7937 8994 9999)
6: #(108034 1927 2973 3979 4985 6028 80178 7994 8996 9998)
7: #(92201 1881 2973 3980 4986 5978 79240 7969 8996 9999)
8: #(212863 1900 2976 3977 4982 5988 6990 7990 8998 9999)
9: #(95331 1844 2975 3981 4982 5986 80274 7938 8995 9999)
10: #(212355 1900 2975 3980 4983 5986 6990 7993 9034 9960)
11: #(94085 1881 2976 3980 4983 5984 80275 7937 8997 9999)
12: #(96424 1920 2959 3980 4984 5985 6988 7993 8996 10001)

Notice that except for #4 (which was way under what it should be), the first
time through the loop took an additional 100ms-200ms. And contrary to your
reported 700-800microseconds over, my numbers are typically a few microseconds
under. Also occasionally (#5,6,7,9&11) the 7ms sleep took 80ms for some reason.

I repeated this on a WinXP 800MHz machine and got:

1: #(8808 1925 2984 3983 4989 34090 6988 7997 9006 9998)
2: #(967 45076 2978 3983 4992 5989 6995 15022 9003 10003)
3: #(701 1931 2981 35089 4987 5992 7000 7994 9003 65204)
4: #(599 1931 2983 4026 4948 57157 6994 7997 9003 10008)
5: #(332 27013 2978 3985 4989 5996 6995 15020 9007 10000)
6: #(8834 1937 2989 3979 4993 35084 6992 7999 9006 9996)
7: #(1093 58118 2971 3987 4987 6004 6987 15021 9003 10006)
8: #(148 1949 2964 45121 4985 5994 6992 7999 9006 65177)
9: #(795 1927 2982 3987 4989 57160 6986 8002 8998 10008)
10: #(672 27011 3003 3962 5031 5954 6994 15022 9000 10009)
11: #(4474 1940 2981 3987 4988 35085 6996 8042 8977 9974)
12: #(593 45073 2983 3980 4990 5993 6998 15019 9000 10004)

Here there isn't the first time through the loop phenomenon of an extra
100ms-200ms, but the first time through is definitely a lot more variable. Also
there seem to be a lot more instances where it takes 15ms, 27ms, 35ms, 45ms,
58ms or 65ms to complete, rather than 1ms-10ms.

At this point I'm inclined to just chalk it up to the variable (or flaky?)
behavior of a complex system, since I don't really see the direct relevance to
what I need to be doing.[snip]

> Hope the above helps.

I sure does! :-)
thanks again,
-Bill

-------------------------------------------
Bill Dargel            [hidden email]
Shoshana Technologies
100 West Joy Road, Ann Arbor, MI 48105  USA


Reply | Threaded
Open this post in threaded view
|

Re: External calls and UI responsiveness

Chris Uppal-3
In reply to this post by Bill Dargel
Bill Dargel wrote:

>> I tried it on my machine and it *seemed* to help with your example,
>> but I can't be sure because even your worst case didn't slow my
>> machine (nearly identical to yours: D5.0.3, 650Mz, W2K) down very
>> much.  The UI became sluggish, but I was only getting fractional
>> second delays in response to mouse clicks, rather than the
>> multisecond pauses that you mentioned.
>
> Hmm. I wonder what the difference is. You sure that the "countFor1ms
> timesRepeat: [2 min: 3]" is taking about 1ms on your machine? I
> suppose you could always decrease the count or fraction of time that
> it spends sending Smalltalk messages until things get real sluggish.
> You don't have something else running that ends up signaling the
> inputSemaphore, do you?

Turns out that there was (is) a demon Process that wakes up once per second,
does a bit of work, then does a #queueDeferredAction: to send the results to a
GUI monitor.  #queueDeferredAction: signals the semaphore, hence there was an
average of 0.5 seconds between waking up the Main Process.  If I kill the demon
I see the same unresponsiveness as you.

Easy to reproduce the effect:
    s := SessionManager current inputState inputSemaphore.
    kl := KernelLibrary default.
    p := [[(Delay forMilliseconds: 250) wait. s signal] repeat] fork.
    "p terminate"
With that loop running the dramatic pauses "vanish".  Not that I'm recommending
it as a solution, mind you.

    -- chris


Reply | Threaded
Open this post in threaded view
|

Re: External calls and UI responsiveness

Chris Uppal-3
In reply to this post by Blair McGlashan
Blair McGlashan wrote:

>> sleep: 1
> [...]
> The precise reason for that is not immediately obvious, but it could
> be due to a number of things.

Won't the idle loop spinning bug come into play here ?  If it goes into a dizzy
then it'll get through 40 milliseconds worth of message sends in about 40
1-millisecond chunks.

    -- chris


Reply | Threaded
Open this post in threaded view
|

Re: External calls and UI responsiveness

Blair McGlashan
"Chris Uppal" <[hidden email]> wrote in message
news:3e797e3a$0$59851$[hidden email]...
> Blair McGlashan wrote:
>
> >> sleep: 1
> > [...]
> > The precise reason for that is not immediately obvious, but it could
> > be due to a number of things.
>
> Won't the idle loop spinning bug come into play here ?  If it goes into a
dizzy
> then it'll get through 40 milliseconds worth of message sends in about 40
> 1-millisecond chunks.

That is a possibility, yes.

Regards

Blair


Reply | Threaded
Open this post in threaded view
|

Re: External calls and UI responsiveness

Blair McGlashan
In reply to this post by Bill Dargel
"Bill Dargel" <[hidden email]> wrote in message
news:[hidden email]...
> Blair McGlashan wrote:
> ...
> > Its possible that the mechanism could be improved to use a measure of
> > elapsed time that is more accurate for these cases, but an overridding
> > objective of the design was to minimise the overhead of the polling.
>
> Just thinking off the top of my head, seems like a timed mechanism could
be put
> together that wouldn't be any more overhead than the current scheme. I
envision
> using some sort of Window's timer (perhaps on its own thread?) that could
just
> set a flag that would be checked by the VM loop (similar to how the
current
> message counting is checked) to determine when to poll for Windows
messages.

That is a possibility. Thanks for the suggestion.

> [snip]
>
> > Its possible that even the remote exception mechanism I mentioned above
is
> > indeed not safe for the ODBC driver if it is not exception aware, in
which
> > case the best method is to have a flag you can test in the Smalltalk
code
> > that allows you to initiate shutdown when the external call returns
> > normally.
>
> I'll probably end up exploring this option. Given that eventually I'll
need the
> possibility of more than one background process at a time, the trick of
throwing
> in a "sleep: 1", won't really take care of the problem. I'm using MyODBC
v2.50,
> and noticed that they just moved v3.51 to production status a few weeks
ago. I
> guess first thing to do is try the newer version to see if it's better
equipped
> to terminate cleanly in a overlapped call.

If not, then don't forget the technique of having a "terminate" flag you
explicitly check, which is the standard technique for exiting native threads
where one isn't in full control of all the code that might be run.

> [snip]
>
> > > I tried to explore the issue by using KernelLibrary>>sleep:. But ran
> > > into this other strange anomaly.
> >    [snip]
> > For your first call the VM has to create a thread, a relatively
expensive
> > process. Subsequent to that it is able to reuse one (the same one in
fact)
> > from the pool.
>
> Hmm. Is the pool of threads then emptied? If I run the loop in a workspace
just
> a second or two later, the first one in the loop still takes 100ms - 200ms
> longer, including your example below.

The threads leave the pool and exit when they get "bored". They get bored
after waiting around for about 10-15 seconds. However, if they are
explicitly terminated (by Process>>terminate), then they won't return to the
pool.

[ snip sleep timings]
> ...
> At this point I'm inclined to just chalk it up to the variable (or flaky?)
> behavior of a complex system, since I don't really see the direct
relevance to
> what I need to be doing.[snip]

Since NT is not a real-time system, the amount of time spent in a Sleep is
not very precise, and sometimes it may want to do other work introducing
significant variability into the results. Its worth bearing this in mind
when taking any timings that potentially involve context switching between
threads, but a bit of a red herring in relation to your current issue.

Regards

Blair