I use SessionManager>>trace: or Transcript show: in our code (for debugging
purposes) and also use background processes which gather information about various object. Now we get deadlocks quite often. One simple method to reproduce this is to evaluate: p := [1 to: 100000000000000 do: [:i | Transcript show: i printString; cr] ] forkAt: 3. Then try to clear the Transcript => Deadlock. (You can interruped with Ctrl-Break). I already spend some hours understanding the problem and it seems that the is a interlock of callbacks on the VM stack like (time from top to bottom): 1: background process: lock the transcript semaphore and sendMessage to update the Transscript 2: background process: callback from sendMessage to update the transcript. 3: main process: During the processing of the callback in the background process in step 2 the main process comes to life because the VM signalled the input semaphore and it has a higher priority than the background process. The main process calls DispatchMessage with the wmCommand message it received. 4: main process: callback from DispatchMessage to process the wmCommand message. The processing tries to lock the transcript semaphore in order to clear the transcript. Since the background process has locked the mutex in step 1 => MAIN PROCESS BLOCKED. 5: background process: the processing of the callback in step 2 is finished (actually it was only a default window processing). The process tries to return but cannot because callback must be returned in the same order as they occured, i.e. the callback from the main process has to return first but cannot because it is blocked by the transcript semaphore => BACKGROUND PROCESS BLOCKED. 6. DEADLOCK I don't know exactly if this is really the correct explanation because actually the background process has two open callbacks and unfortunately it is not possible to see the callback stack in the VM. I tried to lock at the cookie values they provide, but they seem to have to order (Blair?). Also what was to understand the problem I tried to understand how the VM processes callbacks and window messages, but I don't know whether I have the right picture. For example: 1) How got the main process in step 3 to life. Is it really the VM who signals the input semaphore and does a hard process switch from the background process to the main process? 2) What are the entry points from the VM into the Smalltalk system. Is it only InputState>>wndProc:message:wParam:lParam:cookie: or are they other callback from the VM? 3) If the VM sets the input semaphore and watches the message queue for input, why is there also the idle process? Regards Carsten |
"Carsten Haerle" <[hidden email]> wrote in message
news:burr1s$ojv$02$[hidden email]... > I use SessionManager>>trace: or Transcript show: in our code (for debugging > purposes) and also use background processes which gather information about > various object. Now we get deadlocks quite often. One simple method to > reproduce this is to evaluate: > > p := [1 to: 100000000000000 do: [:i | Transcript show: i printString; cr] ] > forkAt: 3. > > Then try to clear the Transcript => Deadlock. (You can interruped with > Ctrl-Break). > > I already spend some hours understanding the problem and it seems that the > is a interlock of callbacks on the VM stack like (time from top to bottom): > Hmmm yes, I can see how that might happen. The Transcript's current implementation is infringing the rule of not performing UI updates from a background process. It should either not be sending windows messages from inside its mutex, or it should be using an unsubclassed control so that the callbacks do not transit through Dolphin's message dispatching mechanism. > 1: background process: lock the transcript semaphore and sendMessage to > update the Transscript > 2: background process: callback from sendMessage to update the transcript. > 3: main process: During the processing of the callback in the background > process in step 2 the main process comes to life because the VM signalled > the input semaphore and it has a higher priority than the background > process. The main process calls DispatchMessage with the wmCommand message > it received. Probably, but it might help to know that the VM only signals the input semaphore when the system appears to be performing CPU intensive processing such that the image has not checked the Windows message queue for a pre-defined period. This period is defined in terms of the number of method activations, and is known as the "sampling interval". See InputState>>setSamplingInterval:. Sampling the Windows message queue is not without cost, so the interval has to be reasonably large. However if it is too large, then the system will become sluggish to respond to UI activity when a background computation is being performed. The sampling can be turned off altogether too - read the comment of InputState>>primSampleInterval: for further information. > 4: main process: callback from DispatchMessage to process the wmCommand > message. The processing tries to lock the transcript semaphore in order to > clear the transcript. Since the background process has locked the mutex in > step 1 => MAIN PROCESS BLOCKED. > 5: background process: the processing of the callback in step 2 is finished > (actually it was only a default window processing). The process tries to > return but cannot because callback must be returned in the same order as > they occured, i.e. the callback from the main process has to return first > but cannot because it is blocked by the transcript semaphore => BACKGROUND > PROCESS BLOCKED. > 6. DEADLOCK > > I don't know exactly if this is really the correct explanation because > actually the background process has two open callbacks and unfortunately it > is not possible to see the callback stack in the VM. I tried to lock at the > cookie values they provide, but they seem to have to order (Blair?). It sounds like a reasonable explanation to me. The callbacks must be exited in order, and the VM guarantees this by queueing up attempts by the image to return out of order. This is the solution that Dolphin adopts to running multiple green threads within a single native thread. A less satisfactory alternative is to dequeue most incoming Windows messages and place them into a separate queue managed either in the VM, or in Smalltalk. From the point of view of the host OS, this means that those messages are effectively handled asynchronously, and so it is only appropriate for cases where this doesn't upset either Windows or the controls and where the return value is ignored. Other cases (i.e. those that must be handled synchronously) have to be handled by disabling process switching while the message is handled. This approach is unsatisfactory because: 1) There aren't many cases where the Windows messages can really be handled asynchronously, and even where this appears to be the case there is potential for timing related issues to cause unexpected behaviour or instability in either Windows or the controls, and 2) Disabling process switching means that, in effect, it is not possible to debug through many Windows calls, including any inbound COM calls or other forms of synchronous callback. > > Also what was to understand the problem I tried to understand how the VM > processes callbacks and window messages, but I don't know whether I have the > right picture. For example: > > 1) How got the main process in step 3 to life. Is it really the VM who > signals the input semaphore and does a hard process switch from the > background process to the main process? Probably (as a result of the sampling mechanism I described above), although more normally it is signalled from the idle process, although there are various other places it can be signalled from which you can easily find by browsing from InputState. > 2) What are the entry points from the VM into the Smalltalk system. Is it > only InputState>>wndProc:message:wParam:lParam:cookie: or are they other > callback from the VM? No, there are a few others. Browse the 'vm entry points' category to see them all. > 3) If the VM sets the input semaphore and watches the message queue for > input, why is there also the idle process? Normally it is the idle process which detects new input - it is responsible for quiescing the system in a Windows-friendly manner (by calling MsgWaitForMultipleObjects). This means the system is, by preference, event driven and sleeps except when processing input from the Windows message queue. However a mechanism to cheaply poll the queue is also needed for the case where background processing is consuming CPU time, and this is the sampling mechanism I described above. Unfortunately it is not possible for one Win32 thread to wait for input on another thread's message queue, as if it were this would be a better way for the VM to detect available input. Regards Blair |
Free forum by Nabble | Edit this page |