I want to create a process doing some work and call #changed on a Morph. I want to start/suspend/resume or stop this process. But sometimes, suspending the process locks the UI-Process, and I don't know why. Did I miss something or do I have to care when to call suspend? Wrapping the "morph changed" call in UIManager default defer:[ morph changed]. Does not change anything. Here is an example to reproduce it. Create the process, call resume, call supsend. It works, most of the time, but sometimes, calling suspend locks the ui. p:=[[true] whileTrue:[ Transcript crShow: (DateAndTime now asString). 30 milliSeconds asDelay wait]] newProcess. p resume. p suspend. Nicolai |
Hi Nicolai,
On Tue, Jul 8, 2014 at 7:19 AM, Nicolai Hess <[hidden email]> wrote:
If you simply suspend this process at random form a user-priority process you'll never be able to damage the Delay machinery you're using, but chances are you'll suspend the process inside the critical section that Transcript uses to make itself thread-safe, and that'll lock up the Transcript.
ThreadSafeTranscript>>nextPutAll: value accessSemaphore critical: [stream nextPutAll: value].
^value So instead you need to use a semaphore. e.g. | p s wait | s := Semaphore new. p:=[[true] whileTrue:[wait ifTrue: [s wait]. Transcript crShow: (DateAndTime now asString). 30 milliSeconds asDelay wait]] newProcess.
wait := true. 30 milliSeconds asDelay wait. wait := false. s signal etc... HTH
best, Eliot
|
2014-07-09 2:07 GMT+02:00 Eliot Miranda <[hidden email]>:
Thank you Eliot yes I guessed it locks up the critical section, but I hoped with would not happen if I the use UIManager defer call.
Is this a common pattern I can find in pharos classes. Or I need some help understanding this. The semaphore wait/signal is used instead of process resume/suspend? What I want is a process doing repeatly some computation, calls or triggers an update on a morph, and I want to suspend and resume this process. I would stop this discussion if someone tells me, "No your are doing it wrong, go this way ..", BUT what strikes me: in this example, that reproduces my problem more closely: |p m s running| running:=false. m:=Morph new color:Color red. s:= StringMorph new. m addMorphBack:s. p:=[[true]whileTrue:[20 milliSeconds asDelay wait. s contents:(DateAndTime now asString). m changed]] newProcess. m on:#mouseUp send:#value to:[ running ifTrue:[p suspend. m color:Color red.] ifFalse:[p resume.m color:Color green.]. running := running not]. m extent:(300@50). m openInWorld clicking on the morph will stop or resume the process, if it locks up I can still press alt+dot -> - a Debugger opens but the UI is still not responsive. I can click with the mouse on the debuggers close icon. - nothing happens, as the UI is still blocked. - pressing alt+Dot again, the mouse click on the close icon is processed and the first debugger window closes - maybe other debuggers open. Repeating this steps, at some time the system is *fully* responsive again! And miraculously, it works after that without further blockages.
What's happening here? Nicolai
|
I am still struggling with it. Any ideas?2014-07-09 11:19 GMT+02:00 Nicolai Hess <[hidden email]>:
|
First I thought I'd try moving the update of StringMorph outside the worker-process using a Morph's #step method as follows... Morph subclass: #BackgroundWorkDisplayMorph instanceVariableNames: 'interProcessString stringMorph' classVariableNames: '' category: 'BenPlay' "---------" BackgroundWorkDisplayMorph>>initializeMorph self color: Color red. stringMorph := StringMorph new. self addMorphBack: stringMorph. self extent:(300@50). "---------" BackgroundWorkDisplayMorph>>newWorkerProcess ^[ | work | work := 0. [ 20 milliSeconds asDelay wait. work := work + 1. interProcessString := work asString. ] repeat. ] newProcess. "---------" BackgroundWorkDisplayMorph>>step stringMorph contents: interProcessString. "---------" BackgroundWorkDisplayMorph>>stepTime ^50 "---------" BackgroundWorkDisplayMorph>>initialize | workerProcess running | super initialize. self initializeMorph. workerProcess := self newWorkerProcess. running := false. self on: #mouseUp send: #value to: [ (running := running not) ifTrue: [ workerProcess resume. self color: Color green. ] ifFalse: [ workerProcess suspend. self color: Color red. ] ] "---------" But evaluating "BackgroundWorkDisplayMorph new openInWorld" found this exhibited the same problematic behavior you reported... Clicking on the morph worked a few times and then froze the UI until Cmd-. pressed a few times. However I found the following never locked the GUI. BackgroundWorkDisplayMorph>>initialize "BackgroundWorkDisplayMorph new openInWorld" | workerProcess running | super initialize. self initializeMorph. workerProcess := self newWorkerProcess. running := false. [ [ (running := running not) ifTrue: [ workerProcess resume. self color: Color green ] ifFalse: [ workerProcess suspend. self color: Color red ]. 10 milliSeconds asDelay wait. ] repeat ] fork. "---------" So the problem seemed to not be with #suspend/#resume or with the shared variable /interProcessString/. Indeed, since in the worker thread /interProcessString/ is atomically assigned a copy via #asString, and the String never updated, I think there is no need to surround use of it with a critical section. The solution then was to move the "#resume/#suspend" away from the "#on: #mouseUp send: #value to:" as follows... BackgroundWorkDisplayMorph>>initialize "BackgroundWorkDisplayMorph new openInWorld" | workerProcess running lastRunning | super initialize. self initializeMorph. workerProcess := self newWorkerProcess. lastRunning := running := false. [ [ lastRunning = running ifFalse: [ running ifTrue: [ workerProcess resume ] ifFalse: [ workerProcess suspend ]. lastRunning := running. ]. 10 milliSeconds asDelay wait. ] repeat ] fork. self on: #mouseUp send: #value to: [ (running := running not) ifTrue: [ self color: Color green. ] ifFalse: [ self color: Color red. ] ] "---------" And finally remove the busy loop. BackgroundWorkDisplayMorph>>initialize "BackgroundWorkDisplayMorph new openInWorld" | workerProcess running lastRunning semaphore | super initialize. self initializeMorph. workerProcess := self newWorkerProcess. lastRunning := running := false. semaphore := Semaphore new. [ [ semaphore wait. running ifTrue: [ workerProcess resume ] ifFalse: [ workerProcess suspend ]. ] repeat ] fork. self on: #mouseUp send: #value to: [ (running := running not) ifTrue: [ self color: Color green. ] ifFalse: [ self color: Color red. ]. semaphore signal. ] "---------" Now I can't say how close that is to how it "should" be done. Its the first time I used sempahores and just what I discovered hacking around. But hey! it works :) cheers -ben Nicolai Hess wrote:
|
Actually, thinking further this morning, you don't need that extra process running... BackgroundWorkDisplayMorph>>initialize | running workerStart | super initialize. self initializeMorph. running := false. workerStart := Semaphore new. [ | work | work := 0. [ running ifFalse: [ workerStart wait ]. 20 milliSeconds asDelay wait. work := work + 1. interProcessString := work asString. ] repeat. ] fork. self on: #mouseUp send: #value to: [ (running := running not) ifTrue: [ self color: Color green. workerStart signal ] ifFalse: [ self color: Color red ]. ] "---------"
|
In reply to this post by Ben Coman
Hi ben, thank you for looking at this. 2014-07-22 20:17 GMT+02:00 <[hidden email]>:
This locks the UI as well. Not every timet hough. I did this 5 times, every time in a freshly loaded image and it happens two times.
And this too :(
And this locks the UI too. (Loaded the code 20 times, every time after a fresh image start up. Two times I got a locked ui after the first two clicks). And I don't understand this code :)
|
Over the last few days I have been looking deeper into the image
locking when suspending a process. It is an
interesting rabbit hole [1] that leads to pondering the Delay
machinery, that leads to some VM questions.
When pressing the interrupt key it seems to always opens the debugger with the following call stack. Semaphore>>critical: 'self wait' BlockClosure>>ensure: 'self valueNoContextSwitch' Semaphore>>critical: 'ensure: [ caught ifTrue: [self signal]] Delay>>schedule 'AccessProtect critical: [' Delay>>wait 'self schedule' WorldState>>interCyclePause: I notice... Delay class >> initialize TimingSemaphore := (Smalltalk specialObjectsArray at: 30). and... Delay class >> startTimerEventLoop TimingSemaphore := Semaphore new. which seems incongruous that TimingSemaphore is set in differently. So while I presume this critical stuff all works fine, just in an exotic way, my entropy-guarding-neuron would just like confirm this is so. -------------- In Delay class >> handleTimerEvent the comment says... "Handle a timer event.... -a timer signal (not explicitly specified)" ...is that event perhaps a 'tick' generated periodically by the VM via that item from specialObjectArray ? Or is there some other mechanism ? -------------- [1] http://www.urbandictionary.com/define.php?term=Rabbit+Hole cheers -ben P.S. I've left the following for some initial context as I change the subject. btw Nicolai, I confirm that my proposed fixes only work on Windows, not Mavericks (and I haven't checked Linux). Nicolai Hess wrote:
|
Hi Ben,
On Fri, Jul 25, 2014 at 7:56 AM, Ben Coman <[hidden email]> wrote: --
The TimingSemaphore gets installed in the specialObjectsArray via primSignal: aSemaphore atMilliseconds: aSmallInteger
"Signal the semaphore when the millisecond clock reaches the value of the second argument. Fail if the first argument is neither a Semaphore nor nil. Essential. See Object documentation whatIsAPrimitive."
<primitive: 136> ^self primitiveFailed and from that the VM sets the nextWakeupUsecs:
primitiveSignalAtMilliseconds "Cause the time semaphore, if one has been registered, to be signalled when the microsecond clock is greater than or equal to
the given tick value. A tick value of zero turns off timer interrupts." | msecsObj msecs deltaMsecs sema |
<var: #msecs type: #usqInt> msecsObj := self stackTop. sema := self stackValue: 1.
msecs := self positive32BitValueOf: msecsObj. self successful ifTrue:
[(objectMemory isSemaphoreOop: sema) ifTrue: [objectMemory splObj: TheTimerSemaphore put: sema.
deltaMsecs := msecs - (self ioMSecs bitAnd: MillisecondClockMask). deltaMsecs < 0 ifTrue: [deltaMsecs := deltaMsecs + MillisecondClockMask + 1].
nextWakeupUsecs := self ioUTCMicroseconds + (deltaMsecs * 1000). ^self pop: 2]. sema = objectMemory nilObject ifTrue:
[objectMemory storePointer: TheTimerSemaphore ofObject: objectMemory specialObjectsOop
withValue: objectMemory nilObject. nextWakeupUsecs := 0. ^self pop: 2]].
self primitiveFailFor: PrimErrBadArgument
Every time the VM checks for interrupts, which, in the Cog.Stack VMs is controlled by the heartbeat frequency, which defaults to 2 milliseconds, the VM checks if the current time has progressed to or beyond nextWakeupUsecs and signals the timer semaphore if so.
and the problem here is not in the VM. So climb out and breath some fresh air ;-)
Aloha, Eliot
|
In reply to this post by Ben Coman
Nicolai Hess wrote:
Hi Nicolai, I have something for you to try. With the problem seeming to be stuck in the semaphore critical section, I read around on mutexes and semaphores and found [1] enlightening, which says: "The correct use of a semaphore is for signaling from one task to another. A mutex is meant to be taken and released, always in that order, by each task that uses the shared resource it protects. By contrast, tasks that use semaphores either signal or wait—not both. [...] Any two RTOS tasks that operate at **different priorities** and coordinate via a mutex, create the opportunity for priority inversion [...while...] semaphores [...] do not cause priority inversion when used for signaling." [1] To summarize with an example, here's how to use a mutex: /* Task 1 */ mutexWait(mutex_mens_room); // Safely use shared resource mutexRelease(mutex_mens_room); /* Task 2 */ mutexWait(mutex_mens_room); // Safely use shared resource mutexRelease(mutex_mens_room); By contrast, you should always use a semaphore like this: /* Task 1 - Producer */ semPost(sem_power_btn); // Send the signal /* Task 2 - Consumer */ semPend(sem_power_btn); // Wait for signal I am out of my depth here, but in light [1] the way the current code seems to mix semaphore/mutex concepts with Semaphore>>forMutalExclusion and Semaphore>>critical: gave me some concern. Anyway, the delay scheduling code seemed more like a producer/consumer signaling problem than a shared resource mutal exclusion problem, so I rewrote it like the former and fixes the problem without causing any *obvious* problems. However I lack experience here and it could definitely use some review. Can you take the attached zip file and... 1. File in "Delay class-new delay code part 1.st" Defines #enableNewDelayCode and #enableNewDelayCode: Manually define the referenced class variables 2. File in "Delay class-new delay code part 2.st" Modifies #handleTimerEvent, guarded by enableNewDelayCode 3. File in "Delay-new delay code part 3.st" Modifies #schedule and #unschedule, guarded by enableNewDelayCode 4. File in "BackgroundWorkDisplayMorph.st" 5. Save checkpoint Image to return to between trials. 6. Open System Browser on BackgroundWorkDisplayMorph>>initialize Trial 1. BackgroundWorkDisplayMorph new openInWorld. Trial 2. Delay enableNewDelayCode: true. BackgroundWorkDisplayMorph new openInWorld. If if its deemed worthwhile, I'll open a ticket to track this more formally. One thing needed is to considered performance impact, I haven't so far. cheers -ben [1] http://www.barrgroup.com/Embedded-Systems/How-To/RTOS-Mutex-Semaphore [2] http://nerdfortress.com/2011/02/18/you-say-semaphore-i-say-mutex/
NewDelayScheduling-BenComan-2014-07-27a.zip (4K) Download Attachment |
2014-07-27 7:09 GMT+02:00 Ben Coman <[hidden email]>:
Interesting, it works. So, it is the Delay>>schedule implementation? Because it uses a #critical block for the wait instead of wait/signal only? Nicolai |
In reply to this post by Ben Coman
Hi Ben,
|
In reply to this post by Ben Coman
Ben Coman wrote: Hi Nicolai, I have something for you to try. Juan, Nicolai, I found I was overzealous in my guarding by looping on /beingWaitedOn/ in #schedule and #unschedule, which caused DelayTest>>testSemaphore to fail. With the code in my previous zip file, you should modify the #ifTrue: part of the following methods... for Delay>>schedule beingWaitedOn ifTrue: [^self error: 'This Delay has already been scheduled.']. "Assuming comparison, #ifTrue:ifFalse and assignment all inlined bytecodes that can't be interrupted." ScheduledDelay == nil ifTrue: [ ScheduledDelay := self ] "No other process will overwrite " ifFalse: [ "Assuming when semaphore is signalled to continue this process, it cannot be interrupted again before assignment" ScheduledDelayNilledSemaphore wait. ScheduledDelay := self. ]. "Signal semaphore in #handleTimerEvent (highest priority process) to action ScheduledDelay, set ScheduledDelay to nil and signal ScheduledDelayNilledSemaphore" TimingSemaphore signal. for Delay>>unschedule "Assuming comparison, #ifTrue:ifFalse and assignment all inlined bytecodes that can't be interrupted." FinishedDelay == nil ifTrue: [ FinishedDelay := self ] ifFalse: [ "Assuming when semaphore is signalled to continue this process, it cannot be interrupted again before assignment" FinishedDelayNilledSemaphore wait. FinishedDelay := self. ]. "Signal semaphore in #handleTimerEvent (highest priority process) to action FinishedDelay, set FinishedDelay to nil and signal FinishedDelayNilledSemaphore" TimingSemaphore signal. For reference, I've also attached a new zip of my Delay class. cheers -ben Delay.zip (7K) Download Attachment |
In reply to this post by Ben Coman
So, we are going to change it in Pharo. What about Squeak ? 2014-07-28 21:39 GMT+02:00 Eliot Miranda <[hidden email]>:
|
Free forum by Nabble | Edit this page |