Case 17373 had an erroneous failure from
ProcessTest>>testHighPriorityOverridesWaitTime and several others seem also to have suffered... * https://pharo.fogbugz.com/default.asp?17373 * https://pharo.fogbugz.com/default.asp?17496 * https://pharo.fogbugz.com/default.asp?17445 * https://pharo.fogbugz.com/default.asp?7444 These have comments that the problem can't be reproduced, which is common of occasional race conditions. So rather than just try to solve this on the issue tracker, I thought I'd share a tip how to isolate and squash such intermittent failures, essentially by stressing that single test a large number of times like this... 1 to: 100 do: [ :n | Transcript crShow: 'Try ' , n printString. ProcessTest debug: #testHighPriorityOverridesWaitTime. ] Since the debugger only focusses on a point in time snapshot of the running system, some Transcript instrumentation was useful to overview the dynamic behaviour, like this... ProcessTest>>testHighPriorityOverridesWaitTime | lowerPriorityWaitingLonger higherPriorityWaitingLess nextReadyProcess | lowerPriorityWaitingLonger := [ ] forkAt: 11 higherPriorityWaitingLess := [ ] forkAt: 12. nextReadyProcess := Processor nextReadyProcess. Transcript cr; tab;show: (lowerPriorityWaitingLonger priority -> lowerPriorityWaitingLonger identityHash); cr; tab;show: (higherPriorityWaitingLess priority -> higherPriorityWaitingLess identityHash); cr; tab;show: (nextReadyProcess priority -> nextReadyProcess identityHash). self assert: nextReadyProcess equals: higherPriorityWaitingLess. which produces... Try 1 11->343137024 12->40659200 12->40659200 Try 2 11->461794304 12->374996992 12->40659200 Observe on the second try nextReadyProcess refers to same process as from the first try. This can be cleaned up just prior to the asset check by adding ... lowerPriorityWaitingLonger terminate. higherPriorityWaitingLess terminate. which produces... Try 1 11->1005688064 12->833273600 12->833273600 ... Try 100 11->764450816 12->807973888 12->807973888 However this is still a little fragile, since some process with priority greater than the running test could slip in before the call to nextReadyProcess, as can be seen by pushing the #to:do: loop out to 100000, which produces... Try 32044 11->784361216 12->431928576 40->397533952 or wrapping it by [ ]forkAt:39, which produces... Try 1 11->431897088 12->400972288 30->338363648 After several awkward failed attempts to fix this, it seemed best to observe actual behaviour rather than peeking into the dynamic data structures. Thus I propose... ProcessTest>>testHighPriorityOverridesWaitTime "The first process to resume will pass straight through the gate while the other waits for the assert to whichRan." | gate checkAssert priorityRan | gate := Semaphore new signal. checkAssert := Semaphore new. [ gate wait. priorityRan := 11. checkAssert signal ] forkAt: 11. [ gate wait. priorityRan := 12. checkAssert signal ] forkAt: 12. checkAssert wait. self assert: priorityRan=12 description: 'Second scheduled but higher priority should run first'. gate signal. checkAssert wait. self assert: whichRan=11 description: 'First scheduled but lower priority should run after'. https://pharo.fogbugz.com/default.asp?17501 |
Hi. I like your proposals. It fixed problem. And more important it provides better documentation for tested behaviour 2016-01-30 10:43 GMT+01:00 Ben Coman <[hidden email]>: Case 17373 had an erroneous failure from |
Free forum by Nabble | Edit this page |