My general understanding of process scheduling is that it is FIFO
within same priority. Indeed Pharo has a test for this ProcessTest>>testSchedulingIsFirstComeFirstServed, which however I thought was a bit fragile and I was looking to revise. But now I'm curious whether same-priority-process FIFO scheduling is an expected guarantee? On both Pharo 5 build 50560 & Squeak 5 build 15113 the following test fails... [ 1 to: 100000 do: [ :n | | ranFirst ranSecond | [ [ ranFirst ifNil: [ ranFirst := 1 ] ifNotNil: [ ranSecond := 1] ] forkAt: 78. [ ranFirst ifNil: [ ranFirst := 2 ] ifNotNil: [ ranSecond := 2] ] forkAt: 78. ] forkAt: 79. self assert: ranFirst=1. self assert: ranSecond=2. ] ] fork. ... but only after 20k iterations, which seems an "almost" guarantee. Interestingly, it doesn't fail when run without the outer fork. Also if I take an alternate approach with less message calls... faults := OrderedCollection new. done := Semaphore new. [ 1 to: 10000000 do: [ :n | | ranFirst ranSecond | [ [ ranFirst ifNil: [ ranFirst := 1 ] ifNotNil: [ ranSecond := 1] ] forkAt: 78. [ ranFirst ifNil: [ ranFirst := 2 ] ifNotNil: [ ranSecond := 2] ] forkAt: 78. ] forkAt: 79. (ranFirst=1 and: ranSecond=2) ifFalse: [ faults add: { n . ranFirst . ranSecond } ]. ]. done signal. ] fork. done wait. faults inspect. I only get seven failures in 10 million iterations. "an OrderedCollection( #( 258535 2 1) #(1148605 2 1) #(3502820 2 1) #(4010935 2 1) #(4533713 2 1) #(6301878 2 1) #(8497001 2 1))" So what I'm wondering is if there is a bug to hunt down, or if FIFO scheduling is simply not an expected guarantee? cheers -ben |
On Sun, Jan 31, 2016 at 6:16 PM, Ben Coman <[hidden email]> wrote:
> My general understanding of process scheduling is that it is FIFO > within same priority. Indeed Pharo has a test for this > ProcessTest>>testSchedulingIsFirstComeFirstServed, which however I > thought was a bit fragile and I was looking to revise. > > But now I'm curious whether same-priority-process FIFO scheduling is > an expected guarantee? > On both Pharo 5 build 50560 & Squeak 5 build 15113 the following test fails... > > [ 1 to: 100000 do: [ :n | > | ranFirst ranSecond | > [ [ ranFirst ifNil: [ ranFirst := 1 ] ifNotNil: [ ranSecond > := 1] ] forkAt: 78. > [ ranFirst ifNil: [ ranFirst := 2 ] ifNotNil: [ ranSecond > := 2] ] forkAt: 78. > ] forkAt: 79. > self assert: ranFirst=1. > self assert: ranSecond=2. ] > ] fork. > > ... but only after 20k iterations, which seems an "almost" guarantee. > Interestingly, it doesn't fail when run without the outer fork. > Also if I take an alternate approach with less message calls... > > faults := OrderedCollection new. > done := Semaphore new. > [ 1 to: 10000000 do: [ :n | | ranFirst ranSecond | > [ [ ranFirst ifNil: [ ranFirst := 1 ] ifNotNil: [ ranSecond > := 1] ] forkAt: 78. > [ ranFirst ifNil: [ ranFirst := 2 ] ifNotNil: [ ranSecond > := 2] ] forkAt: 78. > ] forkAt: 79. > (ranFirst=1 and: ranSecond=2) ifFalse: [ faults add: { n . > ranFirst . ranSecond } ]. > ]. > done signal. > ] fork. > done wait. > faults inspect. > > I only get seven failures in 10 million iterations. > "an OrderedCollection( > #( 258535 2 1) > #(1148605 2 1) > #(3502820 2 1) > #(4010935 2 1) > #(4533713 2 1) > #(6301878 2 1) > #(8497001 2 1))" > > So what I'm wondering is if there is a bug to hunt down, or if FIFO > scheduling is simply not an expected guarantee? > > cheers -ben A while ago in Pharo I added DelayNullScheduler, which stops the timingPriority(=80) process that schedules delays. After selecting it via World > System > Settings > System > Delay Scheduler I cannot reproduce any of the faults(?) above. (Aside: I just found performance a bit better if DelayNullScheduler>>schedule: first does a "Processor yield") With the usual delay scheduler operating, the fault(?) can be bypassed by signalling the timingSemaphore... timingSemaphore := Delay testCaseSupportTimingSemaphore. [ 1 to: 100000 do: [ :n | | ranFirst ranSecond | [ [ ranFirst ifNil: [ ranFirst := 1 ] ifNotNil: [ ranSecond := 1] ] forkAt: 78. [ ranFirst ifNil: [ ranFirst := 2 ] ifNotNil: [ ranSecond := 2] ] forkAt: 78. ] forkAt: 79. self assert: ranFirst=1. self assert: ranSecond=2. ] timingSemaphore signal. ] fork. The fault(?) can also be bypassed by commenting the call to #intercyclePause: out of WorldState>>doOneCycleFor: I'd be glad for any insights. cheers -ben |
Free forum by Nabble | Edit this page |