ProcessTest>>testSchedulingIsFirstComeFirstServed

Previous Topic Next Topic
 
classic Classic list List threaded Threaded
2 messages Options
Reply | Threaded
Open this post in threaded view
|

ProcessTest>>testSchedulingIsFirstComeFirstServed

Ben Coman
My general understanding of process scheduling is that it is FIFO
within same priority.  Indeed Pharo has a test for this
ProcessTest>>testSchedulingIsFirstComeFirstServed, which however I
thought was a bit fragile and I was looking to revise.

But now I'm curious whether same-priority-process FIFO scheduling is
an expected guarantee?
On both Pharo 5 build 50560 & Squeak 5 build 15113 the following test fails...

  [  1 to: 100000 do: [  :n |
      | ranFirst ranSecond |
      [    [ ranFirst ifNil: [ ranFirst := 1 ] ifNotNil: [ ranSecond
:= 1] ] forkAt: 78.
           [ ranFirst ifNil: [ ranFirst := 2 ] ifNotNil: [ ranSecond
:= 2] ] forkAt: 78.
      ] forkAt: 79.
     self assert: ranFirst=1.
     self assert: ranSecond=2.     ]
  ] fork.

... but only after 20k iterations, which seems an "almost" guarantee.
Interestingly, it doesn't fail when run without the outer fork.
Also if I take an alternate approach with less message calls...

  faults := OrderedCollection new.
  done := Semaphore new.
  [ 1 to: 10000000 do: [  :n | | ranFirst ranSecond |
       [   [ ranFirst ifNil: [ ranFirst := 1 ] ifNotNil: [ ranSecond
:= 1] ] forkAt: 78.
           [ ranFirst ifNil: [ ranFirst := 2 ] ifNotNil: [ ranSecond
:= 2] ] forkAt: 78.
       ] forkAt: 79.
       (ranFirst=1 and: ranSecond=2) ifFalse: [  faults add: { n .
ranFirst . ranSecond } ].
     ].
    done signal.
  ] fork.
  done wait.
  faults inspect.

I only get seven failures in 10 million iterations.
"an OrderedCollection(
   #(  258535 2 1)
   #(1148605 2 1)
   #(3502820 2 1)
   #(4010935 2 1)
   #(4533713 2 1)
   #(6301878 2 1)
   #(8497001 2 1))"

So what I'm wondering is if there is a bug to hunt down, or if FIFO
scheduling is simply not an expected guarantee?

cheers -ben

Reply | Threaded
Open this post in threaded view
|

Re: ProcessTest>>testSchedulingIsFirstComeFirstServed

Ben Coman
On Sun, Jan 31, 2016 at 6:16 PM, Ben Coman <[hidden email]> wrote:

> My general understanding of process scheduling is that it is FIFO
> within same priority.  Indeed Pharo has a test for this
> ProcessTest>>testSchedulingIsFirstComeFirstServed, which however I
> thought was a bit fragile and I was looking to revise.
>
> But now I'm curious whether same-priority-process FIFO scheduling is
> an expected guarantee?
> On both Pharo 5 build 50560 & Squeak 5 build 15113 the following test fails...
>
>   [  1 to: 100000 do: [  :n |
>       | ranFirst ranSecond |
>       [    [ ranFirst ifNil: [ ranFirst := 1 ] ifNotNil: [ ranSecond
> := 1] ] forkAt: 78.
>            [ ranFirst ifNil: [ ranFirst := 2 ] ifNotNil: [ ranSecond
> := 2] ] forkAt: 78.
>       ] forkAt: 79.
>      self assert: ranFirst=1.
>      self assert: ranSecond=2.     ]
>   ] fork.
>
> ... but only after 20k iterations, which seems an "almost" guarantee.
> Interestingly, it doesn't fail when run without the outer fork.
> Also if I take an alternate approach with less message calls...
>
>   faults := OrderedCollection new.
>   done := Semaphore new.
>   [ 1 to: 10000000 do: [  :n | | ranFirst ranSecond |
>        [   [ ranFirst ifNil: [ ranFirst := 1 ] ifNotNil: [ ranSecond
> := 1] ] forkAt: 78.
>            [ ranFirst ifNil: [ ranFirst := 2 ] ifNotNil: [ ranSecond
> := 2] ] forkAt: 78.
>        ] forkAt: 79.
>        (ranFirst=1 and: ranSecond=2) ifFalse: [  faults add: { n .
> ranFirst . ranSecond } ].
>      ].
>     done signal.
>   ] fork.
>   done wait.
>   faults inspect.
>
> I only get seven failures in 10 million iterations.
> "an OrderedCollection(
>    #(  258535 2 1)
>    #(1148605 2 1)
>    #(3502820 2 1)
>    #(4010935 2 1)
>    #(4533713 2 1)
>    #(6301878 2 1)
>    #(8497001 2 1))"
>
> So what I'm wondering is if there is a bug to hunt down, or if FIFO
> scheduling is simply not an expected guarantee?
>
> cheers -ben

A while ago in Pharo I added DelayNullScheduler, which stops the
timingPriority(=80) process that schedules delays.  After selecting it
via World > System > Settings > System > Delay Scheduler  I cannot
reproduce any of the faults(?) above.   (Aside: I just found
performance a bit better if DelayNullScheduler>>schedule: first does a
"Processor yield")

With the usual delay scheduler operating, the fault(?) can be bypassed
by signalling the timingSemaphore...

   timingSemaphore := Delay testCaseSupportTimingSemaphore.
   [  1 to: 100000 do: [  :n |
       | ranFirst ranSecond |
       [    [ ranFirst ifNil: [ ranFirst := 1 ]
                             ifNotNil: [ ranSecond := 1] ] forkAt: 78.
            [ ranFirst ifNil: [ ranFirst := 2 ]
                            ifNotNil: [ ranSecond := 2] ] forkAt: 78.
       ] forkAt: 79.
      self assert: ranFirst=1.
      self assert: ranSecond=2.     ]
      timingSemaphore signal.
  ] fork.


The fault(?) can also be bypassed by commenting the call to #intercyclePause:
out of WorldState>>doOneCycleFor:


I'd be glad for any insights.

cheers -ben