[ANN] 22477 DelayScheduler cleanup and refactoring [was: Where do we go now ?]

Previous Topic Next Topic
 
classic Classic list List threaded Threaded
7 messages Options
Reply | Threaded
Open this post in threaded view
|

[ANN] 22477 DelayScheduler cleanup and refactoring [was: Where do we go now ?]

Ben Coman
On Fri, 13 Apr 2018 at 13:56, Benoit St-Jean via Pharo-users <[hidden email]> wrote:
Do we really need 8 delay schedulers (DelayMicrosecondScheduler, DelayMillisecondScheduler, DelayNullScheduler, DelayExperimentalSpinScheduler, DelaySpinScheduler, DelayTicklessScheduler, DelayExperimentalCourageousScheduler, DelayExperimentalSemaphoreScheduler) ? 

I've cleaned the delay scheduling subsystem [1] to condense these alternatives and separate orthogonal functionality.  After a couple of reviews this was integrated and made active in last week's Build 1273.  Could anyone with test scenarios from previous issues with delays have a bash at stressing the latest build.

The old(existing) DelaySpinScheduler remains in the system so it can be activated as a point of comparison (System > Settings > System > Delay Scheduler). Pending any adverse reports the final step will be to remove the old hierarchy next week.  

There remain separate mutex and semaphore based schedulers since:
* they differ by only a couple of overridden methods
* their slightly different implementations help highlight the core algorithm
* they provide a simple in-Image example of different synchronisation mechanisms.
* in isolating edge cases its useful to be able to compare results between implementations

I've retained both microsecond and millisecond operation,
but extracted into "ticker" classes orthogonal to the "scheduling" classes since:
* it makes the core scheduling algorithm independent of time base
* millisecond (or other custom) timebase might(?) be more efficient on smaller 32-bit embedded systems
* tests can now simulate ticker time to avoid tests interfering with system's-active-scheduler VM interaction (which may be the source of some random CI failures)
* delay scheduler tests are now independent of real-time (which may be the source of some random CI failures where delays are affected by varying CI server loads)
* when multi-threaded FFI callbacks become available, may facilitate experimenting with:
     * wake-up from native timers
     * wake-up of embedded Pharo from the encompassing system


Other refactors: 
* during system snapshot, save/restore of resumption times was happening at user priority which risks a race condition reported at [2]. *All* modification of resumption times now occurs at timing/highest priority.


Reply | Threaded
Open this post in threaded view
|

Re: [ANN] 22477 DelayScheduler cleanup and refactoring [was: Where do we go now ?]

Sean P. DeNigris
Administrator
Nice writeup! Thanks :)



-----
Cheers,
Sean
--
Sent from: http://forum.world.st/Pharo-Smalltalk-Developers-f1294837.html

Cheers,
Sean
Reply | Threaded
Open this post in threaded view
|

Re: [ANN] 22477 DelayScheduler cleanup and refactoring [was: Where do we go now ?]

Stephane Ducasse-3
In reply to this post by Ben Coman
Ben I would like to thank you!
Simply.

Stef
On Tue, Oct 2, 2018 at 5:58 AM Ben Coman <[hidden email]> wrote:

>
> On Fri, 13 Apr 2018 at 13:56, Benoit St-Jean via Pharo-users <[hidden email]> wrote:
>>
>> Do we really need 8 delay schedulers (DelayMicrosecondScheduler, DelayMillisecondScheduler, DelayNullScheduler, DelayExperimentalSpinScheduler, DelaySpinScheduler, DelayTicklessScheduler, DelayExperimentalCourageousScheduler, DelayExperimentalSemaphoreScheduler) ?
>
>
> I've cleaned the delay scheduling subsystem [1] to condense these alternatives and separate orthogonal functionality.  After a couple of reviews this was integrated and made active in last week's Build 1273.  Could anyone with test scenarios from previous issues with delays have a bash at stressing the latest build.
>
> The old(existing) DelaySpinScheduler remains in the system so it can be activated as a point of comparison (System > Settings > System > Delay Scheduler). Pending any adverse reports the final step will be to remove the old hierarchy next week.
>
> There remain separate mutex and semaphore based schedulers since:
> * they differ by only a couple of overridden methods
> * their slightly different implementations help highlight the core algorithm
> * they provide a simple in-Image example of different synchronisation mechanisms.
> * in isolating edge cases its useful to be able to compare results between implementations
>
> I've retained both microsecond and millisecond operation,
> but extracted into "ticker" classes orthogonal to the "scheduling" classes since:
> * it makes the core scheduling algorithm independent of time base
> * millisecond (or other custom) timebase might(?) be more efficient on smaller 32-bit embedded systems
> * tests can now simulate ticker time to avoid tests interfering with system's-active-scheduler VM interaction (which may be the source of some random CI failures)
> * delay scheduler tests are now independent of real-time (which may be the source of some random CI failures where delays are affected by varying CI server loads)
> * when multi-threaded FFI callbacks become available, may facilitate experimenting with:
>      * wake-up from native timers
>      * wake-up of embedded Pharo from the encompassing system
>
>
> Other refactors:
> * during system snapshot, save/restore of resumption times was happening at user priority which risks a race condition reported at [2]. *All* modification of resumption times now occurs at timing/highest priority.
>
>
> [1] https://pharo.manuscript.com/f/cases/22477/DelayScheduler-cleanup-and-refactoring
> [2] https://pharo.manuscript.com/f/cases/18359/

Reply | Threaded
Open this post in threaded view
|

Re: [ANN] 22477 DelayScheduler cleanup and refactoring [was: Where do we go now ?]

Guillermo Polito
Yes +1 on the thanks!
Also it's great to read your emails to learn new stuff :)

On Tue, Oct 2, 2018 at 8:30 PM Stephane Ducasse <[hidden email]> wrote:
Ben I would like to thank you!
Simply.

Stef
On Tue, Oct 2, 2018 at 5:58 AM Ben Coman <[hidden email]> wrote:
>
> On Fri, 13 Apr 2018 at 13:56, Benoit St-Jean via Pharo-users <[hidden email]> wrote:
>>
>> Do we really need 8 delay schedulers (DelayMicrosecondScheduler, DelayMillisecondScheduler, DelayNullScheduler, DelayExperimentalSpinScheduler, DelaySpinScheduler, DelayTicklessScheduler, DelayExperimentalCourageousScheduler, DelayExperimentalSemaphoreScheduler) ?
>
>
> I've cleaned the delay scheduling subsystem [1] to condense these alternatives and separate orthogonal functionality.  After a couple of reviews this was integrated and made active in last week's Build 1273.  Could anyone with test scenarios from previous issues with delays have a bash at stressing the latest build.
>
> The old(existing) DelaySpinScheduler remains in the system so it can be activated as a point of comparison (System > Settings > System > Delay Scheduler). Pending any adverse reports the final step will be to remove the old hierarchy next week.
>
> There remain separate mutex and semaphore based schedulers since:
> * they differ by only a couple of overridden methods
> * their slightly different implementations help highlight the core algorithm
> * they provide a simple in-Image example of different synchronisation mechanisms.
> * in isolating edge cases its useful to be able to compare results between implementations
>
> I've retained both microsecond and millisecond operation,
> but extracted into "ticker" classes orthogonal to the "scheduling" classes since:
> * it makes the core scheduling algorithm independent of time base
> * millisecond (or other custom) timebase might(?) be more efficient on smaller 32-bit embedded systems
> * tests can now simulate ticker time to avoid tests interfering with system's-active-scheduler VM interaction (which may be the source of some random CI failures)
> * delay scheduler tests are now independent of real-time (which may be the source of some random CI failures where delays are affected by varying CI server loads)
> * when multi-threaded FFI callbacks become available, may facilitate experimenting with:
>      * wake-up from native timers
>      * wake-up of embedded Pharo from the encompassing system
>
>
> Other refactors:
> * during system snapshot, save/restore of resumption times was happening at user priority which risks a race condition reported at [2]. *All* modification of resumption times now occurs at timing/highest priority.
>
>
> [1] https://pharo.manuscript.com/f/cases/22477/DelayScheduler-cleanup-and-refactoring
> [2] https://pharo.manuscript.com/f/cases/18359/



--

   

Guille Polito

Research Engineer

Centre de Recherche en Informatique, Signal et Automatique de Lille

CRIStAL - UMR 9189

French National Center for Scientific Research - http://www.cnrs.fr


Web: http://guillep.github.io

Phone: +33 06 52 70 66 13

Reply | Threaded
Open this post in threaded view
|

Re: [ANN] 22477 DelayScheduler cleanup and refactoring [was: Where do we go now ?]

Stephan Eggermont-3
In reply to this post by Ben Coman
Ben Coman <[hidden email]> wrote:
> On Fri, 13 Apr 2018 at 13:56, Benoit St-Jean via Pharo-users <
> [hidden email]> wrote:
>

What is the current status of this? DelaySpinScheduler is default in my
recent Pharo 7 images and makes my images unusable on Ubuntu 18.04LTS.

Stephan




Reply | Threaded
Open this post in threaded view
|

Re: [ANN] 22477 DelayScheduler cleanup and refactoring [was: Where do we go now ?]

Ben Coman
On Mon, 5 Nov 2018 at 20:02, Stephan Eggermont <[hidden email]> wrote:
Ben Coman <[hidden email]> wrote:
> On Fri, 13 Apr 2018 at 13:56, Benoit St-Jean via Pharo-users <
> [hidden email]> wrote:
>

What is the current status of this? DelaySpinScheduler is default in my
recent Pharo 7 images and makes my images unusable on Ubuntu 18.04LTS.

The refactoring was enabled a month ago. DelaySemaphoreScheduler in the refactored hierarchy was made default so that DelaySpinScheduler could remain in place in the old hierarchy . 

On discord you mention that DelaySpinScheduler was the default in build 1358.
But with PharoLauncher I just downloaded  "Pharo-7.0.0-alpha.build.1358.sha.f9325e7.arch.32bit"
and for me it shows DelaySemaphoreScheduler is the default.

Evaluating ```Delay delaySchedulerClass```   ==>   "DelaySemaphoreScheduler"
and the process browser shows...
DelayScheduler-build-71358.png

So I'm curious how DelaySpinScheduler became enabled for you?

cheers -ben


P.S. a case was found where a process running at highestPriority (the same as the delay scheduling loop)
that called  Delay>>schedule:  twice would lock DelaySpinScheduler - since the delay scheduling loop would not get an opportunity to process & clear the transfer-variable "delayToStart".
I'd always worked with the assumption that the delay scheduling loop was the only process running at highest priority. 
DelaySemaphoreScheduler works fine with multiple highestPriority processes, so it will be the default going forward.
Reply | Threaded
Open this post in threaded view
|

Re: [ANN] 22477 DelayScheduler cleanup and refactoring [was: Where do we go now ?]

Ben Coman


On Mon, 5 Nov 2018 at 22:17, Ben Coman <[hidden email]> wrote:
On Mon, 5 Nov 2018 at 20:02, Stephan Eggermont <[hidden email]> wrote:
Ben Coman <[hidden email]> wrote:
> On Fri, 13 Apr 2018 at 13:56, Benoit St-Jean via Pharo-users <
> [hidden email]> wrote:
>

What is the current status of this? DelaySpinScheduler is default in my
recent Pharo 7 images and makes my images unusable on Ubuntu 18.04LTS.

The refactoring was enabled a month ago. DelaySemaphoreScheduler in the refactored hierarchy was made default so that DelaySpinScheduler could remain in place in the old hierarchy . 

Working through this on Discord with Stefan, discovered that earlier <Store Settings> had recorded DelaySpinScheduler when it was active  
was asserting itself in every freshly created image.  I guess a lot of people will be susceptible to this,
particularly if like me they don't remember where those files are without googling for it.

In this case could fix it by changing   System > System Settings > System > Delay Scheduler   =   DelaySemaphoreScheduler
and then in Settings clicking <Store Settings>.
But maybe something else needs to be done to head off this trap for everyone else.

cheers -ben