Solving multiple termination bugs - summary & proposal

Previous Topic Next Topic
 
classic Classic list List threaded Threaded
6 messages Options
Reply | Threaded
Open this post in threaded view
|

Solving multiple termination bugs - summary & proposal

Jaromir Matas
Hi All,

here's a summary of bugs in the current termination procedure:

1. #isTerminated and #isSuspended fail to report correctly - in [1]

2. active process termination bug causing an image crash - in [2]

3. nested unwind bug: skipping some ensure blocks - in [3]

4. likely a bug: a failure to complete nested unwind blocks halfway thru
execution - also in [3]

5. Christoph's discovery: a failure to correctly close the debugger in case
of error or non-local return in unwind - in [4]

6. inconsistent semantics of unwinding protected blocks during active vs.
suspended process termination
(not reported yet - different sets of unwind blocks are executed depending
on whether a process terminates itself or whether some other process
initiates its termination; the proposed solution unifies active and
suspended process termination and is consistent with Visual Works semantics
(and goes beyond))

[1]
http://forum.world.st/The-Inbox-Kernel-jar-1376-mcz-td5127335.html#a5127336
[2]
http://forum.world.st/A-bug-in-active-process-termination-crashing-image-td5128186.html
[3]
http://forum.world.st/Another-bug-in-Process-terminate-in-unwinding-contexts-tp5128171p5128178.html
[4]
http://forum.world.st/Bug-in-Process-gt-gt-terminate-Returning-from-unwind-contexts-td5127570.html#a5128095


I'd like to propose a solution - a rewrite of #terminate - addressing all of
the above issues. Main points of the solution are:

(i) why differentiate active and suspended process termination? I propose to
unify the two: to suspend the active process and terminate it as a suspended
process. As a result the semantics/outcome of the unwind algorithm will be
the same in both cases. Fixes 2 and 6.

(ii) the above leads to a susbstantial simplification of #isTerminated - a
process is terminated when at the last instruction of its bottom context.
Fixes 1.

(iii) unwind algorithm for incomplete unwind blocks is extended to execute
the outer-most instead of the inner-most incomplete unwind context. Fixes 3
and 4.

(iv) indirect termination via #popTo is replaced with a analogue of
#unwindTo: using Eliot's ingenious #runUntilErrorOrReturnFrom: the same way
as in (iii). I'd very much appreciate if someone commented this part because
I haven't found any clue why the indirect termination has implemented using
#popTo and not #runUntilErrorOrReturnFrom. Fixes 5.

I enclose a set of examples used to compare the current Squeak semantics
with the proposed one (and with VW) that can be used to build test cases.

Here's a commented code. I'll be happy to provide detailed step-by-step
guidance if you find this whole idea interesting and worth implementing. I'm
convinced at least parts of the proposal should be integrated as simple
fixes of the current bugs. Thank you for your patience if you're still
reading :) Any feedback extremely welcome. A changeset is enclosed for your
convenience here:  Fix_terminate_v2.cs
<http://forum.world.st/file/t372955/Fix_terminate_v2.cs>  

Process >> terminate
        "Stop the process that the receiver represents so that it answers true to
#isTerminated.
         Unwind to execute pending and unfinished ensure:/ifCurtailed: blocks
before terminating.
         If the process is in the middle of a critical: critical section, release
it properly."

        | ctxt unwindBlock oldList outerMost |
        self isActiveProcess ifTrue: [
                "If terminating the active process, suspend it first and terminate it as a
suspended process.
                Nested #terminate messages could derail the termination so let's enclose
it in #ensure."
                [[] ensure: [self terminate]] fork.
                ^self suspend].

        "Always suspend the process first so it doesn't accidentally get woken up.
         N.B. If oldList is a LinkedList then the process is runnable. If it is a
Semaphore/Mutex et al
         then the process is blocked, and if it is nil then the process is already
suspended."
        oldList := self suspend.
        suspendedContext ifNotNil:
                ["Release any method marked with the <criticalSection> pragma.
                  The argument is whether the process is runnable."
                 self releaseCriticalSection: (oldList isNil or: [oldList class ==
LinkedList]).

                "If terminating a process halfways through an unwind, try to complete that
unwind block first;
                if there are multiple such nested unwind blocks, try to complete the
outer-most one; the inner
                blocks will be completed in the process."
                ctxt := suspendedContext.
                [(ctxt := ctxt findNextUnwindContextUpTo: nil) isNil] whileFalse:
                        "Contexts under evaluation have already set their complete (tempAt: 2) to
true."
                        [(ctxt tempAt:2) ifNotNil: [outerMost := ctxt]].
                outerMost ifNotNil: [
                        "This is the outer-most unwind context currently under evaluation;
                        let's find an inner context executing outerMost's argument block (tempAt:
1)"
                        (suspendedContext findContextSuchThat: [:ctx |
                                ctx closure == (outerMost tempAt: 1)]) ifNotNil: [:inner |
                                        "Let's finish the unfinished unwind context only (i.e. up to inner) and
return here"
                                        suspendedContext runUntilErrorOrReturnFrom: inner.
                                        "Update the receiver's suspendedContext (the previous step reset its
sender to nil)"
                                        suspendedContext := outerMost]].

                "Now all unwind blocks caught halfway through have been completed;
                let's execute the ones still pending. Note: #findNextUnwindContextUpTo:
starts
                searching from the receiver's sender but the receiver itself may be an
unwind context."
                ctxt := suspendedContext.
                ctxt isUnwindContext ifFalse: [ctxt := ctxt findNextUnwindContextUpTo:
nil].
                [ctxt isNil] whileFalse: [
                        (ctxt tempAt: 2) ifNil: [
                                ctxt tempAt: 2 put: true.
                                unwindBlock := ctxt tempAt: 1.
                                "Create a context for the unwind block and execute it on the unwind
block's stack.
                                Note: using #value instead of #runUntilErrorOrReturnFrom: would lead to
executing
                                the unwind on the wrong stack preventing the correct execution of
non-local returns."
                                suspendedContext := unwindBlock asContextWithSender: ctxt.
                                suspendedContext runUntilErrorOrReturnFrom: suspendedContext].
                        ctxt := ctxt findNextUnwindContextUpTo: nil].

                "Set the context to its endPC and its sender to nil for the benefit of
isTerminated."
                ctxt := suspendedContext.
                ctxt terminateTo: nil.
                ctxt pc: ctxt endPC]


Process >> isTerminated
        "Answer if the receiver is terminated. A process is considered terminated
        if the suspendedContext is the bottomContext and the pc is at the endPC"

        self isActiveProcess ifTrue: [^ false].
        ^suspendedContext isNil or: [
                suspendedContext isBottomContext and: [
                        suspendedContext isDead or: [suspendedContext atEnd]]]


Process >> isSuspended
        "A process is suspended if it has non-nil suspendedContext (e.g. new or
        previously suspended with the suspend primitive) and is not terminated or
        waiting in a scheduler or a semaphore queue (i.e. is not runnable or
blocked)."
       
        ^myList isNil
                and: [suspendedContext notNil
                and: [self isTerminated not]]


Debugger >> windowIsClosing
        "My window is being closed; clean up. Restart the low space watcher.
        If the debugged process is a do-it, it may do a non-local return escaping
        termination so we need to insert a #terminate context under the do-it
        to make sure the debugged UI process will terminate."

        | doItContext terminateContext |
        (interruptedProcess == nil or: [interruptedProcess suspendedContext isNil])
ifTrue: [^ self].

        "find a do-it context; answer nil if it doesn't exist in the sender chain"
        doItContext := interruptedProcess suspendedContext
                findContextSuchThat: [:ctx |
                        ctx methodClass = UndefinedObject and: [
                        ctx selector = #DoIt and: [
                        ctx closure isNil]]].
        doItContext ifNotNil: [ "it exists so let's insert a #terminate context"
                terminateContext := Context
                        sender: doItContext sender
                        receiver: interruptedProcess
                        method: (Process>>#terminate)
                        arguments: {}.
                doItContext privSender: terminateContext ].

        interruptedProcess terminate.
        interruptedProcess := nil.
       
        contextStack := nil.
        receiverInspector := nil.
        contextVariablesInspector := nil.
        Smalltalk installLowSpaceWatcher.  "restart low space handler"




A few examples to illustrate the idea (many more enclosed here
Some_examples_to_examine_terminate_bugs.txt
<http://forum.world.st/file/t372955/Some_examples_to_examine_terminate_bugs.txt>  
):

Ex. 1:

| p |
"Suspend process inside ensure block and make sure x1 x2 and x3 are printed.
Currently only x1 is printed."
p := [
        [
                [ ] ensure: [
                        [ ] ensure: [
                                Processor activeProcess suspend.
                                Transcript show: 'x1'].
                        Transcript show: 'x2']
        ] ensure: [
                Transcript show: 'x3']
] fork.
Processor yield.
p terminate

---> x1   (instead of x1 x2 x3)


Ex. 2:

| p |
Currently only x3 is printed."
p := [
        [
                [ ] ensure: [
                        [ ] ensure: [
                                Processor activeProcess terminate.
                                Transcript show: 'x1'].
        "terminate active process inside ensure block and make sure x1 x2 and x3
are printed.
                Transcript show: 'x2']
        ] ensure: [
                Transcript show: 'x3']
] fork.
Processor yield

---> x3   (instead of x1 x2 x3)


Ex. 3:

"unwind after error inside ensure block and make sure x1 x2 and x3 are
printed.
[
        [ ] ensure: [
                [ ] ensure: [
                        self error.
                        Transcript show: 'x1'].
                Transcript show: 'x2']
] ensure: [
        ^Transcript show: 'x3']


---> x3   (instead of x1 x2 x3)

Note: nested errors during unwind follow #runUntilErrorOrReturnFrom: logic,
i.e. return exception and let user decide...



-----
^[^ Jaromir
--
Sent from: http://forum.world.st/Squeak-Dev-f45488.html

^[^ Jaromir
Reply | Threaded
Open this post in threaded view
|

Re: Solving multiple termination bugs - summary & proposal

Jaromir Matas
Hi,

The enclosed changeset fixes all bugs/inconsistencies concerning process
termination listed below. It's a simplified version of the previous
proposal. Feedback very welcome indeed!

The idea is to simply extend the existing mechanism for completing
halfway-through unwind blocks and let it deal with *all* unwind blocks.

best,

Fix_terminate_v3.cs <http://forum.world.st/file/t372955/Fix_terminate_v3.cs>  


Jaromir Matas wrote

> Hi All,
>
> here's a summary of bugs in the current termination procedure:
>
> 1. #isTerminated and #isSuspended fail to report correctly - in [1]
>
> 2. active process termination bug causing an image crash - in [2]
>
> 3. nested unwind bug: skipping some ensure blocks - in [3]
>
> 4. likely a bug: a failure to complete nested unwind blocks halfway thru
> execution - also in [3]
>
> 5. Christoph's discovery: a failure to correctly close the debugger in
> case
> of error or non-local return in unwind - in [4]
>
> 6. inconsistent semantics of unwinding protected blocks during active vs.
> suspended process termination





-----
^[^ Jaromir
--
Sent from: http://forum.world.st/Squeak-Dev-f45488.html

^[^ Jaromir
Reply | Threaded
Open this post in threaded view
|

Re: Solving multiple termination bugs - summary & proposal

Nicolas Cellier
Hi Jaromir,
many thanks for those efforts and for perseverating.
It must be a bit frustrating to not receive any answer to this good
work and deep investment, but there is a good reason for that: the
subject is tough, and not everyone is able to give you a quick answer
without reinvesting an equivalent slot of time as you personally did.

This change-set looks good to my eyes, the level of comment is good
too and gives me some additional confidence.
I'm probably not at your level of expertise yet, but I vote for
integrating this in trunk.
In the meantime, I'll leave with it in my image(s) for a few days.
You can as well submit an inbox contribution as you did previously,
this is the preferred way, unless the changes span many packages.

cheers

Nicolas

Le sam. 10 avr. 2021 à 10:58, Jaromir Matas <[hidden email]> a écrit :

>
> Hi,
>
> The enclosed changeset fixes all bugs/inconsistencies concerning process
> termination listed below. It's a simplified version of the previous
> proposal. Feedback very welcome indeed!
>
> The idea is to simply extend the existing mechanism for completing
> halfway-through unwind blocks and let it deal with *all* unwind blocks.
>
> best,
>
> Fix_terminate_v3.cs <http://forum.world.st/file/t372955/Fix_terminate_v3.cs>
>
>
> Jaromir Matas wrote
> > Hi All,
> >
> > here's a summary of bugs in the current termination procedure:
> >
> > 1. #isTerminated and #isSuspended fail to report correctly - in [1]
> >
> > 2. active process termination bug causing an image crash - in [2]
> >
> > 3. nested unwind bug: skipping some ensure blocks - in [3]
> >
> > 4. likely a bug: a failure to complete nested unwind blocks halfway thru
> > execution - also in [3]
> >
> > 5. Christoph's discovery: a failure to correctly close the debugger in
> > case
> > of error or non-local return in unwind - in [4]
> >
> > 6. inconsistent semantics of unwinding protected blocks during active vs.
> > suspended process termination
>
>
>
>
>
> -----
> ^[^ Jaromir
> --
> Sent from: http://forum.world.st/Squeak-Dev-f45488.html
>

Reply | Threaded
Open this post in threaded view
|

Re: Solving multiple termination bugs - summary & proposal

Nicolas Cellier
I'll live with it, xcuz the bad anglishe.

Le sam. 10 avr. 2021 à 23:21, Nicolas Cellier
<[hidden email]> a écrit :

>
> Hi Jaromir,
> many thanks for those efforts and for perseverating.
> It must be a bit frustrating to not receive any answer to this good
> work and deep investment, but there is a good reason for that: the
> subject is tough, and not everyone is able to give you a quick answer
> without reinvesting an equivalent slot of time as you personally did.
>
> This change-set looks good to my eyes, the level of comment is good
> too and gives me some additional confidence.
> I'm probably not at your level of expertise yet, but I vote for
> integrating this in trunk.
> In the meantime, I'll leave with it in my image(s) for a few days.
> You can as well submit an inbox contribution as you did previously,
> this is the preferred way, unless the changes span many packages.
>
> cheers
>
> Nicolas
>
> Le sam. 10 avr. 2021 à 10:58, Jaromir Matas <[hidden email]> a écrit :
> >
> > Hi,
> >
> > The enclosed changeset fixes all bugs/inconsistencies concerning process
> > termination listed below. It's a simplified version of the previous
> > proposal. Feedback very welcome indeed!
> >
> > The idea is to simply extend the existing mechanism for completing
> > halfway-through unwind blocks and let it deal with *all* unwind blocks.
> >
> > best,
> >
> > Fix_terminate_v3.cs <http://forum.world.st/file/t372955/Fix_terminate_v3.cs>
> >
> >
> > Jaromir Matas wrote
> > > Hi All,
> > >
> > > here's a summary of bugs in the current termination procedure:
> > >
> > > 1. #isTerminated and #isSuspended fail to report correctly - in [1]
> > >
> > > 2. active process termination bug causing an image crash - in [2]
> > >
> > > 3. nested unwind bug: skipping some ensure blocks - in [3]
> > >
> > > 4. likely a bug: a failure to complete nested unwind blocks halfway thru
> > > execution - also in [3]
> > >
> > > 5. Christoph's discovery: a failure to correctly close the debugger in
> > > case
> > > of error or non-local return in unwind - in [4]
> > >
> > > 6. inconsistent semantics of unwinding protected blocks during active vs.
> > > suspended process termination
> >
> >
> >
> >
> >
> > -----
> > ^[^ Jaromir
> > --
> > Sent from: http://forum.world.st/Squeak-Dev-f45488.html
> >

Reply | Threaded
Open this post in threaded view
|

Re: Solving multiple termination bugs - summary & proposal

Nicolas Cellier
BTW, I have moved your previous attempts to treated inbox, so please,
do not fear to pollute the inbox again :)

Le sam. 10 avr. 2021 à 23:25, Nicolas Cellier
<[hidden email]> a écrit :

>
> I'll live with it, xcuz the bad anglishe.
>
> Le sam. 10 avr. 2021 à 23:21, Nicolas Cellier
> <[hidden email]> a écrit :
> >
> > Hi Jaromir,
> > many thanks for those efforts and for perseverating.
> > It must be a bit frustrating to not receive any answer to this good
> > work and deep investment, but there is a good reason for that: the
> > subject is tough, and not everyone is able to give you a quick answer
> > without reinvesting an equivalent slot of time as you personally did.
> >
> > This change-set looks good to my eyes, the level of comment is good
> > too and gives me some additional confidence.
> > I'm probably not at your level of expertise yet, but I vote for
> > integrating this in trunk.
> > In the meantime, I'll leave with it in my image(s) for a few days.
> > You can as well submit an inbox contribution as you did previously,
> > this is the preferred way, unless the changes span many packages.
> >
> > cheers
> >
> > Nicolas
> >
> > Le sam. 10 avr. 2021 à 10:58, Jaromir Matas <[hidden email]> a écrit :
> > >
> > > Hi,
> > >
> > > The enclosed changeset fixes all bugs/inconsistencies concerning process
> > > termination listed below. It's a simplified version of the previous
> > > proposal. Feedback very welcome indeed!
> > >
> > > The idea is to simply extend the existing mechanism for completing
> > > halfway-through unwind blocks and let it deal with *all* unwind blocks.
> > >
> > > best,
> > >
> > > Fix_terminate_v3.cs <http://forum.world.st/file/t372955/Fix_terminate_v3.cs>
> > >
> > >
> > > Jaromir Matas wrote
> > > > Hi All,
> > > >
> > > > here's a summary of bugs in the current termination procedure:
> > > >
> > > > 1. #isTerminated and #isSuspended fail to report correctly - in [1]
> > > >
> > > > 2. active process termination bug causing an image crash - in [2]
> > > >
> > > > 3. nested unwind bug: skipping some ensure blocks - in [3]
> > > >
> > > > 4. likely a bug: a failure to complete nested unwind blocks halfway thru
> > > > execution - also in [3]
> > > >
> > > > 5. Christoph's discovery: a failure to correctly close the debugger in
> > > > case
> > > > of error or non-local return in unwind - in [4]
> > > >
> > > > 6. inconsistent semantics of unwinding protected blocks during active vs.
> > > > suspended process termination
> > >
> > >
> > >
> > >
> > >
> > > -----
> > > ^[^ Jaromir
> > > --
> > > Sent from: http://forum.world.st/Squeak-Dev-f45488.html
> > >

Reply | Threaded
Open this post in threaded view
|

Re: Solving multiple termination bugs - summary & proposal

Jaromir Matas
Hi Nicolas,
Thanks a lot for your encouragement!
Sent to the Inbox. I'll add a set of tests a bit later.

>
> BTW, I have moved your previous attempts to treated inbox, so please,
> do not fear to pollute the inbox again :)

Yes, thanks again, that's exactly what I was fearing - creating more noise
in the Inbox :)

>
> Le sam. 10 avr. 2021 à 23:21, Nicolas Cellier
> <[hidden email]> a écrit :
> >
> > Hi Jaromir,
> > many thanks for those efforts and for perseverating.
> > It must be a bit frustrating to not receive any answer to this good
> > work and deep investment, but there is a good reason for that: the
> > subject is tough, and not everyone is able to give you a quick answer
> > without reinvesting an equivalent slot of time as you personally did.
> >
> > This change-set looks good to my eyes, the level of comment is good
> > too and gives me some additional confidence.
> > I'm probably not at your level of expertise yet, but I vote for
> > integrating this in trunk.
> > In the meantime, I'll leave with it in my image(s) for a few days.
> > You can as well submit an inbox contribution as you did previously,
> > this is the preferred way, unless the changes span many packages.
> >
> > cheers
> >
> > Nicolas

best,



-----
^[^ Jaromir
--
Sent from: http://forum.world.st/Squeak-Dev-f45488.html

^[^ Jaromir