The semantics of halfway-executed unwind contexts during process termination

Previous Topic Next Topic
 
classic Classic list List threaded Threaded
6 messages Options
Reply | Threaded
Open this post in threaded view
|

The semantics of halfway-executed unwind contexts during process termination

Christoph Thiede

Hi all, hi Jaromir,


I'm raising a new question in this post that is related to the following threads, but I think that it deserves its own thread due to the fundamental criticism expressed: [1, 2]


I just took a closer look at ProcessTest >> #testNestedUnwind and I have to say that I don't agree with it. I'm sorry that I did not mention this earlier, but somehow this aspect of Jaromir's large amount of recent work has escaped my attention before today. For reference, so that we all know what we are talking about, here is the test in question:

testNestedUnwind

"Test all nested unwind blocks are correctly unwound; all unwind blocks halfway through their execution should be completed or at least attempted to complete, not only the innermost one"


| p x1 x2 x3 |

x1 := x2 := x3 := false.

p := 

[

[

[ ] ensure: [ "halfway through completion when suspended"

[ ] ensure: [ "halfway through completion when suspended"

Processor activeProcess suspend. 

x1 := true]. 

x2 := true]

] ensure: [ "not started yet when suspended"

x3 := true]

] fork.

Processor yield.

p terminate.

self assert: x1 & x2 & x3.


I'm not convinced about the assertions in this test. :-) In fact, I would only expect x3 to be true but x1 and x2 to be false!
IMHO, when terminating a process, halfway-executed unwinded contexts should not be continued. Only not-yet-activated unwind contexts should be triggered.
Here are my arguments:

  • Regular unwinding and process termination should have exactly the same behavior.
Assume we manipulated the example from the test like this:
[
[
[
[ ] ensure: [ "halfway through completion when suspended"
[ ] ensure: [ "halfway through completion when suspended"
self error.
x1 := true]. 
x2 := true]
] ensure: [ "not started yet when suspended"
x3 := true]
] on: Error do: []
] fork.
I have highlighted the differences, so what I changed was i) to insert an error handler at the bottom of the process and ii) instead of terminating the process, to raise an error in the innermost block.
In this example, only x3 will be set to true which is because the exceptional control flow explicitly discontinues the logic running inside the error handler. Only not-yet-activated unwind contexts will be triggered as part of the unwinding, which only applies to the outermost unwind context.
In my view, process termination should have exactly the same semantics as using an exception to abort the control flow.
If we would not catch the error in the above example but press Abandon in the appearing debugger instead, I see no reason why we would want to execute a different set of unwind contexts.
  • Last but not least, the fact that an error has been signaled means that the signalerContext is "infected" so under no circumstances, abandoning the process should resume the execution of this infected context! (The only exception is when you consciously do so via the "Proceed" button in a debugger.) This might become more vivid if I replace the innermost block with the following:
x1 := (2 / 0  "error!") > 0.
Actually, it is enough to run the following stand-alone:
[] ensure: [
x1 := (2 / 0  "error!") > 0
]
If you debug the Abandon button, you can see that another error occurs while terminating the process, which is a MessageNotUnderstood for #> in ZeroDivision. The only reason why a second debugger does not appear is the current bug in Process >> #terminate which "absorbs" subsequent error in this situation and which is currently being discussed in [2].

Sorry for the long message! I hope that you agree with my arguments, and if not, I am very excited to hear your ones. :-) Unless contradicted, I would like to request to change #testNestedUnwind as described above and use the changed version as the general basis for the ongoing discussions in [1, 2]. But maybe I am also just committing a fatal case of false reasoning ... :-)

Best,
Christoph

[1] http://forum.world.st/The-Inbox-Kernel-ct-1405-mcz-td5129706.html
[2] http://forum.world.st/stepping-over-non-local-return-in-a-protected-block-td5128777.html


Carpe Squeak!
Reply | Threaded
Open this post in threaded view
|

Re: The semantics of halfway-executed unwind contexts during process termination

Nicolas Cellier
Hi Christoph,
I guess that the original intention was to perform the cleanup (like releasing resources) even if we close the debugger in the midst of unwinding.
Whether this is a good idea or not is questionable indeed if the debugger was opened due to unhandled error during execution of unwind block.
But the debugger was not necessarily opened during unwinding...
The question you raise is whether we are trying to be too much clever... It might be.



Le lun. 17 mai 2021 à 17:30, Thiede, Christoph <[hidden email]> a écrit :

Hi all, hi Jaromir,


I'm raising a new question in this post that is related to the following threads, but I think that it deserves its own thread due to the fundamental criticism expressed: [1, 2]


I just took a closer look at ProcessTest >> #testNestedUnwind and I have to say that I don't agree with it. I'm sorry that I did not mention this earlier, but somehow this aspect of Jaromir's large amount of recent work has escaped my attention before today. For reference, so that we all know what we are talking about, here is the test in question:

testNestedUnwind

"Test all nested unwind blocks are correctly unwound; all unwind blocks halfway through their execution should be completed or at least attempted to complete, not only the innermost one"


| p x1 x2 x3 |

x1 := x2 := x3 := false.

p := 

[

[

[ ] ensure: [ "halfway through completion when suspended"

[ ] ensure: [ "halfway through completion when suspended"

Processor activeProcess suspend. 

x1 := true]. 

x2 := true]

] ensure: [ "not started yet when suspended"

x3 := true]

] fork.

Processor yield.

p terminate.

self assert: x1 & x2 & x3.


I'm not convinced about the assertions in this test. :-) In fact, I would only expect x3 to be true but x1 and x2 to be false!
IMHO, when terminating a process, halfway-executed unwinded contexts should not be continued. Only not-yet-activated unwind contexts should be triggered.
Here are my arguments:

  • Regular unwinding and process termination should have exactly the same behavior.
Assume we manipulated the example from the test like this:
[
[
[
[ ] ensure: [ "halfway through completion when suspended"
[ ] ensure: [ "halfway through completion when suspended"
self error.
x1 := true]. 
x2 := true]
] ensure: [ "not started yet when suspended"
x3 := true]
] on: Error do: []
] fork.
I have highlighted the differences, so what I changed was i) to insert an error handler at the bottom of the process and ii) instead of terminating the process, to raise an error in the innermost block.
In this example, only x3 will be set to true which is because the exceptional control flow explicitly discontinues the logic running inside the error handler. Only not-yet-activated unwind contexts will be triggered as part of the unwinding, which only applies to the outermost unwind context.
In my view, process termination should have exactly the same semantics as using an exception to abort the control flow.
If we would not catch the error in the above example but press Abandon in the appearing debugger instead, I see no reason why we would want to execute a different set of unwind contexts.
  • Last but not least, the fact that an error has been signaled means that the signalerContext is "infected" so under no circumstances, abandoning the process should resume the execution of this infected context! (The only exception is when you consciously do so via the "Proceed" button in a debugger.) This might become more vivid if I replace the innermost block with the following:
x1 := (2 / 0  "error!") > 0.
Actually, it is enough to run the following stand-alone:
[] ensure: [
x1 := (2 / 0  "error!") > 0
]
If you debug the Abandon button, you can see that another error occurs while terminating the process, which is a MessageNotUnderstood for #> in ZeroDivision. The only reason why a second debugger does not appear is the current bug in Process >> #terminate which "absorbs" subsequent error in this situation and which is currently being discussed in [2].

Sorry for the long message! I hope that you agree with my arguments, and if not, I am very excited to hear your ones. :-) Unless contradicted, I would like to request to change #testNestedUnwind as described above and use the changed version as the general basis for the ongoing discussions in [1, 2]. But maybe I am also just committing a fatal case of false reasoning ... :-)

Best,
Christoph




Reply | Threaded
Open this post in threaded view
|

Re: The semantics of halfway-executed unwind contexts during process termination

Jaromir Matas
Hi Christoph,

> IMHO, when terminating a process, halfway-executed unwinded contexts
> should not be continued. Only not-yet-activated unwind contexts should be
> triggered.


Yes, I too was wondering why there are different unwind semantics in various
situations (error handling, active process unwind, active and suspended
process termination); why there's not just one common semantics for all. My
conclusion was completing half-ways through unwind blocks was way more
complex than unwinding just the not-yet-started unwind blocks.

As a result (my opinion) Squeak implemented just the completion of the most
recent half-ways through unwind block during termination, and VisualWorks
went a step further and implemented the completion of the outer-most
half-ways through unwind block.

Both however left the termination of the active process on the basic level -
no completion of half-ways through blocks. In my attempt I proposed to unify
the semantics of active process and suspended process termination by
suspending the active process and terminating is as a suspended process.

I was considering a fun discussion about extending the error handling and
unwind semantics to match the termination unwind semantics - i.e. including
completion of half-ways through unwind blocks during normal returns - but
that's most likely in the "way too clever" territory :)

Your proposition goes in the opposite direction however - to reduce the
termination semantics to match the current error handling and active process
unwind semantics.

Well, I personally prefer completing the half-ways through unwind blocks
where possible. In my mind it means "try to repair or clean-up as much as
possible before ending a process". I still think completing half-ways
through unwind blocks is worth the extra effort.

Regarding the example:


> [
> [
> [
> [ ] ensure: [ "halfway through completion"
> [ ] ensure: [ "halfway through completion"
> self error.
> x1 := true].
> x2 := true]
> ] ensure: [ "not started yet"
> x3 := true]
> ] on: Error do: []
> ] fork
> {x1 . x2 . x3} ---> #(nil nil true)
>
> In my view, process termination should have exactly the same semantics as
> using an exception to abort the control flow.
> If we would not catch the error in the above example but press Abandon in
> the appearing debugger instead, I see no reason why we would want to
> execute a different set of unwind contexts.

I disagree here: If we would not catch the error, we would be in a different
situation: an error would have occurred which we would not have anticipated,
thus abandoning the debugger would be a different intentional action than a
controlled and anticipated return from an exception. I may argue we could
even attempt to unwind as if terminating but I'm not sure it'd be justified.
So actually I must admit a different semantics here may even be desirable.
I'm not sure in this regard.

So to conclude, unification was my original driver but I'm no longer so
sure... Termination may be a different beast than a regular return or
handled error after all.

Thanks for this discussion, I look forward to taking a closer look at your
changeset!

Hi Nicolas,


Nicolas Cellier wrote

> Hi Christoph,
> I guess that the original intention was to perform the cleanup (like
> releasing resources) even if we close the debugger in the midst of
> unwinding.
> Whether this is a good idea or not is questionable indeed if the debugger
> was opened due to unhandled error during execution of unwind block.
> But the debugger was not necessarily opened during unwinding...
> The question you raise is whether we are trying to be too much clever...
> It
> might be.

Yes indeed, trying to be too clever is very dangerous!! I'm old enough to
know first hand :D
Thanks,

best,




-----
^[^ Jaromir
--
Sent from: http://forum.world.st/Squeak-Dev-f45488.html

^[^ Jaromir
Reply | Threaded
Open this post in threaded view
|

Re: The semantics of halfway-executed unwind contexts during process termination

Jaromir Matas
Hi Christoph,

I posted some additional arguments in
http://forum.world.st/Solving-multiple-termination-bugs-summary-proposal-tp5128285p5129859.html

> [...] the fact that an error has been signaled means that the
> signalerContext is "infected" so under no
> circumstances, abandoning the process should resume the execution of this
> infected context!

This is what really got me thinking... yes, resuming errors sounds like a
bad idea. But the point is if you terminate a totally healthy process in the
middle of its unwind block then there's no reason to prevent its normal
completion. The thing is you don't know in advance. But a debugger is a
different story - you see an error and make a conscious decision - Proceed
or Abandon? That's why I was looking for a Kill button :) Currently the
consensus is Abandon = terminate, however this is not a given, it can be
reconsidered... e.g. use the unwind version of regular #return/#resume/ etc
- without unwinding halfway through block - that could make a good sense...

It means two different unwind semantics could really be desirable and
justified: If a healthy process terminates, let it unwind as much as
possible including all unwind blocks halfway-through execution.

If a broken process terminates via Abandoning the debugger, use the current
"return" unwind semantics - i.e. execute only not-yet-started unwind blocks.

What do you think?

I'm looking forward to your thoughts.
best,



-----
^[^ Jaromir
--
Sent from: http://forum.world.st/Squeak-Dev-f45488.html

^[^ Jaromir
Reply | Threaded
Open this post in threaded view
|

Re: The semantics of halfway-executed unwind contexts during process termination

Christoph Thiede
Hi Jaromir, hi Nicolas,

thanks for your feedback. I think I see the conflict between useful clean-ups during termination on the one hand and way-too-clever clean-ups during abandoning an errored process on the other hand.

Jaromir, your proposal to provide multiple selectors for modeling separate modes of termination sounds like a very good idea to me. But how many different modes do we actually need? So far I can count three modes:

(i) run no unwind contexts (harhest possible way; currently only achievable by doing "suspendedContext privSender: nil" prior to terminating)
(ii) run not-yet started unwind contexts (this is what I proposed in fix-Process-terminate.1.cs [1])
(iii) run all unwind contexts, including those that already have been started (this is the most friendly way that you implemented in #terminate recently)

Can you please confirm whether this enumeration is correct and complete?

What seems disputable to me are the following questions:

1. Which mode should we use in which situations?

I think this debate could benefit from a few more concrete usage scenarios. I'm just collecting some here (thinking aloud):

- Process Browser: We can provide multiple options in the process menu.
- Debugger: I agree with you that Abandon should always run not-yet started unwind contexts but never resume halfway-executed unwind contexts. So this maps to to mode (ii) from above.
- Skimming through most senders of #terminate in the image, they often orchestrate helper processes, deal with unhandled errors or timeouts, or do similar stuff - usually they should be very fine with the friendly version of #terminate, i.e. mode (iii) from above. I think.
- Regarding option (1), I think you would need it extremely seldom but maybe in situations like when your stack contains a loop, your unwind contexts will cause a recursion/new error, or you deliberately want to prevent any unwind context from running. No objections against adding a small but decent button for this in the debugger. :-)

Would you agree with these behaviors? Maybe you can add further examples to the list?

2. How should we name them?

Direct proposal: (i) #kill and (iii) #terminate.
After looking up the original behavior of #terminate in Squeak 5.3, I think it would be consistent to resume all halfway-executed unwind contexts in this method. So yes, I also withdraw my criticism about #testNestedUnwind. :-)

But I don't have any good idea for version (ii) yet. Call it #abandon like in the debugger? Then again, #abandon is rather a verb from the Morphic language. Further possible vocables (according to my synonym thesaurus) include #end, #stop, #finish, #unwind, #abort, #exit. Please help... :-)

Best,
Christoph

[1] http://forum.world.st/template/NamlServlet.jtp?macro=print_post&node=5129805

Carpe Squeak!
Reply | Threaded
Open this post in threaded view
|

Re: The semantics of halfway-executed unwind contexts during process termination

Jaromir Matas
Hi Christoph,

> Jaromir, your proposal to provide multiple selectors for modeling separate
> modes of termination sounds like a very good idea to me. But how many
> different modes do we actually need? So far I can count three modes:
>
> (i) run no unwind contexts (harshest possible way; currently only
> achievable by doing "suspendedContext privSender: nil" prior to
> terminating)
> (ii) run not-yet started unwind contexts (this is what I proposed in
> fix-Process-terminate.1.cs [1])
> (iii) run all unwind contexts, including those that already have been
> started (this is the most friendly way that you implemented in #terminate
> recently)

I think this is it.

Litereally minutes ago had to use privSender: nil to get rid of a debugger
:) Fully terminate really is too strong to recover from fatal errors.

> ... my point here is: Proceeding from an error almost always doesn't seem
> "right". :-) It is always a decision by the debugging programmer to
> override the default control flow and switch to the "next plausible
> alternative control flow", i.e., resume as if the error would have never
> been raised.

yes - I'd add: even an error may quite often be completely benign, like
'Transcript show: 1/0' - possibly a typo so you just may want to Proceed or
fully terminate. In case the error damages a whole subsequent chain of
events, you're absolutely right a full termination seems a silly option and
a light version of terminate may be the most appropriate.

So I fully agree the decision which termination mode it is stays with the
user - so I'm all for giving the user the choices you suggested.

> \1. Which mode should we use in which situations?
>
> I think this debate could benefit from a few more concrete usage
> scenarios. I'm just collecting some here (thinking aloud):
>
> \- Process Browser: We can provide multiple options in the process menu.
> \- Debugger: I agree with you that Abandon should always run not-yet
> started unwind contexts but never resume halfway-executed unwind contexts.
> So this maps to to mode (ii) from above.
> \- Skimming through most senders of #terminate in the image, they often
> orchestrate helper processes, deal with unhandled errors or timeouts, or
> do similar stuff - usually they should be very fine with the friendly
> version of #terminate, i.e. mode (iii) from above. I think.
> \- Regarding option (1), I think you would need it extremely seldom but
> maybe in situations like when your stack contains a loop, your unwind
> contexts will cause a recursion/new error, or you deliberately want to
> prevent any unwind context from running. No objections against adding a
> small but decent button for this in the debugger. :-)
>
> Would you agree with these behaviors? Maybe you can add further examples
> to the list?

Yes

Process Browser - the right click menu could provide all options

Debugger - Abandon could be the lightweight version you proposed. Why not
have a proper Abandon button for it?
        The right click menu on a context could offer the Kill option (next to
'peel to first like this'); no button necessary.
        Now the question is what should be under the "window close" red-circle-x -
heavyweight terminate? I'm thinking this scenario: if the debugger returns
after closing the window you start thinking what happened and use Abandon;
if it still doesn't help you go right-click and kill it?

My usual scenario is (limited experience however): look at something in the
debugger (on a healthy process) and close the window (i.e. full termination
is appropriate and I'd even say preferable). If something goes wrong - then
I'd welcome a hint there are options - thus the proper Abandon button - what
do you think?

>  \2. How should we name them?
>
> Direct proposal: (i) #kill and (iii) #terminate.
> After looking up the original behavior of #terminate in Squeak 5.3, I
> think it would be consistent to resume all halfway-executed unwind
> contexts in this method. So yes, I also withdraw my criticism about
> #testNestedUnwind. :-)
>
> But I don't have any good idea for version (ii) yet. Call it #abandon like
> in the debugger? Then again, #abandon is rather a verb from the Morphic
> language. Further possible vocables (according to my synonym thesaurus)
> include #end, #stop, #finish, #unwind, #abort, #exit. Please help... :-)

I'd probably go with something like #terminateLight because it's a proper
process termination including unwinds except the ones currently in progress
- so it is a light version of #terminate :) I've checked VisualWorks: they
chose #terminateUnsafely for this type of termination which I don't like
much, it sounds too negative; the real meaning is rather
#terminateAsSafelyAsPossibleGivenTheCircumstances ;).

I'm wondering whether #unwindTo: (used ony by Generator) is bugged (with
regard to dealing with non-local returns), and could be fixed/unified with
your approach. Look at these examples:
```
p := [[Processor activeProcess suspend] valueUninterruptably] fork.
Processor yield.
p suspendedContext unwindTo: nil
```
or
```
p := [[:exit | [Processor activeProcess suspend] ensure: [exit value]]
valueWithExit] fork.
Processor yield.
p suspendedContext unwindTo: nil
```

If you do `p terminate` instead of `p suspendedContext unwindTo: nil`, it
works fine, but #unwindTo causes a block cannot return error - I think it's
the same bug all over again :) #value evaluates the non-local return on the
wrong stack...



Regarding our cannot return discussion - I have to think about it and I'll
post my reply later in [1] to keep it separate :)

Thanks again and regards,

[1]
http://forum.world.st/The-Inbox-Kernel-ct-1405-mcz-td5129706.html#a5130114







-----
^[^ Jaromir
--
Sent from: http://forum.world.st/Squeak-Dev-f45488.html

^[^ Jaromir