Smalltalk › Squeak › Squeak - Dev

The Inbox: Kernel-ct.1405.mcz

_‹ Previous Topic Next Topic _›

Classic

List

Threaded

7 messages Options

commits-2

May 15, 2021; 1:56pm

The Inbox: Kernel-ct.1405.mcz

22167 posts

A new version of Kernel was added to project The Inbox:
http://source.squeak.org/inbox/Kernel-ct.1405.mcz

==================== Summary ====================

Name: Kernel-ct.1405
Author: ct
Time: 15 May 2021, 3:56:24.713389 pm
UUID: 9a92be9b-d778-b54f-b659-713451a2ddb2
Ancestors: Kernel-nice.1402

Counterproposal to Kernel-jar.1404 for fixing VM crashes when resuming from a BlockCannotReturn. Instead of enforcing retrial, repair the context stack if the receiver has ended. There are two reasons for that:

1. Not in all situations, the receiver of #cannotReturn: is actually unable to resume. Consider this example for a disproof:
a := [true ifTrue: [^ 1]. 2].
"Both statements need to be executed separately in a Workspace so that [a outerContext sender] becomes nil!"
a value.
In this situation, it is valid to resume from BlockCannotReturn and currently also possible in the Trunk. Note that BlockCannotReturn even overrides #isResumable to answer true, though the class comment discrecommends resuming it.

2. The pattern proposed by Jaromir reminds me of the current implementation of Object >> #doesNotUnderstand: or Object >> #at:, that is, when the error was resumed, just try it again in the manner of a (potentially) infinite recursion. While the issue with infinite debuggers (which was eventually tripped by exactly this pattern) has been solved some time ago [1], I do not really agree with the pattern in general - it makes it unnecessarily hard for ToolSet implementors to consciously resume from an error instead of retrying it (which we have an extra selector on Exception for).

[1] http://forum.world.st/Please-try-out-Fixes-for-debugger-invocation-during-code-simulation-td5127684.html

=============== Diff against Kernel-nice.1402 ===============

Item was added:
+ ----- Method: BlockCannotReturn>>defaultResumeValue (in category 'defaults') -----
+ defaultResumeValue
+
+ ^ self result!

Item was changed:
----- Method: Context>>cannotReturn: (in category 'private-exceptions') -----
cannotReturn: result

+ closureOrNil ifNotNil: [
+ | resumptionValue |
+ resumptionValue := self cannotReturn: result to: self home sender.
+ self pc > self endPC ifTrue: [
+ "This block has ended, continue with sender"
+ thisContext privSender: self sender].
+ ^ resumptionValue].
- closureOrNil ifNotNil: [^ self cannotReturn: result to: self home sender].
Processor debugWithTitle: 'Computation has been terminated!!' translated full: false.!

Christoph Thiede

May 15, 2021; 2:06pm

Re: The Inbox: Kernel-ct.1405.mcz

1513 posts

The #defaultResumeValue was actually a commit slip and should not have been
included in this version.

Nevertheless, this raises another question - what would you expect from this
example to return?

a := [true ifTrue: [^ 1] yourself].
"Both statements need to be executed separately in a Workspace so that [a
outerContext sender] becomes nil!"
[a value] on: BlockCannotReturn do: [:ex | ex resume].

Should it be 1 or nil? In the Trunk, is it nil, if we override
#defaultResumeValue as below, it will be 1.

(Note that I appended #yourself to the block to prevent inlined compilation
- when inlined, the snippet always returns nil because the optimization does
not assume that the methodReturn could be skipped ...)

Best,
Christoph

-----
Carpe Squeak!
--
Sent from: http://forum.world.st/Squeak-Dev-f45488.html

Carpe Squeak!

Jaromir Matas

May 15, 2021; 9:20pm

Re: The Inbox: Kernel-ct.1405.mcz

142 posts

> Not in all situations, the receiver of #cannotReturn: is actually unable
> to resume. Consider this example for a disproof:
> `a := [true ifTrue: [^ 1]. 2].`
> "Both statements need to be executed separately in a Workspace so that
> [a outerContext sender] becomes nil!"
> `a value.`
> In this situation, it is valid to resume from BlockCannotReturn and
> currently also possible in the Trunk. Note that BlockCannotReturn even
> overrides #isResumable to answer true, though the class comment
> discrecommends resuming it.

My interpretation of this example is the home sender of ^1 is gone once the
first do-it ends. So the second do-it correctly, in my opinion, invokes the
cannot return error. Current Trunk returning 2 seems wildly incorrect to me.

Resuming BlockCannotReturn sounds crazy to me by definition and you're
right: it's set as resumable, I haven't noticed. I'd set it non-resumable.
If a block cannot return, why should we be tempted to do that? :)

> Nevertheless, this raises another question - what would you expect from
> this
> example to return?
>
> `a := [true ifTrue: [^ 1] yourself].`
> "Both statements need to be executed separately in a Workspace so that [a
> outerContext sender] becomes nil!"
> `[a value] on: BlockCannotReturn do: [:ex | ex resume].`
>
> Should it be 1 or nil? In the Trunk, is it nil, if we override
> \#defaultResumeValue as below, it will be 1.

This is a mean example... My fix ended in an infinite loop :) I tried to
fix it but the only clean solution that occurred to me is to set
BlockCannotReturn as non-resumable.

But again, my interpretation here is any attempt to "repair" the context
that cannot return means a substantial change of the non-local return
semantics. It means I'd return nil because the meaning of the error is: I
cannot return 1 to my home sender. Here's one of my examples I'm planning to
send as test cases to the Inbox soon:

[
[
[ ] ensure: [
[] ensure: [
^Transcript show: 'x1'].
Transcript show: 'x2']
] ensure: [
Transcript show: 'x3'].
Transcript show: 'x4'
] fork

In this case the expected outcome is ---> x1 x3. Neither x2 nor x4 should be
printed (x2 is intentionally skipped by the non-local return and x4 is
outside the ensure blocks). With the fix you propose the outcome is either
---> x1 x2 x3 if pressed Abandon or ---> x1 x2 x3 x4 if pressed Proceed -
this would be equivalent to no non-local return at all :)

I hope I'll be able to put the tests together and publish in a few days.

Juan Vuletich showed me a beautiful example about the non-local return
semantics - take a look in [1] in the middle of the post.

Thanks for discussing this!

best,

[1] [[Cuis-dev\] Unwind mechanism during termination is broken and
inconsistent](https://lists.cuis.st/mailman/archives/cuis-dev/2021-April/003055.html)

-----
^[^ Jaromir
--
Sent from: http://forum.world.st/Squeak-Dev-f45488.html

^[^ Jaromir

Christoph Thiede

May 17, 2021; 2:26pm

Re: The Inbox: Kernel-ct.1405.mcz

1513 posts

Hi Jaromir,

first of all, please let me clarify one thing:

BlockCannotReturn is an Error, of course, and thus under normal circumstances should never be resumed. Just like it would usually be an anti-pattern to resume from a subscript error, KeyNotFound error, or anything else. Not for nothing, Error implements #isResumeable with false so that sending #resume[:] to an error will usually raise an IllegalResumeAttempt.

Nevertheless, our debugger supports enforcing the resumption of an exception using Proceed, even if the exception is not marked as resumable. So this should only be used for debugging scenarios, but still, I think that we could - and maybe also should - make it as convenient as possible to perform this non-recommended operation if you are debugging some unusual scenarios. This is mainly helpful to manipulate the behavior of a program during runtime (similar to the "return entered value" in the stack list menu of the debugger).

> Resuming BlockCannotReturn sounds crazy to me by definition and you're right: it's set as resumable, I haven't noticed. I'd set it non-resumable. If a block cannot return, why should we be tempted to do that? :)

I don't know whether there was a special reason to mark it as resumable, but if there wasn't one, I totally agree with you on this point! Just talkin' about the conscious decision of the debugging person to ignore the non-resumability of this exception and resume it anyway, just like you can do it for any other normal error. :-)

> > `a := [true ifTrue: [^ 1]. 2].`

> > "Both statements need to be executed separately in a Workspace so that [a outerContext sender] becomes nil!"

> > `a value.`

> > In this situation, it is valid to resume from BlockCannotReturn and currently also possible in the Trunk. Note that BlockCannotReturn even overrides #isResumable to answer true, though the class comment discrecommends resuming it.

> My interpretation of this example is the home sender of ^1 is gone once the first do-it ends. So the second do-it correctly, in my opinion, invokes the cannot return error. Current Trunk returning 2 seems wildly incorrect to me.

Again, just for clarification: The example does raise a BlockCannotReturn in the second expression in the current Trunk as it should do. :-) Maybe my explanation was not precise enough here. I was only referring to the unusual edge case in which you attempt to proceed the process from this BlockCannotReturn error. In the Trunk, the expression will return 2 in this case. With your example, you won't be able to escape from the situation without pressing Abandon.

> [

> [ ] ensure: [

> [] ensure: [

> ^Transcript show: 'x1'].

> Transcript show: 'x2']

> ] ensure: [

> Transcript show: 'x3'].

> Transcript show: 'x4'

> ] fork

> In this case the expected outcome is ---> x1 x3. Neither x2 nor x4 should be printed (x2 is intentionally skipped by the non-local return and x4 is outside the ensure blocks). With the fix you propose the outcome is either ---> x1 x2 x3 if pressed Abandon or ---> x1 x2 x3 x4 if pressed Proceed - this would be equivalent to no non-local return at all :)

Wait, wait, wait. This smells to me. :-) #cannotReturn: should not be *resumed* after the error was abandoned. Otherwise, something is wrong with the termination logic. Process >> #terminate *must not* resume in this place. Terminating means only executing all uncompleted unwind contexts. I just reverted to the version ct 1/17/2021 18:35 of Process >> #terminate and with regard to your example, both implementations of #cannotReturn: behave the save (---> x1 x3) as expected. Hm, I'm sorry, but Process >> #terminate is not yet done correctly IMHO. I'll send you another message concerning this method soon, but I'm checking many things while drafting the message, so please have some more patience. :-)

> Thanks for discussing this!

Thank *you* for whipping the "machine room" of Squeak into shape again and bringing up all these interesting questions along the way! :-)

Best,

Christoph

Von: Squeak-dev <[hidden email]> im Auftrag von Jaromir Matas <[hidden email]>
Gesendet: Samstag, 15. Mai 2021 23:20:54
An: [hidden email]
Betreff: Re: [squeak-dev] The Inbox: Kernel-ct.1405.mcz

Hi Christoph,

> Counterproposal to Kernel-jar.1404 for fixing VM crashes when resuming
> from a BlockCannotReturn. Instead of enforcing retrial, repair the context
> stack if the receiver has ended.

I was considering the idea whether it could make sense to "fix" the stack
but dumped it eventually because it would completely change the semantics of
non-local returns. In my opinion once the home context sender is not
available it means it's gone irreparably. There are two situation to
consider: double return to the same context within one stack (e.g. the
return context is gone or it may even still exist but its pc has moved) or
the home sender is on a different context stack - in case of forks etc.
Non-local returns between forks could in theory work but not in the current
environment; Squeak strictly requires the home context sender to be on the
same stack.

> Not in all situations, the receiver of #cannotReturn: is actually unable
> to resume. Consider this example for a disproof:
>     `a := [true ifTrue: [^ 1]. 2].`
>     "Both statements need to be executed separately in a Workspace so that
> [a outerContext sender] becomes nil!"
>     `a value.`
> In this situation, it is valid to resume from BlockCannotReturn and
> currently also possible in the Trunk. Note that BlockCannotReturn even
> overrides #isResumable to answer true, though the class comment
> discrecommends resuming it.

My interpretation of this example is the home sender of ^1 is gone once the
first do-it ends. So the second do-it correctly, in my opinion, invokes the
cannot return error. Current Trunk returning 2 seems wildly incorrect to me.

Resuming BlockCannotReturn sounds crazy to me by definition and you're
right: it's set as resumable, I haven't noticed. I'd set it non-resumable.
If a block cannot return, why should we be tempted to do that? :)

> Nevertheless, this raises another question - what would you expect from
> this
> example to return?
>
> `a := [true ifTrue: [^ 1] yourself].`
> "Both statements need to be executed separately in a Workspace so that [a
> outerContext sender] becomes nil!"
> `[a value] on: BlockCannotReturn do: [:ex | ex resume].`
>
> Should it be 1 or nil? In the Trunk, is it nil, if we override
> \#defaultResumeValue as below, it will be 1.

This is a mean example... My fix ended in an infinite loop :) I tried to
fix it but the only clean solution that occurred to me is to set
BlockCannotReturn as non-resumable.

But again, my interpretation here is any attempt to "repair" the context
that cannot return means a substantial change of the non-local return
semantics. It means I'd return nil because the meaning of the error is: I
cannot return 1 to my home sender. Here's one of my examples I'm planning to
send as test cases to the Inbox soon:

[
        [
                [ ] ensure: [
                        [] ensure: [
                                ^Transcript show: 'x1'].
                        Transcript show: 'x2']
        ] ensure: [
                Transcript show: 'x3'].
        Transcript show: 'x4'
] fork

In this case the expected outcome is ---> x1 x3. Neither x2 nor x4 should be
printed (x2 is intentionally skipped by the non-local return and x4 is
outside the ensure blocks). With the fix you propose the outcome is either
---> x1 x2 x3 if pressed Abandon or ---> x1 x2 x3 x4 if pressed Proceed -
this would be equivalent to no non-local return at all :)

I hope I'll be able to put the tests together and publish in a few days.

Juan Vuletich showed me a beautiful example about the non-local return
semantics - take a look in [1] in the middle of the post.

Thanks for discussing this!

best,

[1] [[Cuis-dev\] Unwind mechanism during termination is broken and
inconsistent](https://lists.cuis.st/mailman/archives/cuis-dev/2021-April/003055.html)

-----
^[^ Jaromir
--
Sent from: http://forum.world.st/Squeak-Dev-f45488.html

Smalltalk - Squeak - Dev | Mailing List Archive

forum.world.st

Squeak - Dev forum and mailing list archive. The general-purpose Squeak developers list

Carpe Squeak!

Jaromir Matas

May 21, 2021; 12:22pm

Re: The Inbox: Kernel-ct.1405.mcz

142 posts

This post was updated on May 21, 2021; 6:07pm.

Hi Christoph,

Christoph Thiede wrote
>> > [
>> > [
>> > [ ] ensure: [
>> > [] ensure: [
>> > ^Transcript show: 'x1'].
>> > Transcript show: 'x2']
>> > ] ensure: [
>> > Transcript show: 'x3'].
>> > Transcript show: 'x4'
>> > ] fork
>> >
>> > In this case the expected outcome is ---> x1 x3. Neither x2 nor x4
>> should be printed (x2 is intentionally skipped by the non-local return
>> and x4 is outside the ensure blocks). With the fix you propose the
>> outcome is either ---> x1 x2 x3 if pressed Abandon or ---> x1 x2 x3 x4 if
>> pressed Proceed - this would be equivalent to no non-local return at all
>> :)
>>
>> Wait, wait, wait. This smells to me. :-) #cannotReturn: should not be
>> *resumed* after the error was abandoned. Otherwise, something is wrong
>> with the termination logic. Process >> #terminate *must not* resume in
>> this place. Terminating means only executing all uncompleted unwind
>> contexts. I just reverted to the version ct 1/17/2021 18:35 of Process >>
>> #terminate and with regard to your example, both implementations of
>> #cannotReturn: behave the save (---> x1 x3) as expected. Hm, I'm sorry,
>> but Process >> #terminate is not yet done correctly IMHO.

What happened: in your changeset you made #cannotReturn: return to its
sender after choosing Proceed, i.e. the execution continued into the
preceding #ensure context. This, I feel, introduces an incorrect semantics
here: the real sender of the #cannotReturn: was the VM that tried to execute
a non-local return and failed. For lack of other options (I guess) the VM
set the #ensure: context as a sender of #cannotReturn: - my guess the main
purpose of this link is to keep the stack chain available for unwind - but
not for resuming the execution - so this is my objection.

Proceeding after BlockCannotReturn actually means: Proceed as if no
non-local return was ever there. This doesn't seem right to me but maybe
there could be good a reason to do this in debugging, I don't know.

The crucial point here is #terminate now attempts to complete the outer-most
unfinished unwind block instead of the inner-most only (i.e. the deepest
unfinished unwind block as opposed to the most shallow one). In this
particular example current #terminate correctly leaves the unfinished unwind
block after abandoning BlockCannotReturn (skipping 'x2') and finds another
unwind block with 'x3'. But if you apply your #cannotReturn patch and press
Abandon (edited 5/21) then #cannotReturn: returns and current #terminate simply continues
the unwind within the current unwind block and finds 'x2'.

To avoid any confusion: by no means #terminate resumes after
BlockCannotReturn - absolutely not, #terminate just continues unwinding
remaining unwind blocks only; that's different.

> With your example, you won't be able to escape from the situation without
> pressing Abandon.

Well, yes, that was the point: I can't imagine a reasonable next step from a
non-local return with no home context to return to... That's why I looped
#cannotReturn: to itself with the only way out via Abandon, i.e. terminating
:)

I'm sending an alternative proposal to solve the infinite recursion of
BlockCannotReturn:

```
cannotReturn: result

closureOrNil ifNotNil: [self cannotReturn: result to: self home sender.
[self cannotReturnRecursive: result to: self home sender.
self notify: 'Invoking an infinite loop'.
true] whileTrue].
Processor debugWithTitle: 'Computation has been terminated!' translated
full: false.
```

where #cannotReturnRecursive:to: sets a Boolean variable for the user to be
able to deal with the recursion. Resuming BCR or not should no longer be an
issue...

I know you're questioning whether Abandoning the debugger should be
equivalent to terminating; or more precisely you're suggesting termination
logic should be reduced to follow a normal return or exception return logic,
i.e. skipping the unwind blocks currently under evaluation as discussed in
[1]. As you know I disagree here and maintain the general termination logic
should be as broad as possible but I see your point in reducing the
termination logic in case of abandoning a debugger in case the debugged
process is broken.

Thanks and best regards!

[1]
http://forum.world.st/The-semantics-of-halfway-executed-unwind-contexts-during-process-termination-td5129800.html

-----
^[^ Jaromir
--
Sent from: http://forum.world.st/Squeak-Dev-f45488.html

^[^ Jaromir

Christoph Thiede

May 28, 2021; 6:54pm

Re: The Inbox: Kernel-ct.1405.mcz

1513 posts

Hi Jaromir,

preamble: Yes, I agree that we need something different than the current #terminate for the debugger's Abandon - see [1]. Correct me if I am wrong, but I think further termination details do not need to be discussed in this thread but only in [1]. :-)

> This [proposal], I feel, introduces an incorrect semantics here: the real sender of the #cannotReturn: was the VM that tried to execute a non-local return and failed. For lack of other options (I guess) the VM set the #ensure: context as a sender of #cannotReturn: - my guess the main purpose of this link is to keep the stack chain available for unwind - but not for resuming the execution - so this is my objection.

Thanks for the explanation. This is where our mental models appear to clash: I am not thinking of the VM as an instance that can be the sender of a method - the VM always resists on a meta-level apart from the actual code or stack trace. A sender has to be a context, and a context has to have things such a method, a pc, and a receiver - but what would these properties be for the VM itself? I would rather compare #cannotReturn: to #doesNotUnderstand:, which is a substitute for a different send/instruction in the causing method and validly has that causing method as a sender. You could also think of "^ foo" as a shortcut for "thisContext (home) return: foo" to make sense of this.

> Proceeding after BlockCannotReturn actually means: Proceed as if no non-local return was ever there.

Yes ...

> This doesn't seem right to me but maybe there could be good a reason to do this in debugging, I don't know.

... my point here is: Proceeding from an error almost always doesn't seem "right". :-) It is always a decision by the debugging programmer to override the default control flow and switch to the "next plausible alternative control flow", i.e., resume as if the error would have never been raised. Applied to the attempt to return from a method, for me, this means to ignore the return (thinking of it in message sends: to ignore the "thisContext (home) return"). Yeah, and if there is no further statement after that return, my best understanding of the user's intention to "proceed" would be to return to the place from where the block has been invoked ...

> I'm sending an alternative proposal to solve the infinite recursion of BlockCannotReturn

Hm, in this example, you moved the relevant logic into Process >> #complete:to:. This means that BlockCannotReturn(Exception) >> #resumeUnchecked: will not have the same effect, won't it? :-(
Also, can you convince me why you would need some extra state in the exception for this?

Argh, here is another example which does not yet match my expectations:

sender := thisContext swapSender: nil.
true ifTrue: [^ 1]. "Proceed the BlockCannotReturn"
thisContext privSender: sender.
^ 2

I think this should eventually answer 2. Apparently, the VM already has reset the pc in this example so we are helpless here. ^^

Best,
Christoph

[1] http://forum.world.st/The-semantics-of-halfway-executed-unwind-contexts-during-process-termination-tp5129800p5130110.html

Carpe Squeak!

Jaromir Matas

May 30, 2021; 4:50pm

Re: The Inbox: Kernel-ct.1405.mcz

142 posts

Hi Christoph,

>
> ... my point here is: Proceeding from an error almost always doesn't seem
> "right". :-) It is always a decision by the debugging programmer to
> override the default control flow and switch to the "next plausible
> alternative control flow", i.e., resume as if the error would have never
> been raised. Applied to the attempt to return from a method, for me, this
> means to ignore the return (thinking of it in message sends: to ignore the
> "thisContext (home) return"). Yeah, and if there is no further statement
> after that return, my best understanding of the user's intention to
> "proceed" would be to return to the place from where the block has been
> invoked ...

Agreed :) The more I think about it the more I like it ;) And well, the
non-local return could have been a typo anyways... so actually, this makes
the best sense and preserves all options open for the user - perfect!

> Also, can you convince me why you would need some extra state in the
> exception for this?

No, I hated adding an extra state :) The only thing I really need for the
full #terminate to work correctly is a way to distinguish between normal
proceeding the computation and resuming the unwind procedure when the
BlockCannotReturn error occurs inside an unwind block currently being
evaluated. All I need then is a Warning the computation proceeds beyond the
BlockCannotReturn; here's what I mean:

```
cannotReturn: result

closureOrNil ifNotNil: [
| resumptionValue |
resumptionValue := self cannotReturn: result to: self home sender.
ProceedBlockCannotReturn new signal: 'This block has ended, continue with
sender?'.
self pc > self endPC ifTrue: [
"This block has ended, continue with sender"
thisContext privSender: self sender].
^ resumptionValue].
Processor debugWithTitle: 'Computation has been terminated!' translated
full: false
```

So if you're fine with this addition, full #terminate would recognize when
it's beyond BlockCannotReturn and would continue unwinding the non-local
return accordingly.

I think it's useful to warn the user about such an unusual (and new) option
as Proceeding safely beyond BlockCannotReturn anyway :)

>
> Argh, here is another example which does not yet match my expectations:
> ```
> sender := thisContext swapSender: nil.
> true ifTrue: [^ 1]. "Proceed the BlockCannotReturn"
> thisContext privSender: sender.
> ^ 2
> ```
> I think this should eventually answer 2. Apparently, the VM already has
> reset the pc in this example so we are helpless here.

There's something wrong with this example :) (or my understanding of it)

1) if you print-it or do-it you get Computation terminated instead of Block
cannot return

2) so I wrapped it in [] value
```
[sender := thisContext swapSender: nil.
true ifTrue: [^ 1]. "Proceed the BlockCannotReturn"
thisContext privSender: sender.
^ 2] value
```
Now it raises BlockCannotReturn and returns 2 with your changeset (if
Proceeded)...

BUT

if you debug it and:
A) step through the ^1 - you get Message not understood
B) step into ^1 - you don't get any error and can happily continue

I'm confused...

Many thanks for your comments and your proposed solution to cannot return;
I'll update #terminate and remove my previous attempts from the Inbox.
best,

-----
^[^ Jaromir
--
Sent from: http://forum.world.st/Squeak-Dev-f45488.html

^[^ Jaromir