Question regarding compilation of #caseOf:[otherwise:] messages

Previous Topic Next Topic
 
classic Classic list List threaded Threaded
10 messages Options
Reply | Threaded
Open this post in threaded view
|

Question regarding compilation of #caseOf:[otherwise:] messages

Christoph Thiede

Hi all,


this a question regarding the Compiler's implementation. I spent some time on it but could not yet find a satisfying explanation. Maybe someone of you can help me?


Let's look into MessageNode >> #emitCodeForCase:encoder:value:, which contains the following passage:


(isLast and: [allReturn]) ifTrue:

    [self emitCodeForJump: elseSize encoder: encoder]


Btw, the equivalent in #sizeCodeForCase:value: is:


(isLast and: [allReturn]) ifTrue: [thenSize := thenSize + (self sizeCode: encoder forJump: elseSize)].


In the bytecode of a method, this will look like this:


89 <23> pushConstant: 2

90 <88> dup

91 <76> pushConstant: 1

92 <B6> send: =

93 <9A> jumpFalse: 97

94 <87> pop

95 <22> pushConstant: #one

96 <7C> returnTop

97 <77> pushConstant: 2

98 <B6> send: =

99 <9A> jumpFalse: 103

100 <21> pushConstant: #two

101 <7C> returnTop

102 <90> jumpTo: 104 "here!"

103 <20> pushConstant: 42

104 <87> pop


I cannot see when this jump would be ever reached.

Case 1: Any case is selected, and because allReturn is true, its inlined value block is guaranteed to #returnTop and thus not to reach the last jump (in the above example, this happens in pc101).

Case 2: No case is selected, so the execution will jump straightly to the inlined otherwiseBlock (this is pc99 in the example).

For what purpose do we need this jump (ex: pc102)? Should we maybe remove it?


Are there any special situations in which #returns is true but there is any code path that does *not* return? Some weird exception handling trick, for example?

Or was the dubious jump just designed as a fallback to catch any hypothetic failures in any other component of the compiler system?


I removed the generation of this jump for testing and could not investigate any anomalies so far. All compiler tests are passing - except of a number of Decompiler tests, which probably would need to be updated as well.

Please find a minimum changeset in the attachment that demonstrates my question.


I am looking forward to your replies!


Best,

Christoph




cleanup caseOf.1.cs (4K) Download Attachment
Carpe Squeak!
Reply | Threaded
Open this post in threaded view
|

Re: Question regarding compilation of #caseOf:[otherwise:] messages

Nicolas Cellier
Hi Christoph,
could it be for the case when there is no otherwise?

Le sam. 22 févr. 2020 à 18:24, Thiede, Christoph <[hidden email]> a écrit :

Hi all,


this a question regarding the Compiler's implementation. I spent some time on it but could not yet find a satisfying explanation. Maybe someone of you can help me?


Let's look into MessageNode >> #emitCodeForCase:encoder:value:, which contains the following passage:


(isLast and: [allReturn]) ifTrue:

    [self emitCodeForJump: elseSize encoder: encoder]


Btw, the equivalent in #sizeCodeForCase:value: is:


(isLast and: [allReturn]) ifTrue: [thenSize := thenSize + (self sizeCode: encoder forJump: elseSize)].


In the bytecode of a method, this will look like this:


89 <23> pushConstant: 2

90 <88> dup

91 <76> pushConstant: 1

92 <B6> send: =

93 <9A> jumpFalse: 97

94 <87> pop

95 <22> pushConstant: #one

96 <7C> returnTop

97 <77> pushConstant: 2

98 <B6> send: =

99 <9A> jumpFalse: 103

100 <21> pushConstant: #two

101 <7C> returnTop

102 <90> jumpTo: 104 "here!"

103 <20> pushConstant: 42

104 <87> pop


I cannot see when this jump would be ever reached.

Case 1: Any case is selected, and because allReturn is true, its inlined value block is guaranteed to #returnTop and thus not to reach the last jump (in the above example, this happens in pc101).

Case 2: No case is selected, so the execution will jump straightly to the inlined otherwiseBlock (this is pc99 in the example).

For what purpose do we need this jump (ex: pc102)? Should we maybe remove it?


Are there any special situations in which #returns is true but there is any code path that does *not* return? Some weird exception handling trick, for example?

Or was the dubious jump just designed as a fallback to catch any hypothetic failures in any other component of the compiler system?


I removed the generation of this jump for testing and could not investigate any anomalies so far. All compiler tests are passing - except of a number of Decompiler tests, which probably would need to be updated as well.

Please find a minimum changeset in the attachment that demonstrates my question.


I am looking forward to your replies!


Best,

Christoph




Reply | Threaded
Open this post in threaded view
|

Re: Question regarding compilation of #caseOf:[otherwise:] messages

Christoph Thiede

Hi Nicolas,


could it be for the case when there is no otherwise?


In this case, a "self caseError" will be generated (though this is no correct behavior either IMO, see Problems with #caseError). I don't see how this should affect the jumps or returns before the otherwise part ...

Best,
Christoph


Von: Squeak-dev <[hidden email]> im Auftrag von Nicolas Cellier <[hidden email]>
Gesendet: Samstag, 22. Februar 2020 18:36:01
An: The general-purpose Squeak developers list
Betreff: Re: [squeak-dev] Question regarding compilation of #caseOf:[otherwise:] messages
 
Hi Christoph,
could it be for the case when there is no otherwise?

Le sam. 22 févr. 2020 à 18:24, Thiede, Christoph <[hidden email]> a écrit :

Hi all,


this a question regarding the Compiler's implementation. I spent some time on it but could not yet find a satisfying explanation. Maybe someone of you can help me?


Let's look into MessageNode >> #emitCodeForCase:encoder:value:, which contains the following passage:


(isLast and: [allReturn]) ifTrue:

    [self emitCodeForJump: elseSize encoder: encoder]


Btw, the equivalent in #sizeCodeForCase:value: is:


(isLast and: [allReturn]) ifTrue: [thenSize := thenSize + (self sizeCode: encoder forJump: elseSize)].


In the bytecode of a method, this will look like this:


89 <23> pushConstant: 2

90 <88> dup

91 <76> pushConstant: 1

92 <B6> send: =

93 <9A> jumpFalse: 97

94 <87> pop

95 <22> pushConstant: #one

96 <7C> returnTop

97 <77> pushConstant: 2

98 <B6> send: =

99 <9A> jumpFalse: 103

100 <21> pushConstant: #two

101 <7C> returnTop

102 <90> jumpTo: 104 "here!"

103 <20> pushConstant: 42

104 <87> pop


I cannot see when this jump would be ever reached.

Case 1: Any case is selected, and because allReturn is true, its inlined value block is guaranteed to #returnTop and thus not to reach the last jump (in the above example, this happens in pc101).

Case 2: No case is selected, so the execution will jump straightly to the inlined otherwiseBlock (this is pc99 in the example).

For what purpose do we need this jump (ex: pc102)? Should we maybe remove it?


Are there any special situations in which #returns is true but there is any code path that does *not* return? Some weird exception handling trick, for example?

Or was the dubious jump just designed as a fallback to catch any hypothetic failures in any other component of the compiler system?


I removed the generation of this jump for testing and could not investigate any anomalies so far. All compiler tests are passing - except of a number of Decompiler tests, which probably would need to be updated as well.

Please find a minimum changeset in the attachment that demonstrates my question.


I am looking forward to your replies!


Best,

Christoph




Carpe Squeak!
Reply | Threaded
Open this post in threaded view
|

Re: Question regarding compilation of #caseOf:[otherwise:] messages

Nicolas Cellier
It was just a wild guess, because I did not look at the code nor any other detail.
If you already analyzed it, I trust you, you are more advanced than me on the subject :)
Fixing the Decompiler might be a bit more involved the way it is written now...
But with your fast progress and fearless diving in lower level details, I think you can do it :)

Le sam. 22 févr. 2020 à 18:44, Thiede, Christoph <[hidden email]> a écrit :

Hi Nicolas,


could it be for the case when there is no otherwise?


In this case, a "self caseError" will be generated (though this is no correct behavior either IMO, see Problems with #caseError). I don't see how this should affect the jumps or returns before the otherwise part ...

Best,
Christoph


Von: Squeak-dev <[hidden email]> im Auftrag von Nicolas Cellier <[hidden email]>
Gesendet: Samstag, 22. Februar 2020 18:36:01
An: The general-purpose Squeak developers list
Betreff: Re: [squeak-dev] Question regarding compilation of #caseOf:[otherwise:] messages
 
Hi Christoph,
could it be for the case when there is no otherwise?

Le sam. 22 févr. 2020 à 18:24, Thiede, Christoph <[hidden email]> a écrit :

Hi all,


this a question regarding the Compiler's implementation. I spent some time on it but could not yet find a satisfying explanation. Maybe someone of you can help me?


Let's look into MessageNode >> #emitCodeForCase:encoder:value:, which contains the following passage:


(isLast and: [allReturn]) ifTrue:

    [self emitCodeForJump: elseSize encoder: encoder]


Btw, the equivalent in #sizeCodeForCase:value: is:


(isLast and: [allReturn]) ifTrue: [thenSize := thenSize + (self sizeCode: encoder forJump: elseSize)].


In the bytecode of a method, this will look like this:


89 <23> pushConstant: 2

90 <88> dup

91 <76> pushConstant: 1

92 <B6> send: =

93 <9A> jumpFalse: 97

94 <87> pop

95 <22> pushConstant: #one

96 <7C> returnTop

97 <77> pushConstant: 2

98 <B6> send: =

99 <9A> jumpFalse: 103

100 <21> pushConstant: #two

101 <7C> returnTop

102 <90> jumpTo: 104 "here!"

103 <20> pushConstant: 42

104 <87> pop


I cannot see when this jump would be ever reached.

Case 1: Any case is selected, and because allReturn is true, its inlined value block is guaranteed to #returnTop and thus not to reach the last jump (in the above example, this happens in pc101).

Case 2: No case is selected, so the execution will jump straightly to the inlined otherwiseBlock (this is pc99 in the example).

For what purpose do we need this jump (ex: pc102)? Should we maybe remove it?


Are there any special situations in which #returns is true but there is any code path that does *not* return? Some weird exception handling trick, for example?

Or was the dubious jump just designed as a fallback to catch any hypothetic failures in any other component of the compiler system?


I removed the generation of this jump for testing and could not investigate any anomalies so far. All compiler tests are passing - except of a number of Decompiler tests, which probably would need to be updated as well.

Please find a minimum changeset in the attachment that demonstrates my question.


I am looking forward to your replies!


Best,

Christoph





Reply | Threaded
Open this post in threaded view
|

Re: Question regarding compilation of #caseOf:[otherwise:] messages

Christoph Thiede

If you already analyzed it, I trust you, you are more advanced than me on the subject :)


I would not say that I have an overview of the entire compiler system at the moment. Someone else should definitively review this before. :)

Fixing the Decompiler might be a bit more involved the way it is written now...
> But with your fast progress and fearless diving in lower level details, I think you can do it :)

Hm, maybe this could be some of my next adventures :-)
(At the moment I'm trying to optimize #caseOf: to use binary search if appropriate, let's see what will come of it ^^)

Best,
Christoph

Von: Squeak-dev <[hidden email]> im Auftrag von Nicolas Cellier <[hidden email]>
Gesendet: Samstag, 22. Februar 2020 18:50:26
An: The general-purpose Squeak developers list
Betreff: Re: [squeak-dev] Question regarding compilation of #caseOf:[otherwise:] messages
 
It was just a wild guess, because I did not look at the code nor any other detail.
If you already analyzed it, I trust you, you are more advanced than me on the subject :)
Fixing the Decompiler might be a bit more involved the way it is written now...
But with your fast progress and fearless diving in lower level details, I think you can do it :)

Le sam. 22 févr. 2020 à 18:44, Thiede, Christoph <[hidden email]> a écrit :

Hi Nicolas,


could it be for the case when there is no otherwise?


In this case, a "self caseError" will be generated (though this is no correct behavior either IMO, see Problems with #caseError). I don't see how this should affect the jumps or returns before the otherwise part ...

Best,
Christoph


Von: Squeak-dev <[hidden email]> im Auftrag von Nicolas Cellier <[hidden email]>
Gesendet: Samstag, 22. Februar 2020 18:36:01
An: The general-purpose Squeak developers list
Betreff: Re: [squeak-dev] Question regarding compilation of #caseOf:[otherwise:] messages
 
Hi Christoph,
could it be for the case when there is no otherwise?

Le sam. 22 févr. 2020 à 18:24, Thiede, Christoph <[hidden email]> a écrit :

Hi all,


this a question regarding the Compiler's implementation. I spent some time on it but could not yet find a satisfying explanation. Maybe someone of you can help me?


Let's look into MessageNode >> #emitCodeForCase:encoder:value:, which contains the following passage:


(isLast and: [allReturn]) ifTrue:

    [self emitCodeForJump: elseSize encoder: encoder]


Btw, the equivalent in #sizeCodeForCase:value: is:


(isLast and: [allReturn]) ifTrue: [thenSize := thenSize + (self sizeCode: encoder forJump: elseSize)].


In the bytecode of a method, this will look like this:


89 <23> pushConstant: 2

90 <88> dup

91 <76> pushConstant: 1

92 <B6> send: =

93 <9A> jumpFalse: 97

94 <87> pop

95 <22> pushConstant: #one

96 <7C> returnTop

97 <77> pushConstant: 2

98 <B6> send: =

99 <9A> jumpFalse: 103

100 <21> pushConstant: #two

101 <7C> returnTop

102 <90> jumpTo: 104 "here!"

103 <20> pushConstant: 42

104 <87> pop


I cannot see when this jump would be ever reached.

Case 1: Any case is selected, and because allReturn is true, its inlined value block is guaranteed to #returnTop and thus not to reach the last jump (in the above example, this happens in pc101).

Case 2: No case is selected, so the execution will jump straightly to the inlined otherwiseBlock (this is pc99 in the example).

For what purpose do we need this jump (ex: pc102)? Should we maybe remove it?


Are there any special situations in which #returns is true but there is any code path that does *not* return? Some weird exception handling trick, for example?

Or was the dubious jump just designed as a fallback to catch any hypothetic failures in any other component of the compiler system?


I removed the generation of this jump for testing and could not investigate any anomalies so far. All compiler tests are passing - except of a number of Decompiler tests, which probably would need to be updated as well.

Please find a minimum changeset in the attachment that demonstrates my question.


I am looking forward to your replies!


Best,

Christoph





Carpe Squeak!
Reply | Threaded
Open this post in threaded view
|

Re: Question regarding compilation of #caseOf:[otherwise:] messages

Levente Uzonyi
On Sat, 22 Feb 2020, Thiede, Christoph wrote:

>
> > If you already analyzed it, I trust you, you are more advanced than me on the subject :)
>
>
> I would not say that I have an overview of the entire compiler system at the moment. Someone else should definitively review this before. :)
>
> > Fixing the Decompiler might be a bit more involved the way it is written now... > But with your fast progress and fearless diving in lower level details, I think you can do it :)
>
> Hm, maybe this could be some of my next adventures :-)
> (At the moment I'm trying to optimize #caseOf: to use binary search if appropriate, let's see what will come of it ^^)
When is it appropriate to use binary search there?

Levente

>
> Best,
> Christoph
>
> __________________________________________________________________________________________________________________________________________________________________________________________________________________________________________________________________________________________________________________________
> Von: Squeak-dev <[hidden email]> im Auftrag von Nicolas Cellier <[hidden email]>
> Gesendet: Samstag, 22. Februar 2020 18:50:26
> An: The general-purpose Squeak developers list
> Betreff: Re: [squeak-dev] Question regarding compilation of #caseOf:[otherwise:] messages  
> It was just a wild guess, because I did not look at the code nor any other detail.
> If you already analyzed it, I trust you, you are more advanced than me on the subject :)
> Fixing the Decompiler might be a bit more involved the way it is written now...
> But with your fast progress and fearless diving in lower level details, I think you can do it :)
>
> Le sam. 22 févr. 2020 à 18:44, Thiede, Christoph <[hidden email]> a écrit :
>
>       Hi Nicolas,
>
>
>       > could it be for the case when there is no otherwise?
>
>
> In this case, a "self caseError" will be generated (though this is no correct behavior either IMO, see Problems with #caseError). I don't see how this should affect the jumps or returns before the otherwise part ...
>
> Best,
> Christoph
>
> __________________________________________________________________________________________________________________________________________________________________________________________________________________________________________________________________________________________________________________________
> Von: Squeak-dev <[hidden email]> im Auftrag von Nicolas Cellier <[hidden email]>
> Gesendet: Samstag, 22. Februar 2020 18:36:01
> An: The general-purpose Squeak developers list
> Betreff: Re: [squeak-dev] Question regarding compilation of #caseOf:[otherwise:] messages  
> Hi Christoph,
> could it be for the case when there is no otherwise?
>
> Le sam. 22 févr. 2020 à 18:24, Thiede, Christoph <[hidden email]> a écrit :
>
>       Hi all,
>
>
>       this a question regarding the Compiler's implementation. I spent some time on it but could not yet find a satisfying explanation. Maybe someone of you can help me?
>
>
>       Let's look into MessageNode >> #emitCodeForCase:encoder:value:, which contains the following passage:
>
>
>       (isLast and: [allReturn]) ifTrue:
>
>     [self emitCodeForJump: elseSize encoder: encoder]
>
>
> Btw, the equivalent in #sizeCodeForCase:value: is:
>
>
>       (isLast and: [allReturn]) ifTrue: [thenSize := thenSize + (self sizeCode: encoder forJump: elseSize)].
>
>
> In the bytecode of a method, this will look like this:
>
>
>       89 <23> pushConstant: 2
>
> 90 <88> dup
>
> 91 <76> pushConstant: 1
>
> 92 <B6> send: =
>
> 93 <9A> jumpFalse: 97
>
> 94 <87> pop
>
> 95 <22> pushConstant: #one
>
> 96 <7C> returnTop
>
> 97 <77> pushConstant: 2
>
> 98 <B6> send: =
>
> 99 <9A> jumpFalse: 103
>
> 100 <21> pushConstant: #two
>
> 101 <7C> returnTop
>
> 102 <90> jumpTo: 104 "here!"
>
> 103 <20> pushConstant: 42
>
> 104 <87> pop
>
>
> I cannot see when this jump would be ever reached.
>
> Case 1: Any case is selected, and because allReturn is true, its inlined value block is guaranteed to #returnTop and thus not to reach the last jump (in the above example, this happens in pc101).
>
> Case 2: No case is selected, so the execution will jump straightly to the inlined otherwiseBlock (this is pc99 in the example).
>
> For what purpose do we need this jump (ex: pc102)? Should we maybe remove it?
>
>
> Are there any special situations in which #returns is true but there is any code path that does *not* return? Some weird exception handling trick, for example?
>
> Or was the dubious jump just designed as a fallback to catch any hypothetic failures in any other component of the compiler system?
>
>
> I removed the generation of this jump for testing and could not investigate any anomalies so far. All compiler tests are passing - except of a number of Decompiler tests, which probably would need to be updated as well.
>
> Please find a minimum changeset in the attachment that demonstrates my question.
>
>
> I am looking forward to your replies!
>
>
> Best,
>
> Christoph
>
>
>
>
>

Reply | Threaded
Open this post in threaded view
|

Re: Question regarding compilation of #caseOf:[otherwise:] messages

Christoph Thiede
> > (At the moment I'm trying to optimize #caseOf: to use binary search if appropriate, let's see what will come of it ^^)

> When is it appropriate to use binary search there?

Looking up numbers can be reduced from O(n) to O(log n) complexity.

x caseOf: {
 [2] -> [1].
 [3] -> [2].
 [5] -> [3].
 [7] -> [4].
 [11] -> [5] ... }

This could also help us to both clean up and accelerate #interpretNextV3ClosuresInstructionFor:.

Best,
Christoph

Von: Squeak-dev <[hidden email]> im Auftrag von Levente Uzonyi <[hidden email]>
Gesendet: Sonntag, 23. Februar 2020 00:19:09
An: The general-purpose Squeak developers list
Betreff: Re: [squeak-dev] Question regarding compilation of #caseOf:[otherwise:] messages
 
On Sat, 22 Feb 2020, Thiede, Christoph wrote:

>
> > If you already analyzed it, I trust you, you are more advanced than me on the subject :)
>
>
> I would not say that I have an overview of the entire compiler system at the moment. Someone else should definitively review this before. :)
>
> > Fixing the Decompiler might be a bit more involved the way it is written now... > But with your fast progress and fearless diving in lower level details, I think you can do it :)
>
> Hm, maybe this could be some of my next adventures :-)
> (At the moment I'm trying to optimize #caseOf: to use binary search if appropriate, let's see what will come of it ^^)

When is it appropriate to use binary search there?

Levente

>
> Best,
> Christoph
>
> __________________________________________________________________________________________________________________________________________________________________________________________________________________________________________________________________________________________________________________________
> Von: Squeak-dev <[hidden email]> im Auftrag von Nicolas Cellier <[hidden email]>
> Gesendet: Samstag, 22. Februar 2020 18:50:26
> An: The general-purpose Squeak developers list
> Betreff: Re: [squeak-dev] Question regarding compilation of #caseOf:[otherwise:] messages  
> It was just a wild guess, because I did not look at the code nor any other detail.
> If you already analyzed it, I trust you, you are more advanced than me on the subject :)
> Fixing the Decompiler might be a bit more involved the way it is written now...
> But with your fast progress and fearless diving in lower level details, I think you can do it :)
>
> Le sam. 22 févr. 2020 à 18:44, Thiede, Christoph <[hidden email]> a écrit :
>
>       Hi Nicolas,
>
>
>       > could it be for the case when there is no otherwise?
>
>
> In this case, a "self caseError" will be generated (though this is no correct behavior either IMO, see Problems with #caseError). I don't see how this should affect the jumps or returns before the otherwise part ...
>
> Best,
> Christoph
>
> __________________________________________________________________________________________________________________________________________________________________________________________________________________________________________________________________________________________________________________________
> Von: Squeak-dev <[hidden email]> im Auftrag von Nicolas Cellier <[hidden email]>
> Gesendet: Samstag, 22. Februar 2020 18:36:01
> An: The general-purpose Squeak developers list
> Betreff: Re: [squeak-dev] Question regarding compilation of #caseOf:[otherwise:] messages  
> Hi Christoph,
> could it be for the case when there is no otherwise?
>
> Le sam. 22 févr. 2020 à 18:24, Thiede, Christoph <[hidden email]> a écrit :
>
>       Hi all,
>
>
>       this a question regarding the Compiler's implementation. I spent some time on it but could not yet find a satisfying explanation. Maybe someone of you can help me?
>
>
>       Let's look into MessageNode >> #emitCodeForCase:encoder:value:, which contains the following passage:
>
>
>       (isLast and: [allReturn]) ifTrue:
>
>     [self emitCodeForJump: elseSize encoder: encoder]
>
>
> Btw, the equivalent in #sizeCodeForCase:value: is:
>
>
>       (isLast and: [allReturn]) ifTrue: [thenSize := thenSize + (self sizeCode: encoder forJump: elseSize)].
>
>
> In the bytecode of a method, this will look like this:
>
>
>       89 <23> pushConstant: 2
>
> 90 <88> dup
>
> 91 <76> pushConstant: 1
>
> 92 <B6> send: =
>
> 93 <9A> jumpFalse: 97
>
> 94 <87> pop
>
> 95 <22> pushConstant: #one
>
> 96 <7C> returnTop
>
> 97 <77> pushConstant: 2
>
> 98 <B6> send: =
>
> 99 <9A> jumpFalse: 103
>
> 100 <21> pushConstant: #two
>
> 101 <7C> returnTop
>
> 102 <90> jumpTo: 104 "here!"
>
> 103 <20> pushConstant: 42
>
> 104 <87> pop
>
>
> I cannot see when this jump would be ever reached.
>
> Case 1: Any case is selected, and because allReturn is true, its inlined value block is guaranteed to #returnTop and thus not to reach the last jump (in the above example, this happens in pc101).
>
> Case 2: No case is selected, so the execution will jump straightly to the inlined otherwiseBlock (this is pc99 in the example).
>
> For what purpose do we need this jump (ex: pc102)? Should we maybe remove it?
>
>
> Are there any special situations in which #returns is true but there is any code path that does *not* return? Some weird exception handling trick, for example?
>
> Or was the dubious jump just designed as a fallback to catch any hypothetic failures in any other component of the compiler system?
>
>
> I removed the generation of this jump for testing and could not investigate any anomalies so far. All compiler tests are passing - except of a number of Decompiler tests, which probably would need to be updated as well.
>
> Please find a minimum changeset in the attachment that demonstrates my question.
>
>
> I am looking forward to your replies!
>
>
> Best,
>
> Christoph
>
>
>
>
>


Carpe Squeak!
Reply | Threaded
Open this post in threaded view
|

Re: Question regarding compilation of #caseOf:[otherwise:] messages

Eliot Miranda-2
In reply to this post by Levente Uzonyi
Hi Levente,

> On Feb 22, 2020, at 3:19 PM, Levente Uzonyi <[hidden email]> wrote:
>
> On Sat, 22 Feb 2020, Thiede, Christoph wrote:
>
>> > If you already analyzed it, I trust you, you are more advanced than me on the subject :)
>> I would not say that I have an overview of the entire compiler system at the moment. Someone else should definitively review this before. :)
>> > Fixing the Decompiler might be a bit more involved the way it is written now... > But with your fast progress and fearless diving in lower level details, I think you can do it :)
>> Hm, maybe this could be some of my next adventures :-)
>> (At the moment I'm trying to optimize #caseOf: to use binary search if appropriate, let's see what will come of it ^^)
>
> When is it appropriate to use binary search there?

I like this thought!  It made me think of two things.  One would be how to implement properly a bytecode for case dispatch.  I thought of indexing a literal array of bytecode PCs.  But better is I think doing an indexed jump into a sequence of maximally sized jumps.  So one would go romething like

       bytecodes to compute an index on top of stack
       computedJump
       jump L0
       jump L1
       ...
       jump LN
       bytecodes to dispatch case error

But I think binary dispatch is much more compelling than introducing a bytecode for a very rare case (excuse the pun).

So the second thought, which applies both to the existing implementation and a binary dispatch is that we *badly* need to fix the debugger to not step through the case dispatch but instead step directly to the block that the case selects is to the otherwise:.  It’s tedious to step through the huge case in instruction generation in the JIT, and stepping though in binary dispatch order would be confusing, albeit much shorter.

>
> Levente
>
>> Best,
>> Christoph
>> __________________________________________________________________________________________________________________________________________________________________________________________________________________________________________________________________________________________________________________________
>> Von: Squeak-dev <[hidden email]> im Auftrag von Nicolas Cellier <[hidden email]>
>> Gesendet: Samstag, 22. Februar 2020 18:50:26
>> An: The general-purpose Squeak developers list
>> Betreff: Re: [squeak-dev] Question regarding compilation of #caseOf:[otherwise:] messages  
>> It was just a wild guess, because I did not look at the code nor any other detail.
>> If you already analyzed it, I trust you, you are more advanced than me on the subject :)
>> Fixing the Decompiler might be a bit more involved the way it is written now...
>> But with your fast progress and fearless diving in lower level details, I think you can do it :)
>> Le sam. 22 févr. 2020 à 18:44, Thiede, Christoph <[hidden email]> a écrit :
>>
>>      Hi Nicolas,
>>
>>      > could it be for the case when there is no otherwise?
>> In this case, a "self caseError" will be generated (though this is no correct behavior either IMO, see Problems with #caseError). I don't see how this should affect the jumps or returns before the otherwise part ...
>> Best,
>> Christoph
>> __________________________________________________________________________________________________________________________________________________________________________________________________________________________________________________________________________________________________________________________
>> Von: Squeak-dev <[hidden email]> im Auftrag von Nicolas Cellier <[hidden email]>
>> Gesendet: Samstag, 22. Februar 2020 18:36:01
>> An: The general-purpose Squeak developers list
>> Betreff: Re: [squeak-dev] Question regarding compilation of #caseOf:[otherwise:] messages  
>> Hi Christoph,
>> could it be for the case when there is no otherwise?
>> Le sam. 22 févr. 2020 à 18:24, Thiede, Christoph <[hidden email]> a écrit :
>>
>>      Hi all,
>>
>>      this a question regarding the Compiler's implementation. I spent some time on it but could not yet find a satisfying explanation. Maybe someone of you can help me?
>>
>>      Let's look into MessageNode >> #emitCodeForCase:encoder:value:, which contains the following passage:
>>
>>      (isLast and: [allReturn]) ifTrue:
>>     [self emitCodeForJump: elseSize encoder: encoder]
>> Btw, the equivalent in #sizeCodeForCase:value: is:
>>
>>      (isLast and: [allReturn]) ifTrue: [thenSize := thenSize + (self sizeCode: encoder forJump: elseSize)].
>> In the bytecode of a method, this will look like this:
>>
>>      89 <23> pushConstant: 2
>> 90 <88> dup
>> 91 <76> pushConstant: 1
>> 92 <B6> send: =
>> 93 <9A> jumpFalse: 97
>> 94 <87> pop
>> 95 <22> pushConstant: #one
>> 96 <7C> returnTop
>> 97 <77> pushConstant: 2
>> 98 <B6> send: =
>> 99 <9A> jumpFalse: 103
>> 100 <21> pushConstant: #two
>> 101 <7C> returnTop
>> 102 <90> jumpTo: 104 "here!"
>> 103 <20> pushConstant: 42
>> 104 <87> pop
>> I cannot see when this jump would be ever reached.
>> Case 1: Any case is selected, and because allReturn is true, its inlined value block is guaranteed to #returnTop and thus not to reach the last jump (in the above example, this happens in pc101).
>> Case 2: No case is selected, so the execution will jump straightly to the inlined otherwiseBlock (this is pc99 in the example).
>> For what purpose do we need this jump (ex: pc102)? Should we maybe remove it?
>> Are there any special situations in which #returns is true but there is any code path that does *not* return? Some weird exception handling trick, for example?
>> Or was the dubious jump just designed as a fallback to catch any hypothetic failures in any other component of the compiler system?
>> I removed the generation of this jump for testing and could not investigate any anomalies so far. All compiler tests are passing - except of a number of Decompiler tests, which probably would need to be updated as well.
>> Please find a minimum changeset in the attachment that demonstrates my question.
>> I am looking forward to your replies!
>> Best,
>> Christoph
>

Reply | Threaded
Open this post in threaded view
|

Re: Question regarding compilation of #caseOf:[otherwise:] messages

Levente Uzonyi
In reply to this post by Christoph Thiede
On Sun, 23 Feb 2020, Thiede, Christoph wrote:

> > > (At the moment I'm trying to optimize #caseOf: to use binary search if appropriate, let's see what will come of it ^^)
>
> > When is it appropriate to use binary search there?
>
> Looking up numbers can be reduced from O(n) to O(log n) complexity.
>
> x caseOf: {
>  [2] -> [1].
>  [3] -> [2].
>  [5] -> [3].
>  [7] -> [4].
>  [11] -> [5] ... }
>
What if the value of x is nil?
What if the branches are not sorted by "key"?
What if there are multiple branches with the same "key"?
What if some "keys" are not literals?

Levente

> This could also help us to both clean up and accelerate #interpretNextV3ClosuresInstructionFor:.
>
> Best,
> Christoph
>
> _________________________________________________________________________________________________________________________________________________________________________________________________________________________________
> Von: Squeak-dev <[hidden email]> im Auftrag von Levente Uzonyi <[hidden email]>
> Gesendet: Sonntag, 23. Februar 2020 00:19:09
> An: The general-purpose Squeak developers list
> Betreff: Re: [squeak-dev] Question regarding compilation of #caseOf:[otherwise:] messages  
> On Sat, 22 Feb 2020, Thiede, Christoph wrote:
>
> >
> > > If you already analyzed it, I trust you, you are more advanced than me on the subject :)
> >
> >
> > I would not say that I have an overview of the entire compiler system at the moment. Someone else should definitively review this before. :)
> >
> > > Fixing the Decompiler might be a bit more involved the way it is written now... > But with your fast progress and fearless diving in lower level details, I think you can do it :)
> >
> > Hm, maybe this could be some of my next adventures :-)
> > (At the moment I'm trying to optimize #caseOf: to use binary search if appropriate, let's see what will come of it ^^)
>
> When is it appropriate to use binary search there?
>
> Levente
>
> >
> > Best,
> > Christoph
> >
> >________________________________________________________________________________________________________________________________________________________________________________________________________________________________
> __________________________________________________________________________________________
> > Von: Squeak-dev <[hidden email]> im Auftrag von Nicolas Cellier <[hidden email]>
> > Gesendet: Samstag, 22. Februar 2020 18:50:26
> > An: The general-purpose Squeak developers list
> > Betreff: Re: [squeak-dev] Question regarding compilation of #caseOf:[otherwise:] messages  
> > It was just a wild guess, because I did not look at the code nor any other detail.
> > If you already analyzed it, I trust you, you are more advanced than me on the subject :)
> > Fixing the Decompiler might be a bit more involved the way it is written now...
> > But with your fast progress and fearless diving in lower level details, I think you can do it :)
> >
> > Le sam. 22 févr. 2020 à 18:44, Thiede, Christoph <[hidden email]> a écrit :
> >
> >       Hi Nicolas,
> >
> >
> >       > could it be for the case when there is no otherwise?
> >
> >
> > In this case, a "self caseError" will be generated (though this is no correct behavior either IMO, see Problems with #caseError). I don't see how this should affect the jumps or returns before the otherwise part ...
> >
> > Best,
> > Christoph
> >
> >________________________________________________________________________________________________________________________________________________________________________________________________________________________________
> __________________________________________________________________________________________
> > Von: Squeak-dev <[hidden email]> im Auftrag von Nicolas Cellier <[hidden email]>
> > Gesendet: Samstag, 22. Februar 2020 18:36:01
> > An: The general-purpose Squeak developers list
> > Betreff: Re: [squeak-dev] Question regarding compilation of #caseOf:[otherwise:] messages  
> > Hi Christoph,
> > could it be for the case when there is no otherwise?
> >
> > Le sam. 22 févr. 2020 à 18:24, Thiede, Christoph <[hidden email]> a écrit :
> >
> >       Hi all,
> >
> >
> >       this a question regarding the Compiler's implementation. I spent some time on it but could not yet find a satisfying explanation. Maybe someone of you can help me?
> >
> >
> >       Let's look into MessageNode >> #emitCodeForCase:encoder:value:, which contains the following passage:
> >
> >
> >       (isLast and: [allReturn]) ifTrue:
> >
> >     [self emitCodeForJump: elseSize encoder: encoder]
> >
> >
> > Btw, the equivalent in #sizeCodeForCase:value: is:
> >
> >
> >       (isLast and: [allReturn]) ifTrue: [thenSize := thenSize + (self sizeCode: encoder forJump: elseSize)].
> >
> >
> > In the bytecode of a method, this will look like this:
> >
> >
> >       89 <23> pushConstant: 2
> >
> > 90 <88> dup
> >
> > 91 <76> pushConstant: 1
> >
> > 92 <B6> send: =
> >
> > 93 <9A> jumpFalse: 97
> >
> > 94 <87> pop
> >
> > 95 <22> pushConstant: #one
> >
> > 96 <7C> returnTop
> >
> > 97 <77> pushConstant: 2
> >
> > 98 <B6> send: =
> >
> > 99 <9A> jumpFalse: 103
> >
> > 100 <21> pushConstant: #two
> >
> > 101 <7C> returnTop
> >
> > 102 <90> jumpTo: 104 "here!"
> >
> > 103 <20> pushConstant: 42
> >
> > 104 <87> pop
> >
> >
> > I cannot see when this jump would be ever reached.
> >
> > Case 1: Any case is selected, and because allReturn is true, its inlined value block is guaranteed to #returnTop and thus not to reach the last jump (in the above example, this happens in pc101).
> >
> > Case 2: No case is selected, so the execution will jump straightly to the inlined otherwiseBlock (this is pc99 in the example).
> >
> > For what purpose do we need this jump (ex: pc102)? Should we maybe remove it?
> >
> >
> > Are there any special situations in which #returns is true but there is any code path that does *not* return? Some weird exception handling trick, for example?
> >
> > Or was the dubious jump just designed as a fallback to catch any hypothetic failures in any other component of the compiler system?
> >
> >
> > I removed the generation of this jump for testing and could not investigate any anomalies so far. All compiler tests are passing - except of a number of Decompiler tests, which probably would need to be updated as well.
> >
> > Please find a minimum changeset in the attachment that demonstrates my question.
> >
> >
> > I am looking forward to your replies!
> >
> >
> > Best,
> >
> > Christoph
> >
> >
> >
> >
> >
>
>

Reply | Threaded
Open this post in threaded view
|

Re: Question regarding compilation of #caseOf:[otherwise:] messages

Christoph Thiede
In reply to this post by Eliot Miranda-2

Hi Levente, hi Eliot,


What if the value of x is nil?

> [...]
> What if there are multiple branches with the same "key"?
> What if some "keys" are not literals?

In these three cases, we just can reuse the existing linear dispatching. I only thought of optimizing the case if all keys are distinct SmallIntegers (and their number is greater than two).

> What if the branches are not sorted by "key"?

Then we can sort them, or would this be a problem?


Eliot, what is your idea of indexed jumps more in detail?

Actually I thought of simply converting the "conditional heap" structure you can find in #interpretNextV3ClosuresInstructionFor: into bytecodes. Something like:


    1. pushTempVar: #x
    2. dup
    3. pushLit: 2
    4. send: #<=
    5. jumpIfFalse: 18
    6. dup
    7. pushLit: 1
    8. send: #=
    9. jumpIfFalse: 23
    10. pop
    11. push: #one
    12. jump: 25
    13. pushLit: 2
    14. send: #=
    15. jumpIfFalse: 23
    16. push: #two
    17. jump: 25
    18. pushLit: 3
    19. send: #=
    20. jumpIfFalse: 23
    21. push: #three
    22. jump: 25
    23. pushReceiver
    24. send: #caseError
    25. ...

for:

x caseOf: {[1]->[#one]. [2]->[#two]. [3]->[#three]}

Can you follow me? How would you think of this approach? :)

So the second thought, which applies both to the existing implementation and a binary dispatch is that we *badly* need to fix the debugger to not step through the case dispatch but instead step directly to the block that the case selects is to the otherwise:

What do you mean by this? At the moment, the debugger halts at each case key block, and there is another halt that is invisible (unless you use the BytecodeDebugger) before sending #caseError. I think this is quite convenient when debugging a #caseOf:otherwise: message?

Best,
Christoph



Von: Squeak-dev <[hidden email]> im Auftrag von Eliot Miranda <[hidden email]>
Gesendet: Sonntag, 23. Februar 2020 18:38:15
An: The general-purpose Squeak developers list
Betreff: Re: [squeak-dev] Question regarding compilation of #caseOf:[otherwise:] messages
 
Hi Levente,

> On Feb 22, 2020, at 3:19 PM, Levente Uzonyi <[hidden email]> wrote:
>
> On Sat, 22 Feb 2020, Thiede, Christoph wrote:
>
>> > If you already analyzed it, I trust you, you are more advanced than me on the subject :)
>> I would not say that I have an overview of the entire compiler system at the moment. Someone else should definitively review this before. :)
>> > Fixing the Decompiler might be a bit more involved the way it is written now... > But with your fast progress and fearless diving in lower level details, I think you can do it :)
>> Hm, maybe this could be some of my next adventures :-)
>> (At the moment I'm trying to optimize #caseOf: to use binary search if appropriate, let's see what will come of it ^^)
>
> When is it appropriate to use binary search there?

I like this thought!  It made me think of two things.  One would be how to implement properly a bytecode for case dispatch.  I thought of indexing a literal array of bytecode PCs.  But better is I think doing an indexed jump into a sequence of maximally sized jumps.  So one would go romething like

       bytecodes to compute an index on top of stack
       computedJump
       jump L0
       jump L1
       ...
       jump LN
       bytecodes to dispatch case error

But I think binary dispatch is much more compelling than introducing a bytecode for a very rare case (excuse the pun).

So the second thought, which applies both to the existing implementation and a binary dispatch is that we *badly* need to fix the debugger to not step through the case dispatch but instead step directly to the block that the case selects is to the otherwise:.  It’s tedious to step through the huge case in instruction generation in the JIT, and stepping though in binary dispatch order would be confusing, albeit much shorter.

>
> Levente
>
>> Best,
>> Christoph
>> __________________________________________________________________________________________________________________________________________________________________________________________________________________________________________________________________________________________________________________________
>> Von: Squeak-dev <[hidden email]> im Auftrag von Nicolas Cellier <[hidden email]>
>> Gesendet: Samstag, 22. Februar 2020 18:50:26
>> An: The general-purpose Squeak developers list
>> Betreff: Re: [squeak-dev] Question regarding compilation of #caseOf:[otherwise:] messages 
>> It was just a wild guess, because I did not look at the code nor any other detail.
>> If you already analyzed it, I trust you, you are more advanced than me on the subject :)
>> Fixing the Decompiler might be a bit more involved the way it is written now...
>> But with your fast progress and fearless diving in lower level details, I think you can do it :)
>> Le sam. 22 févr. 2020 à 18:44, Thiede, Christoph <[hidden email]> a écrit :
>>
>>      Hi Nicolas,
>>
>>      > could it be for the case when there is no otherwise?
>> In this case, a "self caseError" will be generated (though this is no correct behavior either IMO, see Problems with #caseError). I don't see how this should affect the jumps or returns before the otherwise part ...
>> Best,
>> Christoph
>> __________________________________________________________________________________________________________________________________________________________________________________________________________________________________________________________________________________________________________________________
>> Von: Squeak-dev <[hidden email]> im Auftrag von Nicolas Cellier <[hidden email]>
>> Gesendet: Samstag, 22. Februar 2020 18:36:01
>> An: The general-purpose Squeak developers list
>> Betreff: Re: [squeak-dev] Question regarding compilation of #caseOf:[otherwise:] messages 
>> Hi Christoph,
>> could it be for the case when there is no otherwise?
>> Le sam. 22 févr. 2020 à 18:24, Thiede, Christoph <[hidden email]> a écrit :
>>
>>      Hi all,
>>
>>      this a question regarding the Compiler's implementation. I spent some time on it but could not yet find a satisfying explanation. Maybe someone of you can help me?
>>
>>      Let's look into MessageNode >> #emitCodeForCase:encoder:value:, which contains the following passage:
>>
>>      (isLast and: [allReturn]) ifTrue:
>>     [self emitCodeForJump: elseSize encoder: encoder]
>> Btw, the equivalent in #sizeCodeForCase:value: is:
>>
>>      (isLast and: [allReturn]) ifTrue: [thenSize := thenSize + (self sizeCode: encoder forJump: elseSize)].
>> In the bytecode of a method, this will look like this:
>>
>>      89 <23> pushConstant: 2
>> 90 <88> dup
>> 91 <76> pushConstant: 1
>> 92 <B6> send: =
>> 93 <9A> jumpFalse: 97
>> 94 <87> pop
>> 95 <22> pushConstant: #one
>> 96 <7C> returnTop
>> 97 <77> pushConstant: 2
>> 98 <B6> send: =
>> 99 <9A> jumpFalse: 103
>> 100 <21> pushConstant: #two
>> 101 <7C> returnTop
>> 102 <90> jumpTo: 104 "here!"
>> 103 <20> pushConstant: 42
>> 104 <87> pop
>> I cannot see when this jump would be ever reached.
>> Case 1: Any case is selected, and because allReturn is true, its inlined value block is guaranteed to #returnTop and thus not to reach the last jump (in the above example, this happens in pc101).
>> Case 2: No case is selected, so the execution will jump straightly to the inlined otherwiseBlock (this is pc99 in the example).
>> For what purpose do we need this jump (ex: pc102)? Should we maybe remove it?
>> Are there any special situations in which #returns is true but there is any code path that does *not* return? Some weird exception handling trick, for example?
>> Or was the dubious jump just designed as a fallback to catch any hypothetic failures in any other component of the compiler system?
>> I removed the generation of this jump for testing and could not investigate any anomalies so far. All compiler tests are passing - except of a number of Decompiler tests, which probably would need to be updated as well.
>> Please find a minimum changeset in the attachment that demonstrates my question.
>> I am looking forward to your replies!
>> Best,
>> Christoph
>



Carpe Squeak!