Smalltalk › Pharo › Pharo Smalltalk Developers

Mysteries of the compiler

_‹ Previous Topic Next Topic _›

Classic

List

Threaded

20 messages Options

Nicolai Hess

Mysteries of the compiler

I tried to solve issues
"13260 inspecting variables in the debugger fails in some cases" and
"14058 Inconsistent information in debugger"

and found some oddities in the way opal (compiler and the debuggermap)
handles tempory variables.

I am not sure if these are bugs.

1. order of temporaries:

Evaluating this method
|a b c|
a:=1.
b:=2.
c:=3.
thisContext tempNames logCr.
["in a block ! " thisContext tempNames logCr . a+b+c ] value.

prints in the transcript
#(#a #b #c)
#(#a #b #c)

- ok, both contexts, the method and block context, use the same

ordering for the tempories

writing on one variable in the block context require an indirection vector for
the variable

|a b c|
a:=1.
b:=2.
c:=3.
thisContext tempNames logCr.
[b:=1. thisContext tempNames logCr. a+b+c ] value.

here the variable b is written within the block context. This method prints
#(#b #a #c)
#(#a #c #b)

- strange the variable order has changed and is not the same in both contexts.

2. OCOptimizedBlockScope

Consider a whileTrue: call

["A-Block"] whileTrue:["B-Block"]

or an ifTrue:

("some boolean value") ifTrue:["C-Block"]

stepping into the blocks with the debugger and reading the
values for a tempoariy variable shows that every block has
an OCOpotimizedBlockScope as its scope.

Are all 3 blocks the same kind of "optimized blocks"? because ->

3. Blocks within optimized blocks
This method prints the number of tempories on the Transcript

(the on:do: call is just a dummy for an "block within a block"
but the important part is the usage of the three variables
in the do-block)

|a b c|
a:=1.
b:=2.
c:=3.
(c>0) ifTrue:[
[ a asString ] on:Error do:[:ex | a value:a value:b value:c].
].
thisContext method numTemps logCr.
a

The output on the transcript is "3"

- ok

This method is like the one above, but the nested block
is in the receiver of the whileTrue:[] call (again, an optimized block, right?)

|a b c|
a:=1.
b:=2.
c:=3.

[
[ a asString ] on:Error do:[:ex | a value:a value:b value:c].
c <0 ] whileTrue:[].
thisContext method numTemps logCr.
a

The output on the Transcript is "1"

And it depends on the number of used variables in the do-Block

The same method but with the do-Block
do:[:ex | a value:a value:b]

would print 2 as number of temporaries.

nicolai

Marcus Denker-4

Re: Mysteries of the compiler

> On 12 Oct 2014, at 11:49, Nicolai Hess <[hidden email]> wrote:
>
> I tried to solve issues
> "13260 inspecting variables in the debugger fails in some cases" and
> "14058 Inconsistent information in debugger"
> and found some oddities in the way opal (compiler and the debuggermap)
> handles tempory variables.
> I am not sure if these are bugs.
>
> 1. order of temporaries:
> Evaluating this method
> |a b c|
> a:=1.
> b:=2.
> c:=3.
> thisContext tempNames logCr.
> ["in a block ! " thisContext tempNames logCr . a+b+c ] value.
>
> prints in the transcript
> #(#a #b #c)
> #(#a #b #c)
>
> - ok, both contexts, the method and block context, use the same
> ordering for the tempories
>
> writing on one variable in the block context require an indirection vector for
> the variable
>
> |a b c|
> a:=1.
> b:=2.
> c:=3.
> thisContext tempNames logCr.
> [b:=1. thisContext tempNames logCr. a+b+c ] value.
>
> here the variable b is written within the block context. This method prints
> #(#b #a #c)
> #(#a #c #b)
>
> - strange the variable order has changed and is not the same in both contexts.
>

Yes, but behind the scenes a lot is going on… when you assign a variable in a block, this
variable can not be allocated like a normal temp. There is instead an array (temp vector) allocated where this
variable lives. Then *this array* has a “temp” and is accessed in the block like the variables
in the case of no assignment.

This means that there can not be an order: some variables live in arrays, others in the context.
The only thing that you have is a scope that nows how a name maps to a specific offset or two
(offset of temp of the “temp vector + offset inside this temp vector).

One could beautify it a bit by sorting the names according to the original definition order, though.
(we should look onto that).

But even if we do, this does not mean that b has offset 2 in the example, as it lives in an array...

> 2. OCOptimizedBlockScope
>
> Consider a whileTrue: call
> ["A-Block"] whileTrue:["B-Block"]
> or an ifTrue:
> ("some boolean value") ifTrue:["C-Block"]
> stepping into the blocks with the debugger and reading the
> values for a tempoariy variable shows that every block has
> an OCOpotimizedBlockScope as its scope.
> Are all 3 blocks the same kind of "optimized blocks"? because ->
>

Yes, we tried two ideas
The first move only real existing scopes, that means, instead of an optimised scope
point to the scope that all these optimized blocks are flattened into.
But that was not nice and we could not get it to work.
Instead, we opted into modelling the concept of the optimised scope: on the AST
level, the block has a scope (like all blocks), but when you add an entry (or query),
it will forward to the “real” existing scope.

I still like it, it makes things very explicit.

> 3. Blocks within optimized blocks
> This method prints the number of tempories on the Transcript
> (the on:do: call is just a dummy for an "block within a block"
> but the important part is the usage of the three variables
> in the do-block)
>
> |a b c|
> a:=1.
> b:=2.
> c:=3.
> (c>0) ifTrue:[
> [ a asString ] on:Error do:[:ex | a value:a value:b value:c].
> ].
> thisContext method numTemps logCr.
> a
>
> The output on the transcript is "3"
> - ok
>
> This method is like the one above, but the nested block
> is in the receiver of the whileTrue:[] call (again, an optimized block, right?)
>
> |a b c|
> a:=1.
> b:=2.
> c:=3.
>
> [
> [ a asString ] on:Error do:[:ex | a value:a value:b value:c].
> c <0 ] whileTrue:[].
> thisContext method numTemps logCr.
> a
>
> The output on the Transcript is "1"
> And it depends on the number of used variables in the do-Block
> The same method but with the do-Block
> do:[:ex | a value:a value:b]
> would print 2 as number of temporaries.
>

Hmm… I need to think about this one. It looks strange, yes.

Marcus

Eliot Miranda-2

Re: Mysteries of the compiler

Hi Marcus,

On Tue, Oct 14, 2014 at 4:09 AM, Marcus Denker <[hidden email]> wrote:

> On 12 Oct 2014, at 11:49, Nicolai Hess <[hidden email]> wrote:
>
> I tried to solve issues
> "13260 inspecting variables in the debugger fails in some cases" and
> "14058 Inconsistent information in debugger"
> and found some oddities in the way opal (compiler and the debuggermap)
> handles tempory variables.
> I am not sure if these are bugs.
>
> 1. order of temporaries:
> Evaluating this method
> |a b c|
> a:=1.
> b:=2.
> c:=3.
> thisContext tempNames logCr.
> ["in a block ! " thisContext tempNames logCr . a+b+c ] value.
>
> prints in the transcript
> #(#a #b #c)
> #(#a #b #c)
>
> - ok, both contexts, the method and block context, use the same
> ordering for the tempories
>
> writing on one variable in the block context require an indirection vector for
> the variable
>
> |a b c|
> a:=1.
> b:=2.
> c:=3.
> thisContext tempNames logCr.
> [b:=1. thisContext tempNames logCr. a+b+c ] value.
>
> here the variable b is written within the block context. This method prints
> #(#b #a #c)
> #(#a #c #b)
>
> - strange the variable order has changed and is not the same in both contexts.
>

Yes, but behind the scenes a lot is going on… when you assign a variable in a block, this
variable can not be allocated like a normal temp. There is instead an array (temp vector) allocated where this
variable lives. Then *this array* has a “temp” and is accessed in the block like the variables
in the case of no assignment.

Niolai, you're probably aware of it but there's an extensive write up of the design on my blog:

http://www.mirandabanda.org/cogblog/2008/06/07/closures-part-i/

http://www.mirandabanda.org/cogblog/2008/07/22/closures-part-ii-the-bytecodes/

http://www.mirandabanda.org/cogblog/2008/07/24/closures-part-iii-the-compiler/ (especially section "The Closure Analysis")

This means that there can not be an order: some variables live in arrays, others in the context.
The only thing that you have is a scope that nows how a name maps to a specific offset or two
(offset of temp of the “temp vector + offset inside this temp vector).

I disagree slightly. The Squeak compiler maintains order, both in temps and in indirect temps. But whether a temp is direct or indirect depends on a simple analysis of a single method. I still think it is part of the Smalltalk culture to KISS and e.g. maintain the order of temps.

One could beautify it a bit by sorting the names according to the original definition order, though.
(we should look onto that).

The current exception system needs to access temps, e.g. ensure:'s and ifCurtailed:'s complete and on:do:'s handlerActive. Isn't it really important to maintain ordering? It certainly makes introspection easier. One can go through a name-based interface but that is likely to be much much slower in what is a perfrmance-critical part of the system.

But even if we do, this does not mean that b has offset 2 in the example, as it lives in an array...

Right.

> 2. OCOptimizedBlockScope
>
> Consider a whileTrue: call
> ["A-Block"] whileTrue:["B-Block"]
> or an ifTrue:
> ("some boolean value") ifTrue:["C-Block"]
> stepping into the blocks with the debugger and reading the
> values for a tempoariy variable shows that every block has
> an OCOpotimizedBlockScope as its scope.
> Are all 3 blocks the same kind of "optimized blocks"? because ->
>
Yes, we tried two ideas
The first move only real existing scopes, that means, instead of an optimised scope
point to the scope that all these optimized blocks are flattened into.
But that was not nice and we could not get it to work.
Instead, we opted into modelling the concept of the optimised scope: on the AST
level, the block has a scope (like all blocks), but when you add an entry (or query),
it will forward to the “real” existing scope.

I still like it, it makes things very explicit.

> 3. Blocks within optimized blocks
> This method prints the number of tempories on the Transcript
> (the on:do: call is just a dummy for an "block within a block"
> but the important part is the usage of the three variables
> in the do-block)
>
> |a b c|
> a:=1.
> b:=2.
> c:=3.
> (c>0) ifTrue:[
> [ a asString ] on:Error do:[:ex | a value:a value:b value:c].
> ].
> thisContext method numTemps logCr.
> a
>
> The output on the transcript is "3"
> - ok
>
> This method is like the one above, but the nested block
> is in the receiver of the whileTrue:[] call (again, an optimized block, right?)
>
> |a b c|
> a:=1.
> b:=2.
> c:=3.
>
> [
> [ a asString ] on:Error do:[:ex | a value:a value:b value:c].
> c <0 ] whileTrue:[].
> thisContext method numTemps logCr.
> a
>
> The output on the Transcript is "1"
> And it depends on the number of used variables in the do-Block
> The same method but with the do-Block
> do:[:ex | a value:a value:b]
> would print 2 as number of temporaries.
>

Hmm… I need to think about this one. It looks strange, yes.

Marcus

--
best,

Eliot

Nicolai Hess

Re: Mysteries of the compiler

Thank you eliot, marcus

Yes I know your blog it is a great source of information.

I am curious how the squeak compiler/debugger handles this (I'll look into it).

About opal and optimized block scope:
Yes, I like the way this "special" non-scope is modelled with
its own scope class.

But somehow the whileTrue: receiver block is handled different than
the ifTrue: block argument.

2014-10-14 17:41 GMT+02:00 Eliot Miranda <[hidden email]>:

Hi Marcus,

On Tue, Oct 14, 2014 at 4:09 AM, Marcus Denker <[hidden email]> wrote:

> On 12 Oct 2014, at 11:49, Nicolai Hess <[hidden email]> wrote:
>
> I tried to solve issues
> "13260 inspecting variables in the debugger fails in some cases" and
> "14058 Inconsistent information in debugger"
> and found some oddities in the way opal (compiler and the debuggermap)
> handles tempory variables.
> I am not sure if these are bugs.
>
> 1. order of temporaries:
> Evaluating this method
> |a b c|
> a:=1.
> b:=2.
> c:=3.
> thisContext tempNames logCr.
> ["in a block ! " thisContext tempNames logCr . a+b+c ] value.
>
> prints in the transcript
> #(#a #b #c)
> #(#a #b #c)
>
> - ok, both contexts, the method and block context, use the same
> ordering for the tempories
>
> writing on one variable in the block context require an indirection vector for
> the variable
>
> |a b c|
> a:=1.
> b:=2.
> c:=3.
> thisContext tempNames logCr.
> [b:=1. thisContext tempNames logCr. a+b+c ] value.
>
> here the variable b is written within the block context. This method prints
> #(#b #a #c)
> #(#a #c #b)
>
> - strange the variable order has changed and is not the same in both contexts.
>

Yes, but behind the scenes a lot is going on… when you assign a variable in a block, this
variable can not be allocated like a normal temp. There is instead an array (temp vector) allocated where this
variable lives. Then *this array* has a “temp” and is accessed in the block like the variables
in the case of no assignment.

Niolai, you're probably aware of it but there's an extensive write up of the design on my blog:

http://www.mirandabanda.org/cogblog/2008/06/07/closures-part-i/
http://www.mirandabanda.org/cogblog/2008/07/22/closures-part-ii-the-bytecodes/
http://www.mirandabanda.org/cogblog/2008/07/24/closures-part-iii-the-compiler/ (especially section "The Closure Analysis")

This means that there can not be an order: some variables live in arrays, others in the context.
The only thing that you have is a scope that nows how a name maps to a specific offset or two
(offset of temp of the “temp vector + offset inside this temp vector).

I disagree slightly. The Squeak compiler maintains order, both in temps and in indirect temps. But whether a temp is direct or indirect depends on a simple analysis of a single method. I still think it is part of the Smalltalk culture to KISS and e.g. maintain the order of temps.

One could beautify it a bit by sorting the names according to the original definition order, though.
(we should look onto that).

The current exception system needs to access temps, e.g. ensure:'s and ifCurtailed:'s complete and on:do:'s handlerActive. Isn't it really important to maintain ordering? It certainly makes introspection easier. One can go through a name-based interface but that is likely to be much much slower in what is a perfrmance-critical part of the system.

But even if we do, this does not mean that b has offset 2 in the example, as it lives in an array...

Right.

> 2. OCOptimizedBlockScope
>
> Consider a whileTrue: call
> ["A-Block"] whileTrue:["B-Block"]
> or an ifTrue:
> ("some boolean value") ifTrue:["C-Block"]
> stepping into the blocks with the debugger and reading the
> values for a tempoariy variable shows that every block has
> an OCOpotimizedBlockScope as its scope.
> Are all 3 blocks the same kind of "optimized blocks"? because ->
>
Yes, we tried two ideas
The first move only real existing scopes, that means, instead of an optimised scope
point to the scope that all these optimized blocks are flattened into.
But that was not nice and we could not get it to work.
Instead, we opted into modelling the concept of the optimised scope: on the AST
level, the block has a scope (like all blocks), but when you add an entry (or query),
it will forward to the “real” existing scope.

I still like it, it makes things very explicit.

> 3. Blocks within optimized blocks
> This method prints the number of tempories on the Transcript
> (the on:do: call is just a dummy for an "block within a block"
> but the important part is the usage of the three variables
> in the do-block)
>
> |a b c|
> a:=1.
> b:=2.
> c:=3.
> (c>0) ifTrue:[
> [ a asString ] on:Error do:[:ex | a value:a value:b value:c].
> ].
> thisContext method numTemps logCr.
> a
>
> The output on the transcript is "3"
> - ok
>
> This method is like the one above, but the nested block
> is in the receiver of the whileTrue:[] call (again, an optimized block, right?)
>
> |a b c|
> a:=1.
> b:=2.
> c:=3.
>
> [
> [ a asString ] on:Error do:[:ex | a value:a value:b value:c].
> c <0 ] whileTrue:[].
> thisContext method numTemps logCr.
> a
>
> The output on the Transcript is "1"
> And it depends on the number of used variables in the do-Block
> The same method but with the do-Block
> do:[:ex | a value:a value:b]
> would print 2 as number of temporaries.
>

Hmm… I need to think about this one. It looks strange, yes.

Marcus

--
best,
Eliot

Eliot Miranda-2

Re: Mysteries of the compiler

On Tue, Oct 14, 2014 at 11:15 AM, Nicolai Hess <[hidden email]> wrote:

Thank you eliot, marcus

Yes I know your blog it is a great source of information.
I am curious how the squeak compiler/debugger handles this (I'll look into it).

There's no scope at run-time, only at compile time. AT compile time the BlockNodes for optimized blocks get checked for the optimized selectors (ifTrue: whileTrue and: et al) and have a flag set if they can be optimized (have the right number of arguments). At run-time there's no evidence of an optimized block. One is left in the enclosing method/block activation.

About opal and optimized block scope:
Yes, I like the way this "special" non-scope is modelled with
its own scope class.
But somehow the whileTrue: receiver block is handled different than
the ifTrue: block argument.

How so? Is it the source extent that is wrong?

2014-10-14 17:41 GMT+02:00 Eliot Miranda <[hidden email]>:
Hi Marcus,

On Tue, Oct 14, 2014 at 4:09 AM, Marcus Denker <[hidden email]> wrote:

> On 12 Oct 2014, at 11:49, Nicolai Hess <[hidden email]> wrote:
>
> I tried to solve issues
> "13260 inspecting variables in the debugger fails in some cases" and
> "14058 Inconsistent information in debugger"
> and found some oddities in the way opal (compiler and the debuggermap)
> handles tempory variables.
> I am not sure if these are bugs.
>
> 1. order of temporaries:
> Evaluating this method
> |a b c|
> a:=1.
> b:=2.
> c:=3.
> thisContext tempNames logCr.
> ["in a block ! " thisContext tempNames logCr . a+b+c ] value.
>
> prints in the transcript
> #(#a #b #c)
> #(#a #b #c)
>
> - ok, both contexts, the method and block context, use the same
> ordering for the tempories
>
> writing on one variable in the block context require an indirection vector for
> the variable
>
> |a b c|
> a:=1.
> b:=2.
> c:=3.
> thisContext tempNames logCr.
> [b:=1. thisContext tempNames logCr. a+b+c ] value.
>
> here the variable b is written within the block context. This method prints
> #(#b #a #c)
> #(#a #c #b)
>
> - strange the variable order has changed and is not the same in both contexts.
>

Yes, but behind the scenes a lot is going on… when you assign a variable in a block, this
variable can not be allocated like a normal temp. There is instead an array (temp vector) allocated where this
variable lives. Then *this array* has a “temp” and is accessed in the block like the variables
in the case of no assignment.

Niolai, you're probably aware of it but there's an extensive write up of the design on my blog:

http://www.mirandabanda.org/cogblog/2008/06/07/closures-part-i/
http://www.mirandabanda.org/cogblog/2008/07/22/closures-part-ii-the-bytecodes/
http://www.mirandabanda.org/cogblog/2008/07/24/closures-part-iii-the-compiler/ (especially section "The Closure Analysis")

This means that there can not be an order: some variables live in arrays, others in the context.
The only thing that you have is a scope that nows how a name maps to a specific offset or two
(offset of temp of the “temp vector + offset inside this temp vector).

I disagree slightly. The Squeak compiler maintains order, both in temps and in indirect temps. But whether a temp is direct or indirect depends on a simple analysis of a single method. I still think it is part of the Smalltalk culture to KISS and e.g. maintain the order of temps.

One could beautify it a bit by sorting the names according to the original definition order, though.
(we should look onto that).

The current exception system needs to access temps, e.g. ensure:'s and ifCurtailed:'s complete and on:do:'s handlerActive. Isn't it really important to maintain ordering? It certainly makes introspection easier. One can go through a name-based interface but that is likely to be much much slower in what is a perfrmance-critical part of the system.

But even if we do, this does not mean that b has offset 2 in the example, as it lives in an array...

Right.

> 2. OCOptimizedBlockScope
>
> Consider a whileTrue: call
> ["A-Block"] whileTrue:["B-Block"]
> or an ifTrue:
> ("some boolean value") ifTrue:["C-Block"]
> stepping into the blocks with the debugger and reading the
> values for a tempoariy variable shows that every block has
> an OCOpotimizedBlockScope as its scope.
> Are all 3 blocks the same kind of "optimized blocks"? because ->
>
Yes, we tried two ideas
The first move only real existing scopes, that means, instead of an optimised scope
point to the scope that all these optimized blocks are flattened into.
But that was not nice and we could not get it to work.
Instead, we opted into modelling the concept of the optimised scope: on the AST
level, the block has a scope (like all blocks), but when you add an entry (or query),
it will forward to the “real” existing scope.

I still like it, it makes things very explicit.

> 3. Blocks within optimized blocks
> This method prints the number of tempories on the Transcript
> (the on:do: call is just a dummy for an "block within a block"
> but the important part is the usage of the three variables
> in the do-block)
>
> |a b c|
> a:=1.
> b:=2.
> c:=3.
> (c>0) ifTrue:[
> [ a asString ] on:Error do:[:ex | a value:a value:b value:c].
> ].
> thisContext method numTemps logCr.
> a
>
> The output on the transcript is "3"
> - ok
>
> This method is like the one above, but the nested block
> is in the receiver of the whileTrue:[] call (again, an optimized block, right?)
>
> |a b c|
> a:=1.
> b:=2.
> c:=3.
>
> [
> [ a asString ] on:Error do:[:ex | a value:a value:b value:c].
> c <0 ] whileTrue:[].
> thisContext method numTemps logCr.
> a
>
> The output on the Transcript is "1"
> And it depends on the number of used variables in the do-Block
> The same method but with the do-Block
> do:[:ex | a value:a value:b]
> would print 2 as number of temporaries.
>

Hmm… I need to think about this one. It looks strange, yes.

Marcus

--
best,
Eliot

--
best,

Eliot

Nicolai Hess

Re: Mysteries of the compiler

2014-10-14 20:31 GMT+02:00 Eliot Miranda <[hidden email]>:

On Tue, Oct 14, 2014 at 11:15 AM, Nicolai Hess <[hidden email]> wrote:
Thank you eliot, marcus

Yes I know your blog it is a great source of information.
I am curious how the squeak compiler/debugger handles this (I'll look into it).

There's no scope at run-time, only at compile time. AT compile time the BlockNodes for optimized blocks get checked for the optimized selectors (ifTrue: whileTrue and: et al) and have a flag set if they can be optimized (have the right number of arguments). At run-time there's no evidence of an optimized block. One is left in the enclosing method/block activation.

About opal and optimized block scope:
Yes, I like the way this "special" non-scope is modelled with
its own scope class.
But somehow the whileTrue: receiver block is handled different than
the ifTrue: block argument.

How so? Is it the source extent that is wrong?

1. sometimes the debugger can not inspect a single tempvar (list of all temps is working)

2. number of temp vars (in method header) is wrong:

    | a b c |
    a := 1.
    b := 2.
    c := 3.
    [[a asString]
        on: Error
        do: [:ex | a
                value: a
                value: b
                value: c].
    c < 0] whileTrue.

method header numtemps 1

    | a b c |
    a := 1.
    b := 2.
    c := 3.
    c < 0
        ifTrue: [
            [ a asString ]
                on: Error
                do: [ :ex | a value: a value: b value: c ].
            c < 0 ].
    ^ a

method header numtemps 3

both methods define a block with a nested block.
In the first method it is the receiver of #whileTrue:
the second method it is the argument to #ifTrue:

I think both blocks are optimized, but somehow the

block receiving the whileTrue: message makes the computation

for the number of tempvars wrong.

2014-10-14 17:41 GMT+02:00 Eliot Miranda <[hidden email]>:
Hi Marcus,

On Tue, Oct 14, 2014 at 4:09 AM, Marcus Denker <[hidden email]> wrote:

> On 12 Oct 2014, at 11:49, Nicolai Hess <[hidden email]> wrote:
>
> I tried to solve issues
> "13260 inspecting variables in the debugger fails in some cases" and
> "14058 Inconsistent information in debugger"
> and found some oddities in the way opal (compiler and the debuggermap)
> handles tempory variables.
> I am not sure if these are bugs.
>
> 1. order of temporaries:
> Evaluating this method
> |a b c|
> a:=1.
> b:=2.
> c:=3.
> thisContext tempNames logCr.
> ["in a block ! " thisContext tempNames logCr . a+b+c ] value.
>
> prints in the transcript
> #(#a #b #c)
> #(#a #b #c)
>
> - ok, both contexts, the method and block context, use the same
> ordering for the tempories
>
> writing on one variable in the block context require an indirection vector for
> the variable
>
> |a b c|
> a:=1.
> b:=2.
> c:=3.
> thisContext tempNames logCr.
> [b:=1. thisContext tempNames logCr. a+b+c ] value.
>
> here the variable b is written within the block context. This method prints
> #(#b #a #c)
> #(#a #c #b)
>
> - strange the variable order has changed and is not the same in both contexts.
>

Yes, but behind the scenes a lot is going on… when you assign a variable in a block, this
variable can not be allocated like a normal temp. There is instead an array (temp vector) allocated where this
variable lives. Then *this array* has a “temp” and is accessed in the block like the variables
in the case of no assignment.

Niolai, you're probably aware of it but there's an extensive write up of the design on my blog:

http://www.mirandabanda.org/cogblog/2008/06/07/closures-part-i/
http://www.mirandabanda.org/cogblog/2008/07/22/closures-part-ii-the-bytecodes/
http://www.mirandabanda.org/cogblog/2008/07/24/closures-part-iii-the-compiler/ (especially section "The Closure Analysis")

This means that there can not be an order: some variables live in arrays, others in the context.
The only thing that you have is a scope that nows how a name maps to a specific offset or two
(offset of temp of the “temp vector + offset inside this temp vector).

I disagree slightly. The Squeak compiler maintains order, both in temps and in indirect temps. But whether a temp is direct or indirect depends on a simple analysis of a single method. I still think it is part of the Smalltalk culture to KISS and e.g. maintain the order of temps.

One could beautify it a bit by sorting the names according to the original definition order, though.
(we should look onto that).

The current exception system needs to access temps, e.g. ensure:'s and ifCurtailed:'s complete and on:do:'s handlerActive. Isn't it really important to maintain ordering? It certainly makes introspection easier. One can go through a name-based interface but that is likely to be much much slower in what is a perfrmance-critical part of the system.

But even if we do, this does not mean that b has offset 2 in the example, as it lives in an array...

Right.

> 2. OCOptimizedBlockScope
>
> Consider a whileTrue: call
> ["A-Block"] whileTrue:["B-Block"]
> or an ifTrue:
> ("some boolean value") ifTrue:["C-Block"]
> stepping into the blocks with the debugger and reading the
> values for a tempoariy variable shows that every block has
> an OCOpotimizedBlockScope as its scope.
> Are all 3 blocks the same kind of "optimized blocks"? because ->
>
Yes, we tried two ideas
The first move only real existing scopes, that means, instead of an optimised scope
point to the scope that all these optimized blocks are flattened into.
But that was not nice and we could not get it to work.
Instead, we opted into modelling the concept of the optimised scope: on the AST
level, the block has a scope (like all blocks), but when you add an entry (or query),
it will forward to the “real” existing scope.

I still like it, it makes things very explicit.

> 3. Blocks within optimized blocks
> This method prints the number of tempories on the Transcript
> (the on:do: call is just a dummy for an "block within a block"
> but the important part is the usage of the three variables
> in the do-block)
>
> |a b c|
> a:=1.
> b:=2.
> c:=3.
> (c>0) ifTrue:[
> [ a asString ] on:Error do:[:ex | a value:a value:b value:c].
> ].
> thisContext method numTemps logCr.
> a
>
> The output on the transcript is "3"
> - ok
>
> This method is like the one above, but the nested block
> is in the receiver of the whileTrue:[] call (again, an optimized block, right?)
>
> |a b c|
> a:=1.
> b:=2.
> c:=3.
>
> [
> [ a asString ] on:Error do:[:ex | a value:a value:b value:c].
> c <0 ] whileTrue:[].
> thisContext method numTemps logCr.
> a
>
> The output on the Transcript is "1"
> And it depends on the number of used variables in the do-Block
> The same method but with the do-Block
> do:[:ex | a value:a value:b]
> would print 2 as number of temporaries.
>

Hmm… I need to think about this one. It looks strange, yes.

Marcus

--
best,
Eliot

--
best,
Eliot

Marcus Denker-4

Re: Mysteries of the compiler

1. sometimes the debugger can not inspect a single tempvar (list of all temps is working)
2. number of temp vars (in method header) is wrong:

    | a b c |
    a := 1.
    b := 2.
    c := 3.
    [[a asString]
        on: Error
        do: [:ex | a
                value: a
                value: b
                value: c].
    c < 0] whileTrue.

method header numtemps 1

    | a b c |
    a := 1.
    b := 2.
    c := 3.
    c < 0
        ifTrue: [
            [ a asString ]
                on: Error
                do: [ :ex | a value: a value: b value: c ].
            c < 0 ].
    ^ a

method header numtemps 3

both methods define a block with a nested block.
In the first method it is the receiver of #whileTrue:
the second method it is the argument to #ifTrue:
I think both blocks are optimized, but somehow the
block receiving the whileTrue: message makes the computation
for the number of tempvars wronng

What is happening is that the variables are wrongly put in a temp vector... in the case

of loops.

the one temp is this vector.

this should be fixed.

Marcus Denker-4

Re: Mysteries of the compiler

On 16 Oct 2014, at 16:43, Marcus Denker <[hidden email]> wrote:

1. sometimes the debugger can not inspect a single tempvar (list of all temps is working)
2. number of temp vars (in method header) is wrong:

    | a b c |
    a := 1.
    b := 2.
    c := 3.
    [[a asString]
        on: Error
        do: [:ex | a
                value: a
                value: b
                value: c].
    c < 0] whileTrue.

method header numtemps 1

    | a b c |
    a := 1.
    b := 2.
    c := 3.
    c < 0
        ifTrue: [
            [ a asString ]
                on: Error
                do: [ :ex | a value: a value: b value: c ].
            c < 0 ].
    ^ a

method header numtemps 3

both methods define a block with a nested block.
In the first method it is the receiver of #whileTrue:
the second method it is the argument to #ifTrue:
I think both blocks are optimized, but somehow the
block receiving the whileTrue: message makes the computation
for the number of tempvars wronng

What is happening is that the variables are wrongly put in a temp vector... in the case
of loops.

the one temp is this vector.

this should be fixed.

I think the problem is that what we did is to mark a variables as escaping when it

as *read* in an optimised block.

But what we should do is to mark only all *writes* to a variable that happens in a loop as

escaping vars even in the case when they happen in the same scope.

this means the read is simpler:

visitVariableNode: aVariableNode

| var |

var := (self lookupVariableForRead: aVariableNode) ifNil: [(self undeclaredVariable: aVariableNode)].

aVariableNode binding: var.

var isUninitialized ifTrue: [self uninitializedVariable: aVariableNode].

While in the write we need to mark:

lookupVariableForWrite: aVariableNode

| var |

var := scope lookupVar: aVariableNode name.

var ifNil: [^var].

var isSpecialVariable ifTrue: [ self storeIntoSpecialVariable: aVariableNode ].

var isWritable ifFalse: [ self storeIntoReadOnlyVariable: aVariableNode ].

var isTemp ifTrue: [

"when in a loop, all writes escape"

scope outerScope isInsideOptimizedLoop ifTrue: [ var markEscapingWrite ].

"else only escaping when they will end up in different closures"

(var scope outerNotOptimizedScope ~= scope outerNotOptimizedScope)

ifTrue: [ var markEscapingWrite]].

^var

I checked that all Opal tests that are failing are actually testing wrong, and I double checked

that all those methods are now compiled like with the old old compiler…

Now recompilng the image.

Marcus

Marcus Denker-4

Re: Mysteries of the compiler

On 16 Oct 2014, at 17:26, Marcus Denker <[hidden email]> wrote:

On 16 Oct 2014, at 16:43, Marcus Denker <[hidden email]> wrote:

1. sometimes the debugger can not inspect a single tempvar (list of all temps is working)
2. number of temp vars (in method header) is wrong:

    | a b c |
    a := 1.
    b := 2.
    c := 3.
    [[a asString]
        on: Error
        do: [:ex | a
                value: a
                value: b
                value: c].
    c < 0] whileTrue.

method header numtemps 1

    | a b c |
    a := 1.
    b := 2.
    c := 3.
    c < 0
        ifTrue: [
            [ a asString ]
                on: Error
                do: [ :ex | a value: a value: b value: c ].
            c < 0 ].
    ^ a

method header numtemps 3

both methods define a block with a nested block.
In the first method it is the receiver of #whileTrue:
the second method it is the argument to #ifTrue:
I think both blocks are optimized, but somehow the
block receiving the whileTrue: message makes the computation
for the number of tempvars wronng

What is happening is that the variables are wrongly put in a temp vector... in the case
of loops.

the one temp is this vector.

this should be fixed.

I think the problem is that what we did is to mark a variables as escaping when it
as *read* in an optimised block.

But what we should do is to mark only all *writes* to a variable that happens in a loop as
escaping vars even in the case when they happen in the same scope.

this means the read is simpler:

visitVariableNode: aVariableNode
| var |
var := (self lookupVariableForRead: aVariableNode) ifNil: [(self undeclaredVariable: aVariableNode)].
aVariableNode binding: var.
var isUninitialized ifTrue: [self uninitializedVariable: aVariableNode].

While in the write we need to mark:

lookupVariableForWrite: aVariableNode

| var |

var := scope lookupVar: aVariableNode name.

var ifNil: [^var].
var isSpecialVariable ifTrue: [ self storeIntoSpecialVariable: aVariableNode ].
var isWritable ifFalse: [ self storeIntoReadOnlyVariable: aVariableNode ].

var isTemp ifTrue: [
"when in a loop, all writes escape"
scope outerScope isInsideOptimizedLoop ifTrue: [ var markEscapingWrite ].
"else only escaping when they will end up in different closures"
(var scope outerNotOptimizedScope ~= scope outerNotOptimizedScope)
ifTrue: [ var markEscapingWrite]].
^var

I checked that all Opal tests that are failing are actually testing wrong, and I double checked
that all those methods are now compiled like with the old old compiler…
Now recompilng the image.

Hmm… nope…. something wrong.

Marcus

Marcus Denker-4

Re: Mysteries of the compiler

>>
>> While in the write we need to mark:
>>
>> lookupVariableForWrite: aVariableNode
>>
>> | var |
>>
>> var := scope lookupVar: aVariableNode name.
>>
>> var ifNil: [^var].
>> var isSpecialVariable ifTrue: [ self storeIntoSpecialVariable: aVariableNode ].
>> var isWritable ifFalse: [ self storeIntoReadOnlyVariable: aVariableNode ].
>>
>> var isTemp ifTrue: [
>> "when in a loop, all writes escape"
>> scope outerScope isInsideOptimizedLoop ifTrue: [ var markEscapingWrite ].
>> "else only escaping when they will end up in different closures"
>> (var scope outerNotOptimizedScope ~= scope outerNotOptimizedScope)
>> ifTrue: [ var markEscapingWrite]].
>> ^var
>>
>>
>> I checked that all Opal tests that are failing are actually testing wrong, and I double checked
>> that all those methods are now compiled like with the old old compiler…
>> Now recompilng the image.
>
> Hmm… nope…. something wrong.
>

the “outerScope” was wrong.

So without that, I can recompile the image and your example is compiled without temp vectors…

Marcus

Nicolai Hess

Re: Mysteries of the compiler

2014-10-16 17:40 GMT+02:00 Marcus Denker <[hidden email]>:

>>
>> While in the write we need to mark:
>>
>> lookupVariableForWrite: aVariableNode
>>
>> | var |
>>
>> var := scope lookupVar: aVariableNode name.
>>
>> var ifNil: [^var].
>> var isSpecialVariable ifTrue: [ self storeIntoSpecialVariable: aVariableNode ].
>> var isWritable ifFalse: [ self storeIntoReadOnlyVariable: aVariableNode ].
>>
>> var isTemp ifTrue: [
>> "when in a loop, all writes escape"
>> scope outerScope isInsideOptimizedLoop ifTrue: [ var markEscapingWrite ].
>> "else only escaping when they will end up in different closures"
>> (var scope outerNotOptimizedScope ~= scope outerNotOptimizedScope)
>> ifTrue: [ var markEscapingWrite]].
>> ^var
>>
>>
>> I checked that all Opal tests that are failing are actually testing wrong, and I double checked
>> that all those methods are now compiled like with the old old compiler…
>> Now recompilng the image.
>
> Hmm… nope…. something wrong.
>

the “outerScope” was wrong.

So without that, I can recompile the image and your example is compiled without temp vectors…

Marcus

good!

I'll open an issue on fogbugz and create a slice with your changes.

about reordering of tempvars:

It would be nice if the order preserves, but I think we can live with it.

Although there is an issue with the debugger:

14058 Inconsistent information in debugger

I solved that one by using tempvar names instead if the index, but
this has another issue (sometimes the last (or most nested) tempvar is always nil).

The real error is, EyeDebuggerContextInspector does recognizes if the index changes

(somewhere in EyeDebuggerContextInspector>>#updateList it
compares the new elements (with new indices) with the old elements list

and skips the update)

Marcus Denker-5

Re: Mysteries of the compiler

On 16 Oct 2014 at 22:07:03, Nicolai Hess ([hidden email]) wrote:

2014-10-16 17:40 GMT+02:00 Marcus Denker <[hidden email]>:
>>

>> While in the write we need to mark:
>>
>> lookupVariableForWrite: aVariableNode
>>
>> | var |
>>
>> var := scope lookupVar: aVariableNode name.
>>
>> var ifNil: [^var].
>> var isSpecialVariable ifTrue: [ self storeIntoSpecialVariable: aVariableNode ].
>> var isWritable ifFalse: [ self storeIntoReadOnlyVariable: aVariableNode ].
>>
>> var isTemp ifTrue: [
>> "when in a loop, all writes escape"
>> scope outerScope isInsideOptimizedLoop ifTrue: [ var markEscapingWrite ].
>> "else only escaping when they will end up in different closures"
>> (var scope outerNotOptimizedScope ~= scope outerNotOptimizedScope)
>> ifTrue: [ var markEscapingWrite]].
>> ^var
>>
>>
>> I checked that all Opal tests that are failing are actually testing wrong, and I double checked
>> that all those methods are now compiled like with the old old compiler…
>> Now recompilng the image.
>
> Hmm… nope…. something wrong.
>

the “outerScope” was wrong.

So without that, I can recompile the image and your example is compiled without temp vectors…

Marcus

good!
I'll open an issue on fogbugz and create a slice with your changes.

about reordering of tempvars:
It would be nice if the order preserves, but I think we can live with it.

Yes, we should order them. But keep in mind that this does not mean that you

can access them via “at: index”. The index is “virtual”, so var3 can be actually var2,

because the *real* var2 lives together with var1 in a tempVector (which is now var1).

Knowledge has to come from somewhere… e.g. the DebuggerMethodMap builds up

a description, while we (for now) just delegate to the AST (+scopes + sematic vars).

this means that

#tempAt:

sees always the real world. e.g. a method with 3 vars where they are all in the tempVector,

tempAt:2 will raise an error.

namedTempAt: does what the old one did.

This now can get confused in the Opal case with the changing order, which we should fix.

(and yes, we should check how much slower the AST bases scheme is for accessing

temps by offset... we could cache a representation similar to what the old DebuggerMethodMap

does).

Although there is an issue with the debugger:
14058 Inconsistent information in debugger
I solved that one by using tempvar names instead if the index, but
this has another issue (sometimes the last (or most nested) tempvar is always nil).

The real error is, EyeDebuggerContextInspector does recognizes if the index changes
(somewhere in EyeDebuggerContextInspector>>#updateList it
compares the new elements (with new indices) with the old elements list
and skips the update)

Ok… that could be fixed with ordered temp names?

Marcus

Eliot Miranda-2

Re: Mysteries of the compiler

On Thu, Oct 16, 2014 at 11:53 PM, Marcus Denker <[hidden email]> wrote:

On 16 Oct 2014 at 22:07:03, Nicolai Hess ([hidden email]) wrote:

2014-10-16 17:40 GMT+02:00 Marcus Denker <[hidden email]>:
>>

>> While in the write we need to mark:
>>
>> lookupVariableForWrite: aVariableNode
>>
>> | var |
>>
>> var := scope lookupVar: aVariableNode name.
>>
>> var ifNil: [^var].
>> var isSpecialVariable ifTrue: [ self storeIntoSpecialVariable: aVariableNode ].
>> var isWritable ifFalse: [ self storeIntoReadOnlyVariable: aVariableNode ].
>>
>> var isTemp ifTrue: [
>> "when in a loop, all writes escape"
>> scope outerScope isInsideOptimizedLoop ifTrue: [ var markEscapingWrite ].
>> "else only escaping when they will end up in different closures"
>> (var scope outerNotOptimizedScope ~= scope outerNotOptimizedScope)
>> ifTrue: [ var markEscapingWrite]].
>> ^var
>>
>>
>> I checked that all Opal tests that are failing are actually testing wrong, and I double checked
>> that all those methods are now compiled like with the old old compiler…
>> Now recompilng the image.
>
> Hmm… nope…. something wrong.
>

the “outerScope” was wrong.

So without that, I can recompile the image and your example is compiled without temp vectors…

Marcus

good!
I'll open an issue on fogbugz and create a slice with your changes.

about reordering of tempvars:
It would be nice if the order preserves, but I think we can live with it.
Yes, we should order them. But keep in mind that this does not mean that you
can access them via “at: index”. The index is “virtual”, so var3 can be actually var2,
because the *real* var2 lives together with var1 in a tempVector (which is now var1).
Knowledge has to come from somewhere… e.g. the DebuggerMethodMap builds up
a description, while we (for now) just delegate to the AST (+scopes + semantic vars).
this means that
#tempAt:
sees always the real world. e.g. a method with 3 vars where they are all in the tempVector,
tempAt:2 will raise an error.
namedTempAt: does what the old one did.

Not quite. What namedTempAt:[put:] do is access the temps in the order they appear in the list answered by ContextPart>>tempNames which is self debuggerMap tempNamesForContext: self. This is a flattened list that includes both direct and indirect temps. This is also the list that the Squeak debugger uses to populate the bottom-right context inspector in the debugger. namedTempAt:[put:] matches tempAt:[put:] until you get to an indirect temp. In a method the Squeak compiler always puts the indirect temps after all the direct temps, so the two match until the first indirect temp. But in a closure the situation is potentially much more complex because indirection vectors can be part of a closure's copied values, so it may have any number of indirection vectors copied form outer scopes, its own local direct temps and its own local indirect temps. The copied values get pushed after the arguments and before the local temps. So the ordering potentially gets complicated.

This now can get confused in the Opal case with the changing order, which we should fix.
(and yes, we should check how much slower the AST bases scheme is for accessing
temps by offset... we could cache a representation similar to what the old DebuggerMethodMap
does).

Or use DebuggerMethodMap, rewriting the code to make it more comprehensible... Why reinvent the wheel?

Although there is an issue with the debugger:
14058 Inconsistent information in debugger
I solved that one by using tempvar names instead if the index, but
this has another issue (sometimes the last (or most nested) tempvar is always nil).

The real error is, EyeDebuggerContextInspector does recognizes if the index changes
(somewhere in EyeDebuggerContextInspector>>#updateList it
compares the new elements (with new indices) with the old elements list
and skips the update)

Ok… that could be fixed with ordered temp names?

Marcus

--
best,

Eliot

Nicolai Hess

Re: Mysteries of the compiler

In reply to this post by Nicolai Hess

I ran some testcases with your changes and I expect some test

are failing now, but actually they don't fail but others do.

2014-10-16 22:06 GMT+02:00 Nicolai Hess <[hidden email]>:

2014-10-16 17:40 GMT+02:00 Marcus Denker <[hidden email]>:
>>
>> While in the write we need to mark:
>>
>> lookupVariableForWrite: aVariableNode
>>
>> | var |
>>
>> var := scope lookupVar: aVariableNode name.
>>
>> var ifNil: [^var].
>> var isSpecialVariable ifTrue: [ self storeIntoSpecialVariable: aVariableNode ].
>> var isWritable ifFalse: [ self storeIntoReadOnlyVariable: aVariableNode ].
>>
>> var isTemp ifTrue: [
>> "when in a loop, all writes escape"
>> scope outerScope isInsideOptimizedLoop ifTrue: [ var markEscapingWrite ].
>> "else only escaping when they will end up in different closures"
>> (var scope outerNotOptimizedScope ~= scope outerNotOptimizedScope)
>> ifTrue: [ var markEscapingWrite]].
>> ^var
>>
>>
>> I checked that all Opal tests that are failing are actually testing wrong, and I double checked
>> that all those methods are now compiled like with the old old compiler…
>> Now recompilng the image.
>
> Hmm… nope…. something wrong.
>

the “outerScope” was wrong.

So without that, I can recompile the image and your example is compiled without temp vectors…

Marcus

good!
I'll open an issue on fogbugz and create a slice with your changes.

about reordering of tempvars:
It would be nice if the order preserves, but I think we can live with it.
Although there is an issue with the debugger:
14058 Inconsistent information in debugger
I solved that one by using tempvar names instead if the index, but
this has another issue (sometimes the last (or most nested) tempvar is always nil).

The real error is, EyeDebuggerContextInspector does recognizes if the index changes
(somewhere in EyeDebuggerContextInspector>>#updateList it
compares the new elements (with new indices) with the old elements list
and skips the update)

Marcus Denker-4

Re: Mysteries of the compiler

--
Marcus Denker
Sent with Airmail

On 17 Oct 2014 at 12:46:20, Nicolai Hess ([hidden email]) wrote:

I ran some testcases with your changes and I expect some test
are failing now, but actually they don't fail but others do.

Yes, but I think these tests are wrong.

(needs to be carefully checked).

2014-10-16 22:06 GMT+02:00 Nicolai Hess <[hidden email]>:

2014-10-16 17:40 GMT+02:00 Marcus Denker <[hidden email]>:

>>
>> While in the write we need to mark:
>>
>> lookupVariableForWrite: aVariableNode
>>
>> | var |
>>
>> var := scope lookupVar: aVariableNode name.
>>
>> var ifNil: [^var].
>> var isSpecialVariable ifTrue: [ self storeIntoSpecialVariable: aVariableNode ].
>> var isWritable ifFalse: [ self storeIntoReadOnlyVariable: aVariableNode ].
>>
>> var isTemp ifTrue: [
>> "when in a loop, all writes escape"
>> scope outerScope isInsideOptimizedLoop ifTrue: [ var markEscapingWrite ].
>> "else only escaping when they will end up in different closures"
>> (var scope outerNotOptimizedScope ~= scope outerNotOptimizedScope)
>> ifTrue: [ var markEscapingWrite]].
>> ^var
>>
>>
>> I checked that all Opal tests that are failing are actually testing wrong, and I double checked
>> that all those methods are now compiled like with the old old compiler…
>> Now recompilng the image.
>
> Hmm… nope…. something wrong.
>

the “outerScope” was wrong.

So without that, I can recompile the image and your example is compiled without temp vectors…

Marcus

good!

I'll open an issue on fogbugz and create a slice with your changes.

about reordering of tempvars:

It would be nice if the order preserves, but I think we can live with it.

Although there is an issue with the debugger:

14058 Inconsistent information in debugger

I solved that one by using tempvar names instead if the index, but
this has another issue (sometimes the last (or most nested) tempvar is always nil).

The real error is, EyeDebuggerContextInspector does recognizes if the index changes

(somewhere in EyeDebuggerContextInspector>>#updateList it
compares the new elements (with new indices) with the old elements list

and skips the update)

Nicolai Hess

Re: Mysteries of the compiler

2014-10-17 13:28 GMT+02:00 Marcus Denker <[hidden email]>:

--
Marcus Denker
Sent with Airmail

On 17 Oct 2014 at 12:46:20, Nicolai Hess ([hidden email]) wrote:

I ran some testcases with your changes and I expect some test
are failing now, but actually they don't fail but others do.

Yes, but I think these tests are wrong.
(needs to be carefully checked).

Ok, some checks:

OCASTCheckerTest

testSemanticAnalysisOnNonMethodNode
is this a testcase? what does it test?

testNoRemoteBlockArgument
testNoRemoteBlockTemp

both look similiar.

testsingleRemoteTempVarWhileWithTempNotInlined

I would add assertions for the "temp" var

the others look fine.

OCASTClosureAnalyzerTest

testOptimizedBlockWriteInNestedBlockCase2:

the testmethos like (OCOpalExamples>>#optimizedBlockWriteInNestedBlockCase2) arent
supposed to be runnable code, but I would change the "(true)whileTrue:" to -> "[true] whileTrue:" anyway.

testOptimizedBlockWriteInNestedBlockCase3

look at the comment for :
self assert: ast scope copiedVars size = 1. "Is this correct, I think that the copied vars should be empty."

I think the number of copiedVars is right, because there is a outer block.

the same for testOptimizedBlockWriteInNestedBlockCase4, the whileTrue is inside of a block.

in testOptimizedBlockWrittenAfterClosedOverCase1

there is a comment
"problem: as block is optimized, this var does not need to be remote"
self assert: (scopes third tempVector at: 'temp') isRemote.

But I think this is correct, the temp var is within a block

the others look fine.

Changing the method visitVariableNode: and lookupVariableForWrite:

makes these tests failing:

testOptimizedBlockWrittenAfterClosedOverCase1 it fails in
self deny: (ast scope lookupVar: 'index') isEscaping.

I think the test is right, the var index is only used within the whileTrue block.

the same for
testOptimizedBlockWrittenAfterClosedOverCase2

testExampleSimpleBlockLocalWhile fails, I think the test is right, but I am not sure.

testOptimizedBlockWriteInBlock fails in
self assert: ast scope tempVars size = 1.

But I think the test is right, there is only one local var, and it is written only

in the whileTrue:[block]

the same in testOptimizedBlockWriteInNestedBlockCase2, allthough the whileTrue: is
in another ifTrue:[] block

the same for the other failing tests.

Nicolai Hess

Re: Mysteries of the compiler

The more I tried to change/fix the compiler and checking the failing testcases,

the more I learn about it.

Finally I think the compiler and testcases are mostly right and my understanding was wrong.

My (wrong) assumptions were:

- the "numTemps" in the method header is always the number of declared temps, regardless whether they end up in an indirection vector

- the inlined ifTrue: and the inlined whileTrue: blocks are handled the same way,

both assumptions are wrong:
A method that starts with
foo

| a b c|

and all temps end up in an indirect vector has only 1 tempvar, the "indirection vector" and even "numTemps" in the method header is 1.

ifTrue and whileTrue are handled differently for example in this case

|a|
a:=1.

(...) ifTrue:[

... assign to "a"

... create a block which uses "a"
]

the block only needs a copy of "a", as it is not changed after the creation of the block

|a|
a:=1.

[...] whileTrue:[

... assign to "a"

... create a block which uses "a"
]

here we need a indirection vector, because the while-block *can* loop repeatedly and change the value of "a" after
an block was created.

If this is all correct, there is only one problem in this special case (my example above):

If there is no assignment within the loop at all, there is no need to create a indirection vector and
all used vars (a b c) can just be copied.

Or am I still confused ?!

nicolai

Eliot Miranda-2

Re: Mysteries of the compiler

On Tue, Oct 21, 2014 at 1:02 PM, Nicolai Hess <[hidden email]> wrote:

The more I tried to change/fix the compiler and checking the failing testcases,
the more I learn about it.
Finally I think the compiler and testcases are mostly right and my understanding was wrong.

My (wrong) assumptions were:
- the "numTemps" in the method header is always the number of declared temps, regardless whether they end up in an indirection vector

Ah, no. It is the sum of the arguments and the local temporaries. So it is the position to set the stack pointer to on activating a method.

- the inlined ifTrue: and the inlined whileTrue: blocks are handled the same way,

both assumptions are wrong:
A method that starts with
foo
| a b c|

and all temps end up in an indirect vector has only 1 tempvar, the "indirection vector" and even "numTemps" in the method header is 1.

ifTrue and whileTrue are handled differently for example in this case

|a|
a:=1.
(...) ifTrue:[
... assign to "a"
... create a block which uses "a"
]

the block only needs a copy of "a", as it is not changed after the creation of the block

Right.

|a|
a:=1.
[...] whileTrue:[
... assign to "a"
... create a block which uses "a"
]

here we need a indirection vector, because the while-block *can* loop repeatedly and change the value of "a" after
an block was created.

Right. In general the compiler cannot tell whether the whileTrue: will be executed 0, 1 or many times. In cases where it can tell it would take lots of machinery to do so and only be able to tell in relatively few cases. So the compiler always assumes the whileTrue: is evaluated more than once, and hence that it is possible for a block to be created and a modified in a subsequent iteration of the loop. Hence it must put a in an indirection vector.

If this is all correct, there is only one problem in this special case (my example above):
If there is no assignment within the loop at all, there is no need to create a indirection vector and
all used vars (a b c) can just be copied.

Or am I still confused ?!

No you're not. You get it now :-). One question is whether you can explain it all better than I did on my blog, because people seem to find this area of teh system hard and take a while to "get it".

nicolai

--
best,

Eliot

Nicolai Hess

Re: Mysteries of the compiler

2014-10-22 0:45 GMT+02:00 Eliot Miranda <[hidden email]>:

On Tue, Oct 21, 2014 at 1:02 PM, Nicolai Hess <[hidden email]> wrote:

The more I tried to change/fix the compiler and checking the failing testcases,
the more I learn about it.
Finally I think the compiler and testcases are mostly right and my understanding was wrong.

My (wrong) assumptions were:
- the "numTemps" in the method header is always the number of declared temps, regardless whether they end up in an indirection vector

Ah, no. It is the sum of the arguments and the local temporaries. So it is the position to set the stack pointer to on activating a method.

- the inlined ifTrue: and the inlined whileTrue: blocks are handled the same way,

both assumptions are wrong:
A method that starts with
foo
| a b c|

and all temps end up in an indirect vector has only 1 tempvar, the "indirection vector" and even "numTemps" in the method header is 1.

ifTrue and whileTrue are handled differently for example in this case

|a|
a:=1.
(...) ifTrue:[
... assign to "a"
... create a block which uses "a"
]

the block only needs a copy of "a", as it is not changed after the creation of the block

Right.

|a|
a:=1.
[...] whileTrue:[
... assign to "a"
... create a block which uses "a"
]

here we need a indirection vector, because the while-block *can* loop repeatedly and change the value of "a" after
an block was created.

Right. In general the compiler cannot tell whether the whileTrue: will be executed 0, 1 or many times. In cases where it can tell it would take lots of machinery to do so and only be able to tell in relatively few cases. So the compiler always assumes the whileTrue: is evaluated more than once, and hence that it is possible for a block to be created and a modified in a subsequent iteration of the loop. Hence it must put a in an indirection vector.

If this is all correct, there is only one problem in this special case (my example above):
If there is no assignment within the loop at all, there is no need to create a indirection vector and
all used vars (a b c) can just be copied.

Or am I still confused ?!

No you're not. You get it now :-). One question is whether you can explain it all better than I did on my blog, because people seem to find this area of teh system hard and take a while to "get it".

@eliot

Your blog contains all the information needed. But if someone don't have already a deep knowledge of the compiler and VM-internals, it can be a bit overwhelming.

And I must admit I have overread or skipped some parts, because I thought they are not important or because I didn't understand :)

And now I found the place in the (old) compiler, where the blocks in optimized loops are handled. You search for assignments in a loop

and if one is found, restart the "analyseClosure" as if the write had happened after the close over.

Now I need to find a way to do the same with opal compiler. It is (and looks a lot) different than the old compiler.

It tags the variables with a usage marker #write, #read and an "escaping" flag. And it does only one pass over the

tree.

But this way, it can not distinguish between a write that happens before entering the loop from a write happening within

the loop.

@marcus

I propese an extra usage tag #writeInLoop, that is only set on assignments within a loop.

And then, mark an "escaping" var in an optimized loop only as "escapingWrite", if it is marked as "writeInLoop".

nicolai
--
best,
Eliot

Marcus Denker-4

Re: Mysteries of the compiler

@marcus

I propese an extra usage tag #writeInLoop, that is only set on assignments within a loop.

And then, mark an "escaping" var in an optimized loop only as "escapingWrite", if it is marked as "writeInLoop".

Yes! that should solve it.