The Inbox: EToys-ct.367.mcz

Previous Topic Next Topic
 
classic Classic list List threaded Threaded
12 messages Options
Reply | Threaded
Open this post in threaded view
|

The Inbox: EToys-ct.367.mcz

commits-2
A new version of EToys was added to project The Inbox:
http://source.squeak.org/inbox/EToys-ct.367.mcz

==================== Summary ====================

Name: EToys-ct.367
Author: ct
Time: 15 October 2019, 2:46:24.862129 pm
UUID: 1394344f-b1e3-5640-a13a-70c5dffd51f4
Ancestors: EToys-mt.361

Allow for embedding SyntaxMorphs into test tiles.

=============== Diff against EToys-mt.361 ===============

Item was added:
+ ----- Method: SyntaxMorph>>parseNodeWith:asStatement: (in category '*Etoys-Squeakland-code generation') -----
+ parseNodeWith: encoder asStatement: aBoolean
+
+ ^ self parseNode!


Reply | Threaded
Open this post in threaded view
|

[Etoys, Compiler] Help wanted: Trying to embed SyntaxMorphs into other tiles

Christoph Thiede

Hi all,


I'm currently trying to implement #parseNodeWith: on SyntaxMorph, in order to embed SyntaxMorphs into regular tiles. (Did this ever work in past?)

I'm afraid the attempt in the commit below does not work yet; you can create a script editor, but parsing is erroneous, so you cannot execute the script.


To reproduce:

Compile the following:

MyPlayer >> examplePlayerCode

self forward: 6 * 7.

self turn: (11 raisedTo: 13 modulo: 97)

and evaluate:

| e p |
p := Morph new openInWorld assuredPlayer.
e := (MyPlayer >> #examplePlayerCode) decompile asScriptEditorFor: p.
e openInHand.


In Player>>#acceptScript:for:, #generate: is called on node, and when I decompile the result, I get a strange result:


examplePlayerCodeTest

self forward: 6 * 7.

self

forward: (#forward: forward: #forward:).


I don't know how to use #generate: exactly, but other senders usually appear to recompile a method before passing it to #generate:.

For comparison:

(Collection >> #asArray) decompile generate: CompiledMethodTrailer empty ] fails, but

m := (Collection >> #asArray) decompile.

  m := Compiler new compile: m in: Collection notifying: nil ifFail: #foo.
  m generate: CompiledMethodTrailer empty ] works.

Why is that recompilation required but decompilation is insufficient? Is this some bug, or is it expected behavior?


However, in the case of SyntaxMorph, I don't know how to recompile the node before, as a SyntaxMorph should be able to represent a node of an arbitrary type that must not be constrained to a MessageNode. So how could I solve the problem to generate code from SyntaxMorphs?


tl;dr: What is the full story of #generate: and how can it be made to work in this example?

Many thanks in advance! :-)


Best,

Christoph



Von: Squeak-dev <[hidden email]> im Auftrag von [hidden email] <[hidden email]>
Gesendet: Dienstag, 15. Oktober 2019 14:46 Uhr
An: [hidden email]
Betreff: [squeak-dev] The Inbox: EToys-ct.367.mcz
 
A new version of EToys was added to project The Inbox:
http://source.squeak.org/inbox/EToys-ct.367.mcz

==================== Summary ====================

Name: EToys-ct.367
Author: ct
Time: 15 October 2019, 2:46:24.862129 pm
UUID: 1394344f-b1e3-5640-a13a-70c5dffd51f4
Ancestors: EToys-mt.361

Allow for embedding SyntaxMorphs into test tiles.

=============== Diff against EToys-mt.361 ===============

Item was added:
+ ----- Method: SyntaxMorph>>parseNodeWith:asStatement: (in category '*Etoys-Squeakland-code generation') -----
+ parseNodeWith: encoder asStatement: aBoolean
+
+        ^ self parseNode!




Carpe Squeak!
Reply | Threaded
Open this post in threaded view
|

Re: [Etoys, Compiler] Help wanted: Trying to embed SyntaxMorphs into other tiles

Christoph Thiede

Hi all! :-)

Just an update of the decompilation question:
Christoph Thiede wrote
I don't know how to use #generate: exactly, but other senders usually appear to recompile a method before passing it to #generate:.
For comparison:

[ (Collection >> #asArray) decompile generate: CompiledMethodTrailer empty ] fails, but

[ m := (Collection >> #asArray) decompile.

  m := Compiler new compile: m in: Collection notifying: nil ifFail: #foo.
  m generate: CompiledMethodTrailer empty ] works.
Why is that recompilation required but decompilation is insufficient? Is this some bug, or is it expected behavior?
The general approach seems to be correct, but I think I found an error in the decompilation of literal variables such as Array. I sent Compiler-ct.425 to the inbox which should fix this issue.

I am going to complete the implementation of SyntaxMorph >> #parseNode :-)

Best,
Christoph

Von: Squeak-dev <[hidden email]> im Auftrag von Thiede, Christoph
Gesendet: Dienstag, 15. Oktober 2019 21:08:24
An: [hidden email]
Betreff: [squeak-dev] [Etoys, Compiler] Help wanted: Trying to embed SyntaxMorphs into other tiles
 

Hi all,


I'm currently trying to implement #parseNodeWith: on SyntaxMorph, in order to embed SyntaxMorphs into regular tiles. (Did this ever work in past?)

I'm afraid the attempt in the commit below does not work yet; you can create a script editor, but parsing is erroneous, so you cannot execute the script.


To reproduce:

Compile the following:

MyPlayer >> examplePlayerCode

self forward: 6 * 7.

self turn: (11 raisedTo: 13 modulo: 97)

and evaluate:

| e p |
p := Morph new openInWorld assuredPlayer.
e := (MyPlayer >> #examplePlayerCode) decompile asScriptEditorFor: p.
e openInHand.


In Player>>#acceptScript:for:, #generate: is called on node, and when I decompile the result, I get a strange result:


examplePlayerCodeTest

self forward: 6 * 7.

self

forward: (#forward: forward: #forward:).


I don't know how to use #generate: exactly, but other senders usually appear to recompile a method before passing it to #generate:.

For comparison:

(Collection >> #asArray) decompile generate: CompiledMethodTrailer empty ] fails, but

m := (Collection >> #asArray) decompile.

  m := Compiler new compile: m in: Collection notifying: nil ifFail: #foo.
  m generate: CompiledMethodTrailer empty ] works.

Why is that recompilation required but decompilation is insufficient? Is this some bug, or is it expected behavior?


However, in the case of SyntaxMorph, I don't know how to recompile the node before, as a SyntaxMorph should be able to represent a node of an arbitrary type that must not be constrained to a MessageNode. So how could I solve the problem to generate code from SyntaxMorphs?


tl;dr: What is the full story of #generate: and how can it be made to work in this example?

Many thanks in advance! :-)


Best,

Christoph



Von: Squeak-dev <[hidden email]> im Auftrag von [hidden email] <[hidden email]>
Gesendet: Dienstag, 15. Oktober 2019 14:46 Uhr
An: [hidden email]
Betreff: [squeak-dev] The Inbox: EToys-ct.367.mcz
 
A new version of EToys was added to project The Inbox:
http://source.squeak.org/inbox/EToys-ct.367.mcz

==================== Summary ====================

Name: EToys-ct.367
Author: ct
Time: 15 October 2019, 2:46:24.862129 pm
UUID: 1394344f-b1e3-5640-a13a-70c5dffd51f4
Ancestors: EToys-mt.361

Allow for embedding SyntaxMorphs into test tiles.

=============== Diff against EToys-mt.361 ===============

Item was added:
+ ----- Method: SyntaxMorph>>parseNodeWith:asStatement: (in category '*Etoys-Squeakland-code generation') -----
+ parseNodeWith: encoder asStatement: aBoolean
+
+        ^ self parseNode!




Carpe Squeak!
Reply | Threaded
Open this post in threaded view
|

Re: [Etoys, Compiler] Help wanted: Trying to embed SyntaxMorphs into other tiles

Eliot Miranda-2
Hi Christoph,

On Mar 27, 2020, at 12:45 PM, Thiede, Christoph <[hidden email]> wrote:



Hi all! :-)

Just an update of the decompilation question:
Christoph Thiede wrote
I don't know how to use #generate: exactly, but other senders usually appear to recompile a method before passing it to #generate:.
For comparison:

[ (Collection >> #asArray) decompile generate: CompiledMethodTrailer empty ] fails, but

[ m := (Collection >> #asArray) decompile.

  m := Compiler new compile: m in: Collection notifying: nil ifFail: #foo.
  m generate: CompiledMethodTrailer empty ] works.
Why is that recompilation required but decompilation is insufficient? Is this some bug, or is it expected behavior?
The general approach seems to be correct, but I think I found an error in the decompilation of literal variables such as Array. I sent Compiler-ct.425 to the inbox which should fix this issue.

I moved this to inbox.  It looks correct.  Can you check it against the old bytecode set too?  We don’t want it to break old-style blocks.


I am going to complete the implementation of SyntaxMorph >> #parseNode :-)

Best,
Christoph

Von: Squeak-dev <[hidden email]> im Auftrag von Thiede, Christoph
Gesendet: Dienstag, 15. Oktober 2019 21:08:24
An: [hidden email]
Betreff: [squeak-dev] [Etoys, Compiler] Help wanted: Trying to embed SyntaxMorphs into other tiles
 

Hi all,


I'm currently trying to implement #parseNodeWith: on SyntaxMorph, in order to embed SyntaxMorphs into regular tiles. (Did this ever work in past?)

I'm afraid the attempt in the commit below does not work yet; you can create a script editor, but parsing is erroneous, so you cannot execute the script.


To reproduce:

Compile the following:

MyPlayer >> examplePlayerCode

self forward: 6 * 7.

self turn: (11 raisedTo: 13 modulo: 97)

and evaluate:

| e p |
p := Morph new openInWorld assuredPlayer.
e := (MyPlayer >> #examplePlayerCode) decompile asScriptEditorFor: p.
e openInHand.


In Player>>#acceptScript:for:, #generate: is called on node, and when I decompile the result, I get a strange result:


examplePlayerCodeTest

self forward: 6 * 7.

self

forward: (#forward: forward: #forward:).


I don't know how to use #generate: exactly, but other senders usually appear to recompile a method before passing it to #generate:.

For comparison:

(Collection >> #asArray) decompile generate: CompiledMethodTrailer empty ] fails, but

m := (Collection >> #asArray) decompile.

  m := Compiler new compile: m in: Collection notifying: nil ifFail: #foo.
  m generate: CompiledMethodTrailer empty ] works.

Why is that recompilation required but decompilation is insufficient? Is this some bug, or is it expected behavior?


However, in the case of SyntaxMorph, I don't know how to recompile the node before, as a SyntaxMorph should be able to represent a node of an arbitrary type that must not be constrained to a MessageNode. So how could I solve the problem to generate code from SyntaxMorphs?


tl;dr: What is the full story of #generate: and how can it be made to work in this example?

Many thanks in advance! :-)


Best,

Christoph



Von: Squeak-dev <[hidden email]> im Auftrag von [hidden email] <[hidden email]>
Gesendet: Dienstag, 15. Oktober 2019 14:46 Uhr
An: [hidden email]
Betreff: [squeak-dev] The Inbox: EToys-ct.367.mcz
 
A new version of EToys was added to project The Inbox:
http://source.squeak.org/inbox/EToys-ct.367.mcz

==================== Summary ====================

Name: EToys-ct.367
Author: ct
Time: 15 October 2019, 2:46:24.862129 pm
UUID: 1394344f-b1e3-5640-a13a-70c5dffd51f4
Ancestors: EToys-mt.361

Allow for embedding SyntaxMorphs into test tiles.

=============== Diff against EToys-mt.361 ===============

Item was added:
+ ----- Method: SyntaxMorph>>parseNodeWith:asStatement: (in category '*Etoys-Squeakland-code generation') -----
+ parseNodeWith: encoder asStatement: aBoolean
+
+        ^ self parseNode!





Reply | Threaded
Open this post in threaded view
|

Re: [Etoys, Compiler] Help wanted: Trying to embed SyntaxMorphs into other tiles

Christoph Thiede

Hi Eliot,


It looks correct.  Can you check it against the old bytecode set too?  We don’t want it to break old-style blocks.


Good point. I ran

(Object >> #asOrderedCollection) decompile generate valueWithReceiver: 42 arguments: #().

for both bytecode sets, and both were fine.

But:

(Collection >> #asArray) decompile generate valueWithReceiver: {42} asOrderedCollection arguments: #().

breaks - in both bytecode sets. This is weird.
I will have a look into it, maybe I can discover what's wrong.

In addition, I propose to write tests for this. But it's not the goal of the decompiler to yield exactly the same parse tree or source code as the original method consisted of? In this case, we will need to write a lot of fixtures for the tests.

Best,
Christoph




Von: Squeak-dev <[hidden email]> im Auftrag von Eliot Miranda <[hidden email]>
Gesendet: Freitag, 27. März 2020 21:33 Uhr
An: The general-purpose Squeak developers list
Betreff: Re: [squeak-dev] [Etoys, Compiler] Help wanted: Trying to embed SyntaxMorphs into other tiles
 
Hi Christoph,

On Mar 27, 2020, at 12:45 PM, Thiede, Christoph <[hidden email]> wrote:



Hi all! :-)

Just an update of the decompilation question:
Christoph Thiede wrote
I don't know how to use #generate: exactly, but other senders usually appear to recompile a method before passing it to #generate:.
For comparison:

[ (Collection >> #asArray) decompile generate: CompiledMethodTrailer empty ] fails, but

[ m := (Collection >> #asArray) decompile.

  m := Compiler new compile: m in: Collection notifying: nil ifFail: #foo.
  m generate: CompiledMethodTrailer empty ] works.
Why is that recompilation required but decompilation is insufficient? Is this some bug, or is it expected behavior?
The general approach seems to be correct, but I think I found an error in the decompilation of literal variables such as Array. I sent Compiler-ct.425 to the inbox which should fix this issue.

I moved this to inbox.  It looks correct.  Can you check it against the old bytecode set too?  We don’t want it to break old-style blocks.


I am going to complete the implementation of SyntaxMorph >> #parseNode :-)

Best,
Christoph

Von: Squeak-dev <[hidden email]> im Auftrag von Thiede, Christoph
Gesendet: Dienstag, 15. Oktober 2019 21:08:24
An: [hidden email]
Betreff: [squeak-dev] [Etoys, Compiler] Help wanted: Trying to embed SyntaxMorphs into other tiles
 

Hi all,


I'm currently trying to implement #parseNodeWith: on SyntaxMorph, in order to embed SyntaxMorphs into regular tiles. (Did this ever work in past?)

I'm afraid the attempt in the commit below does not work yet; you can create a script editor, but parsing is erroneous, so you cannot execute the script.


To reproduce:

Compile the following:

MyPlayer >> examplePlayerCode

self forward: 6 * 7.

self turn: (11 raisedTo: 13 modulo: 97)

and evaluate:

| e p |
p := Morph new openInWorld assuredPlayer.
e := (MyPlayer >> #examplePlayerCode) decompile asScriptEditorFor: p.
e openInHand.


In Player>>#acceptScript:for:, #generate: is called on node, and when I decompile the result, I get a strange result:


examplePlayerCodeTest

self forward: 6 * 7.

self

forward: (#forward: forward: #forward:).


I don't know how to use #generate: exactly, but other senders usually appear to recompile a method before passing it to #generate:.

For comparison:

(Collection >> #asArray) decompile generate: CompiledMethodTrailer empty ] fails, but

m := (Collection >> #asArray) decompile.

  m := Compiler new compile: m in: Collection notifying: nil ifFail: #foo.
  m generate: CompiledMethodTrailer empty ] works.

Why is that recompilation required but decompilation is insufficient? Is this some bug, or is it expected behavior?


However, in the case of SyntaxMorph, I don't know how to recompile the node before, as a SyntaxMorph should be able to represent a node of an arbitrary type that must not be constrained to a MessageNode. So how could I solve the problem to generate code from SyntaxMorphs?


tl;dr: What is the full story of #generate: and how can it be made to work in this example?

Many thanks in advance! :-)


Best,

Christoph



Von: Squeak-dev <[hidden email]> im Auftrag von [hidden email] <[hidden email]>
Gesendet: Dienstag, 15. Oktober 2019 14:46 Uhr
An: [hidden email]
Betreff: [squeak-dev] The Inbox: EToys-ct.367.mcz
 
A new version of EToys was added to project The Inbox:
http://source.squeak.org/inbox/EToys-ct.367.mcz

==================== Summary ====================

Name: EToys-ct.367
Author: ct
Time: 15 October 2019, 2:46:24.862129 pm
UUID: 1394344f-b1e3-5640-a13a-70c5dffd51f4
Ancestors: EToys-mt.361

Allow for embedding SyntaxMorphs into test tiles.

=============== Diff against EToys-mt.361 ===============

Item was added:
+ ----- Method: SyntaxMorph>>parseNodeWith:asStatement: (in category '*Etoys-Squeakland-code generation') -----
+ parseNodeWith: encoder asStatement: aBoolean
+
+        ^ self parseNode!





Carpe Squeak!
Reply | Threaded
Open this post in threaded view
|

Decompiler buggy (was: AW: [Etoys, Compiler] Help wanted: Trying to embed SyntaxMorphs into other tiles)

Christoph Thiede

Hi Eliot, hi all,


ah, I finally found the bug, but this was a really hard hunt! :D


The solution is absolutely simple, again:


codeAnySelector: selector


^SelectorNode new

key: selector

+ index: nil

- index: 0

type: SendType


Seriously, did the Decompiler ever reliably produce re-generatable parse trees in the past? But it should do so, shouldn't it? :-)


Before the above patch, the following example was broken, too:

class := Object newSubclass.
class compile: 'foo ^ 1 + 1'.
(class >> #foo) decompile generate valueWithReceiver: class new arguments: #(). "SmallInteger does not understand #foo"

Now I'm wondering what are the actual semantics of the index variable. Its method comment about "various uses depending on the class of the receiver" is quite generic - do you know some more details about this? 
Should we also use nil instead of 0 in DecompilerConstructor >> #codeAnyLiteral:? At first glance, senders of #encodeLiteral: do not appear to set it to zero manually (so they leave it nil), but unless there is any documentation of the index meaning, this is speculation only, as I could not find any other example where decompilation + regeneration produce a method that cannot be executed properly.

By the way, here is another interesting one-liner:

(Object newSubclass environment: self environment; compile: 'foo ^(ObjectTracer on: nil) class'; >> #foo) decompile generate valueWithReceiver: nil arguments: #()

Interestingly, it opens a debugger - in other words, #class is sent as a regular selector. The decompiler does not know anything about special selectors at the moment. Is this desired behavior? I wonder whether it should be the parse tree's responsibility to install such kind of optimizations, rather than the responsibility of the Compiler.
Because in reality, Compiler is not the only client that requests code generation from parse trees. Etoys is a good example for a client from another domain that uses this service, too. Should all these other clients be withheld these important optimizations of Smalltalk expressions?

Best,
Christoph


Von: Thiede, Christoph
Gesendet: Freitag, 27. März 2020 23:16 Uhr
An: The general-purpose Squeak developers list
Betreff: AW: [squeak-dev] [Etoys, Compiler] Help wanted: Trying to embed SyntaxMorphs into other tiles
 

Hi Eliot,


It looks correct.  Can you check it against the old bytecode set too?  We don’t want it to break old-style blocks.


Good point. I ran

(Object >> #asOrderedCollection) decompile generate valueWithReceiver: 42 arguments: #().

for both bytecode sets, and both were fine.

But:

(Collection >> #asArray) decompile generate valueWithReceiver: {42} asOrderedCollection arguments: #().

breaks - in both bytecode sets. This is weird.
I will have a look into it, maybe I can discover what's wrong.

In addition, I propose to write tests for this. But it's not the goal of the decompiler to yield exactly the same parse tree or source code as the original method consisted of? In this case, we will need to write a lot of fixtures for the tests.

Best,
Christoph




Von: Squeak-dev <[hidden email]> im Auftrag von Eliot Miranda <[hidden email]>
Gesendet: Freitag, 27. März 2020 21:33 Uhr
An: The general-purpose Squeak developers list
Betreff: Re: [squeak-dev] [Etoys, Compiler] Help wanted: Trying to embed SyntaxMorphs into other tiles
 
Hi Christoph,

On Mar 27, 2020, at 12:45 PM, Thiede, Christoph <[hidden email]> wrote:



Hi all! :-)

Just an update of the decompilation question:
Christoph Thiede wrote
I don't know how to use #generate: exactly, but other senders usually appear to recompile a method before passing it to #generate:.
For comparison:

[ (Collection >> #asArray) decompile generate: CompiledMethodTrailer empty ] fails, but

[ m := (Collection >> #asArray) decompile.

  m := Compiler new compile: m in: Collection notifying: nil ifFail: #foo.
  m generate: CompiledMethodTrailer empty ] works.
Why is that recompilation required but decompilation is insufficient? Is this some bug, or is it expected behavior?
The general approach seems to be correct, but I think I found an error in the decompilation of literal variables such as Array. I sent Compiler-ct.425 to the inbox which should fix this issue.

I moved this to inbox.  It looks correct.  Can you check it against the old bytecode set too?  We don’t want it to break old-style blocks.


I am going to complete the implementation of SyntaxMorph >> #parseNode :-)

Best,
Christoph

Von: Squeak-dev <[hidden email]> im Auftrag von Thiede, Christoph
Gesendet: Dienstag, 15. Oktober 2019 21:08:24
An: [hidden email]
Betreff: [squeak-dev] [Etoys, Compiler] Help wanted: Trying to embed SyntaxMorphs into other tiles
 

Hi all,


I'm currently trying to implement #parseNodeWith: on SyntaxMorph, in order to embed SyntaxMorphs into regular tiles. (Did this ever work in past?)

I'm afraid the attempt in the commit below does not work yet; you can create a script editor, but parsing is erroneous, so you cannot execute the script.


To reproduce:

Compile the following:

MyPlayer >> examplePlayerCode

self forward: 6 * 7.

self turn: (11 raisedTo: 13 modulo: 97)

and evaluate:

| e p |
p := Morph new openInWorld assuredPlayer.
e := (MyPlayer >> #examplePlayerCode) decompile asScriptEditorFor: p.
e openInHand.


In Player>>#acceptScript:for:, #generate: is called on node, and when I decompile the result, I get a strange result:


examplePlayerCodeTest

self forward: 6 * 7.

self

forward: (#forward: forward: #forward:).


I don't know how to use #generate: exactly, but other senders usually appear to recompile a method before passing it to #generate:.

For comparison:

(Collection >> #asArray) decompile generate: CompiledMethodTrailer empty ] fails, but

m := (Collection >> #asArray) decompile.

  m := Compiler new compile: m in: Collection notifying: nil ifFail: #foo.
  m generate: CompiledMethodTrailer empty ] works.

Why is that recompilation required but decompilation is insufficient? Is this some bug, or is it expected behavior?


However, in the case of SyntaxMorph, I don't know how to recompile the node before, as a SyntaxMorph should be able to represent a node of an arbitrary type that must not be constrained to a MessageNode. So how could I solve the problem to generate code from SyntaxMorphs?


tl;dr: What is the full story of #generate: and how can it be made to work in this example?

Many thanks in advance! :-)


Best,

Christoph



Von: Squeak-dev <[hidden email]> im Auftrag von [hidden email] <[hidden email]>
Gesendet: Dienstag, 15. Oktober 2019 14:46 Uhr
An: [hidden email]
Betreff: [squeak-dev] The Inbox: EToys-ct.367.mcz
 
A new version of EToys was added to project The Inbox:
http://source.squeak.org/inbox/EToys-ct.367.mcz

==================== Summary ====================

Name: EToys-ct.367
Author: ct
Time: 15 October 2019, 2:46:24.862129 pm
UUID: 1394344f-b1e3-5640-a13a-70c5dffd51f4
Ancestors: EToys-mt.361

Allow for embedding SyntaxMorphs into test tiles.

=============== Diff against EToys-mt.361 ===============

Item was added:
+ ----- Method: SyntaxMorph>>parseNodeWith:asStatement: (in category '*Etoys-Squeakland-code generation') -----
+ parseNodeWith: encoder asStatement: aBoolean
+
+        ^ self parseNode!





Carpe Squeak!
Reply | Threaded
Open this post in threaded view
|

Re: Decompiler buggy (was: AW: [Etoys, Compiler] Help wanted: Trying to embed SyntaxMorphs into other tiles)

Nicolas Cellier
Hi Christoph,

Le sam. 28 mars 2020 à 01:12, Thiede, Christoph <[hidden email]> a écrit :

Hi Eliot, hi all,


ah, I finally found the bug, but this was a really hard hunt! :D


The solution is absolutely simple, again:


codeAnySelector: selector


^SelectorNode new

key: selector

+ index: nil

- index: 0

type: SendType


Good find!

Seriously, did the Decompiler ever reliably produce re-generatable parse trees in the past? But it should do so, shouldn't it? :-)


Maybe it did (see below). But I'm not sure that is was a feature...
Isn't it mostly used for replacing absent source code... that will eventually be repasrsed ? (!)

Before the above patch, the following example was broken, too:

class := Object newSubclass.
class compile: 'foo ^ 1 + 1'.
(class >> #foo) decompile generate valueWithReceiver: class new arguments: #(). "SmallInteger does not understand #foo"

Now I'm wondering what are the actual semantics of the index variable. Its method comment about "various uses depending on the class of the receiver" is quite generic - do you know some more details about this? 
Should we also use nil instead of 0 in DecompilerConstructor >> #codeAnyLiteral:? At first glance, senders of #encodeLiteral: do not appear to set it to zero manually (so they leave it nil), but unless there is any documentation of the index meaning, this is speculation only, as I could not find any other example where decompilation + regeneration produce a method that cannot be executed properly.

It's very low level, some kind of reflexion of byteCode encoding.
Once upon a time (< Squeak4.0), the code was even more horrible to follow!

LeafNode>>key: object index: i type: type
    self key: object code: (self code: i type: type)

LeafNode>>code: index type: type
    index isNil
         ifTrue: [^type negated].
     (CodeLimits at: type) > index
         ifTrue: [^(CodeBases at: type) + index].
     ^type * 256 + index

As you see, index i passed as argument to #code: keyword (? it's because it's documenting the output, not the input);
then code: parameter shadowing the index instance variable...
And the index instance variable was not set... Kind of brainfuck.

We still have code:type: and index variable shadowing in current trunk...

By the way, here is another interesting one-liner:

(Object newSubclass environment: self environment; compile: 'foo ^(ObjectTracer on: nil) class'; >> #foo) decompile generate valueWithReceiver: nil arguments: #()

Interestingly, it opens a debugger - in other words, #class is sent as a regular selector. The decompiler does not know anything about special selectors at the moment. Is this desired behavior? I wonder whether it should be the parse tree's responsibility to install such kind of optimizations, rather than the responsibility of the Compiler.
Because in reality, Compiler is not the only client that requests code generation from parse trees. Etoys is a good example for a client from another domain that uses this service, too. Should all these other clients be withheld these important optimizations of Smalltalk expressions?

After parsing, there are other compilation phases, for analyzing variable scope, clean blocks, etc...
It's possible to scatter the implementation of various phases in the nodes themselves, but the trend is rather to use a visitor pattern;
it gather the handling in some specialized classes that hold all the states (rather than pass them as message arguments).
Pharo team did a complete re-engineering of compiler (OpalCompiler) that you culd study.

Best,
Christoph


Von: Thiede, Christoph
Gesendet: Freitag, 27. März 2020 23:16 Uhr
An: The general-purpose Squeak developers list
Betreff: AW: [squeak-dev] [Etoys, Compiler] Help wanted: Trying to embed SyntaxMorphs into other tiles
 

Hi Eliot,


It looks correct.  Can you check it against the old bytecode set too?  We don’t want it to break old-style blocks.


Good point. I ran

(Object >> #asOrderedCollection) decompile generate valueWithReceiver: 42 arguments: #().

for both bytecode sets, and both were fine.

But:

(Collection >> #asArray) decompile generate valueWithReceiver: {42} asOrderedCollection arguments: #().

breaks - in both bytecode sets. This is weird.
I will have a look into it, maybe I can discover what's wrong.

In addition, I propose to write tests for this. But it's not the goal of the decompiler to yield exactly the same parse tree or source code as the original method consisted of? In this case, we will need to write a lot of fixtures for the tests.

Best,
Christoph




Von: Squeak-dev <[hidden email]> im Auftrag von Eliot Miranda <[hidden email]>
Gesendet: Freitag, 27. März 2020 21:33 Uhr
An: The general-purpose Squeak developers list
Betreff: Re: [squeak-dev] [Etoys, Compiler] Help wanted: Trying to embed SyntaxMorphs into other tiles
 
Hi Christoph,

On Mar 27, 2020, at 12:45 PM, Thiede, Christoph <[hidden email]> wrote:



Hi all! :-)

Just an update of the decompilation question:
Christoph Thiede wrote
I don't know how to use #generate: exactly, but other senders usually appear to recompile a method before passing it to #generate:.
For comparison:

[ (Collection >> #asArray) decompile generate: CompiledMethodTrailer empty ] fails, but

[ m := (Collection >> #asArray) decompile.

  m := Compiler new compile: m in: Collection notifying: nil ifFail: #foo.
  m generate: CompiledMethodTrailer empty ] works.
Why is that recompilation required but decompilation is insufficient? Is this some bug, or is it expected behavior?
The general approach seems to be correct, but I think I found an error in the decompilation of literal variables such as Array. I sent Compiler-ct.425 to the inbox which should fix this issue.

I moved this to inbox.  It looks correct.  Can you check it against the old bytecode set too?  We don’t want it to break old-style blocks.


I am going to complete the implementation of SyntaxMorph >> #parseNode :-)

Best,
Christoph

Von: Squeak-dev <[hidden email]> im Auftrag von Thiede, Christoph
Gesendet: Dienstag, 15. Oktober 2019 21:08:24
An: [hidden email]
Betreff: [squeak-dev] [Etoys, Compiler] Help wanted: Trying to embed SyntaxMorphs into other tiles
 

Hi all,


I'm currently trying to implement #parseNodeWith: on SyntaxMorph, in order to embed SyntaxMorphs into regular tiles. (Did this ever work in past?)

I'm afraid the attempt in the commit below does not work yet; you can create a script editor, but parsing is erroneous, so you cannot execute the script.


To reproduce:

Compile the following:

MyPlayer >> examplePlayerCode

self forward: 6 * 7.

self turn: (11 raisedTo: 13 modulo: 97)

and evaluate:

| e p |
p := Morph new openInWorld assuredPlayer.
e := (MyPlayer >> #examplePlayerCode) decompile asScriptEditorFor: p.
e openInHand.


In Player>>#acceptScript:for:, #generate: is called on node, and when I decompile the result, I get a strange result:


examplePlayerCodeTest

self forward: 6 * 7.

self

forward: (#forward: forward: #forward:).


I don't know how to use #generate: exactly, but other senders usually appear to recompile a method before passing it to #generate:.

For comparison:

(Collection >> #asArray) decompile generate: CompiledMethodTrailer empty ] fails, but

m := (Collection >> #asArray) decompile.

  m := Compiler new compile: m in: Collection notifying: nil ifFail: #foo.
  m generate: CompiledMethodTrailer empty ] works.

Why is that recompilation required but decompilation is insufficient? Is this some bug, or is it expected behavior?


However, in the case of SyntaxMorph, I don't know how to recompile the node before, as a SyntaxMorph should be able to represent a node of an arbitrary type that must not be constrained to a MessageNode. So how could I solve the problem to generate code from SyntaxMorphs?


tl;dr: What is the full story of #generate: and how can it be made to work in this example?

Many thanks in advance! :-)


Best,

Christoph



Von: Squeak-dev <[hidden email]> im Auftrag von [hidden email] <[hidden email]>
Gesendet: Dienstag, 15. Oktober 2019 14:46 Uhr
An: [hidden email]
Betreff: [squeak-dev] The Inbox: EToys-ct.367.mcz
 
A new version of EToys was added to project The Inbox:
http://source.squeak.org/inbox/EToys-ct.367.mcz

==================== Summary ====================

Name: EToys-ct.367
Author: ct
Time: 15 October 2019, 2:46:24.862129 pm
UUID: 1394344f-b1e3-5640-a13a-70c5dffd51f4
Ancestors: EToys-mt.361

Allow for embedding SyntaxMorphs into test tiles.

=============== Diff against EToys-mt.361 ===============

Item was added:
+ ----- Method: SyntaxMorph>>parseNodeWith:asStatement: (in category '*Etoys-Squeakland-code generation') -----
+ parseNodeWith: encoder asStatement: aBoolean
+
+        ^ self parseNode!






Reply | Threaded
Open this post in threaded view
|

Re: Decompiler buggy (was: AW: [Etoys, Compiler] Help wanted: Trying to embed SyntaxMorphs into other tiles)

Christoph Thiede

Hey Nicolas,


> > Seriously, did the Decompiler ever reliably produce re-generatable parse trees in the past? But it should do so, shouldn't it? :-)

> Maybe it did (see below). But I'm not sure that is was a feature...
> Isn't it mostly used for replacing absent source code... that will eventually be repasrsed ? (!)

Well, it may be disputable whether decompiled trees should be optimized, but returning trees from anywhere that do not satisfy particular validity conditions (such as index being only set iff a special selector should be encoded) definitively appears wrong and buggy to me.

As you see, index i passed as argument to #code: keyword (? it's because it's documenting the output, not the input); then code: parameter shadowing the index instance variable...

So would you agree to patch DecompilerConstructor >> #codeAnyLiteral:, too? :)

After parsing, there are other compilation phases, for analyzing variable scope, clean blocks, etc...

Hm ... the Compiler divides the compilation phase into two main stages (see #evaluateCue:ifFail:): The first stage is actual "compilation", that is translating the source into a parse tree in the parser. The second stage is to generate a compiled method, which is done by simply passing #generate(WithTempNames) to the parse tree. For me, this appears to be a good logical separation.
Things like scope analysis are done, as you say, in the second stage, of course. But I would not expect that optimizations such as special selectors are already applied in the first stage (this was also kind of confusing when I tried to debug certain optimizations such as of #caseOf:). Isn't it the general idea of a parse tree to have an intermediary representation between a primitive code string and a VM-specific set of bytecodes? Certain optimizations are not even relevant for other parser clients, for example, any code analysis tools.
It would be great to decouple these stages even more - let's say, we can move all the #noteSpecialSelector: and #transform: senders and apply it directly before, or inside of #generate, only. A visitor sounds like a good pattern for this, I did not yet have many opportunities to apply this pattern in practice.

Pharo team did a complete re-engineering of compiler (OpalCompiler) that you culd study.

Wow, I read some slides about OpalCompiler and it sounds great! Allow me one question, why didn't we already adapt this concept in Squeak, what are the disadvantages of this redesign? We could achieve so much more if everyone was pulling in the same direction (I know that it was the Pharo people to fork Squeak, however ...).

Best,
Christoph


Von: Squeak-dev <[hidden email]> im Auftrag von Nicolas Cellier <[hidden email]>
Gesendet: Samstag, 28. März 2020 14:09:15
An: The general-purpose Squeak developers list
Betreff: Re: [squeak-dev] Decompiler buggy (was: AW: [Etoys, Compiler] Help wanted: Trying to embed SyntaxMorphs into other tiles)
 
Hi Christoph,

Le sam. 28 mars 2020 à 01:12, Thiede, Christoph <[hidden email]> a écrit :

Hi Eliot, hi all,


ah, I finally found the bug, but this was a really hard hunt! :D


The solution is absolutely simple, again:


codeAnySelector: selector


^SelectorNode new

key: selector

+ index: nil

- index: 0

type: SendType


Good find!

Seriously, did the Decompiler ever reliably produce re-generatable parse trees in the past? But it should do so, shouldn't it? :-)


Maybe it did (see below). But I'm not sure that is was a feature...
Isn't it mostly used for replacing absent source code... that will eventually be repasrsed ? (!)

Before the above patch, the following example was broken, too:

class := Object newSubclass.
class compile: 'foo ^ 1 + 1'.
(class >> #foo) decompile generate valueWithReceiver: class new arguments: #(). "SmallInteger does not understand #foo"

Now I'm wondering what are the actual semantics of the index variable. Its method comment about "various uses depending on the class of the receiver" is quite generic - do you know some more details about this? 
Should we also use nil instead of 0 in DecompilerConstructor >> #codeAnyLiteral:? At first glance, senders of #encodeLiteral: do not appear to set it to zero manually (so they leave it nil), but unless there is any documentation of the index meaning, this is speculation only, as I could not find any other example where decompilation + regeneration produce a method that cannot be executed properly.

It's very low level, some kind of reflexion of byteCode encoding.
Once upon a time (< Squeak4.0), the code was even more horrible to follow!

LeafNode>>key: object index: i type: type
    self key: object code: (self code: i type: type)

LeafNode>>code: index type: type
    index isNil
         ifTrue: [^type negated].
     (CodeLimits at: type) > index
         ifTrue: [^(CodeBases at: type) + index].
     ^type * 256 + index

As you see, index i passed as argument to #code: keyword (? it's because it's documenting the output, not the input);
then code: parameter shadowing the index instance variable...
And the index instance variable was not set... Kind of brainfuck.

We still have code:type: and index variable shadowing in current trunk...

By the way, here is another interesting one-liner:

(Object newSubclass environment: self environment; compile: 'foo ^(ObjectTracer on: nil) class'; >> #foo) decompile generate valueWithReceiver: nil arguments: #()

Interestingly, it opens a debugger - in other words, #class is sent as a regular selector. The decompiler does not know anything about special selectors at the moment. Is this desired behavior? I wonder whether it should be the parse tree's responsibility to install such kind of optimizations, rather than the responsibility of the Compiler.
Because in reality, Compiler is not the only client that requests code generation from parse trees. Etoys is a good example for a client from another domain that uses this service, too. Should all these other clients be withheld these important optimizations of Smalltalk expressions?

After parsing, there are other compilation phases, for analyzing variable scope, clean blocks, etc...
It's possible to scatter the implementation of various phases in the nodes themselves, but the trend is rather to use a visitor pattern;
it gather the handling in some specialized classes that hold all the states (rather than pass them as message arguments).
Pharo team did a complete re-engineering of compiler (OpalCompiler) that you culd study.

Best,
Christoph


Von: Thiede, Christoph
Gesendet: Freitag, 27. März 2020 23:16 Uhr
An: The general-purpose Squeak developers list
Betreff: AW: [squeak-dev] [Etoys, Compiler] Help wanted: Trying to embed SyntaxMorphs into other tiles
 

Hi Eliot,


It looks correct.  Can you check it against the old bytecode set too?  We don’t want it to break old-style blocks.


Good point. I ran

(Object >> #asOrderedCollection) decompile generate valueWithReceiver: 42 arguments: #().

for both bytecode sets, and both were fine.

But:

(Collection >> #asArray) decompile generate valueWithReceiver: {42} asOrderedCollection arguments: #().

breaks - in both bytecode sets. This is weird.
I will have a look into it, maybe I can discover what's wrong.

In addition, I propose to write tests for this. But it's not the goal of the decompiler to yield exactly the same parse tree or source code as the original method consisted of? In this case, we will need to write a lot of fixtures for the tests.

Best,
Christoph




Von: Squeak-dev <[hidden email]> im Auftrag von Eliot Miranda <[hidden email]>
Gesendet: Freitag, 27. März 2020 21:33 Uhr
An: The general-purpose Squeak developers list
Betreff: Re: [squeak-dev] [Etoys, Compiler] Help wanted: Trying to embed SyntaxMorphs into other tiles
 
Hi Christoph,

On Mar 27, 2020, at 12:45 PM, Thiede, Christoph <[hidden email]> wrote:



Hi all! :-)

Just an update of the decompilation question:
Christoph Thiede wrote
I don't know how to use #generate: exactly, but other senders usually appear to recompile a method before passing it to #generate:.
For comparison:

[ (Collection >> #asArray) decompile generate: CompiledMethodTrailer empty ] fails, but

[ m := (Collection >> #asArray) decompile.

  m := Compiler new compile: m in: Collection notifying: nil ifFail: #foo.
  m generate: CompiledMethodTrailer empty ] works.
Why is that recompilation required but decompilation is insufficient? Is this some bug, or is it expected behavior?
The general approach seems to be correct, but I think I found an error in the decompilation of literal variables such as Array. I sent Compiler-ct.425 to the inbox which should fix this issue.

I moved this to inbox.  It looks correct.  Can you check it against the old bytecode set too?  We don’t want it to break old-style blocks.


I am going to complete the implementation of SyntaxMorph >> #parseNode :-)

Best,
Christoph

Von: Squeak-dev <[hidden email]> im Auftrag von Thiede, Christoph
Gesendet: Dienstag, 15. Oktober 2019 21:08:24
An: [hidden email]
Betreff: [squeak-dev] [Etoys, Compiler] Help wanted: Trying to embed SyntaxMorphs into other tiles
 

Hi all,


I'm currently trying to implement #parseNodeWith: on SyntaxMorph, in order to embed SyntaxMorphs into regular tiles. (Did this ever work in past?)

I'm afraid the attempt in the commit below does not work yet; you can create a script editor, but parsing is erroneous, so you cannot execute the script.


To reproduce:

Compile the following:

MyPlayer >> examplePlayerCode

self forward: 6 * 7.

self turn: (11 raisedTo: 13 modulo: 97)

and evaluate:

| e p |
p := Morph new openInWorld assuredPlayer.
e := (MyPlayer >> #examplePlayerCode) decompile asScriptEditorFor: p.
e openInHand.


In Player>>#acceptScript:for:, #generate: is called on node, and when I decompile the result, I get a strange result:


examplePlayerCodeTest

self forward: 6 * 7.

self

forward: (#forward: forward: #forward:).


I don't know how to use #generate: exactly, but other senders usually appear to recompile a method before passing it to #generate:.

For comparison:

(Collection >> #asArray) decompile generate: CompiledMethodTrailer empty ] fails, but

m := (Collection >> #asArray) decompile.

  m := Compiler new compile: m in: Collection notifying: nil ifFail: #foo.
  m generate: CompiledMethodTrailer empty ] works.

Why is that recompilation required but decompilation is insufficient? Is this some bug, or is it expected behavior?


However, in the case of SyntaxMorph, I don't know how to recompile the node before, as a SyntaxMorph should be able to represent a node of an arbitrary type that must not be constrained to a MessageNode. So how could I solve the problem to generate code from SyntaxMorphs?


tl;dr: What is the full story of #generate: and how can it be made to work in this example?

Many thanks in advance! :-)


Best,

Christoph



Von: Squeak-dev <[hidden email]> im Auftrag von [hidden email] <[hidden email]>
Gesendet: Dienstag, 15. Oktober 2019 14:46 Uhr
An: [hidden email]
Betreff: [squeak-dev] The Inbox: EToys-ct.367.mcz
 
A new version of EToys was added to project The Inbox:
http://source.squeak.org/inbox/EToys-ct.367.mcz

==================== Summary ====================

Name: EToys-ct.367
Author: ct
Time: 15 October 2019, 2:46:24.862129 pm
UUID: 1394344f-b1e3-5640-a13a-70c5dffd51f4
Ancestors: EToys-mt.361

Allow for embedding SyntaxMorphs into test tiles.

=============== Diff against EToys-mt.361 ===============

Item was added:
+ ----- Method: SyntaxMorph>>parseNodeWith:asStatement: (in category '*Etoys-Squeakland-code generation') -----
+ parseNodeWith: encoder asStatement: aBoolean
+
+        ^ self parseNode!






Carpe Squeak!
Reply | Threaded
Open this post in threaded view
|

Re: Decompiler buggy (was: AW: [Etoys, Compiler] Help wanted: Trying to embed SyntaxMorphs into other tiles)

Nicolas Cellier


Le dim. 29 mars 2020 à 15:31, Thiede, Christoph <[hidden email]> a écrit :

Hey Nicolas,


> > Seriously, did the Decompiler ever reliably produce re-generatable parse trees in the past? But it should do so, shouldn't it? :-)

> Maybe it did (see below). But I'm not sure that is was a feature...
> Isn't it mostly used for replacing absent source code... that will eventually be repasrsed ? (!)

Well, it may be disputable whether decompiled trees should be optimized, but returning trees from anywhere that do not satisfy particular validity conditions (such as index being only set iff a special selector should be encoded) definitively appears wrong and buggy to me.

As you see, index i passed as argument to #code: keyword (? it's because it's documenting the output, not the input); then code: parameter shadowing the index instance variable...

So would you agree to patch DecompilerConstructor >> #codeAnyLiteral:, too? :)

After parsing, there are other compilation phases, for analyzing variable scope, clean blocks, etc...

Hm ... the Compiler divides the compilation phase into two main stages (see #evaluateCue:ifFail:): The first stage is actual "compilation", that is translating the source into a parse tree in the parser. The second stage is to generate a compiled method, which is done by simply passing #generate(WithTempNames) to the parse tree. For me, this appears to be a good logical separation.
Things like scope analysis are done, as you say, in the second stage, of course. But I would not expect that optimizations such as special selectors are already applied in the first stage (this was also kind of confusing when I tried to debug certain optimizations such as of #caseOf:). Isn't it the general idea of a parse tree to have an intermediary representation between a primitive code string and a VM-specific set of bytecodes? Certain optimizations are not even relevant for other parser clients, for example, any code analysis tools.
It would be great to decouple these stages even more - let's say, we can move all the #noteSpecialSelector: and #transform: senders and apply it directly before, or inside of #generate, only. A visitor sounds like a good pattern for this, I did not yet have many opportunities to apply this pattern in practice.

Pharo team did a complete re-engineering of compiler (OpalCompiler) that you culd study.

Wow, I read some slides about OpalCompiler and it sounds great! Allow me one question, why didn't we already adapt this concept in Squeak, what are the disadvantages of this redesign? We could achieve so much more if everyone was pulling in the same direction (I know that it was the Pharo people to fork Squeak, however ...).


One disadvantage is that it's about twice slower.
The is because byte code instructions are reified.
Thus instead of source -> parse tree -> compiledMethod
the flow is source -> parse tree -> instructions -> compiledMethod

It would be possible to make the instructions intermediate representation optional though.

The second disadvantage is that, IMO, it's a bit over engineered.
One consequence is that patching the compiler for accepting methods > 15 arguments (required for Smallapack),
or for accepting legacy FFI syntax took me more efforts than patching the legacy squeak compiler.

However, it's a nice piece of code.
It should not be too hard to port to Squeak.
Though, it relies on revamped parse tree nodes (those of refactoring browser, with slight evolutions...).

Best,
Christoph


Von: Squeak-dev <[hidden email]> im Auftrag von Nicolas Cellier <[hidden email]>
Gesendet: Samstag, 28. März 2020 14:09:15
An: The general-purpose Squeak developers list
Betreff: Re: [squeak-dev] Decompiler buggy (was: AW: [Etoys, Compiler] Help wanted: Trying to embed SyntaxMorphs into other tiles)
 
Hi Christoph,

Le sam. 28 mars 2020 à 01:12, Thiede, Christoph <[hidden email]> a écrit :

Hi Eliot, hi all,


ah, I finally found the bug, but this was a really hard hunt! :D


The solution is absolutely simple, again:


codeAnySelector: selector


^SelectorNode new

key: selector

+ index: nil

- index: 0

type: SendType


Good find!

Seriously, did the Decompiler ever reliably produce re-generatable parse trees in the past? But it should do so, shouldn't it? :-)


Maybe it did (see below). But I'm not sure that is was a feature...
Isn't it mostly used for replacing absent source code... that will eventually be repasrsed ? (!)

Before the above patch, the following example was broken, too:

class := Object newSubclass.
class compile: 'foo ^ 1 + 1'.
(class >> #foo) decompile generate valueWithReceiver: class new arguments: #(). "SmallInteger does not understand #foo"

Now I'm wondering what are the actual semantics of the index variable. Its method comment about "various uses depending on the class of the receiver" is quite generic - do you know some more details about this? 
Should we also use nil instead of 0 in DecompilerConstructor >> #codeAnyLiteral:? At first glance, senders of #encodeLiteral: do not appear to set it to zero manually (so they leave it nil), but unless there is any documentation of the index meaning, this is speculation only, as I could not find any other example where decompilation + regeneration produce a method that cannot be executed properly.

It's very low level, some kind of reflexion of byteCode encoding.
Once upon a time (< Squeak4.0), the code was even more horrible to follow!

LeafNode>>key: object index: i type: type
    self key: object code: (self code: i type: type)

LeafNode>>code: index type: type
    index isNil
         ifTrue: [^type negated].
     (CodeLimits at: type) > index
         ifTrue: [^(CodeBases at: type) + index].
     ^type * 256 + index

As you see, index i passed as argument to #code: keyword (? it's because it's documenting the output, not the input);
then code: parameter shadowing the index instance variable...
And the index instance variable was not set... Kind of brainfuck.

We still have code:type: and index variable shadowing in current trunk...

By the way, here is another interesting one-liner:

(Object newSubclass environment: self environment; compile: 'foo ^(ObjectTracer on: nil) class'; >> #foo) decompile generate valueWithReceiver: nil arguments: #()

Interestingly, it opens a debugger - in other words, #class is sent as a regular selector. The decompiler does not know anything about special selectors at the moment. Is this desired behavior? I wonder whether it should be the parse tree's responsibility to install such kind of optimizations, rather than the responsibility of the Compiler.
Because in reality, Compiler is not the only client that requests code generation from parse trees. Etoys is a good example for a client from another domain that uses this service, too. Should all these other clients be withheld these important optimizations of Smalltalk expressions?

After parsing, there are other compilation phases, for analyzing variable scope, clean blocks, etc...
It's possible to scatter the implementation of various phases in the nodes themselves, but the trend is rather to use a visitor pattern;
it gather the handling in some specialized classes that hold all the states (rather than pass them as message arguments).
Pharo team did a complete re-engineering of compiler (OpalCompiler) that you culd study.

Best,
Christoph


Von: Thiede, Christoph
Gesendet: Freitag, 27. März 2020 23:16 Uhr
An: The general-purpose Squeak developers list
Betreff: AW: [squeak-dev] [Etoys, Compiler] Help wanted: Trying to embed SyntaxMorphs into other tiles
 

Hi Eliot,


It looks correct.  Can you check it against the old bytecode set too?  We don’t want it to break old-style blocks.


Good point. I ran

(Object >> #asOrderedCollection) decompile generate valueWithReceiver: 42 arguments: #().

for both bytecode sets, and both were fine.

But:

(Collection >> #asArray) decompile generate valueWithReceiver: {42} asOrderedCollection arguments: #().

breaks - in both bytecode sets. This is weird.
I will have a look into it, maybe I can discover what's wrong.

In addition, I propose to write tests for this. But it's not the goal of the decompiler to yield exactly the same parse tree or source code as the original method consisted of? In this case, we will need to write a lot of fixtures for the tests.

Best,
Christoph




Von: Squeak-dev <[hidden email]> im Auftrag von Eliot Miranda <[hidden email]>
Gesendet: Freitag, 27. März 2020 21:33 Uhr
An: The general-purpose Squeak developers list
Betreff: Re: [squeak-dev] [Etoys, Compiler] Help wanted: Trying to embed SyntaxMorphs into other tiles
 
Hi Christoph,

On Mar 27, 2020, at 12:45 PM, Thiede, Christoph <[hidden email]> wrote:



Hi all! :-)

Just an update of the decompilation question:
Christoph Thiede wrote
I don't know how to use #generate: exactly, but other senders usually appear to recompile a method before passing it to #generate:.
For comparison:

[ (Collection >> #asArray) decompile generate: CompiledMethodTrailer empty ] fails, but

[ m := (Collection >> #asArray) decompile.

  m := Compiler new compile: m in: Collection notifying: nil ifFail: #foo.
  m generate: CompiledMethodTrailer empty ] works.
Why is that recompilation required but decompilation is insufficient? Is this some bug, or is it expected behavior?
The general approach seems to be correct, but I think I found an error in the decompilation of literal variables such as Array. I sent Compiler-ct.425 to the inbox which should fix this issue.

I moved this to inbox.  It looks correct.  Can you check it against the old bytecode set too?  We don’t want it to break old-style blocks.


I am going to complete the implementation of SyntaxMorph >> #parseNode :-)

Best,
Christoph

Von: Squeak-dev <[hidden email]> im Auftrag von Thiede, Christoph
Gesendet: Dienstag, 15. Oktober 2019 21:08:24
An: [hidden email]
Betreff: [squeak-dev] [Etoys, Compiler] Help wanted: Trying to embed SyntaxMorphs into other tiles
 

Hi all,


I'm currently trying to implement #parseNodeWith: on SyntaxMorph, in order to embed SyntaxMorphs into regular tiles. (Did this ever work in past?)

I'm afraid the attempt in the commit below does not work yet; you can create a script editor, but parsing is erroneous, so you cannot execute the script.


To reproduce:

Compile the following:

MyPlayer >> examplePlayerCode

self forward: 6 * 7.

self turn: (11 raisedTo: 13 modulo: 97)

and evaluate:

| e p |
p := Morph new openInWorld assuredPlayer.
e := (MyPlayer >> #examplePlayerCode) decompile asScriptEditorFor: p.
e openInHand.


In Player>>#acceptScript:for:, #generate: is called on node, and when I decompile the result, I get a strange result:


examplePlayerCodeTest

self forward: 6 * 7.

self

forward: (#forward: forward: #forward:).


I don't know how to use #generate: exactly, but other senders usually appear to recompile a method before passing it to #generate:.

For comparison:

(Collection >> #asArray) decompile generate: CompiledMethodTrailer empty ] fails, but

m := (Collection >> #asArray) decompile.

  m := Compiler new compile: m in: Collection notifying: nil ifFail: #foo.
  m generate: CompiledMethodTrailer empty ] works.

Why is that recompilation required but decompilation is insufficient? Is this some bug, or is it expected behavior?


However, in the case of SyntaxMorph, I don't know how to recompile the node before, as a SyntaxMorph should be able to represent a node of an arbitrary type that must not be constrained to a MessageNode. So how could I solve the problem to generate code from SyntaxMorphs?


tl;dr: What is the full story of #generate: and how can it be made to work in this example?

Many thanks in advance! :-)


Best,

Christoph



Von: Squeak-dev <[hidden email]> im Auftrag von [hidden email] <[hidden email]>
Gesendet: Dienstag, 15. Oktober 2019 14:46 Uhr
An: [hidden email]
Betreff: [squeak-dev] The Inbox: EToys-ct.367.mcz
 
A new version of EToys was added to project The Inbox:
http://source.squeak.org/inbox/EToys-ct.367.mcz

==================== Summary ====================

Name: EToys-ct.367
Author: ct
Time: 15 October 2019, 2:46:24.862129 pm
UUID: 1394344f-b1e3-5640-a13a-70c5dffd51f4
Ancestors: EToys-mt.361

Allow for embedding SyntaxMorphs into test tiles.

=============== Diff against EToys-mt.361 ===============

Item was added:
+ ----- Method: SyntaxMorph>>parseNodeWith:asStatement: (in category '*Etoys-Squeakland-code generation') -----
+ parseNodeWith: encoder asStatement: aBoolean
+
+        ^ self parseNode!







Reply | Threaded
Open this post in threaded view
|

Re: Decompiler buggy (was: AW: [Etoys, Compiler] Help wanted: Trying to embed SyntaxMorphs into other tiles)

Eliot Miranda-2
In reply to this post by Nicolas Cellier
Hi Christoph,

    please read what I'm about to say carefully.  This message is aimed at you :-)

On Sat, Mar 28, 2020 at 6:09 AM Nicolas Cellier <[hidden email]> wrote:
Hi Christoph,

Le sam. 28 mars 2020 à 01:12, Thiede, Christoph <[hidden email]> a écrit :

Hi Eliot, hi all,


ah, I finally found the bug, but this was a really hard hunt! :D


The solution is absolutely simple, again:


codeAnySelector: selector


^SelectorNode new

key: selector

+ index: nil

- index: 0

type: SendType


Good find!

Seriously, did the Decompiler ever reliably produce re-generatable parse trees in the past? But it should do so, shouldn't it? :-)


Maybe it did (see below). But I'm not sure that is was a feature...
Isn't it mostly used for replacing absent source code... that will eventually be repasrsed ? (!)

Before the above patch, the following example was broken, too:

class := Object newSubclass.
class compile: 'foo ^ 1 + 1'.
(class >> #foo) decompile generate valueWithReceiver: class new arguments: #(). "SmallInteger does not understand #foo"

Now I'm wondering what are the actual semantics of the index variable. Its method comment about "various uses depending on the class of the receiver" is quite generic - do you know some more details about this? 
Should we also use nil instead of 0 in DecompilerConstructor >> #codeAnyLiteral:? At first glance, senders of #encodeLiteral: do not appear to set it to zero manually (so they leave it nil), but unless there is any documentation of the index meaning, this is speculation only, as I could not find any other example where decompilation + regeneration produce a method that cannot be executed properly.

It's very low level, some kind of reflexion of byteCode encoding.
Once upon a time (< Squeak4.0), the code was even more horrible to follow!

LeafNode>>key: object index: i type: type
    self key: object code: (self code: i type: type)

LeafNode>>code: index type: type
    index isNil
         ifTrue: [^type negated].
     (CodeLimits at: type) > index
         ifTrue: [^(CodeBases at: type) + index].
     ^type * 256 + index

Exactly.  This is actually obsolete genius by Dan Ingalls.  If you have a look at the original Smalltalk-80 bytecode compiler you'll see that the parse tree nodes both represent the parse tree *and* generate the output bytecodes,  This was really important on 16-bit Smalltalk-80 since it meant that the bytecode compiler was extremely compact and concise.  Objects were in extremely short supply, 32k objects in a normal implementation (with 15-bit SmallIntegers), and 48k objects in a "stretch" implementation that had 14-bit SmallIntegers.

Now we have 32-bit and 64-bit implementations this concision is obsolete and what we need is flexibility and clarity.  

I had done some reimplementation work on the bytecode compiler in 2009 to add the closure bytecodes, and to add a proper code generation back end in the BytecodeEncoder framework, but I never finished the cleanup. The index and code inst vars in the LeafNode hierarchy are vestiges of the old implementation.  It would be really good to get rid of the code inst var altogether and to be left only with index, and index being the literal index for literal nodes (perhaps negative indices being used for special selectors), index being the inst var index for inst var nodes, and index being the temp var offset for temp var nodes, etc.

But this really needs someone with fresh eyes and energy.  My plate is full.  When I did think of doing this I realized that it is probably wise to clone the compiler altogether and do the development and testing work in the clone before moving it back to LeafNode et al for the first functional commit.  This to avoid breaking the compiler while trying to fix it.

So Christoph, do you accept my challenge and will you try and eliminate the code inst var from LeafNode?

 

As you see, index i passed as argument to #code: keyword (? it's because it's documenting the output, not the input);
then code: parameter shadowing the index instance variable...
And the index instance variable was not set... Kind of brainfuck.

We still have code:type: and index variable shadowing in current trunk...

By the way, here is another interesting one-liner:

(Object newSubclass environment: self environment; compile: 'foo ^(ObjectTracer on: nil) class'; >> #foo) decompile generate valueWithReceiver: nil arguments: #()

Interestingly, it opens a debugger - in other words, #class is sent as a regular selector. The decompiler does not know anything about special selectors at the moment. Is this desired behavior? I wonder whether it should be the parse tree's responsibility to install such kind of optimizations, rather than the responsibility of the Compiler.
Because in reality, Compiler is not the only client that requests code generation from parse trees. Etoys is a good example for a client from another domain that uses this service, too. Should all these other clients be withheld these important optimizations of Smalltalk expressions?

After parsing, there are other compilation phases, for analyzing variable scope, clean blocks, etc...
It's possible to scatter the implementation of various phases in the nodes themselves, but the trend is rather to use a visitor pattern;
it gather the handling in some specialized classes that hold all the states (rather than pass them as message arguments).
Pharo team did a complete re-engineering of compiler (OpalCompiler) that you culd study.

Best,
Christoph


Von: Thiede, Christoph
Gesendet: Freitag, 27. März 2020 23:16 Uhr
An: The general-purpose Squeak developers list
Betreff: AW: [squeak-dev] [Etoys, Compiler] Help wanted: Trying to embed SyntaxMorphs into other tiles
 

Hi Eliot,


It looks correct.  Can you check it against the old bytecode set too?  We don’t want it to break old-style blocks.


Good point. I ran

(Object >> #asOrderedCollection) decompile generate valueWithReceiver: 42 arguments: #().

for both bytecode sets, and both were fine.

But:

(Collection >> #asArray) decompile generate valueWithReceiver: {42} asOrderedCollection arguments: #().

breaks - in both bytecode sets. This is weird.
I will have a look into it, maybe I can discover what's wrong.

In addition, I propose to write tests for this. But it's not the goal of the decompiler to yield exactly the same parse tree or source code as the original method consisted of? In this case, we will need to write a lot of fixtures for the tests.

Best,
Christoph




Von: Squeak-dev <[hidden email]> im Auftrag von Eliot Miranda <[hidden email]>
Gesendet: Freitag, 27. März 2020 21:33 Uhr
An: The general-purpose Squeak developers list
Betreff: Re: [squeak-dev] [Etoys, Compiler] Help wanted: Trying to embed SyntaxMorphs into other tiles
 
Hi Christoph,

On Mar 27, 2020, at 12:45 PM, Thiede, Christoph <[hidden email]> wrote:



Hi all! :-)

Just an update of the decompilation question:
Christoph Thiede wrote
I don't know how to use #generate: exactly, but other senders usually appear to recompile a method before passing it to #generate:.
For comparison:

[ (Collection >> #asArray) decompile generate: CompiledMethodTrailer empty ] fails, but

[ m := (Collection >> #asArray) decompile.

  m := Compiler new compile: m in: Collection notifying: nil ifFail: #foo.
  m generate: CompiledMethodTrailer empty ] works.
Why is that recompilation required but decompilation is insufficient? Is this some bug, or is it expected behavior?
The general approach seems to be correct, but I think I found an error in the decompilation of literal variables such as Array. I sent Compiler-ct.425 to the inbox which should fix this issue.

I moved this to inbox.  It looks correct.  Can you check it against the old bytecode set too?  We don’t want it to break old-style blocks.


I am going to complete the implementation of SyntaxMorph >> #parseNode :-)

Best,
Christoph

Von: Squeak-dev <[hidden email]> im Auftrag von Thiede, Christoph
Gesendet: Dienstag, 15. Oktober 2019 21:08:24
An: [hidden email]
Betreff: [squeak-dev] [Etoys, Compiler] Help wanted: Trying to embed SyntaxMorphs into other tiles
 

Hi all,


I'm currently trying to implement #parseNodeWith: on SyntaxMorph, in order to embed SyntaxMorphs into regular tiles. (Did this ever work in past?)

I'm afraid the attempt in the commit below does not work yet; you can create a script editor, but parsing is erroneous, so you cannot execute the script.


To reproduce:

Compile the following:

MyPlayer >> examplePlayerCode

self forward: 6 * 7.

self turn: (11 raisedTo: 13 modulo: 97)

and evaluate:

| e p |
p := Morph new openInWorld assuredPlayer.
e := (MyPlayer >> #examplePlayerCode) decompile asScriptEditorFor: p.
e openInHand.


In Player>>#acceptScript:for:, #generate: is called on node, and when I decompile the result, I get a strange result:


examplePlayerCodeTest

self forward: 6 * 7.

self

forward: (#forward: forward: #forward:).


I don't know how to use #generate: exactly, but other senders usually appear to recompile a method before passing it to #generate:.

For comparison:

(Collection >> #asArray) decompile generate: CompiledMethodTrailer empty ] fails, but

m := (Collection >> #asArray) decompile.

  m := Compiler new compile: m in: Collection notifying: nil ifFail: #foo.
  m generate: CompiledMethodTrailer empty ] works.

Why is that recompilation required but decompilation is insufficient? Is this some bug, or is it expected behavior?


However, in the case of SyntaxMorph, I don't know how to recompile the node before, as a SyntaxMorph should be able to represent a node of an arbitrary type that must not be constrained to a MessageNode. So how could I solve the problem to generate code from SyntaxMorphs?


tl;dr: What is the full story of #generate: and how can it be made to work in this example?

Many thanks in advance! :-)


Best,

Christoph



Von: Squeak-dev <[hidden email]> im Auftrag von [hidden email] <[hidden email]>
Gesendet: Dienstag, 15. Oktober 2019 14:46 Uhr
An: [hidden email]
Betreff: [squeak-dev] The Inbox: EToys-ct.367.mcz
 
A new version of EToys was added to project The Inbox:
http://source.squeak.org/inbox/EToys-ct.367.mcz

==================== Summary ====================

Name: EToys-ct.367
Author: ct
Time: 15 October 2019, 2:46:24.862129 pm
UUID: 1394344f-b1e3-5640-a13a-70c5dffd51f4
Ancestors: EToys-mt.361

Allow for embedding SyntaxMorphs into test tiles.

=============== Diff against EToys-mt.361 ===============

Item was added:
+ ----- Method: SyntaxMorph>>parseNodeWith:asStatement: (in category '*Etoys-Squeakland-code generation') -----
+ parseNodeWith: encoder asStatement: aBoolean
+
+        ^ self parseNode!







--
_,,,^..^,,,_
best, Eliot


Reply | Threaded
Open this post in threaded view
|

Re: Decompiler buggy (was: AW: [Etoys, Compiler] Help wanted: Trying to embed SyntaxMorphs into other tiles)

Christoph Thiede

Hi Eliot, this sounds like a reasonable piece of work. I'll need to reverse-engineer all the relevant stuff first, but it will put it onto my list with a priority above average :)


One question in general, both index and code appear to be referenced by LeafNode itself mainly for accessing and initialization purposes. Why can't we define these inst vars per subclass and use an abstract getter in LeafNode (if necessary at all)? I have the feeling that this could simplify explanation and understanding of the several meanings of index.


Best,
Christoph

Von: Squeak-dev <[hidden email]> im Auftrag von Eliot Miranda <[hidden email]>
Gesendet: Sonntag, 29. März 2020 19:49:33
An: The general-purpose Squeak developers list
Betreff: Re: [squeak-dev] Decompiler buggy (was: AW: [Etoys, Compiler] Help wanted: Trying to embed SyntaxMorphs into other tiles)
 
Hi Christoph,

    please read what I'm about to say carefully.  This message is aimed at you :-)

On Sat, Mar 28, 2020 at 6:09 AM Nicolas Cellier <[hidden email]> wrote:
Hi Christoph,

Le sam. 28 mars 2020 à 01:12, Thiede, Christoph <[hidden email]> a écrit :

Hi Eliot, hi all,


ah, I finally found the bug, but this was a really hard hunt! :D


The solution is absolutely simple, again:


codeAnySelector: selector


^SelectorNode new

key: selector

+ index: nil

- index: 0

type: SendType


Good find!

Seriously, did the Decompiler ever reliably produce re-generatable parse trees in the past? But it should do so, shouldn't it? :-)


Maybe it did (see below). But I'm not sure that is was a feature...
Isn't it mostly used for replacing absent source code... that will eventually be repasrsed ? (!)

Before the above patch, the following example was broken, too:

class := Object newSubclass.
class compile: 'foo ^ 1 + 1'.
(class >> #foo) decompile generate valueWithReceiver: class new arguments: #(). "SmallInteger does not understand #foo"

Now I'm wondering what are the actual semantics of the index variable. Its method comment about "various uses depending on the class of the receiver" is quite generic - do you know some more details about this? 
Should we also use nil instead of 0 in DecompilerConstructor >> #codeAnyLiteral:? At first glance, senders of #encodeLiteral: do not appear to set it to zero manually (so they leave it nil), but unless there is any documentation of the index meaning, this is speculation only, as I could not find any other example where decompilation + regeneration produce a method that cannot be executed properly.

It's very low level, some kind of reflexion of byteCode encoding.
Once upon a time (< Squeak4.0), the code was even more horrible to follow!

LeafNode>>key: object index: i type: type
    self key: object code: (self code: i type: type)

LeafNode>>code: index type: type
    index isNil
         ifTrue: [^type negated].
     (CodeLimits at: type) > index
         ifTrue: [^(CodeBases at: type) + index].
     ^type * 256 + index

Exactly.  This is actually obsolete genius by Dan Ingalls.  If you have a look at the original Smalltalk-80 bytecode compiler you'll see that the parse tree nodes both represent the parse tree *and* generate the output bytecodes,  This was really important on 16-bit Smalltalk-80 since it meant that the bytecode compiler was extremely compact and concise.  Objects were in extremely short supply, 32k objects in a normal implementation (with 15-bit SmallIntegers), and 48k objects in a "stretch" implementation that had 14-bit SmallIntegers.

Now we have 32-bit and 64-bit implementations this concision is obsolete and what we need is flexibility and clarity.  

I had done some reimplementation work on the bytecode compiler in 2009 to add the closure bytecodes, and to add a proper code generation back end in the BytecodeEncoder framework, but I never finished the cleanup. The index and code inst vars in the LeafNode hierarchy are vestiges of the old implementation.  It would be really good to get rid of the code inst var altogether and to be left only with index, and index being the literal index for literal nodes (perhaps negative indices being used for special selectors), index being the inst var index for inst var nodes, and index being the temp var offset for temp var nodes, etc.

But this really needs someone with fresh eyes and energy.  My plate is full.  When I did think of doing this I realized that it is probably wise to clone the compiler altogether and do the development and testing work in the clone before moving it back to LeafNode et al for the first functional commit.  This to avoid breaking the compiler while trying to fix it.

So Christoph, do you accept my challenge and will you try and eliminate the code inst var from LeafNode?

 

As you see, index i passed as argument to #code: keyword (? it's because it's documenting the output, not the input);
then code: parameter shadowing the index instance variable...
And the index instance variable was not set... Kind of brainfuck.

We still have code:type: and index variable shadowing in current trunk...

By the way, here is another interesting one-liner:

(Object newSubclass environment: self environment; compile: 'foo ^(ObjectTracer on: nil) class'; >> #foo) decompile generate valueWithReceiver: nil arguments: #()

Interestingly, it opens a debugger - in other words, #class is sent as a regular selector. The decompiler does not know anything about special selectors at the moment. Is this desired behavior? I wonder whether it should be the parse tree's responsibility to install such kind of optimizations, rather than the responsibility of the Compiler.
Because in reality, Compiler is not the only client that requests code generation from parse trees. Etoys is a good example for a client from another domain that uses this service, too. Should all these other clients be withheld these important optimizations of Smalltalk expressions?

After parsing, there are other compilation phases, for analyzing variable scope, clean blocks, etc...
It's possible to scatter the implementation of various phases in the nodes themselves, but the trend is rather to use a visitor pattern;
it gather the handling in some specialized classes that hold all the states (rather than pass them as message arguments).
Pharo team did a complete re-engineering of compiler (OpalCompiler) that you culd study.

Best,
Christoph


Von: Thiede, Christoph
Gesendet: Freitag, 27. März 2020 23:16 Uhr
An: The general-purpose Squeak developers list
Betreff: AW: [squeak-dev] [Etoys, Compiler] Help wanted: Trying to embed SyntaxMorphs into other tiles
 

Hi Eliot,


It looks correct.  Can you check it against the old bytecode set too?  We don’t want it to break old-style blocks.


Good point. I ran

(Object >> #asOrderedCollection) decompile generate valueWithReceiver: 42 arguments: #().

for both bytecode sets, and both were fine.

But:

(Collection >> #asArray) decompile generate valueWithReceiver: {42} asOrderedCollection arguments: #().

breaks - in both bytecode sets. This is weird.
I will have a look into it, maybe I can discover what's wrong.

In addition, I propose to write tests for this. But it's not the goal of the decompiler to yield exactly the same parse tree or source code as the original method consisted of? In this case, we will need to write a lot of fixtures for the tests.

Best,
Christoph




Von: Squeak-dev <[hidden email]> im Auftrag von Eliot Miranda <[hidden email]>
Gesendet: Freitag, 27. März 2020 21:33 Uhr
An: The general-purpose Squeak developers list
Betreff: Re: [squeak-dev] [Etoys, Compiler] Help wanted: Trying to embed SyntaxMorphs into other tiles
 
Hi Christoph,

On Mar 27, 2020, at 12:45 PM, Thiede, Christoph <[hidden email]> wrote:



Hi all! :-)

Just an update of the decompilation question:
Christoph Thiede wrote
I don't know how to use #generate: exactly, but other senders usually appear to recompile a method before passing it to #generate:.
For comparison:

[ (Collection >> #asArray) decompile generate: CompiledMethodTrailer empty ] fails, but

[ m := (Collection >> #asArray) decompile.

  m := Compiler new compile: m in: Collection notifying: nil ifFail: #foo.
  m generate: CompiledMethodTrailer empty ] works.
Why is that recompilation required but decompilation is insufficient? Is this some bug, or is it expected behavior?
The general approach seems to be correct, but I think I found an error in the decompilation of literal variables such as Array. I sent Compiler-ct.425 to the inbox which should fix this issue.

I moved this to inbox.  It looks correct.  Can you check it against the old bytecode set too?  We don’t want it to break old-style blocks.


I am going to complete the implementation of SyntaxMorph >> #parseNode :-)

Best,
Christoph

Von: Squeak-dev <[hidden email]> im Auftrag von Thiede, Christoph
Gesendet: Dienstag, 15. Oktober 2019 21:08:24
An: [hidden email]
Betreff: [squeak-dev] [Etoys, Compiler] Help wanted: Trying to embed SyntaxMorphs into other tiles
 

Hi all,


I'm currently trying to implement #parseNodeWith: on SyntaxMorph, in order to embed SyntaxMorphs into regular tiles. (Did this ever work in past?)

I'm afraid the attempt in the commit below does not work yet; you can create a script editor, but parsing is erroneous, so you cannot execute the script.


To reproduce:

Compile the following:

MyPlayer >> examplePlayerCode

self forward: 6 * 7.

self turn: (11 raisedTo: 13 modulo: 97)

and evaluate:

| e p |
p := Morph new openInWorld assuredPlayer.
e := (MyPlayer >> #examplePlayerCode) decompile asScriptEditorFor: p.
e openInHand.


In Player>>#acceptScript:for:, #generate: is called on node, and when I decompile the result, I get a strange result:


examplePlayerCodeTest

self forward: 6 * 7.

self

forward: (#forward: forward: #forward:).


I don't know how to use #generate: exactly, but other senders usually appear to recompile a method before passing it to #generate:.

For comparison:

(Collection >> #asArray) decompile generate: CompiledMethodTrailer empty ] fails, but

m := (Collection >> #asArray) decompile.

  m := Compiler new compile: m in: Collection notifying: nil ifFail: #foo.
  m generate: CompiledMethodTrailer empty ] works.

Why is that recompilation required but decompilation is insufficient? Is this some bug, or is it expected behavior?


However, in the case of SyntaxMorph, I don't know how to recompile the node before, as a SyntaxMorph should be able to represent a node of an arbitrary type that must not be constrained to a MessageNode. So how could I solve the problem to generate code from SyntaxMorphs?


tl;dr: What is the full story of #generate: and how can it be made to work in this example?

Many thanks in advance! :-)


Best,

Christoph



Von: Squeak-dev <[hidden email]> im Auftrag von [hidden email] <[hidden email]>
Gesendet: Dienstag, 15. Oktober 2019 14:46 Uhr
An: [hidden email]
Betreff: [squeak-dev] The Inbox: EToys-ct.367.mcz
 
A new version of EToys was added to project The Inbox:
http://source.squeak.org/inbox/EToys-ct.367.mcz

==================== Summary ====================

Name: EToys-ct.367
Author: ct
Time: 15 October 2019, 2:46:24.862129 pm
UUID: 1394344f-b1e3-5640-a13a-70c5dffd51f4
Ancestors: EToys-mt.361

Allow for embedding SyntaxMorphs into test tiles.

=============== Diff against EToys-mt.361 ===============

Item was added:
+ ----- Method: SyntaxMorph>>parseNodeWith:asStatement: (in category '*Etoys-Squeakland-code generation') -----
+ parseNodeWith: encoder asStatement: aBoolean
+
+        ^ self parseNode!







--
_,,,^..^,,,_
best, Eliot


Carpe Squeak!
Reply | Threaded
Open this post in threaded view
|

Re: Decompiler buggy (was: AW: [Etoys, Compiler] Help wanted: Trying to embed SyntaxMorphs into other tiles)

Eliot Miranda-2
Hi Christoph,

On Sun, Mar 29, 2020 at 11:21 AM Thiede, Christoph <[hidden email]> wrote:

Hi Eliot, this sounds like a reasonable piece of work. I'll need to reverse-engineer all the relevant stuff first, but it will put it onto my list with a priority above average :)


Thank you!
 

One question in general, both index and code appear to be referenced by LeafNode itself mainly for accessing and initialization purposes. Why can't we define these inst vars per subclass and use an abstract getter in LeafNode (if necessary at all)? I have the feeling that this could simplify explanation and understanding of the several meanings of index.


Sounds reasonable to me. What ever seems best to you. But look at the BytecodeEncoder API before you introduce too much abstraction.  And I'm eager to review code, help, etc.
 

Best,
Christoph

Von: Squeak-dev <[hidden email]> im Auftrag von Eliot Miranda <[hidden email]>
Gesendet: Sonntag, 29. März 2020 19:49:33
An: The general-purpose Squeak developers list
Betreff: Re: [squeak-dev] Decompiler buggy (was: AW: [Etoys, Compiler] Help wanted: Trying to embed SyntaxMorphs into other tiles)
 
Hi Christoph,

    please read what I'm about to say carefully.  This message is aimed at you :-)

On Sat, Mar 28, 2020 at 6:09 AM Nicolas Cellier <[hidden email]> wrote:
Hi Christoph,

Le sam. 28 mars 2020 à 01:12, Thiede, Christoph <[hidden email]> a écrit :

Hi Eliot, hi all,


ah, I finally found the bug, but this was a really hard hunt! :D


The solution is absolutely simple, again:


codeAnySelector: selector


^SelectorNode new

key: selector

+ index: nil

- index: 0

type: SendType


Good find!

Seriously, did the Decompiler ever reliably produce re-generatable parse trees in the past? But it should do so, shouldn't it? :-)


Maybe it did (see below). But I'm not sure that is was a feature...
Isn't it mostly used for replacing absent source code... that will eventually be repasrsed ? (!)

Before the above patch, the following example was broken, too:

class := Object newSubclass.
class compile: 'foo ^ 1 + 1'.
(class >> #foo) decompile generate valueWithReceiver: class new arguments: #(). "SmallInteger does not understand #foo"

Now I'm wondering what are the actual semantics of the index variable. Its method comment about "various uses depending on the class of the receiver" is quite generic - do you know some more details about this? 
Should we also use nil instead of 0 in DecompilerConstructor >> #codeAnyLiteral:? At first glance, senders of #encodeLiteral: do not appear to set it to zero manually (so they leave it nil), but unless there is any documentation of the index meaning, this is speculation only, as I could not find any other example where decompilation + regeneration produce a method that cannot be executed properly.

It's very low level, some kind of reflexion of byteCode encoding.
Once upon a time (< Squeak4.0), the code was even more horrible to follow!

LeafNode>>key: object index: i type: type
    self key: object code: (self code: i type: type)

LeafNode>>code: index type: type
    index isNil
         ifTrue: [^type negated].
     (CodeLimits at: type) > index
         ifTrue: [^(CodeBases at: type) + index].
     ^type * 256 + index

Exactly.  This is actually obsolete genius by Dan Ingalls.  If you have a look at the original Smalltalk-80 bytecode compiler you'll see that the parse tree nodes both represent the parse tree *and* generate the output bytecodes,  This was really important on 16-bit Smalltalk-80 since it meant that the bytecode compiler was extremely compact and concise.  Objects were in extremely short supply, 32k objects in a normal implementation (with 15-bit SmallIntegers), and 48k objects in a "stretch" implementation that had 14-bit SmallIntegers.

Now we have 32-bit and 64-bit implementations this concision is obsolete and what we need is flexibility and clarity.  

I had done some reimplementation work on the bytecode compiler in 2009 to add the closure bytecodes, and to add a proper code generation back end in the BytecodeEncoder framework, but I never finished the cleanup. The index and code inst vars in the LeafNode hierarchy are vestiges of the old implementation.  It would be really good to get rid of the code inst var altogether and to be left only with index, and index being the literal index for literal nodes (perhaps negative indices being used for special selectors), index being the inst var index for inst var nodes, and index being the temp var offset for temp var nodes, etc.

But this really needs someone with fresh eyes and energy.  My plate is full.  When I did think of doing this I realized that it is probably wise to clone the compiler altogether and do the development and testing work in the clone before moving it back to LeafNode et al for the first functional commit.  This to avoid breaking the compiler while trying to fix it.

So Christoph, do you accept my challenge and will you try and eliminate the code inst var from LeafNode?

 

As you see, index i passed as argument to #code: keyword (? it's because it's documenting the output, not the input);
then code: parameter shadowing the index instance variable...
And the index instance variable was not set... Kind of brainfuck.

We still have code:type: and index variable shadowing in current trunk...

By the way, here is another interesting one-liner:

(Object newSubclass environment: self environment; compile: 'foo ^(ObjectTracer on: nil) class'; >> #foo) decompile generate valueWithReceiver: nil arguments: #()

Interestingly, it opens a debugger - in other words, #class is sent as a regular selector. The decompiler does not know anything about special selectors at the moment. Is this desired behavior? I wonder whether it should be the parse tree's responsibility to install such kind of optimizations, rather than the responsibility of the Compiler.
Because in reality, Compiler is not the only client that requests code generation from parse trees. Etoys is a good example for a client from another domain that uses this service, too. Should all these other clients be withheld these important optimizations of Smalltalk expressions?

After parsing, there are other compilation phases, for analyzing variable scope, clean blocks, etc...
It's possible to scatter the implementation of various phases in the nodes themselves, but the trend is rather to use a visitor pattern;
it gather the handling in some specialized classes that hold all the states (rather than pass them as message arguments).
Pharo team did a complete re-engineering of compiler (OpalCompiler) that you culd study.

Best,
Christoph


Von: Thiede, Christoph
Gesendet: Freitag, 27. März 2020 23:16 Uhr
An: The general-purpose Squeak developers list
Betreff: AW: [squeak-dev] [Etoys, Compiler] Help wanted: Trying to embed SyntaxMorphs into other tiles
 

Hi Eliot,


It looks correct.  Can you check it against the old bytecode set too?  We don’t want it to break old-style blocks.


Good point. I ran

(Object >> #asOrderedCollection) decompile generate valueWithReceiver: 42 arguments: #().

for both bytecode sets, and both were fine.

But:

(Collection >> #asArray) decompile generate valueWithReceiver: {42} asOrderedCollection arguments: #().

breaks - in both bytecode sets. This is weird.
I will have a look into it, maybe I can discover what's wrong.

In addition, I propose to write tests for this. But it's not the goal of the decompiler to yield exactly the same parse tree or source code as the original method consisted of? In this case, we will need to write a lot of fixtures for the tests.

Best,
Christoph




Von: Squeak-dev <[hidden email]> im Auftrag von Eliot Miranda <[hidden email]>
Gesendet: Freitag, 27. März 2020 21:33 Uhr
An: The general-purpose Squeak developers list
Betreff: Re: [squeak-dev] [Etoys, Compiler] Help wanted: Trying to embed SyntaxMorphs into other tiles
 
Hi Christoph,

On Mar 27, 2020, at 12:45 PM, Thiede, Christoph <[hidden email]> wrote:



Hi all! :-)

Just an update of the decompilation question:
Christoph Thiede wrote
I don't know how to use #generate: exactly, but other senders usually appear to recompile a method before passing it to #generate:.
For comparison:

[ (Collection >> #asArray) decompile generate: CompiledMethodTrailer empty ] fails, but

[ m := (Collection >> #asArray) decompile.

  m := Compiler new compile: m in: Collection notifying: nil ifFail: #foo.
  m generate: CompiledMethodTrailer empty ] works.
Why is that recompilation required but decompilation is insufficient? Is this some bug, or is it expected behavior?
The general approach seems to be correct, but I think I found an error in the decompilation of literal variables such as Array. I sent Compiler-ct.425 to the inbox which should fix this issue.

I moved this to inbox.  It looks correct.  Can you check it against the old bytecode set too?  We don’t want it to break old-style blocks.


I am going to complete the implementation of SyntaxMorph >> #parseNode :-)

Best,
Christoph

Von: Squeak-dev <[hidden email]> im Auftrag von Thiede, Christoph
Gesendet: Dienstag, 15. Oktober 2019 21:08:24
An: [hidden email]
Betreff: [squeak-dev] [Etoys, Compiler] Help wanted: Trying to embed SyntaxMorphs into other tiles
 

Hi all,


I'm currently trying to implement #parseNodeWith: on SyntaxMorph, in order to embed SyntaxMorphs into regular tiles. (Did this ever work in past?)

I'm afraid the attempt in the commit below does not work yet; you can create a script editor, but parsing is erroneous, so you cannot execute the script.


To reproduce:

Compile the following:

MyPlayer >> examplePlayerCode

self forward: 6 * 7.

self turn: (11 raisedTo: 13 modulo: 97)

and evaluate:

| e p |
p := Morph new openInWorld assuredPlayer.
e := (MyPlayer >> #examplePlayerCode) decompile asScriptEditorFor: p.
e openInHand.


In Player>>#acceptScript:for:, #generate: is called on node, and when I decompile the result, I get a strange result:


examplePlayerCodeTest

self forward: 6 * 7.

self

forward: (#forward: forward: #forward:).


I don't know how to use #generate: exactly, but other senders usually appear to recompile a method before passing it to #generate:.

For comparison:

(Collection >> #asArray) decompile generate: CompiledMethodTrailer empty ] fails, but

m := (Collection >> #asArray) decompile.

  m := Compiler new compile: m in: Collection notifying: nil ifFail: #foo.
  m generate: CompiledMethodTrailer empty ] works.

Why is that recompilation required but decompilation is insufficient? Is this some bug, or is it expected behavior?


However, in the case of SyntaxMorph, I don't know how to recompile the node before, as a SyntaxMorph should be able to represent a node of an arbitrary type that must not be constrained to a MessageNode. So how could I solve the problem to generate code from SyntaxMorphs?


tl;dr: What is the full story of #generate: and how can it be made to work in this example?

Many thanks in advance! :-)


Best,

Christoph



Von: Squeak-dev <[hidden email]> im Auftrag von [hidden email] <[hidden email]>
Gesendet: Dienstag, 15. Oktober 2019 14:46 Uhr
An: [hidden email]
Betreff: [squeak-dev] The Inbox: EToys-ct.367.mcz
 
A new version of EToys was added to project The Inbox:
http://source.squeak.org/inbox/EToys-ct.367.mcz

==================== Summary ====================

Name: EToys-ct.367
Author: ct
Time: 15 October 2019, 2:46:24.862129 pm
UUID: 1394344f-b1e3-5640-a13a-70c5dffd51f4
Ancestors: EToys-mt.361

Allow for embedding SyntaxMorphs into test tiles.

=============== Diff against EToys-mt.361 ===============

Item was added:
+ ----- Method: SyntaxMorph>>parseNodeWith:asStatement: (in category '*Etoys-Squeakland-code generation') -----
+ parseNodeWith: encoder asStatement: aBoolean
+
+        ^ self parseNode!







--
_,,,^..^,,,_
best, Eliot



--
_,,,^..^,,,_
best, Eliot