AST Transformation Techniques

Previous Topic Next Topic
 
classic Classic list List threaded Threaded
5 messages Options
Reply | Threaded
Open this post in threaded view
|

AST Transformation Techniques

Sean P. DeNigris
Administrator

To generate code, I used to use templates like the following (contrived example):

```smalltalk

template := 'method1

^ {returnValue}'.

template format: { #returnValue -> 2 } asDictionary

```

This is *okay* for simple cases, but can get unwieldy, and doesn't benefit from e.g. (early) compiler warnings. It is also lacking when one wants to offer the behavior as a method and also as a script with no dependencies (maybe for CI).

To overcome some of these limitations, I started saving the code template as an actual method and transforming the AST, e.g.:

```smalltalk

"Convert baseline##: method"

methodTree := (self methodNamed: selector) parseTree.

methodTree selector: #baseline:.

methodTree pragmas at: 1 put: (RBPragmaNode selector: #baseline arguments: #()).

commonBlockBody := methodTree statements first arguments last body.

commonBlockBody statements

detect: [ :e | e selector = #repository: ]

ifFound: [ :repoSetter | commonBlockBody removeNode: repoSetter ].

"Compile baseline method"

baseline compile: methodTree newSource classified: 'baseline'

```

When I started doing this, I noticed two things:

1. There were idioms that were repeated over and over.

2. It still isn't super easy to turn a method into a script with no dependencies (e.g. self sends)

So, I did a spike to wrap a few common idioms for this use case. It works like this. Say you have a method like this:

```

SmallRemoteGitRepository>>#addRemote: urlString as: nameString

"Assumes repo is loaded"

| repo remote |

repo := IceRepository registry detect: [ :e | e name = self projectName ].

remote := IceGitRemote name: nameString url: urlString.

repo addRemote: remote.

remote fetch

```

And you want to also provide it as source code for a script with no dependencies. You can do:

```

SmallRemoteGitRepository >>#scriptToAddRemote: urlString as: nameString

| method transformer |

method := SmallRemoteGitRepository methodNamed: #addRemote:as:.

transformer := method peAST_Transformer

beScript;

addStatementFirst: #projectName , ' := ' , self projectName printString;

addStatementFirst: #nameString , ' := ' , nameString printString;

addStatementFirst: #urlString , ' := ' , urlString printString;

replaceNodeDetect: [ :e | e isMessage and: [ e selector = #projectName ] ]

withNode: (RBVariableNode named: #projectName).

^ transformer newSource

```

which would generate (for example, for a particular instance):

```

| repo remote projectName |

urlString := '[hidden email]:magritte-metamodel/magritte.git'.

nameString := 'upstream'.

projectName := 'Magritte'.

repo := IceRepository registry detect: [ :e | e name = projectName ].

remote := IceGitRemote name: nameString url: urlString.

repo addRemote: remote.

remote fetch

```

So my questions are:

- Does something like this already exist?

- Are there better ways to solve this problem?

- Does this solution look promising/generally-helpful?

Cheers,
Sean
Reply | Threaded
Open this post in threaded view
|

Re: AST Transformation Techniques

hogoww

Hi Sean,

I'm aware of Julien Delplanque which has done something a bit like this, or my own project.
Julien's has a good documentation.
Repo: https://github.com/juliendelplanque/PharoCodeGenerator

You also have my own project, which takes a different approach that might be more what you want.
(I didn't document it as well though...): https://github.com/hogoww/PlainPharoCode

Quickly, my way of doing things is to use blocks instead of strings.
This ensures that the generated code is at least well formed.
This also enables code highlighting [...].
It then replaces variables using the lexical scope (except for block args if i recall correctly, since shadowing is not a feature of pharo)

```smalltalk

generateValidatorVisitOf: aClass
    "add an empty behavior visit method"
    | method body anInstance visitClass
    errors | "IV of the visitor"
    anInstance := aClass name asAnInstance.

    visitClass :=  aClass name asVisitClassSelector.
    body := [ :aClass |
        [ super visitClass: {aClass} ]
        on: AssertionFailure
        do: [ errors := errors + 1 ]
       
].

method := visitClass
        asMethodWithBody: body
        withArguments:
{(#aClass -> anInstance )} asDictionary.


    structureValidatorVisitorClass compile: method asString classified: 'visiting'

```

This is a temp variable, used as a selector, which will be replace when evaluating the #asMethodWithBody:withArguments: message send.
The temp var visitClass is a string that will describe the name of the method when sending the #asMethodWithBody:withArguments: message.
This is the block that we want to generate.
The temp var errors has no value, which means it already has the right name. this will not be replaced.
The message send asMethodWithBody will do the required replacements.

As said earlier , block arguments cannot be replaced using the same method, as Pharo disabled shadowing.
We currently have to compile the string version of the method. Although slow, this has not yet been an issue

(I hope that color way of describing a method is readable :D)
This will for example give the folowing output:

```smalltalk

visitExpression: anExpression
    [ super visitExpression: anExpression ]
        on: AssertionFailure
        do: [ errors := errors + 1 ]

```

I took a simple ish example, but we can do more advanced stuff, like concatenating blocks.

You can find more example in the following project in which I use PlainPharoCode exclusively to generate my code: https://github.com/hogoww/C-AST.
(In the ASTCGenerator class)
(You will need a moose image).

I hope you find this interesting !

Have a nice day !

Pierre.

On 16/11/2020 23:55, [hidden email] wrote:

To generate code, I used to use templates like the following (contrived example):

```smalltalk

template := 'method1

^ {returnValue}'.

template format: { #returnValue -> 2 } asDictionary

```

This is *okay* for simple cases, but can get unwieldy, and doesn't benefit from e.g. (early) compiler warnings. It is also lacking when one wants to offer the behavior as a method and also as a script with no dependencies (maybe for CI).

To overcome some of these limitations, I started saving the code template as an actual method and transforming the AST, e.g.:

```smalltalk

"Convert baseline##: method"

methodTree := (self methodNamed: selector) parseTree.

methodTree selector: #baseline:.

methodTree pragmas at: 1 put: (RBPragmaNode selector: #baseline arguments: #()).

commonBlockBody := methodTree statements first arguments last body.

commonBlockBody statements

detect: [ :e | e selector = #repository: ]

ifFound: [ :repoSetter | commonBlockBody removeNode: repoSetter ].

"Compile baseline method"

baseline compile: methodTree newSource classified: 'baseline'

```

When I started doing this, I noticed two things:

1. There were idioms that were repeated over and over.

2. It still isn't super easy to turn a method into a script with no dependencies (e.g. self sends)

So, I did a spike to wrap a few common idioms for this use case. It works like this. Say you have a method like this:

```

SmallRemoteGitRepository>>#addRemote: urlString as: nameString

"Assumes repo is loaded"

| repo remote |

repo := IceRepository registry detect: [ :e | e name = self projectName ].

remote := IceGitRemote name: nameString url: urlString.

repo addRemote: remote.

remote fetch

```

And you want to also provide it as source code for a script with no dependencies. You can do:

```

SmallRemoteGitRepository >>#scriptToAddRemote: urlString as: nameString

| method transformer |

method := SmallRemoteGitRepository methodNamed: #addRemote:as:.

transformer := method peAST_Transformer

beScript;

addStatementFirst: #projectName , ' := ' , self projectName printString;

addStatementFirst: #nameString , ' := ' , nameString printString;

addStatementFirst: #urlString , ' := ' , urlString printString;

replaceNodeDetect: [ :e | e isMessage and: [ e selector = #projectName ] ]

withNode: (RBVariableNode named: #projectName).

^ transformer newSource

```

which would generate (for example, for a particular instance):

```

| repo remote projectName |

urlString := '[hidden email]'.

nameString := 'upstream'.

projectName := 'Magritte'.

repo := IceRepository registry detect: [ :e | e name = projectName ].

remote := IceGitRemote name: nameString url: urlString.

repo addRemote: remote.

remote fetch

```

So my questions are:

- Does something like this already exist?

- Are there better ways to solve this problem?

- Does this solution look promising/generally-helpful?

Reply | Threaded
Open this post in threaded view
|

Re: AST Transformation Techniques

Sean P. DeNigris
Administrator
hogoww wrote
> I'm aware of Julien Delplanque which has done something a bit like this,
> or my own project.

Thanks! Both interesting and good to know about. Your project took me a
while to understand by reading and rereading the examples in the readme, but
I really like the block idea. I think Julien's could work for my use case
because there is a feature to generate his AST model from an existing
method. One day, I'd love to sit down and properly compare all three
approaches. I'll add it to the never-ending TODOs!



-----
Cheers,
Sean
--
Sent from: http://forum.world.st/Pharo-Smalltalk-Developers-f1294837.html
Cheers,
Sean
Reply | Threaded
Open this post in threaded view
|

Re: AST Transformation Techniques

hernanmd
In reply to this post by hogoww

Hi Pierre,

El mar., 17 nov. 2020 a las 5:04, hogoww (<[hidden email]>) escribió:

Hi Sean,

I'm aware of Julien Delplanque which has done something a bit like this, or my own project.
Julien's has a good documentation.
Repo: https://github.com/juliendelplanque/PharoCodeGenerator

You also have my own project, which takes a different approach that might be more what you want.
(I didn't document it as well though...): https://github.com/hogoww/PlainPharoCode

Quickly, my way of doing things is to use blocks instead of strings.
This ensures that the generated code is at least well formed.
This also enables code highlighting [...].
It then replaces variables using the lexical scope (except for block args if i recall correctly, since shadowing is not a feature of pharo)

```smalltalk

generateValidatorVisitOf: aClass
    "add an empty behavior visit method"
    | method body anInstance visitClass
    errors | "IV of the visitor"
    anInstance := aClass name asAnInstance.

    visitClass :=  aClass name asVisitClassSelector.
    body := [ :aClass |
        [ super visitClass: {aClass} ]
        on: AssertionFailure
        do: [ errors := errors + 1 ]
       
].

method := visitClass
        asMethodWithBody: body
        withArguments:
{(#aClass -> anInstance )} asDictionary.


    structureValidatorVisitorClass compile: method asString classified: 'visiting'

```

This is a temp variable, used as a selector, which will be replace when evaluating the #asMethodWithBody:withArguments: message send.
The temp var visitClass is a string that will describe the name of the method when sending the #asMethodWithBody:withArguments: message.
This is the block that we want to generate.
The temp var errors has no value, which means it already has the right name. this will not be replaced.
The message send asMethodWithBody will do the required replacements.

As said earlier , block arguments cannot be replaced using the same method, as Pharo disabled shadowing.
We currently have to compile the string version of the method. Although slow, this has not yet been an issue

(I hope that color way of describing a method is readable :D)
This will for example give the folowing output:

```smalltalk

visitExpression: anExpression
    [ super visitExpression: anExpression ]
        on: AssertionFailure
        do: [ errors := errors + 1 ]

```

I took a simple ish example, but we can do more advanced stuff, like concatenating blocks.


This is something which I've been expecting and trying to find some time to do myself.
Just to be clear, you mean something like:

[ :arg1 | arg1 + 0 ] , [ :arg2 | arg2 + 7 ]

which would give:
[ :arg1 :arg2 |
    arg1 + 0.
    arg2 + 7 ]

Did I understand well?

 Thanks for this report!

Cheers,

Hernán

You can find more example in the following project in which I use PlainPharoCode exclusively to generate my code: https://github.com/hogoww/C-AST.
(In the ASTCGenerator class)
(You will need a moose image).

I hope you find this interesting !

Have a nice day !

Pierre.

On 16/11/2020 23:55, [hidden email] wrote:

To generate code, I used to use templates like the following (contrived example):

```smalltalk

template := 'method1

^ {returnValue}'.

template format: { #returnValue -> 2 } asDictionary

```

This is *okay* for simple cases, but can get unwieldy, and doesn't benefit from e.g. (early) compiler warnings. It is also lacking when one wants to offer the behavior as a method and also as a script with no dependencies (maybe for CI).

To overcome some of these limitations, I started saving the code template as an actual method and transforming the AST, e.g.:

```smalltalk

"Convert baseline##: method"

methodTree := (self methodNamed: selector) parseTree.

methodTree selector: #baseline:.

methodTree pragmas at: 1 put: (RBPragmaNode selector: #baseline arguments: #()).

commonBlockBody := methodTree statements first arguments last body.

commonBlockBody statements

detect: [ :e | e selector = #repository: ]

ifFound: [ :repoSetter | commonBlockBody removeNode: repoSetter ].

"Compile baseline method"

baseline compile: methodTree newSource classified: 'baseline'

```

When I started doing this, I noticed two things:

1. There were idioms that were repeated over and over.

2. It still isn't super easy to turn a method into a script with no dependencies (e.g. self sends)

So, I did a spike to wrap a few common idioms for this use case. It works like this. Say you have a method like this:

```

SmallRemoteGitRepository>>#addRemote: urlString as: nameString

"Assumes repo is loaded"

| repo remote |

repo := IceRepository registry detect: [ :e | e name = self projectName ].

remote := IceGitRemote name: nameString url: urlString.

repo addRemote: remote.

remote fetch

```

And you want to also provide it as source code for a script with no dependencies. You can do:

```

SmallRemoteGitRepository >>#scriptToAddRemote: urlString as: nameString

| method transformer |

method := SmallRemoteGitRepository methodNamed: #addRemote:as:.

transformer := method peAST_Transformer

beScript;

addStatementFirst: #projectName , ' := ' , self projectName printString;

addStatementFirst: #nameString , ' := ' , nameString printString;

addStatementFirst: #urlString , ' := ' , urlString printString;

replaceNodeDetect: [ :e | e isMessage and: [ e selector = #projectName ] ]

withNode: (RBVariableNode named: #projectName).

^ transformer newSource

```

which would generate (for example, for a particular instance):

```

| repo remote projectName |

urlString := '[hidden email]'.

nameString := 'upstream'.

projectName := 'Magritte'.

repo := IceRepository registry detect: [ :e | e name = projectName ].

remote := IceGitRemote name: nameString url: urlString.

repo addRemote: remote.

remote fetch

```

So my questions are:

- Does something like this already exist?

- Are there better ways to solve this problem?

- Does this solution look promising/generally-helpful?

Reply | Threaded
Open this post in threaded view
|

Re: AST Transformation Techniques

hogoww

Indeed !

I'm going to show you a simplified example here, hopefully it'll be understandable.
Bit of context, I'm generating code for visitors, from a Famix metamodel.

    body := [ :anInstance | ] asPlainCodeBC.

    collections do:[:propertyName|
        body add: [:anInstance|
            anInstance propertyName
                ifNotNil:[:aCollection| aCollection
                    do:[:aMember| aMember acceptVisitor: self ]
]]
            withArguments: {#aCollection -> propertyName . #aMember -> propertyName asAnInstance } asDictionary ].
   
    singleElements do:[:propertyName|
        body add: [:anInstance| anInstance propertyName
                    ifNotNil:[:aPropertyName| aPropertyName acceptVisitor: self ] ]
            withArguments: {#aPropertyName -> propertyName asAnInstance } asDictionary ].

We first declare a block that we can concatenate stuff too, then we add others blocks to it.
In this loop, we have to visit every objects in collections for an instance variable of a class.
So we have to add a block per collection to go through.

The actual block that will be added to the resulting block.
Replacements for *this* block.
(Note about this one, I do not remember well, buit I think that you can use one scope per block added, it is copied when the block is added, so you can modify the variables)

The second part is exactly the same as the first one, but for instance variables that do not have to be iterate over.
If you understood the first one, you'll understand the second one !
Otherwise the second one might be able to help you with the first one !

And an example of generated code:

visitStructureDefinition: aStructureDefinition
    aStructureDefinition members
        ifNotNil:
            [ :members | members do: [ :aMembers | aMembers acceptVisitor: self ] ].
    aStructureDefinition declaration
        ifNotNil: [ :aDeclaration | aDeclaration acceptVisitor: self ]

This is an indexable object, a collection.
So we want to visit each objects contained in it.
This is a non indexable object, so we send the visitMessage.
The visit method for this type will take care of visiting its slots properly
.

Note that I initially went with #+, but #, makes as much or more sense ! :)


Pierre

On 18/11/2020 08:42, Hernán Morales Durand wrote:

I took a simple ish example, but we can do more advanced stuff, like concatenating blocks.


This is something which I've been expecting and trying to find some time to do myself.
Just to be clear, you mean something like:

[ :arg1 | arg1 + 0 ] , [ :arg2 | arg2 + 7 ]

which would give:
[ :arg1 :arg2 |
    arg1 + 0.
    arg2 + 7 ]

Did I understand well?

 Thanks for this report!

Cheers,

Hernán

You can find more example in the following project in which I use PlainPharoCode exclusively to generate my code: https://github.com/hogoww/C-AST.
(In the ASTCGenerator class)
(You will need a moose image).

I hope you find this interesting !

Have a nice day !

Pierre.