Generate Bytecodes manually

Previous Topic Next Topic
 
classic Classic list List threaded Threaded
10 messages Options
Reply | Threaded
Open this post in threaded view
|

Generate Bytecodes manually

Udo Schneider
All,

is there a supported interface to generate methods (incl. their
bytecodes) manually?

Best Regards,

Udo


Reply | Threaded
Open this post in threaded view
|

Re: Generate Bytecodes manually

Marcus Denker-4

On Jul 16, 2013, at 4:10 PM, Udo Schneider <[hidden email]> wrote:

> All,
>
> is there a supported interface to generate methods (incl. their bytecodes) manually?
>

In 3.0 we have two nice additions:

1) you can easily create CompiledMethods from RB ASTs

| cm methodNode |

methodNode := RBMethodNode
        selector: #test
        body: (RBReturnNode value: (RBLiteralNode value: 1)) asSequenceNode.
       
methodNode doSemanticAnalysis.
cm := methodNode generate.

cm valueWithReceiver: nil arguments: #()


2) For low level stuff there is IRBuilder. IR is a bytecode-level representation
(CFG) that abstracts away from the details yet is very close.



        |  cm |
       
cm := IRBuilder buildMethod: [ : builder |
                builder pushLiteral: 3;
                pushDup;
                send: #=;
                returnTop].
       
cm valueWithReceiver: nil arguments: #()



Reply | Threaded
Open this post in threaded view
|

Re: Generate Bytecodes manually

Udo Schneider


On 16.07.13 16:39, Marcus Denker wrote:
> In 3.0 we have two nice additions:
>
> 1) you can easily create CompiledMethods from RB ASTs
Looks nice but isn't usable for me as I don't have an AST.

> 2) For low level stuff there is IRBuilder. IR is a bytecode-level representation
> (CFG) that abstracts away from the details yet is very close.
Now we're talking!!! This looks exactly like what I need.

Just a theoretical question: Can every sequence of bytecodes be
decompiled to a Smalltalk method? I doubt it but wanted to ask nevertheless.

Best Regards,

Udo




Reply | Threaded
Open this post in threaded view
|

Re: Generate Bytecodes manually

Marcus Denker-4

On Jul 16, 2013, at 5:10 PM, Udo Schneider <[hidden email]> wrote:

>
>
> On 16.07.13 16:39, Marcus Denker wrote:
>> In 3.0 we have two nice additions:
>>
>> 1) you can easily create CompiledMethods from RB ASTs
> Looks nice but isn't usable for me as I don't have an AST.
>
>> 2) For low level stuff there is IRBuilder. IR is a bytecode-level representation
>> (CFG) that abstracts away from the details yet is very close.
> Now we're talking!!! This looks exactly like what I need.
>
> Just a theoretical question: Can every sequence of bytecodes be decompiled to a Smalltalk method? I doubt it but wanted to ask nevertheless.
>
No. A decompiler has to be very specific for the code generated by the compiler.

For Pharo3, we even opted to not support decompilation at all… (IR -> AST). We will instead save a higher-level representation that has much
more information (e.g. names of variables). For deployment, people can strip the names, giving them the same "obfuscation" the decompiler
provides now.

BC-> IR decompilation always works.

        Marcus


Reply | Threaded
Open this post in threaded view
|

Re: Generate Bytecodes manually

Udo Schneider
On 16.07.13 17:15, Marcus Denker wrote:
> No. A decompiler has to be very specific for the code generated by the compiler.
I thought so.

> For Pharo3, we even opted to not support decompilation at all… (IR -> AST). We will instead save a higher-level representation that has much
> more information (e.g. names of variables). For deployment, people can strip the names, giving them the same "obfuscation" the decompiler
> provides now.
Sounds reasonable - so if I generate IR (-> BC) from non-Smalltalk
sources ... is there any recommended way to make the
non-Smalltalk-source visible in browsers? Up to now I would simply
generate IR/BC directly which would mean that a class would have a
method but w/o the source.

Or do I miss a crucial piece here?

Udo




Reply | Threaded
Open this post in threaded view
|

Re: Generate Bytecodes manually

Marcus Denker-4

On Jul 16, 2013, at 8:55 PM, Udo Schneider <[hidden email]> wrote:

> On 16.07.13 17:15, Marcus Denker wrote:
>> No. A decompiler has to be very specific for the code generated by the compiler.
> I thought so.
>
>> For Pharo3, we even opted to not support decompilation at all… (IR -> AST). We will instead save a higher-level representation that has much
>> more information (e.g. names of variables). For deployment, people can strip the names, giving them the same "obfuscation" the decompiler
>> provides now.
> Sounds reasonable - so if I generate IR (-> BC) from non-Smalltalk sources ... is there any recommended way to make the non-Smalltalk-source visible in browsers? Up to now I would simply generate IR/BC directly which would mean that a class would have a method but w/o the source.
>
> Or do I miss a crucial piece here?
>

What you need to do for that is to implement a class that has the same public API as OpalCompiler and Compiler…
Then you can set this is the compiler of your class (override #compiler on the class side).

This compiler then takes care to create the CompiledMethod with your bytecode *and* your source.

The tools *should* be able to display just anything, but that needs to be double-checked as we changed the tools a lot
and there is no example of methods with non-smalltalk source right now.

Implementing Debugger support is possible, too. For that you need to have a mapping between pc and source for your
language.

What we need is a tutorial how to all that based on a simple toy language ;-)


        Marcus


Reply | Threaded
Open this post in threaded view
|

Re: Generate Bytecodes manually

Udo Schneider
On 17.07.13 11:05, Marcus Denker wrote:
> What you need to do for that is to implement a class that has the same public API as OpalCompiler and Compiler…
> Then you can set this is the compiler of your class (override #compiler on the class side).
>
> This compiler then takes care to create the CompiledMethod with your bytecode *and* your source.
>
> The tools *should* be able to display just anything, but that needs to be double-checked as we changed the tools a lot
> and there is no example of methods with non-smalltalk source right now.
Thanks for the detailed steps - I think I have all the pieces now for
tackling a really crazy idea I have.

> Implementing Debugger support is possible, too. For that you need to have a mapping between pc and source for your
> language.
I didn't even think of the debugger. Mapping the source I have and the
pc should be easy - although I'm not quite sure whether it's useful for
what I'm trying. But we'll see.

Udo



Reply | Threaded
Open this post in threaded view
|

Re: Generate Bytecodes manually

Udo Schneider
I just stumbled over some IRBuilder behaviour which I'm not sure about.

1) Jump/Target Pairs
As far as I understood each jump target can only be used by one jump. So
if multiple jumps need to jump to the same location I have to create
multiple targets (one per jump) pointing to the same location.
I can work with this behavior just fine - I'm just wondering what's the
rationale behind this restriction? IMHO the bytecode itself does not
impose that restriction.

2) IR -> AST
I created some IR sequences which works perfectly fine as expected.
However trying to get the compiled method's AST results in an endless
loop. I assume this is expected behavior as the process IR -> AST only
works for a subset of instruction sequences normally generated by
Smalltalk code. But I just wanted to be sure.

Thanks,

Udo



Reply | Threaded
Open this post in threaded view
|

Re: Generate Bytecodes manually

Marcus Denker-4

On Jul 20, 2013, at 8:59 PM, Udo Schneider <[hidden email]> wrote:

> I just stumbled over some IRBuilder behaviour which I'm not sure about.
>
> 1) Jump/Target Pairs
> As far as I understood each jump target can only be used by one jump. So if multiple jumps need to jump to the same location I have to create multiple targets (one per jump) pointing to the same location.
> I can work with this behavior just fine - I'm just wondering what's the rationale behind this restriction? IMHO the bytecode itself does not impose that restriction.
>
It's not needed for the compiler...

> 2) IR -> AST
> I created some IR sequences which works perfectly fine as expected. However trying to get the compiled method's AST results in an endless loop. I assume this is expected behavior as the process IR -> AST only works for a subset of instruction sequences normally generated by Smalltalk code. But I just wanted to be sure.
>
IR-> AST Is *NOT* there. (THere is no code that does IR ->AST).

there is just the old BC->OLD AST which only works on code compiled with the *OLD* compiler.

The decompielr right now is called when there is no byte code (the old decompiler).
This will be fixed soonish.

        Marcus



Reply | Threaded
Open this post in threaded view
|

Re: Generate Bytecodes manually

Udo Schneider
On 20.07.13 21:05, Marcus Denker wrote:
> It's not needed for the compiler...
Thought so - I can work around it using the public IRBuilder API. So no
problem.

> The decompielr right now is called when there is no byte code (the old decompiler).
> This will be fixed soonish.
Good to know - I'll stay tuned. For those sequences where the
decompilation worked it was already quite impressive to see the
Smalltalk code for something which was created in a totally different
language :-)

Thanks,

Udo