About the new compiler, Part I

Previous Topic Next Topic
 
classic Classic list List threaded Threaded
18 messages Options
Reply | Threaded
Open this post in threaded view
|

About the new compiler, Part I

Marcus Denker
Part I: Compiling is slower, so it's bad
=========================================

This part will not talk about closures at all, just about the overall  
system archicture of the compiler.
I will write with Math something about Closures, their design and  
performance later.

Before I go into the details, I think I need to clearify one thing:  
from the start, I always saw Squeak in terms
of it's possibilities, never as "what it is now". I think all what  
follows comes from that point of view.
If you see Squeak an artifact that should never be changed, then you  
won't like any of the arguments that
follow. But I am truly convinced that the greatness of Squeak never  
was the artifact, it was always the possibility
it promised.

So.The Compiler is slower to compile code (factor ca. 4). But then,  
this is not that of a problem. Most people
will now think about the slowness of loading code with MC, but that  
has multiple causes, the compiler is not
the slowest thing here... (in 3.9, we managed to speed up code loading  
with MC by a factor of two just by caching
the class category in the "category" variable of the class objec).

Another "problem" of the NewCompiler is that it has far more classes  
then the old one. Horrible for some people,
but interestingly, this makes it far easier to understand than the old  
one...

So, Slower and More Classes. What do we get for that? Of course we  
need some payout. One is that the NewCompiler
is *far* easier to understand. I can explain it to a student in half  
an hour, and the student can hack it.

Slower. The Slowness is caused by two things: 1) Use of SmaCC for  
parsing 2) multi-pass visitor based architecture.
Before explaining (with examples) why both are indeed very nice to  
have, I will in this Part I just do a short overview of the  
architecture.

1) Scanning/Parsing. This is wher the text is analyzed and a tree-
structure (AST) is build up. The NewCompiler does not
    implement the Scanner/Parser "by hand", but instead it uses a  
Parser-Generator. This is an application that takes a
    description of the Grammar of the language in a fairly  
standardized form (BNF). Then this is compiled using the Parser  
Generator
    to build the Scanner/Parser classes.
    The nice thing about this are three things:
        1) easier to understand and modify (grammar is formulated as a grammer)
        2) less bugs, as the grammar is transformed automatically. This is  
not that important for a simple language is Smalltalk.
        3) Easy to modify, Easy to add a slightly changed / extended  
Smalltalk-like language. (We see examples for that later)
       
       
2) The AST (Abstract Syntax Tree).
    This encodes the syntax as a tree. For a compiler, a very simple  
AST is mostly enough. For the NewCompiler, the AST of
    the Refactoring Browser was used instead. This is an overkill for  
a compiler, but it has some cool aspects:
        1) One AST for the sytem. No duplicated code between the RB and the  
Compiler. Less bugs.
        2) the RB can use the Parser of the Compiler. No problem with  
differences, less duplicated code.
    The nice thing about the RB AST are two things: Visitors and  
Transformations. Visitors are used a lot in the NewCompiler.
       
       
3) Now we have the AST. The AST does not encode any semantics yet,  
e.g. meaning of variables. We know that there is
    Variable "a", but is it a definition or use? Are variables  
shadowing each other?
    This we need to check and add information to all variables. For  
this, we have a Visitor that walks over the AST,
    grows a scope-chain for each block entered, records variable  
definitions, annotated variable uses with the definition
    that it referes to.

4) The annotated AST now can be used to generate Code. The NewCompiler  
uses a very nice small "Squeak Assembler" to emit
    the code. The class for Code Generation is ASTTranslator. Again a  
visitor, walks over the tree and calls IRBuilder to
    emit code.
    Here in ASTTranslator the magic happens for the inlining of if/and/
while and so on:

        acceptMessageNode: aMessageNode

                aMessageNode isInlineIf ifTrue: [^ self emitIfNode: aMessageNode].
                aMessageNode isInlineIfNil ifTrue: [^ self emitIfNilNode:  
aMessageNode].
                aMessageNode isInlineAndOr ifTrue: [^ self emitAndOrNode:  
aMessageNode].
                aMessageNode isInlineWhile ifTrue: [^ self emitWhileNode:  
aMessageNode].
                aMessageNode isInlineToDo ifTrue: [^ self emitToDoNode: aMessageNode].
                aMessageNode isInlineCase ifTrue: [^ self emitCaseNode: aMessageNode].
                ^ self emitMessageNode: aMessageNode
               

        Uncomment the line with "isInlineIfNil" --> no compiler generates  
normal message send for ifNil:. Easy.
                       
       
5) The IRBuilder. A symblic assembler for Squeak Bytecode. Here is an  
example of
    a method that compared the first iVar with the argument and  
returns yes or no

        ir := RBuilder new
                        numRargs: 2;
                        addTemps: #(self a); "rcvr, arg"
                        pushTemp: #self;
                        pushInstVar: 1;
                        pushTemp: #a;
                        send: #=;
                        jumpAheadTo: #else if: false;
                        pushLiteral: 'yes';
                        returnTop;
                        jumpAheadTarget: #else;
                        pushLiteral: 'no';
                        returnTop;
                        ir.
                       
                we can run this method likes this:
                       
                ir compiledMethod valueWithReceiver: (5@4) arguments: #(5)


        As the "ir" suggests, this bytecode assembler does not directly  
generate bytecode,
        but it builds up an Intermediate Representation instead. This is  
nice, as it allows
        the backend to be changed quite easily, so changeing the bytecode  
encoding is easy
        with this kind of architecture. The other thing is that on the IR, we  
can do transformation,
        too. (example later).

         The IRBuilder can be used to implement compiler for other  
languages, it's actually very simple
         to do: walk the AST of your language, call method on the  
IRBuilder. (Example later).

       
6) ByteCodeBuilder now is called by the IRBuilder to emit bytecode and  
build a compiledMethod object.
    This is the only class that encodes the actual bytecode set used.

So, comparing that to the old compiler, it's actually amazing that  
this is just slower by a factor of 4. It's not
a black box compiler, it's a "Compiler Framework" that can be used to  
experiment. Building yout own compiler
is simplified a lot with this framwork.

Part II will show some examples for how we used this compiler  
framework in the past. I will try to write that
later today.

There are some slides that explain all this in some more detail:

http://www.iam.unibe.ch/~denker/talks/07SCGSmalltalk/11Bytecode.pdf


        Marcus
--
Marcus Denker  --  [hidden email]
http://www.iam.unibe.ch/~denker




Reply | Threaded
Open this post in threaded view
|

Re: About the new compiler, Part I

Damien Cassou-3
Very interesting Markus, thank you very much. Do you think there is
still space for optimizations while keeping the Visitor pattern?

--
Damien Cassou

Reply | Threaded
Open this post in threaded view
|

Re: About the new compiler, Part I

stephane ducasse
In reply to this post by Marcus Denker
Hi marcus

Thanks for the information. I have the impression that you stressed  
too much the speed of the new compiler.
I imagine that it has the same architecture as the one of VW.
I know that you use it a lot and that reflectivity is completely based  
on it and working great.
Now what are the next steps?
The decompiler is working/error handling.
Can we use it for Squeak...

So could you let us know the status
        newcompiler
        compiler
in terms of error hanlding, decompilation, block closures support.

Stef



On Jan 19, 2008, at 11:19 AM, Marcus Denker wrote:

> Part I: Compiling is slower, so it's bad
> =========================================
>
> This part will not talk about closures at all, just about the  
> overall system archicture of the compiler.
> I will write with Math something about Closures, their design and  
> performance later.
>
> Before I go into the details, I think I need to clearify one thing:  
> from the start, I always saw Squeak in terms
> of it's possibilities, never as "what it is now". I think all what  
> follows comes from that point of view.
> If you see Squeak an artifact that should never be changed, then you  
> won't like any of the arguments that
> follow. But I am truly convinced that the greatness of Squeak never  
> was the artifact, it was always the possibility
> it promised.
>
> So.The Compiler is slower to compile code (factor ca. 4). But then,  
> this is not that of a problem. Most people
> will now think about the slowness of loading code with MC, but that  
> has multiple causes, the compiler is not
> the slowest thing here... (in 3.9, we managed to speed up code  
> loading with MC by a factor of two just by caching
> the class category in the "category" variable of the class objec).
>
> Another "problem" of the NewCompiler is that it has far more classes  
> then the old one. Horrible for some people,
> but interestingly, this makes it far easier to understand than the  
> old one...
>
> So, Slower and More Classes. What do we get for that? Of course we  
> need some payout. One is that the NewCompiler
> is *far* easier to understand. I can explain it to a student in half  
> an hour, and the student can hack it.
>
> Slower. The Slowness is caused by two things: 1) Use of SmaCC for  
> parsing 2) multi-pass visitor based architecture.
> Before explaining (with examples) why both are indeed very nice to  
> have, I will in this Part I just do a short overview of the  
> architecture.
>
> 1) Scanning/Parsing. This is wher the text is analyzed and a tree-
> structure (AST) is build up. The NewCompiler does not
>   implement the Scanner/Parser "by hand", but instead it uses a  
> Parser-Generator. This is an application that takes a
>   description of the Grammar of the language in a fairly  
> standardized form (BNF). Then this is compiled using the Parser  
> Generator
>   to build the Scanner/Parser classes.
>   The nice thing about this are three things:
> 1) easier to understand and modify (grammar is formulated as a  
> grammer)
> 2) less bugs, as the grammar is transformed automatically. This is  
> not that important for a simple language is Smalltalk.
> 3) Easy to modify, Easy to add a slightly changed / extended  
> Smalltalk-like language. (We see examples for that later)
>
>
> 2) The AST (Abstract Syntax Tree).
>   This encodes the syntax as a tree. For a compiler, a very simple  
> AST is mostly enough. For the NewCompiler, the AST of
>   the Refactoring Browser was used instead. This is an overkill for  
> a compiler, but it has some cool aspects:
> 1) One AST for the sytem. No duplicated code between the RB and the  
> Compiler. Less bugs.
> 2) the RB can use the Parser of the Compiler. No problem with  
> differences, less duplicated code.
>   The nice thing about the RB AST are two things: Visitors and  
> Transformations. Visitors are used a lot in the NewCompiler.
>
>
> 3) Now we have the AST. The AST does not encode any semantics yet,  
> e.g. meaning of variables. We know that there is
>   Variable "a", but is it a definition or use? Are variables  
> shadowing each other?
>   This we need to check and add information to all variables. For  
> this, we have a Visitor that walks over the AST,
>   grows a scope-chain for each block entered, records variable  
> definitions, annotated variable uses with the definition
>   that it referes to.
>
> 4) The annotated AST now can be used to generate Code. The  
> NewCompiler uses a very nice small "Squeak Assembler" to emit
>   the code. The class for Code Generation is ASTTranslator. Again a  
> visitor, walks over the tree and calls IRBuilder to
>   emit code.
>   Here in ASTTranslator the magic happens for the inlining of if/and/
> while and so on:
>
> acceptMessageNode: aMessageNode
>
> aMessageNode isInlineIf ifTrue: [^ self emitIfNode: aMessageNode].
> aMessageNode isInlineIfNil ifTrue: [^ self emitIfNilNode:  
> aMessageNode].
> aMessageNode isInlineAndOr ifTrue: [^ self emitAndOrNode:  
> aMessageNode].
> aMessageNode isInlineWhile ifTrue: [^ self emitWhileNode:  
> aMessageNode].
> aMessageNode isInlineToDo ifTrue: [^ self emitToDoNode:  
> aMessageNode].
> aMessageNode isInlineCase ifTrue: [^ self emitCaseNode:  
> aMessageNode].
> ^ self emitMessageNode: aMessageNode
>
>
> Uncomment the line with "isInlineIfNil" --> no compiler generates  
> normal message send for ifNil:. Easy.
>
>
> 5) The IRBuilder. A symblic assembler for Squeak Bytecode. Here is  
> an example of
>   a method that compared the first iVar with the argument and  
> returns yes or no
>
> ir := RBuilder new
> numRargs: 2;
> addTemps: #(self a); "rcvr, arg"
> pushTemp: #self;
> pushInstVar: 1;
> pushTemp: #a;
> send: #=;
> jumpAheadTo: #else if: false;
> pushLiteral: 'yes';
> returnTop;
> jumpAheadTarget: #else;
> pushLiteral: 'no';
> returnTop;
> ir.
>
> we can run this method likes this:
>
> ir compiledMethod valueWithReceiver: (5@4) arguments: #(5)
>
>
> As the "ir" suggests, this bytecode assembler does not directly  
> generate bytecode,
> but it builds up an Intermediate Representation instead. This is  
> nice, as it allows
> the backend to be changed quite easily, so changeing the bytecode  
> encoding is easy
> with this kind of architecture. The other thing is that on the IR,  
> we can do transformation,
> too. (example later).
>
>        The IRBuilder can be used to implement compiler for other  
> languages, it's actually very simple
>        to do: walk the AST of your language, call method on the  
> IRBuilder. (Example later).
>
>
> 6) ByteCodeBuilder now is called by the IRBuilder to emit bytecode  
> and build a compiledMethod object.
>   This is the only class that encodes the actual bytecode set used.
>
> So, comparing that to the old compiler, it's actually amazing that  
> this is just slower by a factor of 4. It's not
> a black box compiler, it's a "Compiler Framework" that can be used  
> to experiment. Building yout own compiler
> is simplified a lot with this framwork.
>
> Part II will show some examples for how we used this compiler  
> framework in the past. I will try to write that
> later today.
>
> There are some slides that explain all this in some more detail:
>
> http://www.iam.unibe.ch/~denker/talks/07SCGSmalltalk/11Bytecode.pdf
>
>
> Marcus
> --
> Marcus Denker  --  [hidden email]
> http://www.iam.unibe.ch/~denker
>
>
>
>
>


Reply | Threaded
Open this post in threaded view
|

Re: About the new compiler, Part I

Joshua Gargus-2
Is SmaCC the major bottleneck, or is it tree traversal?  If scanning/
parsing is the slowest part, perhaps it is worth evaluating Alex  
Warth's OMeta as a potential replacement (http://www.cs.ucla.edu/~awarth/ometa/ 
).

Thanks for the write-up about the new compiler, I'm sure that many  
others found it as informative as I did.  I'm looking forward to Part  
II.

Cheers,
Josh


On Jan 19, 2008, at 2:55 AM, stephane ducasse wrote:

> Hi marcus
>
> Thanks for the information. I have the impression that you stressed  
> too much the speed of the new compiler.
> I imagine that it has the same architecture as the one of VW.
> I know that you use it a lot and that reflectivity is completely  
> based on it and working great.
> Now what are the next steps?
> The decompiler is working/error handling.
> Can we use it for Squeak...
>
> So could you let us know the status
> newcompiler
> compiler
> in terms of error hanlding, decompilation, block closures support.
>
> Stef
>
>
>
> On Jan 19, 2008, at 11:19 AM, Marcus Denker wrote:
>
>> Part I: Compiling is slower, so it's bad
>> =========================================
>>
>> This part will not talk about closures at all, just about the  
>> overall system archicture of the compiler.
>> I will write with Math something about Closures, their design and  
>> performance later.
>>
>> Before I go into the details, I think I need to clearify one thing:  
>> from the start, I always saw Squeak in terms
>> of it's possibilities, never as "what it is now". I think all what  
>> follows comes from that point of view.
>> If you see Squeak an artifact that should never be changed, then  
>> you won't like any of the arguments that
>> follow. But I am truly convinced that the greatness of Squeak never  
>> was the artifact, it was always the possibility
>> it promised.
>>
>> So.The Compiler is slower to compile code (factor ca. 4). But then,  
>> this is not that of a problem. Most people
>> will now think about the slowness of loading code with MC, but that  
>> has multiple causes, the compiler is not
>> the slowest thing here... (in 3.9, we managed to speed up code  
>> loading with MC by a factor of two just by caching
>> the class category in the "category" variable of the class objec).
>>
>> Another "problem" of the NewCompiler is that it has far more  
>> classes then the old one. Horrible for some people,
>> but interestingly, this makes it far easier to understand than the  
>> old one...
>>
>> So, Slower and More Classes. What do we get for that? Of course we  
>> need some payout. One is that the NewCompiler
>> is *far* easier to understand. I can explain it to a student in  
>> half an hour, and the student can hack it.
>>
>> Slower. The Slowness is caused by two things: 1) Use of SmaCC for  
>> parsing 2) multi-pass visitor based architecture.
>> Before explaining (with examples) why both are indeed very nice to  
>> have, I will in this Part I just do a short overview of the  
>> architecture.
>>
>> 1) Scanning/Parsing. This is wher the text is analyzed and a tree-
>> structure (AST) is build up. The NewCompiler does not
>>  implement the Scanner/Parser "by hand", but instead it uses a  
>> Parser-Generator. This is an application that takes a
>>  description of the Grammar of the language in a fairly  
>> standardized form (BNF). Then this is compiled using the Parser  
>> Generator
>>  to build the Scanner/Parser classes.
>>  The nice thing about this are three things:
>> 1) easier to understand and modify (grammar is formulated as a  
>> grammer)
>> 2) less bugs, as the grammar is transformed automatically. This is  
>> not that important for a simple language is Smalltalk.
>> 3) Easy to modify, Easy to add a slightly changed / extended  
>> Smalltalk-like language. (We see examples for that later)
>>
>>
>> 2) The AST (Abstract Syntax Tree).
>>  This encodes the syntax as a tree. For a compiler, a very simple  
>> AST is mostly enough. For the NewCompiler, the AST of
>>  the Refactoring Browser was used instead. This is an overkill for  
>> a compiler, but it has some cool aspects:
>> 1) One AST for the sytem. No duplicated code between the RB and  
>> the Compiler. Less bugs.
>> 2) the RB can use the Parser of the Compiler. No problem with  
>> differences, less duplicated code.
>>  The nice thing about the RB AST are two things: Visitors and  
>> Transformations. Visitors are used a lot in the NewCompiler.
>>
>>
>> 3) Now we have the AST. The AST does not encode any semantics yet,  
>> e.g. meaning of variables. We know that there is
>>  Variable "a", but is it a definition or use? Are variables  
>> shadowing each other?
>>  This we need to check and add information to all variables. For  
>> this, we have a Visitor that walks over the AST,
>>  grows a scope-chain for each block entered, records variable  
>> definitions, annotated variable uses with the definition
>>  that it referes to.
>>
>> 4) The annotated AST now can be used to generate Code. The  
>> NewCompiler uses a very nice small "Squeak Assembler" to emit
>>  the code. The class for Code Generation is ASTTranslator. Again a  
>> visitor, walks over the tree and calls IRBuilder to
>>  emit code.
>>  Here in ASTTranslator the magic happens for the inlining of if/and/
>> while and so on:
>>
>> acceptMessageNode: aMessageNode
>>
>> aMessageNode isInlineIf ifTrue: [^ self emitIfNode: aMessageNode].
>> aMessageNode isInlineIfNil ifTrue: [^ self emitIfNilNode:  
>> aMessageNode].
>> aMessageNode isInlineAndOr ifTrue: [^ self emitAndOrNode:  
>> aMessageNode].
>> aMessageNode isInlineWhile ifTrue: [^ self emitWhileNode:  
>> aMessageNode].
>> aMessageNode isInlineToDo ifTrue: [^ self emitToDoNode:  
>> aMessageNode].
>> aMessageNode isInlineCase ifTrue: [^ self emitCaseNode:  
>> aMessageNode].
>> ^ self emitMessageNode: aMessageNode
>>
>>
>> Uncomment the line with "isInlineIfNil" --> no compiler generates  
>> normal message send for ifNil:. Easy.
>>
>>
>> 5) The IRBuilder. A symblic assembler for Squeak Bytecode. Here is  
>> an example of
>>  a method that compared the first iVar with the argument and  
>> returns yes or no
>>
>> ir := RBuilder new
>> numRargs: 2;
>> addTemps: #(self a); "rcvr, arg"
>> pushTemp: #self;
>> pushInstVar: 1;
>> pushTemp: #a;
>> send: #=;
>> jumpAheadTo: #else if: false;
>> pushLiteral: 'yes';
>> returnTop;
>> jumpAheadTarget: #else;
>> pushLiteral: 'no';
>> returnTop;
>> ir.
>>
>> we can run this method likes this:
>>
>> ir compiledMethod valueWithReceiver: (5@4) arguments: #(5)
>>
>>
>> As the "ir" suggests, this bytecode assembler does not directly  
>> generate bytecode,
>> but it builds up an Intermediate Representation instead. This is  
>> nice, as it allows
>> the backend to be changed quite easily, so changeing the bytecode  
>> encoding is easy
>> with this kind of architecture. The other thing is that on the IR,  
>> we can do transformation,
>> too. (example later).
>>
>>       The IRBuilder can be used to implement compiler for other  
>> languages, it's actually very simple
>>       to do: walk the AST of your language, call method on the  
>> IRBuilder. (Example later).
>>
>>
>> 6) ByteCodeBuilder now is called by the IRBuilder to emit bytecode  
>> and build a compiledMethod object.
>>  This is the only class that encodes the actual bytecode set used.
>>
>> So, comparing that to the old compiler, it's actually amazing that  
>> this is just slower by a factor of 4. It's not
>> a black box compiler, it's a "Compiler Framework" that can be used  
>> to experiment. Building yout own compiler
>> is simplified a lot with this framwork.
>>
>> Part II will show some examples for how we used this compiler  
>> framework in the past. I will try to write that
>> later today.
>>
>> There are some slides that explain all this in some more detail:
>>
>> http://www.iam.unibe.ch/~denker/talks/07SCGSmalltalk/11Bytecode.pdf
>>
>>
>> Marcus
>> --
>> Marcus Denker  --  [hidden email]
>> http://www.iam.unibe.ch/~denker
>>
>>
>>
>>
>>
>
>


Reply | Threaded
Open this post in threaded view
|

Re: About the new compiler, Part I

John Brant-2
In reply to this post by Marcus Denker
Marcus Denker wrote:

> 4) The annotated AST now can be used to generate Code. The NewCompiler
> uses a very nice small "Squeak Assembler" to emit
>    the code. The class for Code Generation is ASTTranslator. Again a
> visitor, walks over the tree and calls IRBuilder to
>    emit code.
>    Here in ASTTranslator the magic happens for the inlining of
> if/and/while and so on:
>
>     acceptMessageNode: aMessageNode
>
>         aMessageNode isInlineIf ifTrue: [^ self emitIfNode: aMessageNode].
>         aMessageNode isInlineIfNil ifTrue: [^ self emitIfNilNode:
> aMessageNode].
>         aMessageNode isInlineAndOr ifTrue: [^ self emitAndOrNode:
> aMessageNode].
>         aMessageNode isInlineWhile ifTrue: [^ self emitWhileNode:
> aMessageNode].
>         aMessageNode isInlineToDo ifTrue: [^ self emitToDoNode:
> aMessageNode].
>         aMessageNode isInlineCase ifTrue: [^ self emitCaseNode:
> aMessageNode].
>         ^ self emitMessageNode: aMessageNode
>        
>
>     Uncomment the line with "isInlineIfNil" --> no compiler generates
> normal message send for ifNil:. Easy.

Since you have the transformations available, you could use them as
macros to eliminate some of the cases here. For example, #ifNil: can be
transformed into "isNil ifTrue:".

When I was working on #Smalltalk for .NET, I used the transformations
for all of your special cases. After the code was parsed, it would
perform the transformations on the parse tree to get a new tree that was
then compiled. Each transformation was named and could be added or
removed on a class by class basis. For the #ifNil: message, I would add
a global rule:
        Name: #ifNil:
        Search: ``@a ifNil: ``@b
        Replace: ``@a isNil ifTrue: ``@b
If someone wanted to remove the #ifNil: transformation for a particular
class, they could add a class rule with the same name and without any
search/replace expressions.

In addition to the transformations, I also created a special message
that was evaluated in the context of the current compiler. In
#Smalltalk, if you sent a #primitive:* message to Compiler global, it
would evaluate the primitive block in the compiler's context. For
example, you could have code like:
     myMethod
         | a |
         a := self printString.
         Compiler
             primitive: [:compiler :block |
                 1 to: block statements size by: 2
                     do: [:i |
                         compiler
                             emitStatement: (block statements at: i)]]
             argument: [a := a , '1'. a := a , '2'. a := a , '3'].
         ^a

When this is compiled, it would get to the Compiler #primitive:argument:
message, and then the compiler would evaluate the #primitive: block
argument passing the compiler and the "[a := a , '1'. a := a , '2'. a :=
a , '3']" parse tree as the arguments. In this example, the primitive
block tells the compiler to emit only the odd statements in the block.
Effectively, this would generate code that would look like:
     myMethod
         | a |
         a := self printString.
         a := a , '1'.
         a := a , '3'.
         ^a

In addition to the argument: keyword, I also have an evaluate: keyword
for the #primitive:* message send. While the argument: keyword causes
the corresponding parse tree to be passed to the primitive: block, the
evaluate: keyword causes the compiler to compile the code that pushes
its corresponding argument on the stack. For example, in #Smalltalk, I
handle a == ifTrue: pattern using:
        Name: ==ifTrue:
        Search: ``@a == ``@b ifTrue: ``@trueBlock `{:node | node isBlock and:
[node arguments isEmpty]}'
        Replace: Compiler
                primitive:
                        [:methodCompiler :block |
                        methodCompiler generateIdentityIfTrue: block ifFalse: nil]
                evaluate: ``@a
                evaluate: ``@b
                argument: ``@trueBlock

When the primitive block is evaluated, it knows that the code to push
the ``@a and ``@b objects has been generated, so it just needs to
generate a jump if equal and the block.

By combining the transformations with the Compiler #primitive:*
messages, we can eliminate almost all of your special cases in your
method above. In #Smalltalk, I only have two special cases. One for the
Compiler #primitive:* messages and another one that handles sending
messages to non #Smalltalk objects. For Squeak, you wouldn't need to
worry about sending messages to non Squeak objects, so you could
simplify the method above to only one special case.

Once I had implemented this for #Smalltalk, it was interesting
performing tests comparing the optimized code to the unoptimized code.
The unoptimized code was much bigger since every ifTrue:, whileTrue:,
etc. block was a real block and needed all of the code for a real block.
Also, it was interesting finding a real benchmark that wouldn't crash
with a stack overflow. For example, if you have no optimizations, then
something as simple as "100000000 timesRepeat: []" will blow the stack.


John Brant

Reply | Threaded
Open this post in threaded view
|

Re: About the new compiler, Part I

Nicolas Cellier-3
In reply to this post by stephane ducasse
OldCompiler also has a parseTree, no change at this level.
The visitor pattern versus direct tree recursion at worst double method
stack, not a drama because less parameters should be passed (stored in
visitor inst vars). IMO visitor leads to better code (maintainable,
extensible).
Anyway you follow the steps of ParcPlace there (they switched long time
ago, in Objectworks version maybe).

OldCompiler code to resolve variable scoping is:
- hard to understand
- bugged
        * http://bugs.squeak.org/view.php?id=6704
        * http://bugs.squeak.org/view.php?id=6831
        * http://bugs.squeak.org/view.php?id=6720
        * http://bugs.squeak.org/view.php?id=3448
Anyone to deny a clear relationship?
Many thanks for rewriting this part cleanly!

NewCompiler has one step more than VW: the abstract bytecode sequence
produced by IRBuilder.
VW directly encodes byteCodes on a stream, no reification at this level.

This steps might stress ObjectMemory accumulating objetcs not reclaimed
immediately (exactly like the parse tree does) and adds one more walk.
For these reasons I guess it should be responsible for roughly a factor
2 (Sure Marcus has the tallies).

If this is true, and speed really matters, a rewrite is possible here
and would give another tradeoff. But does speed matters that much?

It seems that Marcus has good applications for this reification as
explained in part II, and implemented really easily compared to pure
stream processing like VW::InstructionClient.

So a big thank you to Marcus and his crew, I see much more progress here
than regression.

Nicolas

stephane ducasse a écrit :

> Hi marcus
>
> Thanks for the information. I have the impression that you stressed too
> much the speed of the new compiler.
> I imagine that it has the same architecture as the one of VW.
> I know that you use it a lot and that reflectivity is completely based
> on it and working great.
> Now what are the next steps?
> The decompiler is working/error handling.
> Can we use it for Squeak...
>
> So could you let us know the status
>     newcompiler
>     compiler
> in terms of error hanlding, decompilation, block closures support.
>
> Stef
>
>
>
> On Jan 19, 2008, at 11:19 AM, Marcus Denker wrote:
>
>> Part I: Compiling is slower, so it's bad
>> =========================================
>>
>> This part will not talk about closures at all, just about the overall
>> system archicture of the compiler.
>> I will write with Math something about Closures, their design and
>> performance later.
>>
>> Before I go into the details, I think I need to clearify one thing:
>> from the start, I always saw Squeak in terms
>> of it's possibilities, never as "what it is now". I think all what
>> follows comes from that point of view.
>> If you see Squeak an artifact that should never be changed, then you
>> won't like any of the arguments that
>> follow. But I am truly convinced that the greatness of Squeak never
>> was the artifact, it was always the possibility
>> it promised.
>>
>> So.The Compiler is slower to compile code (factor ca. 4). But then,
>> this is not that of a problem. Most people
>> will now think about the slowness of loading code with MC, but that
>> has multiple causes, the compiler is not
>> the slowest thing here... (in 3.9, we managed to speed up code loading
>> with MC by a factor of two just by caching
>> the class category in the "category" variable of the class objec).
>>
>> Another "problem" of the NewCompiler is that it has far more classes
>> then the old one. Horrible for some people,
>> but interestingly, this makes it far easier to understand than the old
>> one...
>>
>> So, Slower and More Classes. What do we get for that? Of course we
>> need some payout. One is that the NewCompiler
>> is *far* easier to understand. I can explain it to a student in half
>> an hour, and the student can hack it.
>>
>> Slower. The Slowness is caused by two things: 1) Use of SmaCC for
>> parsing 2) multi-pass visitor based architecture.
>> Before explaining (with examples) why both are indeed very nice to
>> have, I will in this Part I just do a short overview of the architecture.
>>
>> 1) Scanning/Parsing. This is wher the text is analyzed and a
>> tree-structure (AST) is build up. The NewCompiler does not
>>   implement the Scanner/Parser "by hand", but instead it uses a
>> Parser-Generator. This is an application that takes a
>>   description of the Grammar of the language in a fairly standardized
>> form (BNF). Then this is compiled using the Parser Generator
>>   to build the Scanner/Parser classes.
>>   The nice thing about this are three things:
>>     1) easier to understand and modify (grammar is formulated as a
>> grammer)
>>     2) less bugs, as the grammar is transformed automatically. This is
>> not that important for a simple language is Smalltalk.
>>     3) Easy to modify, Easy to add a slightly changed / extended
>> Smalltalk-like language. (We see examples for that later)
>>    
>>    
>> 2) The AST (Abstract Syntax Tree).
>>   This encodes the syntax as a tree. For a compiler, a very simple AST
>> is mostly enough. For the NewCompiler, the AST of
>>   the Refactoring Browser was used instead. This is an overkill for a
>> compiler, but it has some cool aspects:
>>     1) One AST for the sytem. No duplicated code between the RB and
>> the Compiler. Less bugs.
>>     2) the RB can use the Parser of the Compiler. No problem with
>> differences, less duplicated code.
>>   The nice thing about the RB AST are two things: Visitors and
>> Transformations. Visitors are used a lot in the NewCompiler.
>>    
>>    
>> 3) Now we have the AST. The AST does not encode any semantics yet,
>> e.g. meaning of variables. We know that there is
>>   Variable "a", but is it a definition or use? Are variables shadowing
>> each other?
>>   This we need to check and add information to all variables. For
>> this, we have a Visitor that walks over the AST,
>>   grows a scope-chain for each block entered, records variable
>> definitions, annotated variable uses with the definition
>>   that it referes to.
>>
>> 4) The annotated AST now can be used to generate Code. The NewCompiler
>> uses a very nice small "Squeak Assembler" to emit
>>   the code. The class for Code Generation is ASTTranslator. Again a
>> visitor, walks over the tree and calls IRBuilder to
>>   emit code.
>>   Here in ASTTranslator the magic happens for the inlining of
>> if/and/while and so on:
>>
>>     acceptMessageNode: aMessageNode
>>
>>         aMessageNode isInlineIf ifTrue: [^ self emitIfNode:
>> aMessageNode].
>>         aMessageNode isInlineIfNil ifTrue: [^ self emitIfNilNode:
>> aMessageNode].
>>         aMessageNode isInlineAndOr ifTrue: [^ self emitAndOrNode:
>> aMessageNode].
>>         aMessageNode isInlineWhile ifTrue: [^ self emitWhileNode:
>> aMessageNode].
>>         aMessageNode isInlineToDo ifTrue: [^ self emitToDoNode:
>> aMessageNode].
>>         aMessageNode isInlineCase ifTrue: [^ self emitCaseNode:
>> aMessageNode].
>>         ^ self emitMessageNode: aMessageNode
>>        
>>
>>     Uncomment the line with "isInlineIfNil" --> no compiler generates
>> normal message send for ifNil:. Easy.
>>            
>>    
>> 5) The IRBuilder. A symblic assembler for Squeak Bytecode. Here is an
>> example of
>>   a method that compared the first iVar with the argument and returns
>> yes or no
>>
>>     ir := RBuilder new
>>             numRargs: 2;
>>             addTemps: #(self a);        "rcvr, arg"
>>             pushTemp: #self;
>>             pushInstVar: 1;
>>             pushTemp: #a;
>>             send: #=;
>>             jumpAheadTo: #else if: false;
>>             pushLiteral: 'yes';
>>             returnTop;
>>             jumpAheadTarget: #else;
>>             pushLiteral: 'no';
>>             returnTop;
>>             ir.
>>            
>>         we can run this method likes this:
>>            
>>         ir compiledMethod valueWithReceiver: (5@4) arguments: #(5)
>>
>>
>>     As the "ir" suggests, this bytecode assembler does not directly
>> generate bytecode,
>>     but it builds up an Intermediate Representation instead. This is
>> nice, as it allows
>>     the backend to be changed quite easily, so changeing the bytecode
>> encoding is easy
>>     with this kind of architecture. The other thing is that on the IR,
>> we can do transformation,
>>     too. (example later).
>>
>>        The IRBuilder can be used to implement compiler for other
>> languages, it's actually very simple
>>        to do: walk the AST of your language, call method on the
>> IRBuilder. (Example later).
>>
>>    
>> 6) ByteCodeBuilder now is called by the IRBuilder to emit bytecode and
>> build a compiledMethod object.
>>   This is the only class that encodes the actual bytecode set used.
>>
>> So, comparing that to the old compiler, it's actually amazing that
>> this is just slower by a factor of 4. It's not
>> a black box compiler, it's a "Compiler Framework" that can be used to
>> experiment. Building yout own compiler
>> is simplified a lot with this framwork.
>>
>> Part II will show some examples for how we used this compiler
>> framework in the past. I will try to write that
>> later today.
>>
>> There are some slides that explain all this in some more detail:
>>
>> http://www.iam.unibe.ch/~denker/talks/07SCGSmalltalk/11Bytecode.pdf
>>
>>
>>     Marcus
>> --
>> Marcus Denker  --  [hidden email]
>> http://www.iam.unibe.ch/~denker
>>
>>
>>
>>
>>
>
>
>


Reply | Threaded
Open this post in threaded view
|

Re: About the new compiler, Part I

Marcus Denker
In reply to this post by John Brant-2

On 19.01.2008, at 18:11, John Brant wrote:

> Marcus Denker wrote:
>
>> 4) The annotated AST now can be used to generate Code. The  
>> NewCompiler uses a very nice small "Squeak Assembler" to emit
>>   the code. The class for Code Generation is ASTTranslator. Again a  
>> visitor, walks over the tree and calls IRBuilder to
>>   emit code.
>>   Here in ASTTranslator the magic happens for the inlining of if/
>> and/while and so on:
>>    acceptMessageNode: aMessageNode
>>        aMessageNode isInlineIf ifTrue: [^ self emitIfNode:  
>> aMessageNode].
>>        aMessageNode isInlineIfNil ifTrue: [^ self emitIfNilNode:  
>> aMessageNode].
>>        aMessageNode isInlineAndOr ifTrue: [^ self emitAndOrNode:  
>> aMessageNode].
>>        aMessageNode isInlineWhile ifTrue: [^ self emitWhileNode:  
>> aMessageNode].
>>        aMessageNode isInlineToDo ifTrue: [^ self emitToDoNode:  
>> aMessageNode].
>>        aMessageNode isInlineCase ifTrue: [^ self emitCaseNode:  
>> aMessageNode].
>>        ^ self emitMessageNode: aMessageNode
>>           Uncomment the line with "isInlineIfNil" --> no compiler  
>> generates normal message send for ifNil:. Easy.
>
> Since you have the transformations available, you could use them as  
> macros to eliminate some of the cases here. For example, #ifNil: can  
> be transformed into "isNil ifTrue:".
>

Very nice, yes!

There was some use of the Rewriter in the, I think #emitCaseNode:. For  
now this was removed as we do not (yet) want to make the NewCompiler  
depend on the
RB package. (partly because there are people whose only comment on the  
RB is "oh, that's a lot of classes". Even today, the usefulness of  
refactorings is not
yet common knowledge... it's a bit like with Tests. Hard to change  
habits...).

The first step for that would be to make the RB actually use the SmaCC  
based compiler, there is already work beeing done for that. Gwenaël  
Casaccio already has
a SmaCC based parser for the RB Pattern-Rewriter and is fixing the  
SmaCC parser / RB Framework to work together so that we can remove the  
RBParser.


        Marcus
--
Marcus Denker  --  [hidden email]
http://www.iam.unibe.ch/~denker




Reply | Threaded
Open this post in threaded view
|

Re: About the new compiler, Part I

Paolo Bonzini-2

>> Since you have the transformations available, you could use them as
>> macros to eliminate some of the cases here. For example, #ifNil: can
>> be transformed into "isNil ifTrue:".

Stephen Compall has been working on "message macros" for GNU Smalltalk
with his Presource package, living at
http://smalltalk.gnu.org/project/presource (Stephen is CCed).

In Presource, code is created from CodeTemplate objects like this:

     ^(CodeTemplate fromExpr:
          '[:`testVar | `@branch] value: `@testValue') expand:
        (LookupTable from: {'`testVar' -> testVar.
                            '`@testValue' -> testValue.
                            '`@branch' -> elseBranch})

Examples of macros include changing the #from: method call (Squeak's
#newFrom:) to

     LookupTable new
        at: '`testVar' put: testVar;
         ...;
        yourself

or things like "{ a. ' '. b } join" to a Stream, like

     a copy writeStream
        nextPutAll: ' ';
        nextPutAll: b;
        contents

Macros are often used in combination with the {...} array syntax or
class methods, because it allows an easy way to know the receiver's
class (and thus to know if the macro applies).

By the way, his work led to the discovery of a couple of RBParser bugs.
  See http://snipurl.com/1xu6y for a c.l.s post describing them.

Paolo

Reply | Threaded
Open this post in threaded view
|

Re: About the new compiler, Part I

Lukas Renggli
> >> Since you have the transformations available, you could use them as
> >> macros to eliminate some of the cases here. For example, #ifNil: can
> >> be transformed into "isNil ifTrue:".
>
> Stephen Compall has been working on "message macros" for GNU Smalltalk
> with his Presource package, living at
> http://smalltalk.gnu.org/project/presource (Stephen is CCed).

I also wrote a macro system ;-)

In my case this for Squeak and in the context of improving the DSL
(domain specific language) capabilities of Smalltalk. A simple macro
transformation rule based on RB rules looks for example like this:

inlineBetweenAnd
        <rule: 100>
       
        ^ OrderedCollection new
                add: (DSLMacroStringDefinition new
                        search: '``@a between: ``@b and: ``@c';
                        replace: '``@a >= ``@b and: [ ``@a <= ``@c ]';
                        verification: [ :context :node |
                                node receiver isImmediate
                                        and: [ node arguments allSatisfy: [ :each | each isImmediate ] ] ]);
                yourself

The transformations are not restricted to source-code transformation,
but can also be used to improve the editor. The example below uses a
mixture of an RB matcher and several regular expressions matcher. It
highlights XHTML tags within literal strings:

htmlTagsInString
        <rule: 100>
       
        ^ DSLSearchPattern new
                expression: '`{ :node | node isLiteral and: [ node value isString ] }';
                action: (Array
                        with: (DSLMatchPattern new
                                expression: '</?(\w+)\s*([^>]*)>';
                                action: Color blue asStyle;
                                at: 2 action: TextEmphasis bold asStyle;
                                at: 3 action: (DSLMatchPattern new
                                        expression: '(\w+)=("[^"]*")';
                                        at: 3 action: TextColor magenta asStyle))
                        with: (DSLRangePattern new
                                begin: '<!--'; end: '-->';
                                outerAction: (Color r: 0 g: 0.5 b: 0) asStyle))

Cheers,
Lukas

--
Lukas Renggli
http://www.lukas-renggli.ch

Reply | Threaded
Open this post in threaded view
|

Re: About the new compiler, Part I

John Brant-2
In reply to this post by Marcus Denker
Marcus Denker wrote:

> There was some use of the Rewriter in the, I think #emitCaseNode:. For
> now this was removed as we do not (yet) want to make the NewCompiler
> depend on the
> RB package. (partly because there are people whose only comment on the
> RB is "oh, that's a lot of classes". Even today, the usefulness of
> refactorings is not
> yet common knowledge... it's a bit like with Tests. Hard to change
> habits...).

Unless things are packaged differently in Squeak, the rewriter is in the
same package as the parse tree nodes. So you should be able to use the
rewriter without including more stuff.


John Brant

Reply | Threaded
Open this post in threaded view
|

Re: About the new compiler, Part I

John Brant-2
In reply to this post by Paolo Bonzini-2
Paolo Bonzini wrote:

> By the way, his work led to the discovery of a couple of RBParser bugs.
>  See http://snipurl.com/1xu6y for a c.l.s post describing them.

I thought I had replied to this, but it appears that I didn't. Anyway,
the problem with both of these rewrites is that you are trying to
replace a cascade message with something that isn't a message. For
example, when you perform the '``@receiver display: ``@object' ->
'``@object' rewrite on the following:

    (stream display: z) display: (stream display: x);
                        display: y; nextPut: $q

It first matches the "(stream display: z) display: (stream display: x)"
expression. It then searches in the ``@receiver and ``@object for more
matches and replaces them so after replacing the subexpressions, we get
"``@receiver = z" and "``@object = x". Now, we return the overall
replacement for the cascaded #display:. In this case we return the "x"
expression. Since this is not a message, we cannot use it in a cascaded
message, so in VW (I don't know about other implementations), we print
an error message to the transcript and we ignore the rewrite for this
node. The error message is "Cannot replace message node inside of
cascaded node with non-message node."

If we allowed this bad replacement, we would essentially be transforming
the code to:
        x;
                y; nextPut: $q
And this isn't valid Smalltalk code.


John Brant

Reply | Threaded
Open this post in threaded view
|

Re: About the new compiler, Part I

keith1y
In reply to this post by Marcus Denker

> So.The Compiler is slower to compile code (factor ca. 4). But then,
> this is not that of a problem. Most people
> will now think about the slowness of loading code with MC, but that
> has multiple causes, the compiler is not
> the slowest thing here... (in 3.9, we managed to speed up code loading
> with MC by a factor of two just by caching
> the class category in the "category" variable of the class objec).
MC will get much much faster in coming versions.

Currently MC does not support Atomic loading. MC1.5 is slower than MC1
because it iterates through all of the changes to be applied in several
passes.

MC1.5 has code in place ready to use SystemEditor to apply changes.
Firstly This does not need to iterate through the changes as many times,
and secondly the code is much simpler overall. This will be enabled in MC1.6

I would not be at all surprised if we cannot gain back a factor of 4 or
more from this.

I would like to extend MC to support binary loading in the future (1.7?)

Keith

Reply | Threaded
Open this post in threaded view
|

Re: About the new compiler, Part I

Marcus Denker
In reply to this post by John Brant-2

On 20.01.2008, at 17:33, John Brant wrote:

> Marcus Denker wrote:
>
>> There was some use of the Rewriter in the, I think #emitCaseNode:.  
>> For now this was removed as we do not (yet) want to make the  
>> NewCompiler depend on the
>> RB package. (partly because there are people whose only comment on  
>> the RB is "oh, that's a lot of classes". Even today, the usefulness  
>> of refactorings is not
>> yet common knowledge... it's a bit like with Tests. Hard to change  
>> habits...).
>
> Unless things are packaged differently in Squeak, the rewriter is in  
> the same package as the parse tree nodes. So you should be able to  
> use the rewriter without including more stuff.
>

There is an "AST" package and a "NewParser" package, the rest is in  
"RefactoringEngine" and "NewCompiler".


        Marcus
--
Marcus Denker  --  [hidden email]
http://www.iam.unibe.ch/~denker




Reply | Threaded
Open this post in threaded view
|

Re: About the new compiler, Part I

S11001001
In reply to this post by John Brant-2
On Sun, 2008-01-20 at 10:34 -0600, John Brant wrote:
> Since this is not a message, we cannot use it in a cascaded
> message, so in VW (I don't know about other implementations), we print
> an error message to the transcript and we ignore the rewrite for this
> node. The error message is "Cannot replace message node inside of
> cascaded node with non-message node."

The current behavior of GNU Smalltalk is to do the same, signalling a
Warning instead.  I'm not in favor of allowing the replacement.

However, lookForMoreMatchesInContext: is sent before this determination
is made, so the rewrite is not truly "ignored" because recursive rewrite
is invoked on the parts of the match that matched ``-variables, in some
cases mutating those parts with the results.

After rejection, rewriting naturally proceeds into the subnodes of the
original message node, whereupon rewrite rules are applied to parts of
the tree that were already rewritten.  GNU Smalltalk prevents this
double-application by deepcopying the ``-variable values in the context
before rewriting.

--
But you know how reluctant paranormal phenomena are to reveal
themselves when skeptics are present. --Robert Sheaffer, SkI 9/2003



signature.asc (196 bytes) Download Attachment
Reply | Threaded
Open this post in threaded view
|

Re: About the new compiler, Part I

Philippe Marschall
In reply to this post by keith1y
2008/1/20, Keith Hodges <[hidden email]>:

>
> > So.The Compiler is slower to compile code (factor ca. 4). But then,
> > this is not that of a problem. Most people
> > will now think about the slowness of loading code with MC, but that
> > has multiple causes, the compiler is not
> > the slowest thing here... (in 3.9, we managed to speed up code loading
> > with MC by a factor of two just by caching
> > the class category in the "category" variable of the class objec).
> MC will get much much faster in coming versions.
>
> Currently MC does not support Atomic loading. MC1.5 is slower than MC1
> because it iterates through all of the changes to be applied in several
> passes.
>
> MC1.5 has code in place ready to use SystemEditor to apply changes.
> Firstly This does not need to iterate through the changes as many times,
> and secondly the code is much simpler overall. This will be enabled in MC1.6
>
> I would not be at all surprised if we cannot gain back a factor of 4 or
> more from this.

The main problem not MC but PackageInfo. This is what kills MC
performance for bigger packages.

Cheers
Philippe

> I would like to extend MC to support binary loading in the future (1.7?)
>
> Keith
>
>

Reply | Threaded
Open this post in threaded view
|

Re: About the new compiler, Part I

Giovanni Corriga
Philippe Marschall ha scritto:
>
> The main problem not MC but PackageInfo. This is what kills MC
> performance for bigger packages.
>

Would it make sense to introduce a first class Package object?

        Giovanni


Reply | Threaded
Open this post in threaded view
|

Re: About the new compiler, Part I

Philippe Marschall
2008/1/26, Giovanni Corriga <[hidden email]>:
> Philippe Marschall ha scritto:
> >
> > The main problem not MC but PackageInfo. This is what kills MC
> > performance for bigger packages.
> >
>
> Would it make sense to introduce a first class Package object?

That one of the subjects you should avoid on this list.

Cheers
Philippe

Reply | Threaded
Open this post in threaded view
|

Re: About the new compiler, Part I

Giovanni Corriga
Philippe Marschall ha scritto:
> 2008/1/26, Giovanni Corriga <[hidden email]>:
>> Philippe Marschall ha scritto:
>>> The main problem not MC but PackageInfo. This is what kills MC
>>> performance for bigger packages.
>>>
>> Would it make sense to introduce a first class Package object?
>
> That one of the subjects you should avoid on this list.

Heh. What I had in mind was a PackageInfo on steroids, not something
like Java packages/namespaces.

        Giovanni