Re: [squeak-dev] A Bootstrap Compiler

Previous Topic Next Topic
 
classic Classic list List threaded Threaded
5 messages Options
Reply | Threaded
Open this post in threaded view
|

Re: [squeak-dev] A Bootstrap Compiler

Eliot Miranda-2
 
Hi Yoshiki,

On Tue, Dec 28, 2010 at 2:19 AM, Yoshiki Ohshima <[hidden email]> wrote:
 Hi,

 We've been playing with John's MicroSqueak and it occured to me that
having a bytecode compiler that is implemented outside of Squeak opens
some possibilities, such as generate a growable image file from all
text files, or make deep changes to the system without shooting
yourself.

I wrote a longer explanation so if you are interested, please go to:

https://github.com/yoshikiohshima/SqueakBootstrapper

and check it out.

I simply don't see the benefit of putting energy into other languages.  I see the benefit of a textual bootstrap.  But why is it worth-while implementing that in C instead of Smalltalk?  If Smalltalk is more productive (which it is) then writing such a bootstrap in C is a waste of effort, reinvents several wheels at considerable expense and produces an artifact that is less flexible, less extensible, less useful than implementing the same functionality in Smalltalk.

On the other hand, as Andreas suggests, trying to implement something using the simulator looks to be really powerful.  Recent;y I've been playing tangentally in this area.  In recent days I've produced a new code generator for Cog that has some useful speedups (Compiler recompileAll ~ 9% faster, benchFib 2x).  To test the code generator I needed to check stack depths at various points in JIT compilation and execution of the JITted code.  I have a Smalltalk class StackDepthFinder that answers the stack depths for each bytecode of a method.  By adding two classes VMObjectProxy and VMCompiledMethodProxy I could apply StackDepthFinder to methods in the simulator's heap and hence derive stack depths for any method in the simulators image.  To test the JIT it was also convenient to be able to JIT methods in my work image, synthesised test cases etc, not just methods in the simulated image.  Again a facade class allows the simulator to JIT any method in my work image.  This worked well and was easy to implement.  Extending in this direction seems straight-forward.

One starts up wit a simulator and an empty heap and bootstraps objects into that heap, using whatever bytecode set and object format one chooses.  One can test the image using the simulator which should be quite fast enough if the image is a small kernel.  All the implementation is useful and adds to the simulator/VMMaker ecosystem.  All the code is Squeak and can reuse substantial parts of the system.  Seems like a win to me.  I think I'll take this approach in implementing the new object format.  It could be a new backend to MicroSqueak.
cheers
Eliot

 Thank you!

-- Yoshiki


Reply | Threaded
Open this post in threaded view
|

Re: [squeak-dev] A Bootstrap Compiler

Igor Stasenko

On 28 December 2010 19:22, Eliot Miranda <[hidden email]> wrote:

> Hi Yoshiki,
>
> On Tue, Dec 28, 2010 at 2:19 AM, Yoshiki Ohshima <[hidden email]> wrote:
>>
>>  Hi,
>>
>>  We've been playing with John's MicroSqueak and it occured to me that
>> having a bytecode compiler that is implemented outside of Squeak opens
>> some possibilities, such as generate a growable image file from all
>> text files, or make deep changes to the system without shooting
>> yourself.
>>
>> I wrote a longer explanation so if you are interested, please go to:
>>
>> https://github.com/yoshikiohshima/SqueakBootstrapper
>>
>> and check it out.
>
> I simply don't see the benefit of putting energy into other languages.  I
> see the benefit of a textual bootstrap.  But why is it worth-while
> implementing that in C instead of Smalltalk?  If Smalltalk is more
> productive (which it is) then writing such a bootstrap in C is a waste of
> effort, reinvents several wheels at considerable expense and produces an
> artifact that is less flexible, less extensible, less useful than
> implementing the same functionality in Smalltalk.
> On the other hand, as Andreas suggests, trying to implement something using
> the simulator looks to be really powerful.  Recent;y I've been playing
> tangentally in this area.  In recent days I've produced a new code generator
> for Cog that has some useful speedups (Compiler recompileAll ~ 9% faster,
> benchFib 2x).  To test the code generator I needed to check stack depths at
> various points in JIT compilation and execution of the JITted code.  I have
> a Smalltalk class StackDepthFinder that answers the stack depths for each
> bytecode of a method.  By adding two classes VMObjectProxy and
> VMCompiledMethodProxy I could apply StackDepthFinder to methods in the
> simulator's heap and hence derive stack depths for any method in the
> simulators image.  To test the JIT it was also convenient to be able to JIT
> methods in my work image, synthesised test cases etc, not just methods in
> the simulated image.  Again a facade class allows the simulator to JIT any
> method in my work image.  This worked well and was easy to implement.
>  Extending in this direction seems straight-forward.
> One starts up wit a simulator and an empty heap and bootstraps objects into
> that heap, using whatever bytecode set and object format one chooses.  One
> can test the image using the simulator which should be quite fast enough if
> the image is a small kernel.  All the implementation is useful and adds to
> the simulator/VMMaker ecosystem.  All the code is Squeak and can reuse
> substantial parts of the system.  Seems like a win to me.  I think I'll take
> this approach in implementing the new object format.  It could be a new
> backend to MicroSqueak.


Likewise, in NativeBoost, to debug a generated native code, i simply
instructing
assembler to generate int3 instruction at point where i need, and then
after accepting a method and
running the doit, which using this method, i pop into debugger (like
gdb) and can step by step see what it does..

I don't remember being able to do that from any other language,
because code-compile-run cycle makes it simply impossible.
Ah yes.. there is 'compile and continue' feature for C.. but i never
used that after couple failures. And besides often compile & continue
takes same time to compile things than simply build everything from
scratch.. so at the moment when it done compiling, you can forget what
you doing there :)

> cheers
> Eliot
>>
>>  Thank you!
>>
>> -- Yoshiki
>>
>
>
>
>
>



--
Best regards,
Igor Stasenko AKA sig.
Reply | Threaded
Open this post in threaded view
|

Re: [squeak-dev] A Bootstrap Compiler

stephane ducasse-2
In reply to this post by Eliot Miranda-2

Eliot

I would love to see that during the school at INRIA. Could you have a session on that?
Because we need to use a better infrastructure to build new image.
BTW: I had the same question regarding the use of the C compiler. What we tried with the students here was to use the system itself
but we could not work full time on it and also we started from a larger kernel (and wanted to remove more - big mistake).

Stef


> Hi Yoshiki,
>
> On Tue, Dec 28, 2010 at 2:19 AM, Yoshiki Ohshima <[hidden email]> wrote:
>  Hi,
>
>  We've been playing with John's MicroSqueak and it occured to me that
> having a bytecode compiler that is implemented outside of Squeak opens
> some possibilities, such as generate a growable image file from all
> text files, or make deep changes to the system without shooting
> yourself.
>
> I wrote a longer explanation so if you are interested, please go to:
>
> https://github.com/yoshikiohshima/SqueakBootstrapper
>
> and check it out.
>
> I simply don't see the benefit of putting energy into other languages.  I see the benefit of a textual bootstrap.  But why is it worth-while implementing that in C instead of Smalltalk?  If Smalltalk is more productive (which it is) then writing such a bootstrap in C is a waste of effort, reinvents several wheels at considerable expense and produces an artifact that is less flexible, less extensible, less useful than implementing the same functionality in Smalltalk.
>
> On the other hand, as Andreas suggests, trying to implement something using the simulator looks to be really powerful.  Recent;y I've been playing tangentally in this area.  In recent days I've produced a new code generator for Cog that has some useful speedups (Compiler recompileAll ~ 9% faster, benchFib 2x).  To test the code generator I needed to check stack depths at various points in JIT compilation and execution of the JITted code.  I have a Smalltalk class StackDepthFinder that answers the stack depths for each bytecode of a method.  By adding two classes VMObjectProxy and VMCompiledMethodProxy I could apply StackDepthFinder to methods in the simulator's heap and hence derive stack depths for any method in the simulators image.  To test the JIT it was also convenient to be able to JIT methods in my work image, synthesised test cases etc, not just methods in the simulated image.  Again a facade class allows the simulator to JIT any method in my work image.  This worked well and was easy to implement.  Extending in this direction seems straight-forward.
>
> One starts up wit a simulator and an empty heap and bootstraps objects into that heap, using whatever bytecode set and object format one chooses.  One can test the image using the simulator which should be quite fast enough if the image is a small kernel.  All the implementation is useful and adds to the simulator/VMMaker ecosystem.  All the code is Squeak and can reuse substantial parts of the system.  Seems like a win to me.  I think I'll take this approach in implementing the new object format.  It could be a new backend to MicroSqueak.
> cheers
> Eliot
>
>  Thank you!
>
> -- Yoshiki
>
>
>

Reply | Threaded
Open this post in threaded view
|

Re: [squeak-dev] A Bootstrap Compiler

Yoshiki Ohshima-2
In reply to this post by Eliot Miranda-2
 
At Tue, 28 Dec 2010 10:22:35 -0800,
Eliot Miranda wrote:
>
> I simply don't see the benefit of putting energy into other languages. I see the benefit of a textual bootstrap. But why
> is it worth-while implementing that in C instead of Smalltalk? If Smalltalk is more productive (which it is) then
> writing such a bootstrap in C is a waste of effort, reinvents several wheels at considerable expense and produces an
> artifact that is less flexible, less extensible, less useful than implementing the same functionality in Smalltalk.

  Like I wrote in another reply, it is more about the a meta-language
(in this case, a PEG or PEG-like generator).  In the last S3
conference, Ian and Takashi and myself showed that the same
S-expression language can be targeted to x86 code and Adobe Bytecode.
Similar to this, a good bootstrapping strategy would be to write the
major part of compiler in something like PEG and provide different
backend to produce different executables.  For my Bootstrap Compiler,
it is not C as much as Leg; C is convenient and have a workable parser
generator, so it was an okay step, I thought.  (Arguably, Leg does not
support structual matching of trees; One of the phase in the reference
implementation is written directly in C, even though the pattern in
the code just resembles to the PEG implementation.

> On the other hand, as Andreas suggests, trying to implement something using the simulator looks to be really powerful.
> Recent;y I've been playing tangentally in this area. In recent days I've produced a new code generator for Cog that has
> some useful speedups (Compiler recompileAll ~ 9% faster, benchFib 2x). To test the code generator I needed to check
> stack depths at various points in JIT compilation and execution of the JITted code. I have a Smalltalk class
> StackDepthFinder that answers the stack depths for each bytecode of a method. By adding two classes VMObjectProxy and
> VMCompiledMethodProxy I could apply StackDepthFinder to methods in the simulator's heap and hence derive stack depths
> for any method in the simulators image. To test the JIT it was also convenient to be able to JIT methods in my work
> image, synthesised test cases etc, not just methods in the simulated image. Again a facade class allows the simulator to
> JIT any method in my work image. This worked well and was easy to implement. Extending in this direction seems
> straight-forward.

  Ah, okay.  Just an analogy, in the way Slang works, the
meta-language here happen to be Slang/Smalltalk and different
executables are produced with different backend.  As you said,
however, Slang is not a language implementation but a practical
vehicle to get to a place; with a nicer meta-language, the picture
could look prettier.

> One starts up wit a simulator and an empty heap and bootstraps objects into that heap, using whatever bytecode set and
> object format one chooses. One can test the image using the simulator which should be quite fast enough if the image is
> a small kernel. All the implementation is useful and adds to the simulator/VMMaker ecosystem. All the code is Squeak and
> can reuse substantial parts of the system. Seems like a win to me. I think I'll take this approach in implementing the
> new object format. It could be a new backend to MicroSqueak.
> cheers
>
> Eliot

  Thank you!
Reply | Threaded
Open this post in threaded view
|

Re: [squeak-dev] A Bootstrap Compiler

Göran Krampe
In reply to this post by Eliot Miranda-2
 
Hi all!

On 12/28/2010 07:22 PM, Eliot Miranda wrote:
> I simply don't see the benefit of putting energy into other languages.
>   I see the benefit of a textual bootstrap.  But why is it worth-while
> implementing that in C instead of Smalltalk?

Well, one benefit would be to fit more easily into the ecosystems of
Linux distros etc where they generally get nervous from having to start
from an "unknown" binary.

In other words - if a system can be built solely from text files using
"stock" tools like a C compiler (or any tool typically available in
these ecosystems) then the acceptance is much higher.

A similar example was when I wrote a "module" (=build script) for Lunar
Linux which is a source distro much like Gentoo - for the GHC compiler
(Haskell). GHC needs an existing GHC to compile itself, so the script
had to first download a binary GHC in order to compile the new one.

Not that I think this argument is "worth" it perhaps, but hey, any and
all tools/efforts that gives us flexibility is cool in my book. :)

regards, Göran