Smalltalk › Squeak › Squeak - Dev

[squeak-dev] Burn the Squeak Image! (Why I am running for board)

_‹ Previous Topic Next Topic _›

Classic

List

Threaded

74 messages Options

1234

Igor Stasenko

Re: [squeak-dev] re: MicroSqueak

2009/3/2 Göran Krampe <[hidden email]>:
> Igor Stasenko wrote:
>>
>> I having a similar idea to capture methods/classes while running code
>> to discover what objects i need to clone into separate heap to make it
>> running under another interpreter instance in Hydra.
>
> Mmm, you are aware of this stuff that Craig has in Spoon right? The
> "imprinting" stuff IIRC.
>

sure, i'm aware of that.
Craig using changed VM to mark methods while code running. But one
could do much the same using already awailable tools.
Yes, it will be slower, but this process (shrinking) don't have to be
performed regularily, so why care.

> regards, Göran
>
>
>

--
Best regards,
Igor Stasenko AKA sig.

Jecel Assumpcao Jr

[squeak-dev] re: MicroSqueak

In reply to this post by ccrraaiigg

Craig Latta wrote on Sun, 01 Mar 2009 21:20:31 -0800

> Sure, I personally think that should be where the process starts
> (otherwise I suspect unnecessary things get included), but I'm
> interested in approaches from that point that differ from mine.

You are probably aware of the type inference work Ole Agesen did in
Self?

http://selflanguage.org/documentation/published/gold.html

Some of this has since been done for Squeak as well, but not (as far as
I know) for generating minimal images. Of course, type inference has
trouble with things like #perform. In the case of Self, primitive
failures did silly string manipulation and that caused a lot of
unrelated code (string stuff) to get pulled into every image. Another
source of "leaks" was that even if you only used integer math in your
application, the type inferencer couldn't prove that floating point
would never be needed due to the way a few Integer methods were written.

In the end, it seems likely to me that the best result will be obtained
by a combination of methods.

-- Jecel

Eliot Miranda-2

Re: [squeak-dev] Re: Burn the Squeak Image! (Why I am running for board)

In reply to this post by Dan Ingalls

On Sun, Mar 1, 2009 at 6:11 PM, Daniel Ingalls <[hidden email]> wrote:

Eliot Miranda <[hidden email]> wrote...

You're absolutely right. The major image-level change I will require is for Behavior to implement identityHash with a primitive that is different form that in Object. Doing this allows me to implement a hidden class table in the VM where a class's identity hash is the index into the class table. An instance of a class has the class's class table index (the class's id hash) stored in its header, not a direct pointer to the class. So every object has a more compact class reference, say 16, 20 or 24 bits. Also, class references in in-line and method caches are class indices, not direct class references, which means less work on GC. But to ensure a class can be entered in the table by the VM at an unused index Behavi ust be a special primitive that the VM implements as searching the table for an unused index.

Hi, Eliot -

I've been mostly lurking for a while here, but this topic has become more interesting with each tidbit. I just wanted to say that I love the synergy between hash and class table rolled into the elimination of compact classes. It's an improvement in every way. I can't wait to see this all come to life. You go, guy!

Thanks, Dan! The other goodness in this is that the 2-word header layout can be used in both a 32-bit and a 64-bit VM, which also means that the inline cacheing code that e.g. involves embedding a class index in a register load instruction is the same in both 32 and 64 bits, and hence is a leg up for a fast 64-bit VM.

- Dan

Stephen Pair

Re: [squeak-dev] re: MicroSqueak

In reply to this post by Eliot Miranda-2

On Mon, Mar 2, 2009 at 12:38 AM, Eliot Miranda <[hidden email]> wrote:

On Sun, Mar 1, 2009 at 9:20 PM, Craig Latta <[hidden email]> wrote:
[snip]

As for whether to produce an object memory statically and then set it running, or transform an object memory which is always running... I think the resulting memory will need to load modules live anyway, so one might as well do all the transformations that way. Perhaps this is simply an aesthetic choice.

Surely repeatability mandates that one roduce an object memory statically and then set it running? Because of things like delays the always running memory is almost never in a predictable state, so one always ends up with different bits even if they represent the same functionality.

E.

Maybe you could get the repeatability with a process that is roughly:

a) write the spec for the capability of the image (a method that exercises everything you want to be able to do)

b) use the class/method copying & DNU trickery and do the runtime analysis to figure out the classes and methods needed to support that capability

c) do something a little more surgical to build a new image by copying over the behaviors and methods, but construct the processes and stacks more deliberately (so you aren't so tied to the running image's state)

I'd think in this way you could do something that was reproducible to the extent that resulting image was only dependent on the running image for its behaviors and other necessary objects (various singletons and whatnot), but otherwise not affected by various processes and random other things that might be in that image. Once you had (b) and (c) mostly ironed out, it would be a process of refining the specification in (a) to get to a suitable minimal image.

- Stephen

Eliot Miranda-2

Re: [squeak-dev] re: MicroSqueak

On Mon, Mar 2, 2009 at 1:48 PM, Stephen Pair <[hidden email]> wrote:

On Mon, Mar 2, 2009 at 12:38 AM, Eliot Miranda <[hidden email]> wrote:

On Sun, Mar 1, 2009 at 9:20 PM, Craig Latta <[hidden email]> wrote:
[snip]

As for whether to produce an object memory statically and then set it running, or transform an object memory which is always running... I think the resulting memory will need to load modules live anyway, so one might as well do all the transformations that way. Perhaps this is simply an aesthetic choice.

Surely repeatability mandates that one roduce an object memory statically and then set it running? Because of things like delays the always running memory is almost never in a predictable state, so one always ends up with different bits even if they represent the same functionality.

E.

Maybe you could get the repeatability with a process that is roughly:

a) write the spec for the capability of the image (a method that exercises everything you want to be able to do)

b) use the class/method copying & DNU trickery and do the runtime analysis to figure out the classes and methods needed to support that capability
c) do something a little more surgical to build a new image by copying over the behaviors and methods, but construct the processes and stacks more deliberately (so you aren't so tied to the running image's state)

I'd think in this way you could do something that was reproducible to the extent that resulting image was only dependent on the running image for its behaviors and other necessary objects (various singletons and whatnot), but otherwise not affected by various processes and random other things that might be in that image. Once you had (b) and (c) mostly ironed out, it would be a process of refining the specification in (a) to get to a suitable minimal image.

Agreed. The nice thing is being able to run a) in the IDE so that when something is missing it manifests as an Undeclared or an MNU.

One thing is ensuring that when simulating objects like nil, true and false behave as they will in the result, not as defined in the host image. One thing one could do is arrange that the compiler for MObject uses instances of MSymbol for all code under MObject. Then e.g. a doesNotUnderstand: handler on SmallInteger, UndefinedObject, Boolean et al might be able to forward things correctly and arrange that the simulation was more accurate.

- Stephen

Igor Stasenko

Re: [squeak-dev] re: MicroSqueak

2009/3/3 Eliot Miranda <[hidden email]>:

>
>
> On Mon, Mar 2, 2009 at 1:48 PM, Stephen Pair <[hidden email]> wrote:
>>
>> On Mon, Mar 2, 2009 at 12:38 AM, Eliot Miranda <[hidden email]>
>> wrote:
>>>
>>>
>>> On Sun, Mar 1, 2009 at 9:20 PM, Craig Latta <[hidden email]> wrote:
>>>>
>>>> [snip]
>>>> As for whether to produce an object memory statically and then set
>>>> it running, or transform an object memory which is always running... I think
>>>> the resulting memory will need to load modules live anyway, so one might as
>>>> well do all the transformations that way. Perhaps this is simply an
>>>> aesthetic choice.
>>>
>>> Surely repeatability mandates that one roduce an object memory statically
>>> and then set it running? Because of things like delays the always running
>>> memory is almost never in a predictable state, so one always ends up with
>>> different bits even if they represent the same functionality.
>>> E.
>>
>> Maybe you could get the repeatability with a process that is roughly:
>> a) write the spec for the capability of the image (a method that exercises
>> everything you want to be able to do)
>> b) use the class/method copying & DNU trickery and do the runtime analysis
>> to figure out the classes and methods needed to support that capability
>> c) do something a little more surgical to build a new image by copying
>> over the behaviors and methods, but construct the processes and stacks more
>> deliberately (so you aren't so tied to the running image's state)
>> I'd think in this way you could do something that was reproducible to the
>> extent that resulting image was only dependent on the running image for its
>> behaviors and other necessary objects (various singletons and whatnot), but
>> otherwise not affected by various processes and random other things that
>> might be in that image. Once you had (b) and (c) mostly ironed out, it
>> would be a process of refining the specification in (a) to get to a suitable
>> minimal image.
>
> Agreed. The nice thing is being able to run a) in the IDE so that when
> something is missing it manifests as an Undeclared or an MNU.
> One thing is ensuring that when simulating objects like nil, true and false
> behave as they will in the result, not as defined in the host image. One
> thing one could do is arrange that the compiler for MObject uses instances
> of MSymbol for all code under MObject. Then e.g. a doesNotUnderstand:
> handler on SmallInteger, UndefinedObject, Boolean et al might be able to
> forward things correctly and arrange that the simulation was more accurate.
>

Thats why i wrote own parser/compiler in Moebius.
It is designed in a way, that a parser & compiler output is under full
control of an object which plays role as environment.
So, you can produce an instance of CompiledMethod as output, or encode
result in machine code, or represent methods as a raw bytes which then
could be put in the image you constructing.
Even nil,true,false singleton values are under control of environment.

Read more about it here.
http://code.google.com/p/moebius-st/wiki/Parser

Simulation of SmallInts could be made easy - we could simply make a
class, named BoxedSmallInteger
and use it for representing all literal values in methods. At final
stage of image creating we can unbox them and replace by smallints.
We're in smalltalk, after all, where such things is possible to do,
unlike many other languages :)

>>
>> - Stephen

--
Best regards,
Igor Stasenko AKA sig.

Eliot Miranda-2

Re: [squeak-dev] re: MicroSqueak

On Mon, Mar 2, 2009 at 3:25 PM, Igor Stasenko <[hidden email]> wrote:

2009/3/3 Eliot Miranda <[hidden email]>:

- Show quoted text -
>
>
> On Mon, Mar 2, 2009 at 1:48 PM, Stephen Pair <[hidden email]> wrote:
>>
>> On Mon, Mar 2, 2009 at 12:38 AM, Eliot Miranda <[hidden email]>
>> wrote:
>>>
>>>
>>> On Sun, Mar 1, 2009 at 9:20 PM, Craig Latta <[hidden email]> wrote:
>>>>
>>>> [snip]
>>>> As for whether to produce an object memory statically and then set
>>>> it running, or transform an object memory which is always running... I think
>>>> the resulting memory will need to load modules live anyway, so one might as
>>>> well do all the transformations that way. Perhaps this is simply an
>>>> aesthetic choice.
>>>
>>> Surely repeatability mandates that one roduce an object memory statically
>>> and then set it running? Because of things like delays the always running
>>> memory is almost never in a predictable state, so one always ends up with
>>> different bits even if they represent the same functionality.
>>> E.
>>
>> Maybe you could get the repeatability with a process that is roughly:
>> a) write the spec for the capability of the image (a method that exercises
>> everything you want to be able to do)
>> b) use the class/method copying & DNU trickery and do the runtime analysis
>> to figure out the classes and methods needed to support that capability
>> c) do something a little more surgical to build a new image by copying
>> over the behaviors and methods, but construct the processes and stacks more
>> deliberately (so you aren't so tied to the running image's state)
>> I'd think in this way you could do something that was reproducible to the
>> extent that resulting image was only dependent on the running image for its
>> behaviors and other necessary objects (various singletons and whatnot), but
>> otherwise not affected by various processes and random other things that
>> might be in that image. Once you had (b) and (c) mostly ironed out, it
>> would be a process of refining the specification in (a) to get to a suitable
>> minimal image.
>
> Agreed. The nice thing is being able to run a) in the IDE so that when
> something is missing it manifests as an Undeclared or an MNU.
> One thing is ensuring that when simulating objects like nil, true and false
> behave as they will in the result, not as defined in the host image. One
> thing one could do is arrange that the compiler for MObject uses instances
> of MSymbol for all code under MObject. Then e.g. a doesNotUnderstand:
> handler on SmallInteger, UndefinedObject, Boolean et al might be able to
> forward things correctly and arrange that the simulation was more accurate.
>

Thats why i wrote own parser/compiler in Moebius.
It is designed in a way, that a parser & compiler output is under full
control of an object which plays role as environment.
So, you can produce an instance of CompiledMethod as output, or encode
result in machine code, or represent methods as a raw bytes which then
could be put in the image you constructing.
Even nil,true,false singleton values are under control of environment.

Read more about it here.
http://code.google.com/p/moebius-st/wiki/Parser

Simulation of SmallInts could be made easy - we could simply make a
class, named BoxedSmallInteger
and use it for representing all literal values in methods. At final
stage of image creating we can unbox them and replace by smallints.

Cool. So the compiler can avoid using the pushNil, pushFalse and pushTrue bytecodes. It must send some message to coerce every MBoolean result into a host Boolean before doing a conditional jump. It must wrap all MSmallInteger relational primitive invocations with code to coerce the Booleran to the matching MBoolean.

A doesNotUnderstand: will produce an instance of Message and send doesNotUnderstand:, do MObject needs a doesNtUnderstand: handler that sends MSymbol #doesNotUnderstand: with a coercion of the Message to an MMessage. Any other holes that need to be plugged?

Alternatively just create a subclass of InstructionStream and/or ContextPart and interpret all code and have the interpretation manage MObject. That might be slow but easier to get going.

We're in smalltalk, after all, where such things is possible to do,
unlike many other languages :)

Right on!

>>
>> - Stephen

- Show quoted text -

--
Best regards,
Igor Stasenko AKA sig.

Eliot Miranda-2

Re: [squeak-dev] re: MicroSqueak

On Mon, Mar 2, 2009 at 3:51 PM, Eliot Miranda <[hidden email]> wrote:

On Mon, Mar 2, 2009 at 3:25 PM, Igor Stasenko <[hidden email]> wrote:

2009/3/3 Eliot Miranda <[hidden email]>:

- Show quoted text -
- Show quoted text -
>
>
> On Mon, Mar 2, 2009 at 1:48 PM, Stephen Pair <[hidden email]> wrote:
>>
>> On Mon, Mar 2, 2009 at 12:38 AM, Eliot Miranda <[hidden email]>
>> wrote:
>>>
>>>
>>> On Sun, Mar 1, 2009 at 9:20 PM, Craig Latta <[hidden email]> wrote:
>>>>
>>>> [snip]
>>>> As for whether to produce an object memory statically and then set
>>>> it running, or transform an object memory which is always running... I think
>>>> the resulting memory will need to load modules live anyway, so one might as
>>>> well do all the transformations that way. Perhaps this is simply an
>>>> aesthetic choice.
>>>
>>> Surely repeatability mandates that one roduce an object memory statically
>>> and then set it running? Because of things like delays the always running
>>> memory is almost never in a predictable state, so one always ends up with
>>> different bits even if they represent the same functionality.
>>> E.
>>
>> Maybe you could get the repeatability with a process that is roughly:
>> a) write the spec for the capability of the image (a method that exercises
>> everything you want to be able to do)
>> b) use the class/method copying & DNU trickery and do the runtime analysis
>> to figure out the classes and methods needed to support that capability
>> c) do something a little more surgical to build a new image by copying
>> over the behaviors and methods, but construct the processes and stacks more
>> deliberately (so you aren't so tied to the running image's state)
>> I'd think in this way you could do something that was reproducible to the
>> extent that resulting image was only dependent on the running image for its
>> behaviors and other necessary objects (various singletons and whatnot), but
>> otherwise not affected by various processes and random other things that
>> might be in that image. Once you had (b) and (c) mostly ironed out, it
>> would be a process of refining the specification in (a) to get to a suitable
>> minimal image.
>
> Agreed. The nice thing is being able to run a) in the IDE so that when
> something is missing it manifests as an Undeclared or an MNU.
> One thing is ensuring that when simulating objects like nil, true and false
> behave as they will in the result, not as defined in the host image. One
> thing one could do is arrange that the compiler for MObject uses instances
> of MSymbol for all code under MObject. Then e.g. a doesNotUnderstand:
> handler on SmallInteger, UndefinedObject, Boolean et al might be able to
> forward things correctly and arrange that the simulation was more accurate.
>

- Show quoted text -
Thats why i wrote own parser/compiler in Moebius.
It is designed in a way, that a parser & compiler output is under full
control of an object which plays role as environment.
So, you can produce an instance of CompiledMethod as output, or encode
result in machine code, or represent methods as a raw bytes which then
could be put in the image you constructing.
Even nil,true,false singleton values are under control of environment.

Read more about it here.
http://code.google.com/p/moebius-st/wiki/Parser

Simulation of SmallInts could be made easy - we could simply make a
class, named BoxedSmallInteger
and use it for representing all literal values in methods. At final
stage of image creating we can unbox them and replace by smallints.

Cool. So the compiler can avoid using the pushNil, pushFalse and pushTrue bytecodes. It must send some message to coerce every MBoolean result into a host Boolean before doing a conditional jump. It must wrap all MSmallInteger relational primitive invocations with code to coerce the Booleran to the matching MBoolean.

A doesNotUnderstand: will produce an instance of Message and send doesNotUnderstand:, do MObject needs a doesNtUnderstand: handler that sends MSymbol #doesNotUnderstand: with a coercion of the Message to an MMessage. Any other holes that need to be plugged?

Alternatively just create a subclass of InstructionStream and/or ContextPart and interpret all code and have the interpretation manage MObject. That might be slow but easier to get going.

But perhaps a better alternative is just to use Hydra and provide a remote debugging interface to another Hydra space. So there's a version of the Hydra spawn operation that constructs the heap from MObject. That machinery would be easy to extend, right Igor?

How about the remote debugging? How minimal is the debugging stub that must exist in the spawned MImage? Would one need VM changes (e.g. a callback handler for a recursive doesNotUnderstand: error)?

We're in smalltalk, after all, where such things is possible to do,
unlike many other languages :)

Right on!

>>
>> - Stephen

- Show quoted text -

--

Best regards,
Igor Stasenko AKA sig.

Igor Stasenko

Re: [squeak-dev] re: MicroSqueak

2009/3/3 Eliot Miranda <[hidden email]>:

>
>
> On Mon, Mar 2, 2009 at 3:51 PM, Eliot Miranda <[hidden email]>
> wrote:
>>
>>
>> On Mon, Mar 2, 2009 at 3:25 PM, Igor Stasenko <[hidden email]> wrote:
>>>
>>> 2009/3/3 Eliot Miranda <[hidden email]>:
>>> - Show quoted text -
>>> - Show quoted text -
>>> >
>>> >
>>> > On Mon, Mar 2, 2009 at 1:48 PM, Stephen Pair <[hidden email]>
>>> > wrote:
>>> >>
>>> >> On Mon, Mar 2, 2009 at 12:38 AM, Eliot Miranda
>>> >> <[hidden email]>
>>> >> wrote:
>>> >>>
>>> >>>
>>> >>> On Sun, Mar 1, 2009 at 9:20 PM, Craig Latta <[hidden email]> wrote:
>>> >>>>
>>> >>>> [snip]
>>> >>>> As for whether to produce an object memory statically and then
>>> >>>> set
>>> >>>> it running, or transform an object memory which is always running...
>>> >>>> I think
>>> >>>> the resulting memory will need to load modules live anyway, so one
>>> >>>> might as
>>> >>>> well do all the transformations that way. Perhaps this is simply an
>>> >>>> aesthetic choice.
>>> >>>
>>> >>> Surely repeatability mandates that one roduce an object memory
>>> >>> statically
>>> >>> and then set it running? Because of things like delays the always
>>> >>> running
>>> >>> memory is almost never in a predictable state, so one always ends up
>>> >>> with
>>> >>> different bits even if they represent the same functionality.
>>> >>> E.
>>> >>
>>> >> Maybe you could get the repeatability with a process that is roughly:
>>> >> a) write the spec for the capability of the image (a method that
>>> >> exercises
>>> >> everything you want to be able to do)
>>> >> b) use the class/method copying & DNU trickery and do the runtime
>>> >> analysis
>>> >> to figure out the classes and methods needed to support that
>>> >> capability
>>> >> c) do something a little more surgical to build a new image by copying
>>> >> over the behaviors and methods, but construct the processes and stacks
>>> >> more
>>> >> deliberately (so you aren't so tied to the running image's state)
>>> >> I'd think in this way you could do something that was reproducible to
>>> >> the
>>> >> extent that resulting image was only dependent on the running image
>>> >> for its
>>> >> behaviors and other necessary objects (various singletons and
>>> >> whatnot), but
>>> >> otherwise not affected by various processes and random other things
>>> >> that
>>> >> might be in that image. Once you had (b) and (c) mostly ironed out,
>>> >> it
>>> >> would be a process of refining the specification in (a) to get to a
>>> >> suitable
>>> >> minimal image.
>>> >
>>> > Agreed. The nice thing is being able to run a) in the IDE so that when
>>> > something is missing it manifests as an Undeclared or an MNU.
>>> > One thing is ensuring that when simulating objects like nil, true and
>>> > false
>>> > behave as they will in the result, not as defined in the host image.
>>> > One
>>> > thing one could do is arrange that the compiler for MObject uses
>>> > instances
>>> > of MSymbol for all code under MObject. Then e.g. a doesNotUnderstand:
>>> > handler on SmallInteger, UndefinedObject, Boolean et al might be able
>>> > to
>>> > forward things correctly and arrange that the simulation was more
>>> > accurate.
>>> >
>>>
>>> - Show quoted text -
>>> Thats why i wrote own parser/compiler in Moebius.
>>> It is designed in a way, that a parser & compiler output is under full
>>> control of an object which plays role as environment.
>>> So, you can produce an instance of CompiledMethod as output, or encode
>>> result in machine code, or represent methods as a raw bytes which then
>>> could be put in the image you constructing.
>>> Even nil,true,false singleton values are under control of environment.
>>>
>>> Read more about it here.
>>> http://code.google.com/p/moebius-st/wiki/Parser
>>>
>>> Simulation of SmallInts could be made easy - we could simply make a
>>> class, named BoxedSmallInteger
>>> and use it for representing all literal values in methods. At final
>>> stage of image creating we can unbox them and replace by smallints.
>>
>> Cool. So the compiler can avoid using the pushNil, pushFalse and pushTrue
>> bytecodes. It must send some message to coerce every MBoolean result into a
>> host Boolean before doing a conditional jump. It must wrap all
>> MSmallInteger relational primitive invocations with code to coerce the
>> Booleran to the matching MBoolean.

right , something like:

MSmallInteger >> #<
< object
^ MBoolean from: boxed < object boxedValue

>> A doesNotUnderstand: will produce an instance of Message and send
>> doesNotUnderstand:, do MObject needs a doesNtUnderstand: handler that sends
>> MSymbol #doesNotUnderstand: with a coercion of the Message to an MMessage.

why care converting it, when you can simply replace a Message class
with MMessage in special objects array, while running a 'sandboxed'
code :)

Second thing about #doesNotUnderstand:
Often people forgetting that some classes could have own custom
#doesNotUnderstand: method.
I'd rather do not put any expectations on DNU while designing
micro-image bootstrapper.

And i seem missing the direction where this discussion turned out.
Where/when we would want to run a code in host environment?

In Moebius, since it is hosted in Squeak, and since we having better
parser ;) i could simulate the method behavior at multiple stages
including just after parsing a method.
I created a simple MockContext class, which can evaluate things
strictly in a manner how it is parsed from method source.
This, for instance, allows us to test parser output in a complete
black-box fashion:

testEvaluateParserOutput
| method |

method := self parse: ' = x ^ x == self ' class: Object.
self assert: [
(CVMockContext evaluate: method arguments: #(1 1)) == true.].
self assert: [
(CVMockContext evaluate: method arguments: #(true false)) == false.]

testEvaluateParserOutput2
| method |
method := self parse: ' foo ^ #(nil true false) ' class: Object.
self assert: [
(CVMockContext evaluate: method arguments: #(nil)) = #(nil true false)].
method := self parse: ' foo ^ #(#a #b #c) ' class: Object.
self assert: [
(CVMockContext evaluate: method arguments: #(nil)) = #(#a #b #c)].
method := self parse: ' foo ^ #() ' class: Object.
self assert: [
(CVMockContext evaluate: method arguments: #(nil)) = #()].

note, that 'method' above is not a CompiledMethod, it is an AST form
of parsed source, encoded as lambda message sends.

>> Any other holes that need to be plugged?
>> Alternatively just create a subclass of InstructionStream and/or
>> ContextPart and interpret all code and have the interpretation manage
>> MObject. That might be slow but easier to get going.

>
> But perhaps a better alternative is just to use Hydra and provide a remote
> debugging interface to another Hydra space. So there's a version of the
> Hydra spawn operation that constructs the heap from MObject. That machinery
> would be easy to extend, right Igor?

Right, this is what i'm writing between the lines :)
I willing to have a generic toolset which could easily produce a
micro-images for any purposes, including a kernel-image, of course.
Lately , we discussed a one more little primitive for Hydra with Klaus
, and one more kind of channel - an object channel. It will allow you
to transfer objects (even cloning a subgraphs) between images , not
just dumb raw number of bytes :) This could ease developing tools
which require interaction between images, because you don't have to
care about serializing/deserializing stuff - you literally just send
what you want to other side.

> How about the remote debugging? How minimal is the debugging stub that must
> exist in the spawned MImage? Would one need VM changes (e.g. a callback
> handler for a recursive doesNotUnderstand: error)?
>

I think that putting debugger support into VM will be a big mistake.
Debugging is a fairly complex domain, and i don't think that we need
to deal with this at VM level, where is no objects but oops, headers &
bits.. This is right way to get a hellishly complex & unmanageable
artifact.

Debugger, as anything else is invoked using regular message send -
(during Error>>signal). So, it is easy to hook into it and turn into
right direction.
I made a simple class HydraDebugToolSet, which replaces an image
default toolset for images which running in background.
In result, when error happens, it sends an error message to a
#transcript channel of main interpreter.
Nothing stops us from getting a bit further and request main
interpreter to establish a remote debugging session (except that we
don't have Debuggers with remote debugging capabilities ;) ).
But i know there is already at least one remote debugger
implementation in Squeak - GemStone tools. It is using OB tools to
generate UI & other stuff.
I'm not sure, what license it having, and could it be took as base for
remote debugging tool for Squeak.
(it would be nice to have a basic remote debugging framework in
squeak, which could allow different backends - either G/S , remote
socket connection, or via Hydra channels).

>>> We're in smalltalk, after all, where such things is possible to do,
>>> unlike many other languages :)
>>
>> Right on!
>>
>>>
>>> >>
>>> >> - Stephen
>>> - Show quoted text -
>>>
>>> --
>>> Best regards,
>>> Igor Stasenko AKA sig.
>>>

--
Best regards,
Igor Stasenko AKA sig.

Stephen Pair

Re: [squeak-dev] re: MicroSqueak

In reply to this post by Eliot Miranda-2

On Mon, Mar 2, 2009 at 6:59 PM, Eliot Miranda <[hidden email]> wrote:

But perhaps a better alternative is just to use Hydra and provide a remote debugging interface to another Hydra space. So there's a version of the Hydra spawn operation that constructs the heap from MObject. That machinery would be easy to extend, right Igor?

I like it. It made me immediately think of gestation and child birth. You could call this early stage interface the umbilical interface. But seriously, if you really want to get at the very smallest possible starting image, constructing one that is a sort of embryo that is still dependent on its host and unable to live in the world on its own is probably the way to go. This minimal image wouldn't need a file system interface, a compiler, and probably lots of other things that one built to live on its own would need.

- Stephen

Igor Stasenko

Re: [squeak-dev] re: MicroSqueak

In reply to this post by Igor Stasenko

2009/3/3 Igor Stasenko <[hidden email]>:

> - Show quoted text -
> 2009/3/3 Eliot Miranda <[hidden email]>:
>>
>>
>> On Mon, Mar 2, 2009 at 3:51 PM, Eliot Miranda <[hidden email]>
>> wrote:
>>>
>>>
>>> On Mon, Mar 2, 2009 at 3:25 PM, Igor Stasenko <[hidden email]> wrote:
>>>>
>>>> 2009/3/3 Eliot Miranda <[hidden email]>:
>>>> - Show quoted text -
>>>> - Show quoted text -
>>>> >
>>>> >
>>>> > On Mon, Mar 2, 2009 at 1:48 PM, Stephen Pair <[hidden email]>
>>>> > wrote:
>>>> >>
>>>> >> On Mon, Mar 2, 2009 at 12:38 AM, Eliot Miranda
>>>> >> <[hidden email]>
>>>> >> wrote:
>>>> >>>
>>>> >>>
>>>> >>> On Sun, Mar 1, 2009 at 9:20 PM, Craig Latta <[hidden email]> wrote:
>>>> >>>>
>>>> >>>> [snip]
>>>> >>>> As for whether to produce an object memory statically and then
>>>> >>>> set
>>>> >>>> it running, or transform an object memory which is always running...
>>>> >>>> I think
>>>> >>>> the resulting memory will need to load modules live anyway, so one
>>>> >>>> might as
>>>> >>>> well do all the transformations that way. Perhaps this is simply an
>>>> >>>> aesthetic choice.
>>>> >>>
>>>> >>> Surely repeatability mandates that one roduce an object memory
>>>> >>> statically
>>>> >>> and then set it running? Because of things like delays the always
>>>> >>> running
>>>> >>> memory is almost never in a predictable state, so one always ends up
>>>> >>> with
>>>> >>> different bits even if they represent the same functionality.
>>>> >>> E.
>>>> >>
>>>> >> Maybe you could get the repeatability with a process that is roughly:
>>>> >> a) write the spec for the capability of the image (a method that
>>>> >> exercises
>>>> >> everything you want to be able to do)
>>>> >> b) use the class/method copying & DNU trickery and do the runtime
>>>> >> analysis
>>>> >> to figure out the classes and methods needed to support that
>>>> >> capability
>>>> >> c) do something a little more surgical to build a new image by copying
>>>> >> over the behaviors and methods, but construct the processes and stacks
>>>> >> more
>>>> >> deliberately (so you aren't so tied to the running image's state)
>>>> >> I'd think in this way you could do something that was reproducible to
>>>> >> the
>>>> >> extent that resulting image was only dependent on the running image
>>>> >> for its
>>>> >> behaviors and other necessary objects (various singletons and
>>>> >> whatnot), but
>>>> >> otherwise not affected by various processes and random other things
>>>> >> that
>>>> >> might be in that image. Once you had (b) and (c) mostly ironed out,
>>>> >> it
>>>> >> would be a process of refining the specification in (a) to get to a
>>>> >> suitable
>>>> >> minimal image.
>>>> >
>>>> > Agreed. The nice thing is being able to run a) in the IDE so that when
>>>> > something is missing it manifests as an Undeclared or an MNU.
>>>> > One thing is ensuring that when simulating objects like nil, true and
>>>> > false
>>>> > behave as they will in the result, not as defined in the host image.
>>>> > One
>>>> > thing one could do is arrange that the compiler for MObject uses
>>>> > instances
>>>> > of MSymbol for all code under MObject. Then e.g. a doesNotUnderstand:
>>>> > handler on SmallInteger, UndefinedObject, Boolean et al might be able
>>>> > to
>>>> > forward things correctly and arrange that the simulation was more
>>>> > accurate.
>>>> >
>>>>
>>>> - Show quoted text -
>>>> Thats why i wrote own parser/compiler in Moebius.
>>>> It is designed in a way, that a parser & compiler output is under full
>>>> control of an object which plays role as environment.
>>>> So, you can produce an instance of CompiledMethod as output, or encode
>>>> result in machine code, or represent methods as a raw bytes which then
>>>> could be put in the image you constructing.
>>>> Even nil,true,false singleton values are under control of environment.
>>>>
>>>> Read more about it here.
>>>> http://code.google.com/p/moebius-st/wiki/Parser
>>>>
>>>> Simulation of SmallInts could be made easy - we could simply make a
>>>> class, named BoxedSmallInteger
>>>> and use it for representing all literal values in methods. At final
>>>> stage of image creating we can unbox them and replace by smallints.
>>>
>>> Cool. So the compiler can avoid using the pushNil, pushFalse and pushTrue
>>> bytecodes. It must send some message to coerce every MBoolean result into a
>>> host Boolean before doing a conditional jump. It must wrap all
>>> MSmallInteger relational primitive invocations with code to coerce the
>>> Booleran to the matching MBoolean.
>
> right , something like:
>
> MSmallInteger >> #<
> < object
> ^ MBoolean from: boxed < object boxedValue
>
>>> A doesNotUnderstand: will produce an instance of Message and send
>>> doesNotUnderstand:, do MObject needs a doesNtUnderstand: handler that sends
>>> MSymbol #doesNotUnderstand: with a coercion of the Message to an MMessage.
>
> why care converting it, when you can simply replace a Message class
> with MMessage in special objects array, while running a 'sandboxed'
> code :)
>
> Second thing about #doesNotUnderstand:
> Often people forgetting that some classes could have own custom
> #doesNotUnderstand: method.
> I'd rather do not put any expectations on DNU while designing
> micro-image bootstrapper.
>
> And i seem missing the direction where this discussion turned out.
> Where/when we would want to run a code in host environment?
>
> In Moebius, since it is hosted in Squeak, and since we having better
> parser ;) i could simulate the method behavior at multiple stages
> including just after parsing a method.
> I created a simple MockContext class, which can evaluate things
> strictly in a manner how it is parsed from method source.
> This, for instance, allows us to test parser output in a complete
> black-box fashion:
>
> testEvaluateParserOutput
> | method |
>
> method := self parse: ' = x ^ x == self ' class: Object.
> self assert: [
> (CVMockContext evaluate: method arguments: #(1 1)) == true.].
> self assert: [
> (CVMockContext evaluate: method arguments: #(true false)) == false.]
>
> testEvaluateParserOutput2
> | method |
> method := self parse: ' foo ^ #(nil true false) ' class: Object.
> self assert: [
> (CVMockContext evaluate: method arguments: #(nil)) = #(nil true false)].
> method := self parse: ' foo ^ #(#a #b #c) ' class: Object.
> self assert: [
> (CVMockContext evaluate: method arguments: #(nil)) = #(#a #b #c)].
> method := self parse: ' foo ^ #() ' class: Object.
> self assert: [
> (CVMockContext evaluate: method arguments: #(nil)) = #()].
>
> note, that 'method' above is not a CompiledMethod, it is an AST form
> of parsed source, encoded as lambda message sends.
>
>
>>> Any other holes that need to be plugged?
>>> Alternatively just create a subclass of InstructionStream and/or
>>> ContextPart and interpret all code and have the interpretation manage
>>> MObject. That might be slow but easier to get going.
>
>>
>> But perhaps a better alternative is just to use Hydra and provide a remote
>> debugging interface to another Hydra space. So there's a version of the
>> Hydra spawn operation that constructs the heap from MObject. That machinery
>> would be easy to extend, right Igor?
>
> Right, this is what i'm writing between the lines :)
> I willing to have a generic toolset which could easily produce a
> micro-images for any purposes, including a kernel-image, of course.
> Lately , we discussed a one more little primitive for Hydra with Klaus
> , and one more kind of channel - an object channel. It will allow you
> to transfer objects (even cloning a subgraphs) between images , not
> just dumb raw number of bytes :) This could ease developing tools
> which require interaction between images, because you don't have to
> care about serializing/deserializing stuff - you literally just send
> what you want to other side.
>

A little more about that.
VM is capable to recognize a most basic object types (ints, strings,
bytes etc), to allow us to transfer objects in JSON-like style.
Suppose you want to transfer an instance of class #Foo, which can't be
recognized by VM.
You can serialize it as an array, like:
#(#Foo ivar1value ivar2value ...)

then, at receiving side, it is quite simple to reconstruct it into
instance of Foo class.
Compare this with amount of processing you may need if you limited
only to exchange using raw byte buffers.

>> How about the remote debugging? How minimal is the debugging stub that must
>> exist in the spawned MImage? Would one need VM changes (e.g. a callback
>> handler for a recursive doesNotUnderstand: error)?
>>
>
> I think that putting debugger support into VM will be a big mistake.
> Debugging is a fairly complex domain, and i don't think that we need
> to deal with this at VM level, where is no objects but oops, headers &
> bits.. This is right way to get a hellishly complex & unmanageable
> artifact.
>
> Debugger, as anything else is invoked using regular message send -
> (during Error>>signal). So, it is easy to hook into it and turn into
> right direction.
> I made a simple class HydraDebugToolSet, which replaces an image
> default toolset for images which running in background.
> In result, when error happens, it sends an error message to a
> #transcript channel of main interpreter.
> Nothing stops us from getting a bit further and request main
> interpreter to establish a remote debugging session (except that we
> don't have Debuggers with remote debugging capabilities ;) ).
> But i know there is already at least one remote debugger
> implementation in Squeak - GemStone tools. It is using OB tools to
> generate UI & other stuff.
> I'm not sure, what license it having, and could it be took as base for
> remote debugging tool for Squeak.
> (it would be nice to have a basic remote debugging framework in
> squeak, which could allow different backends - either G/S , remote
> socket connection, or via Hydra channels).
>
>>>> We're in smalltalk, after all, where such things is possible to do,
>>>> unlike many other languages :)
>>>
>>> Right on!
>>>
>>>>
>>>> >>
>>>> >> - Stephen
>>>> - Show quoted text -
>>>>
>>>> --
>>>> Best regards,
>>>> Igor Stasenko AKA sig.
>>>>
>
>
> --
> - Show quoted text -
> Best regards,
> Igor Stasenko AKA sig.
>

--
Best regards,
Igor Stasenko AKA sig.

Michael van der Gulik-2

Re: [squeak-dev] re: MicroSqueak

In reply to this post by Eliot Miranda-2

On Tue, Mar 3, 2009 at 12:51 PM, Eliot Miranda <[hidden email]> wrote:

Cool. So the compiler can avoid using the pushNil, pushFalse and pushTrue bytecodes. It must send some message to coerce every MBoolean result into a host Boolean before doing a conditional jump.

Is it possible to have the compiler not generate conditional jumps but rather actually evaluate True>>ifTrue:, False>>ifTrue: etc etc for your own True and False classes (MTrue and MFalse???)?

Is there something I'm forgetting which makes this obviously not work? Are conditional jumps really required?

Gulik.

--
http://gulik.pbwiki.com/

Igor Stasenko

Re: [squeak-dev] re: MicroSqueak

In reply to this post by Stephen Pair

2009/3/3 Stephen Pair <[hidden email]>:

> On Mon, Mar 2, 2009 at 6:59 PM, Eliot Miranda <[hidden email]>
> wrote:
>>
>> But perhaps a better alternative is just to use Hydra and provide a remote
>> debugging interface to another Hydra space. So there's a version of the
>> Hydra spawn operation that constructs the heap from MObject. That machinery
>> would be easy to extend, right Igor?
>
> I like it. It made me immediately think of gestation and child birth. You
> could call this early stage interface the umbilical interface. But
> seriously, if you really want to get at the very smallest possible starting
> image, constructing one that is a sort of embryo that is still dependent on
> its host and unable to live in the world on its own is probably the way to
> go. This minimal image wouldn't need a file system interface, a compiler,
> and probably lots of other things that one built to live on its own would
> need.

Right, it is waits to be implemented.
Currently in example of HydraClone>>cloneIdleProcess, i stubbing out
all class/metaclass references with a dumb anonymous instances of
Class, which having a format field set, and empty method dictionary.
This is to make sure that VM will not crash occasionally while
stepping out on stubbed class :)
To get an effect of host<->embryo relation, we need to invent a
special stub, which will carry enough information for passing it to
host image and getting back an object which is then #become the real
class or method or whatever.

P.S. there is a lot of synergy with a Spoon. Time to time people
pointing out on this.
I just want to make it clear: I'm aware about it and even think it
worth integrating Spoon features with Hydra to not reinvent the wheel,
especially on language side.

> - Stephen
>
>

--
Best regards,
Igor Stasenko AKA sig.

ccrraaiigg

[squeak-dev] re: producing minimal systems (was "MicroSqueak")

In reply to this post by Eliot Miranda-2

Hi--

Ah, another day, another omnibus response. :)

Eliot writes:

> Surely repeatability mandates that one roduce an object memory
> statically and then set it running? Because of things like delays the
> always running memory is almost never in a predictable state, so one
> always ends up with different bits even if they represent the same
> functionality.

Sure, but that's all I care about: something minimal for a given
set of functionality (in my case, loading the next module). I've made an
initial memory from which others can be made by loading modules. If I
need to make a new initial memory for some reason, I don't care whether
it has every bit in the same place, but I do expect it to have the
equivalent objects fulfilling equivalent roles.

I'm after repeatability of functionality without excess. So, no,
repeatability doesn't mandate producing an object memory statically from
within a host. Simulation was the first thing I tried. In my experience,
it has been much easier (although still not a walk in the park :) to
avoid simulation for this purpose and use remote messaging on a real
object memory instead. I do, however, run the minimal memory in the
interpreter simulator occasionally, to debug and produce visualizations[1].

> ...perhaps a better alternative is just to use Hydra and provide a
> remote debugging interface to another Hydra space.
>
> ...
>
> How about the remote debugging? How minimal is the debugging stub
> that must exist in the spawned MImage? Would one need VM changes...?

Just a reminder at this point that Spoon has remote debugging
between object memories, with support for context stacks spanning
multiple physical machines. It uses remote messaging, which uses a small
change to the VM's method lookup, but otherwise no special VM support is
necessary. And no need for a myriad of shadow classes (MBoolean,
MockContexts, specialized parsers, et al).

Igor writes:

> ...there is a lot of synergy with Spoon. People point this out from
> time to time.

Yes, this would be one of those times. :) I only wish that I
could have finished this while I was on break, or that I could be
employed to do this. :)

Janko writes:

> A question from someone not so [knowledgeable about] Smalltalk
> internals: is Spoon compatible with the proposed MicroSqueak? That is,
> can it be Spoon based on top of MicroSqueak?

I haven't seen any indication that MicroSqueak is actually
minimal, so I'm not sure why one would want to do that. It seems like
Spoon and MicroSqueak are two fundamentally different approaches. Spoon
isn't really something that runs "on top of" something else; once you
add anything more to it, it's not minimal anymore.

Now, if instead you meant Naiad (Spoon's module system) and Other
(Spoon's remote messaging framework)... sure, those will run in any
Smalltalk. They have to, in order to provide a convenient way of moving
code between old and new systems.

Stephen Pair writes:

> Maybe you could get the repeatability with a process that is
> roughly...

Hmm, didn't you just write that, and didn't I respond that I'd
already done an equivalent thing? :)

Jecel writes:

> You are probably aware of the type inference work Ole Agesen did in
> Self?

Yes, thanks!

Igor writes:

> I having a similar idea to capture methods/classes while running code
> to discover what objects i need to clone into separate heap to make it
> running under another interpreter instance in Hydra.

Göran responds:

> Mmm, you are aware of this stuff that Craig has in Spoon right? The
> "imprinting" stuff IIRC.

(Thanks, Göran!)

Igor responds:

> sure, I'm aware of that. Craig using changed VM to mark methods while
> code running. But one could do much the same using already awailable
> tools. Yes, it will be slower, but this process (shrinking) don't have
> to be performed regularily, so why care.

True, imprinting uses method activation marking, but imprinting
has nothing to do with shrinking. Imprinting is useful for transferring
methods from one object memory to another as they are run, in real-time.
If the marking weren't done in the VM, it wouldn't work. Given that, it
really was easiest to just do shrinking with a slightly-modified version
of the garbage collector. I think it makes a lot of sense.

But speaking of shrinking, I'll just reiterate the most extreme
result so far, a 1337-byte object memory[2], suitable for t-shirts...

As an aside... I'm still amazed at how apprehensive people are
about modifying the VM, after all the time Squeak's been around. The
relative ease with which one can do it, with the ability to debug using
Smalltalk tools, is one of the main compelling things about the system.
It lets us just go ahead and change the VM *if that's what's
appropriate*, rather than try to work around it as with previous
systems. (True, it can be much easier still... and we'll get there
faster if more people dive in.)

***

Finally, an obligatory repetition of what I'm working on now: I'm
implementing Naiad, Spoon's module system[3]. I have a headful object
memory with editions in it describing its classes, methods, modules,
etc. (see [3] for terminology). I have a minimal object memory, and I
have another headful memory with tools in it for manipulating memories
remotely (remote system browser, remote inspectors, remote debugger, etc.).

Both the minimal memory and the tools memory can connect to the
history memory and use that instead of a changes file. I can do
traditional things like looking up versions of a particular method, but
also more sophisticated queries (for example, methods written by a
particular author over some time period, removed over some other period,
and that access a particular instance variable).

My current task is creating a minimal history memory, to go along
with the minimal memory. I'm transferring editions for all the
components of the minimal memory into a copy of the minimal memory, and
fixing bugs that I uncover in the process. Then I'll have the pieces of
the next Spoon system: a minimal object memory, and a minimal history
memory that describes it. I will release that, along with changesets for
the remote tools that previous (3.2 to 4.0) object memories can use.
Then people can start composing Naiad modules for all the behavior that
I removed (e.g., graphics support).

I've made several releases of Spoon in the past (most notably
2004-02-14, 2005-12-11, 2006-10-25, and 2007-04-12), but every time the
lack of the module system limited the pool of truly interested folks to
a handful. Given how difficult it is to make releases before the module
system exists, I've decided to focus on the module system (asking for
feedback about the module system design[3] in the meantime). I still
think it's going to be worth the time, and making it my day job would
still speed it up.

-C

[1] http://netjam.org/spoon/viz
[2] http://netjam.org/spoon/smallest
[3] http://netjam.org/spoon/naiad

--
Craig Latta
www.netjam.org
next show: 2009-03-13 (www.thishere.org)

1234