Smalltalk › Squeak › Squeak - Dev

[squeak-dev] Burn the Squeak Image! (Why I am running for board)

_‹ Previous Topic Next Topic _›

Classic

List

Threaded

74 messages Options

1234

Eliot Miranda-2

Re: [squeak-dev] Re: Burn the Squeak Image! (Why I am running for board)

On Sun, Mar 1, 2009 at 1:01 PM, Igor Stasenko <[hidden email]> wrote:

2009/3/1 Eliot Miranda <[hidden email]>:

>
>
> On Sun, Mar 1, 2009 at 12:23 PM, Igor Stasenko <[hidden email]> wrote:
>>
>> 2009/3/1 Andreas Raab <[hidden email]>:
>> > Igor Stasenko wrote:
>> >>
>> >> Changing the object formats alone does not gives any benefits. What is
>> >> the point in having new format when you keep using old semantic model
>> >> as before?
>> >
>> > Speed. That is the only point of the exercise to begin with.
>> >
>> >> This is like swapping instance variables order in your class.. Apart
>> >> from a better aestetical view it gives you nothing :)
>> >
>> > If swapping ivars in a class would give me a 3x in performance I'd be
>> > doing
>> > this all day long...
>> >
>> but we both know that this is too good to be true. :)
>> unless you change the way how things working, you can't achieve
>> significant performance boost. And often this means rewriting
>> interfaces, which inevitably leads to changing a lot of code on
>> language side etc.
>
> Uh, no. Here is the inline cache check in Cog, which is as complicated as
> it is because of compact classes:
> 00009588: movl %edx, %eax : 89 D0
> 0000958a: andl $0x00000001, %eax : 83 E0 01
> 0000958d: jnz .+0x00000011 (0x000095a0=singleRelease@40) : 75 11
> 0000958f: movl %ds:(%edx), %eax : 8B 42 00
> 00009592: shrl $0x0a, %eax : C1 E8 0A
> 00009595: andl $0x0000007c, %eax : 83 E0 7C
> 00009598: jnz .+0x00000006 (0x000095a0=singleRelease@40) : 75 06
> 0000959a: movl %ds:0xfffffffc(%edx), %eax : 8B 42 FC
> 0000959d: andl $0xfffffffc, %eax : 83 E0 FC
> 000095a0: cmpl %ecx, %eax : 39 C8
> 000095a2: jnz .+0xffffffda (0x0000957e=LSICMissCall) : 75 DA
> In VisualWorks the code looks like
>    movl %ebx, %eax
>    andl $3, %eax
>    jnz LCompare
>    movl (%ebx), %eax
> LCompare:
>    cmpl %eax, %edx
>    jnz +0xffffff??=LSICMissCall
> That's 9 or 11 instructions (compact vs non-compact) vs 6 instructions in
> the common case, but vitally, for non-compact classes 2 memory reads vs one.
> So indeed object representation can make a major difference in run-time
> performance. Consider how much quicker object allocation is in VW, which
> does not have to check if the receiving class is compact or not, compared to
> Squeak. Consider how much quicker string access is in VW, which has
> immediate characters, than Squeak with the character table and the inability
> to do == comparisons on Unicode characters. etc. etc.

Sometimes I having troubles with expressing my thoughts clearly.. sorry.

No need to apologise. Everyone (me especially) can take a few mail messages to converge on the right meaning from time to time.

I din't mean that changing object format does not improves the speed.
I meant that such changes alone is very hard to adopt without ANY
changes on language side.
See
Behavior>>becomeCompact
becomeCompactSimplyAt: index
becomeUncompact

see also
#compactClassesArray

and i suspect this list is only a top of the iceberg (for instance,
you may need to change SpaceTally to report things properly).

You're absolutely right. The major image-level change I will require is for Behavior to implement identityHash with a primitive that is different form that in Object. Doing this allows me to implement a hidden class table in the VM where a class's identity hash is the index into the class table. An instance of a class has the class's class table index (the class's id hash) stored in its header, not a direct pointer to the class. So every object has a more compact class reference, say 16, 20 or 24 bits. Also, class references in in-line and method caches are class indices, not direct class references, which means less work on GC. But to ensure a class can be entered in the table by the VM at an unused index Behaviour>>identityHash must be a special primitive that the VM implements as searching the table for an unused index.

But the imager-level code for compact classes can still exist; its just that the VM will ignore it :)

Of course you're right about SpaceTally. Perhaps the VM should provide some primitives that allow SpaceTally to be parameterised. of course it's not until one tries to use such a parameterised SpaceTally on more than one object representation that one knows the design works across a range of object representations. And its not as if one will be trying new object representations every week (although I don't know :) ). But it might be worth the effort. But probably simpler is having the kernel of SpaceTaly's computatioons be in the microkernel image, and writing some suitable tests to be run in a derivative image.

But my experience with 64-bit VW is that there are very few changes. We had the Behaviour>>identityHash primtiive, the primitive that answered the size of the hash field, and that's about it. Note that the image already computes the size of SmallInteger by doing subtractions until overflow at start-up.

Now, defending you point, that really, it would be much easier to deal
with such things in a micro-image (consider the amount of code and
tests which you need to perform when producing new update).

This makes you, as a VM developer be responsible from good integration
of VM with language side.
Then rest images, which is based on it will have to use things
strictly in manner, as it put in kernel.
It is important to draw a line between kernel and rest of the code in
image, which depends on it.

Right. Agreed. And experience shows (16-bit => 32-bit, Squeak & VW 64-bit) that the new constraints introduced by the microkernel will be very few and unobtrusive.

Best

Eliot

>>
>> > Cheers,
>> > - Andreas
>> >
>> --
>> Best regards,
>> Igor Stasenko AKA sig.
>
> Best
> Eliot
>
>
>

--

Best regards,
Igor Stasenko AKA sig.

Michael van der Gulik-2

Re: [squeak-dev] Burn the Squeak Image! (Why I am running for board)

In reply to this post by Tapple Gao

On Sat, Feb 28, 2009 at 9:20 PM, Matthew Fulmer <[hidden email]> wrote:

Squeak is a growing community with diverse needs. We have long
outgrown the monolithic image left to us by our founders, Dan
Ingalls and company.

Yes, but only the enlightened community members realise this. The rest of them still think Morphic was well designed.

2004: DPON, by Michael van der Gulik: A project to revive Henrik
Godenryd's modularity framework abandoned in Squeak 3.3

Er... no it isn't! Namespaces/Packages is similar, but I'm redesigning stuff rather than reviving stuff.

We need to build things for those who would build better images
themselves. Having many good images to choose from makes
everybody happier. The only issue with the situation is that
they are not always compatible. I believe this is the core issue
that the board and the squeak release team needs to address.

I fully agree with this.

By this time next year, every squeak distribution
(squeak.org, Pharo, eToys, cobalt) will be running a
standard version of the following three packages:
- Collections
- Streams
- Compiler

- Kernel ?

Gulik.

--
http://gulik.pbwiki.com/

Igor Stasenko

Re: [squeak-dev] Re: Burn the Squeak Image! (Why I am running for board)

In reply to this post by Eliot Miranda-2

2009/3/1 Eliot Miranda <[hidden email]>:

>
>
> On Sun, Mar 1, 2009 at 1:01 PM, Igor Stasenko <[hidden email]> wrote:
>>
>> 2009/3/1 Eliot Miranda <[hidden email]>:
>> >
>> >
>> > On Sun, Mar 1, 2009 at 12:23 PM, Igor Stasenko <[hidden email]>
>> > wrote:
>> >>
>> >> 2009/3/1 Andreas Raab <[hidden email]>:
>> >> > Igor Stasenko wrote:
>> >> >>
>> >> >> Changing the object formats alone does not gives any benefits. What
>> >> >> is
>> >> >> the point in having new format when you keep using old semantic
>> >> >> model
>> >> >> as before?
>> >> >
>> >> > Speed. That is the only point of the exercise to begin with.
>> >> >
>> >> >> This is like swapping instance variables order in your class.. Apart
>> >> >> from a better aestetical view it gives you nothing :)
>> >> >
>> >> > If swapping ivars in a class would give me a 3x in performance I'd be
>> >> > doing
>> >> > this all day long...
>> >> >
>> >> but we both know that this is too good to be true. :)
>> >> unless you change the way how things working, you can't achieve
>> >> significant performance boost. And often this means rewriting
>> >> interfaces, which inevitably leads to changing a lot of code on
>> >> language side etc.
>> >
>> > Uh, no. Here is the inline cache check in Cog, which is as complicated
>> > as
>> > it is because of compact classes:
>> > 00009588: movl %edx, %eax : 89 D0
>> > 0000958a: andl $0x00000001, %eax : 83 E0 01
>> > 0000958d: jnz .+0x00000011 (0x000095a0=singleRelease@40) : 75 11
>> > 0000958f: movl %ds:(%edx), %eax : 8B 42 00
>> > 00009592: shrl $0x0a, %eax : C1 E8 0A
>> > 00009595: andl $0x0000007c, %eax : 83 E0 7C
>> > 00009598: jnz .+0x00000006 (0x000095a0=singleRelease@40) : 75 06
>> > 0000959a: movl %ds:0xfffffffc(%edx), %eax : 8B 42 FC
>> > 0000959d: andl $0xfffffffc, %eax : 83 E0 FC
>> > 000095a0: cmpl %ecx, %eax : 39 C8
>> > 000095a2: jnz .+0xffffffda (0x0000957e=LSICMissCall) : 75 DA
>> > In VisualWorks the code looks like
>> >    movl %ebx, %eax
>> >    andl $3, %eax
>> >    jnz LCompare
>> >    movl (%ebx), %eax
>> > LCompare:
>> >    cmpl %eax, %edx
>> >    jnz +0xffffff??=LSICMissCall
>> > That's 9 or 11 instructions (compact vs non-compact) vs 6 instructions
>> > in
>> > the common case, but vitally, for non-compact classes 2 memory reads vs
>> > one.
>> > So indeed object representation can make a major difference in run-time
>> > performance. Consider how much quicker object allocation is in VW,
>> > which
>> > does not have to check if the receiving class is compact or not,
>> > compared to
>> > Squeak. Consider how much quicker string access is in VW, which has
>> > immediate characters, than Squeak with the character table and the
>> > inability
>> > to do == comparisons on Unicode characters. etc. etc.
>>
>> Sometimes I having troubles with expressing my thoughts clearly.. sorry.
>
> No need to apologise. Everyone (me especially) can take a few mail messages
> to converge on the right meaning from time to time.
>>
>> I din't mean that changing object format does not improves the speed.
>> I meant that such changes alone is very hard to adopt without ANY
>> changes on language side.
>> See
>> Behavior>>becomeCompact
>> becomeCompactSimplyAt: index
>> becomeUncompact
>>
>> see also
>> #compactClassesArray
>>
>> and i suspect this list is only a top of the iceberg (for instance,
>> you may need to change SpaceTally to report things properly).
>
> You're absolutely right. The major image-level change I will require is for
> Behavior to implement identityHash with a primitive that is different form
> that in Object. Doing this allows me to implement a hidden class table in
> the VM where a class's identity hash is the index into the class table. An
> instance of a class has the class's class table index (the class's id hash)
> stored in its header, not a direct pointer to the class. So every object
> has a more compact class reference, say 16, 20 or 24 bits. Also, class
> references in in-line and method caches are class indices, not direct class
> references, which means less work on GC. But to ensure a class can be
> entered in the table by the VM at an unused index Behaviour>>identityHash
> must be a special primitive that the VM implements as searching the table
> for an unused index.
> But the imager-level code for compact classes can still exist; its just that
> the VM will ignore it :)
> Of course you're right about SpaceTally. Perhaps the VM should provide some
> primitives that allow SpaceTally to be parameterised. of course it's not
> until one tries to use such a parameterised SpaceTally on more than one
> object representation that one knows the design works across a range of
> object representations. And its not as if one will be trying new object
> representations every week (although I don't know :) ). But it might be
> worth the effort. But probably simpler is having the kernel of SpaceTaly's
> computatioons be in the microkernel image, and writing some suitable tests
> to be run in a derivative image.
> But my experience with 64-bit VW is that there are very few changes. We had
> the Behaviour>>identityHash primtiive, the primitive that answered the size
> of the hash field, and that's about it. Note that the image already
> computes the size of SmallInteger by doing subtractions until overflow at
> start-up.
>>
>> Now, defending you point, that really, it would be much easier to deal
>> with such things in a micro-image (consider the amount of code and
>> tests which you need to perform when producing new update).
>>
>> This makes you, as a VM developer be responsible from good integration
>> of VM with language side.
>> Then rest images, which is based on it will have to use things
>> strictly in manner, as it put in kernel.
>> It is important to draw a line between kernel and rest of the code in
>> image, which depends on it.
>
> Right. Agreed. And experience shows (16-bit => 32-bit, Squeak & VW 64-bit)
> that the new constraints introduced by the microkernel will be very few and
> unobtrusive.

I don't agree with defining a kernel modularisation as a constraint :)
The constraint is when you saying: hey pals, you can't have a headless
image - use one with Morphic instead, and if you really really want
it, we invented a workaround - a headless mode in VM. ;)

> Best
> Eliot
>

--
Best regards,
Igor Stasenko AKA sig.

ccrraaiigg

[squeak-dev] re: MicroSqueak (was "Burn the Squeak Image!")

In reply to this post by Bert Freudenberg

Hi Bert (and Eliot)--

> [John Maloney] did [MicroSqueak] a couple of years ago and showed it
> to us last summer. As Eliot wrote, it builds an image in memory from a
> hierarchy of classes.

How did he know which classes were necessary for the new image?

thanks,

-C

--
Craig Latta
www.netjam.org
next show: 2009-03-13 (www.thishere.org)

ccrraaiigg

[squeak-dev] re: Burn the Squeak Image! (Why Matthew is running for board)

In reply to this post by Eliot Miranda-2

Hi Eliot--

> Craig, do you agree?

Yes.

> If so, how much of this do you have already?

Well, you mentioned:

> This image needs minimal scripting support to respond to command-line
> bootstrap commands (including cross-platform stdin & stdout and a file
> interface)...

I do this through a web interface, which I prefer, but a module to
do this through command-line parameters, stdin, and files would be
straightforward.

> ...a compiler with which to compile code...

This is already a loadable thing, but I rarely load it because I
don't need the compiler to install compiled methods.

> ...collections, magnitudes, exceptions (as necessary)...

Yes.

> a default error handler that dumps the stack to stdout and then
> abort...

Yes.

> ...and that's about it.

Okay, there you go.

thanks,

-C

--
Craig Latta
www.netjam.org
next show: 2009-03-13 (www.thishere.org)

Eliot Miranda-2

Re: [squeak-dev] re: MicroSqueak (was "Burn the Squeak Image!")

In reply to this post by ccrraaiigg

On Sun, Mar 1, 2009 at 4:55 PM, Craig Latta <[hidden email]> wrote:

Hi Bert (and Eliot)--

> [John Maloney] did [MicroSqueak] a couple of years ago and showed it
> to us last summer. As Eliot wrote, it builds an image in memory from a
> hierarchy of classes.

How did he know which classes were necessary for the new image?

There is a hierarchy rooted at MObject, all of which becomes the new classes in the new image. The generator renames the classes so that in the generated image MObject is called Object and so on for all subclasses.

HTH

thanks,

-C

--
Craig Latta
www.netjam.org
next show: 2009-03-13 (www.thishere.org)

ccrraaiigg

[squeak-dev] re: MicroSqueak

Hi Eliot--

> > How did [John Maloney] know which classes were necessary for [a] new
> > [MicroSqueak] image?
>
> There is a hierarchy rooted at MObject, all of which becomes the new
> classes in the new image. The generator renames the classes so that
> in the generated image MObject is called Object and so on for all
> subclasses.

Right, but how did he know which classes should be in that
hierarchy? I.e., how did he decide what was essential and what wasn't?

-C

--
Craig Latta
www.netjam.org
next show: 2009-03-13 (www.thishere.org)

Eliot Miranda-2

Re: [squeak-dev] re: MicroSqueak

On Sun, Mar 1, 2009 at 5:51 PM, Craig Latta <[hidden email]> wrote:

Hi Eliot--

> > How did [John Maloney] know which classes were necessary for [a] new
> > [MicroSqueak] image?
>
> There is a hierarchy rooted at MObject, all of which becomes the new
> classes in the new image. The generator renames the classes so that
> in the generated image MObject is called Object and so on for all
> subclasses.

Right, but how did he know which classes should be in that hierarchy? I.e., how did he decide what was essential and what wasn't?

You should ask John, but the aim was to get a minimal Hello World so there isn't a compiler for example. But surely the answer is "whatever you need to get what you want done", right? So if the goal is a microkernel from which one can bootstrap an image one needs a compiler, a file system interface, and for sanity a minimal error-reporting framework right?

-C

--
Craig Latta
www.netjam.org
next show: 2009-03-13 (www.thishere.org)

Dan Ingalls

Re: [squeak-dev] Re: Burn the Squeak Image! (Why I am running for board)

In reply to this post by Tapple Gao

Eliot Miranda <[hidden email]> wrote...

You're absolutely right. The major image-level change I will require is for Behavior to implement identityHash with a primitive that is different form that in Object. Doing this allows me to implement a hidden class table in the VM where a class's identity hash is the index into the class table. An instance of a class has the class's class table index (the class's id hash) stored in its header, not a direct pointer to the class. So every object has a more compact class reference, say 16, 20 or 24 bits. Also, class references in in-line and method caches are class indices, not direct class references, which means less work on GC. But to ensure a class can be entered in the table by the VM at an unused index Behavi ust be a special primitive that the VM implements as searching the table for an unused index.

Hi, Eliot -

I've been mostly lurking for a while here, but this topic has become more interesting with each tidbit. I just wanted to say that I love the synergy between hash and class table rolled into the elimination of compact classes. It's an improvement in every way. I can't wait to see this all come to life. You go, guy!

- Dan

ccrraaiigg

[squeak-dev] re: MicroSqueak

In reply to this post by Eliot Miranda-2

> You should ask John, but the aim was to get a minimal Hello World so
> there isn't a compiler for example. But surely the answer is
> "whatever you need to get what you want done", right?

Heh, sure... but you're not answering my question. :) I'm
wondering how others have gone about deriving the complete set of
classes and methods a system needs to perform a particular task (so that
I can compare to how I do it). Indeed, I should ask John.

> So if the goal is a microkernel from which one can bootstrap an image
> one needs a compiler, a file system interface, and for sanity a
> minimal error-reporting framework right?

Sure, but that doesn't help much with questions like "Should I
include method X or not?". The necessity of several things in the system
is rather subtle. :) I like to be able to point to any byte in an
object memory and give a simple explanation as to why it's there, and
have performed a straightforward process to get that explanation.

-C

--
Craig Latta
www.netjam.org
next show: 2009-03-13 (www.thishere.org)

Igor Stasenko

Re: [squeak-dev] re: MicroSqueak

2009/3/2 Craig Latta <[hidden email]>:

>
>> You should ask John, but the aim was to get a minimal Hello World so
>> there isn't a compiler for example. But surely the answer is
>> "whatever you need to get what you want done", right?
>
> Heh, sure... but you're not answering my question. :) I'm wondering how
> others have gone about deriving the complete set of classes and methods a
> system needs to perform a particular task (so that I can compare to how I do
> it). Indeed, I should ask John.
>
>> So if the goal is a microkernel from which one can bootstrap an image
>> one needs a compiler, a file system interface, and for sanity a
>> minimal error-reporting framework right?
>
> Sure, but that doesn't help much with questions like "Should I include
> method X or not?". The necessity of several things in the system is rather
> subtle. :) I like to be able to point to any byte in an object memory and
> give a simple explanation as to why it's there, and have performed a
> straightforward process to get that explanation.
>

I think i can give you an idea:
any object/class which interfacing with core VM functionality should
be in that hierarchy.
A starting point - VM requires a special objects array, properly
filled with certain objects of certain, expected properties/slots.
There is no way how you can avoid providing this information in image
without chance of breaking everything.
Going further - we could identify all methods which using core set of
primitives (mainly - numeric ones) and add them as well.
Next - add a basic I/O (make file plugin working).
And finally - make a compiler working (in non-interactive mode).
The rest is optional, since by having a compiler we could file in any
code we want to.

>
> -C
>
> --
> Craig Latta
> www.netjam.org
> next show: 2009-03-13 (www.thishere.org)
>

--
Best regards,
Igor Stasenko AKA sig.

Stephen Pair

Re: [squeak-dev] re: MicroSqueak

In reply to this post by ccrraaiigg

On Sun, Mar 1, 2009 at 10:04 PM, Craig Latta <[hidden email]> wrote:

> You should ask John, but the aim was to get a minimal Hello World so
> there isn't a compiler for example. But surely the answer is
> "whatever you need to get what you want done", right?

Heh, sure... but you're not answering my question. :) I'm wondering how others have gone about deriving the complete set of classes and methods a system needs to perform a particular task (so that I can compare to how I do it). Indeed, I should ask John.

Hmm, I think the key question here is: what do you want to be able to do with the image you create? I've no idea how John did this, but I think I would start by writing a method that did all the things I'd want to be able to do in the image I ultimately create. This might be things like compiling a string of code, loading something from a file (or socket), etc...the bare minimum of things you'd absolutely need to do in a minimal image. Then I'd guess you could create a shadow M* class with this method in it and some sort of proxies for literals and global references. With a little doesNotUnderstand: magic I'd think you could run that method, hit those proxies and copy over other classes and methods as you hit them. I'm sure the devil is in the details, but this top level method that expresses exactly the things that you'd want to be able to do in this minimal image seems like it would be an interesting artifact in and of itself. It would be in a sense a specification of this image.

- Stephen

Igor Stasenko

Re: [squeak-dev] re: MicroSqueak

2009/3/2 Stephen Pair <[hidden email]>:

> On Sun, Mar 1, 2009 at 10:04 PM, Craig Latta <[hidden email]> wrote:
>>
>> > You should ask John, but the aim was to get a minimal Hello World so
>> > there isn't a compiler for example. But surely the answer is
>> > "whatever you need to get what you want done", right?
>>
>> Heh, sure... but you're not answering my question. :) I'm wondering
>> how others have gone about deriving the complete set of classes and methods
>> a system needs to perform a particular task (so that I can compare to how I
>> do it). Indeed, I should ask John.
>
> Hmm, I think the key question here is: what do you want to be able to do
> with the image you create? I've no idea how John did this, but I think I
> would start by writing a method that did all the things I'd want to be able
> to do in the image I ultimately create. This might be things like compiling
> a string of code, loading something from a file (or socket), etc...the bare
> minimum of things you'd absolutely need to do in a minimal image. Then I'd
> guess you could create a shadow M* class with this method in it and some
> sort of proxies for literals and global references. With a little
> doesNotUnderstand: magic I'd think you could run that method, hit those
> proxies and copy over other classes and methods as you hit them. I'm sure
> the devil is in the details, but this top level method that expresses
> exactly the things that you'd want to be able to do in this minimal image
> seems like it would be an interesting artifact in and of itself. It would
> be in a sense a specification of this image.
> - Stephen
>
>

I having a similar idea to capture methods/classes while running code
to discover what objects i need to clone into separate heap to make it
running under another interpreter instance in Hydra.

Squeak already having facilities which can be used for this:
MessageTally tallySends: aBlock .

we just need to make some modifications to it, and as you pointed out,
since we need a bare minimum, then capturing things while running:

FileStream fileIn: 'somecode.st'

is a good starting point.

--
Best regards,
Igor Stasenko AKA sig.

Eliot Miranda-2

Re: [squeak-dev] re: MicroSqueak

In reply to this post by ccrraaiigg

On Sun, Mar 1, 2009 at 7:04 PM, Craig Latta <[hidden email]> wrote:

> You should ask John, but the aim was to get a minimal Hello World so
> there isn't a compiler for example. But surely the answer is
> "whatever you need to get what you want done", right?

Heh, sure... but you're not answering my question. :) I'm wondering how others have gone about deriving the complete set of classes and methods a system needs to perform a particular task (so that I can compare to how I do it). Indeed, I should ask John.

Ah, well the way John does it in MicroSqueak is that one can run whatever the expression one wants the new image to evaluate in the simulation hierarchy MObject. So one can experiment. There are limitations as I mentioned (one is using the host system's Booleans and Numbers, and Array from brace cinstructs, and Message from doesNotUnderstand:) and one still has to figure out what needs to be in the specialObjectsArray. But as far as what "main" does one can test that.

> So if the goal is a microkernel from which one can bootstrap an image
> one needs a compiler, a file system interface, and for sanity a
> minimal error-reporting framework right?

Sure, but that doesn't help much with questions like "Should I include method X or not?". The necessity of several things in the system is rather subtle. :) I like to be able to point to any byte in an object memory and give a simple explanation as to why it's there, and have performed a straightforward process to get that explanation.

Right. The simulation of "main" (plus your coverage tool?) answers that.

-C

--
Craig Latta
www.netjam.org
next show: 2009-03-13 (www.thishere.org)

Andreas.Raab

[squeak-dev] Re: MicroSqueak

In reply to this post by Stephen Pair

Stephen Pair wrote:

> Hmm, I think the key question here is: what do you want to be able to do
> with the image you create? I've no idea how John did this, but I think
> I would start by writing a method that did all the things I'd want to be
> able to do in the image I ultimately create. This might be things like
> compiling a string of code, loading something from a file (or socket),
> etc...the bare minimum of things you'd absolutely need to do in a
> minimal image. Then I'd guess you could create a shadow M* class with
> this method in it and some sort of proxies for literals and global
> references. With a little doesNotUnderstand: magic I'd think you could
> run that method, hit those proxies and copy over other classes and
> methods as you hit them. I'm sure the devil is in the details, but this
> top level method that expresses exactly the things that you'd want to be
> able to do in this minimal image seems like it would be an interesting
> artifact in and of itself. It would be in a sense a specification of
> this image.

That is one way to do it. The alternative (which I used a couple of
years back) is to say: Everything in Kernel-* should make for a
self-contained kernel image. So I started by writing a script which
would copy all the classes and while doing so rename all references to
classes (regardless of whether defined in kernel or not).

At the end of the copying process you end up with a huge number of
Undeclared variables. This is your starting point. Go in and add, remove
or rewrite classes and methods so that they do not refer to entities
outside of your environment. This requires applying some judgement
calls, for example I had a category Kernel-Graphics which included
Color, Point, and Rectangle. Then I did another pass removing lots of
unused methods which I had determined to be unused.

At the end of the process I wrote a script that (via some arcane means)
did a self-transformation of the image I was running and magically
dropped from 30MB to 400k in size. Then I had a hard disk crash and most
of the means that I've been using in this work were lost :-(((

I still have the resulting image but there is really no realistic way of
recovering the process. Which is why I would argue that the better way
to go is to write an image compiler that takes packages and compiles
them into a new object memory. That way you are concentrating on the
process rather than on the artifact (in my experience all the shrinking
processes end up with nonrepeatable one-offs)

Cheers,
- Andreas

ccrraaiigg

[squeak-dev] re: MicroSqueak

In reply to this post by Igor Stasenko

Hi--

Igor writes:

> ...any object/class which interfacing with core VM functionality
> should be in that hierarchy.

Heh, but that just shifts the question over to what exactly "core
VM functionality" is. There's a lot of cruft in th VM, too. And I happen
to agree about starting from a desired high-level task (that's exactly
what I did for Spoon). I'm suspicious of any process which doesn't start
from there.

> The rest is optional, since by having a compiler we could file in any
> code we want to.

Well, I prefer to install compiled methods directly without
recompiling anything, but sure.

Stephen Pair writes:

> Hmm, I think the key question here is: what do you want to be able to
> do with the image you create?

Sure, I personally think that should be where the process starts
(otherwise I suspect unnecessary things get included), but I'm
interested in approaches from that point that differ from mine.

> I've no idea how John did this, but I think I would start by writing a
> method that did all the things I'd want to be able to do in the image
> I ultimately create. This might be things like compiling a string of
> code, loading something from a file (or socket), etc...the bare
> minimum of things you'd absolutely need to do in a minimal image. Then
> I'd guess you could create a shadow M* class with this method in it
> and some sort of proxies for literals and global references. With a
> little doesNotUnderstand: magic I'd think you could run that method,
> hit those proxies and copy over other classes and methods as you hit
> them. I'm sure the devil is in the details, but this top level
method > that expresses exactly the things that you'd want to be able to
do in
> this minimal image seems like it would be an interesting artifact in
> and of itself. It would be in a sense a specification of this image.

This is roughly what I did with Spoon, although my tactic was to
mark everything in a normal object memory involved in a particular task,
then use the garbage collector to throw away everything else
atomically[1]. I like to have a known-working object memory at every
point in the process, by dealing with a running memory as much as
possible (rather than creating one in situ and hoping that it works when
resumed).

Igor responds:

> I have a similar idea to capture methods/classes while running code
> to discover what objects i need to clone into separate heap to make it
> running under another interpreter instance in Hydra.
>
> Squeak already having facilities which can be used for this
> (MessageTally>>tallySends:). We just need to make some modifications
> to it, and as you pointed out, since we need a bare minimum, then
> capture things while running:
>
> FileStream fileIn: 'somecode.st'
>
> is a good starting point.

Right, although I think using the VM to do the marking is more
convenient, and faster.

Andreas writes:

> The alternative (which I used a couple of years back) is to say:
> Everything in Kernel-* should make for a self-contained kernel image.

Aha, yeah; I'm not that trusting. :)

> So I started by writing a script which would copy all the classes and
> while doing so rename all references to classes (regardless of whether
> defined in kernel or not).
>
> At the end of the copying process you end up with a huge number of
> Undeclared variables. This is your starting point. Go in and add,
> remove or rewrite classes and methods so that they do not refer to
> entities outside of your environment. This requires applying some
> judgment calls, for example I had a category Kernel-Graphics which
> included Color, Point, and Rectangle. Then I did another pass removing
> lots of unused methods which I had determined to be unused.

Yeah, that's a lot of work; perhaps on the order of work I was
doing earlier in the project, when I was removing things manually with
remote tools[2].

> At the end of the process I wrote a script that (via some arcane
> means) did a self-transformation of the image I was running and
> magically dropped from 30MB to 400k in size. Then I had a hard disk
> crash and most of the means that I've been using in this work were
> lost :-(((

Ouch! I'm sorry to hear that. That actually happened to me too (in
2005), but through a total coincidence I had a sufficiently-recent
backup to keep going. Several nice minutes of panic...

> I still have the resulting image but there is really no realistic way
> of recovering the process. Which is why I would argue that the better
> way to go is to write an image compiler that takes packages and
> compiles them into a new object memory. That way you are
concentrating > on the process rather than on the artifact (in my
experience all the
> shrinking processes end up with nonrepeatable one-offs)

Oh, I agree that shrinking is not something one should do to
produce deployment artifacts. I think it should be done to get a truly
minimal memory that can load modules, and then never done again
(although the way I do it is repeatable, for the sake of review).

As for whether to produce an object memory statically and then set
it running, or transform an object memory which is always running... I
think the resulting memory will need to load modules live anyway, so one
might as well do all the transformations that way. Perhaps this is
simply an aesthetic choice.

thanks,

-C

[1] http://tinyurl.com/2gbext (lists.squeakfoundation.org)
[2] http://tinyurl.com/bdtdlb (lists.squeakfoundation.org)

--
Craig Latta
www.netjam.org
next show: 2009-03-13 (www.thishere.org)

Eliot Miranda-2

Re: [squeak-dev] re: MicroSqueak

On Sun, Mar 1, 2009 at 9:20 PM, Craig Latta <[hidden email]> wrote:

[snip]
As for whether to produce an object memory statically and then set it running, or transform an object memory which is always running... I think the resulting memory will need to load modules live anyway, so one might as well do all the transformations that way. Perhaps this is simply an aesthetic choice.

Surely repeatability mandates that one roduce an object memory statically and then set it running? Because of things like delays the always running memory is almost never in a predictable state, so one always ends up with different bits even if they represent the same functionality.

Göran Krampe

Re: [squeak-dev] re: MicroSqueak

In reply to this post by Igor Stasenko

Igor Stasenko wrote:
> I having a similar idea to capture methods/classes while running code
> to discover what objects i need to clone into separate heap to make it
> running under another interpreter instance in Hydra.

Mmm, you are aware of this stuff that Craig has in Spoon right? The
"imprinting" stuff IIRC.

regards, Göran

Edgar J. De Cleene

Re: [squeak-dev] Re: Burn the Squeak Image! (Why I am running for board)

In reply to this post by keith1y

On 2/28/09 9:28 PM, "Keith Hodges" <[hidden email]> wrote:

> 2) From what we have,
> and are using in production, downwards, in a carefully planned,
> engineered effort.

This was 3.10 and now my SqueakLightII, but seems instead of less code for
deal with you prefer do more new code...

Edgar

Janko Mivšek

Re: [squeak-dev] re: MicroSqueak

In reply to this post by ccrraaiigg

Hi Craig,

A question from someone not knowledged so deep in Smalltalk internals:
is Spoon compatible with the proposed MicroSqueak?

That is, can it be Spoon based on top of MicroSqueak? Even more, what
preconditions needs MicroSqueak to have to run Spoon on top?

Best regards
Janko

Craig Latta pravi:

>
> Hi--
>
> Igor writes:
>
>> ...any object/class which interfacing with core VM functionality
>> should be in that hierarchy.
>
> Heh, but that just shifts the question over to what exactly "core
> VM functionality" is. There's a lot of cruft in th VM, too. And I happen
> to agree about starting from a desired high-level task (that's exactly
> what I did for Spoon). I'm suspicious of any process which doesn't start
> from there.
>
>> The rest is optional, since by having a compiler we could file in any
>> code we want to.
>
> Well, I prefer to install compiled methods directly without
> recompiling anything, but sure.
>
> Stephen Pair writes:
>
>> Hmm, I think the key question here is: what do you want to be able to
>> do with the image you create?
>
> Sure, I personally think that should be where the process starts
> (otherwise I suspect unnecessary things get included), but I'm
> interested in approaches from that point that differ from mine.
>
>> I've no idea how John did this, but I think I would start by writing a
>> method that did all the things I'd want to be able to do in the image
>> I ultimately create. This might be things like compiling a string of
>> code, loading something from a file (or socket), etc...the bare
>> minimum of things you'd absolutely need to do in a minimal image. Then
>> I'd guess you could create a shadow M* class with this method in it
>> and some sort of proxies for literals and global references. With a
>> little doesNotUnderstand: magic I'd think you could run that method,
>> hit those proxies and copy over other classes and methods as you hit
>> them. I'm sure the devil is in the details, but this top level method
>> that expresses exactly the things that you'd want to be able to do in
>> this minimal image seems like it would be an interesting artifact in
>> and of itself. It would be in a sense a specification of this image.
>
> This is roughly what I did with Spoon, although my tactic was to
> mark everything in a normal object memory involved in a particular task,
> then use the garbage collector to throw away everything else
> atomically[1]. I like to have a known-working object memory at every
> point in the process, by dealing with a running memory as much as
> possible (rather than creating one in situ and hoping that it works when
> resumed).
>
> Igor responds:
>
>> I have a similar idea to capture methods/classes while running code
>> to discover what objects i need to clone into separate heap to make it
>> running under another interpreter instance in Hydra.
>>
>> Squeak already having facilities which can be used for this
>> (MessageTally>>tallySends:). We just need to make some modifications
>> to it, and as you pointed out, since we need a bare minimum, then
>> capture things while running:
>>
>> FileStream fileIn: 'somecode.st'
>>
>> is a good starting point.
>
> Right, although I think using the VM to do the marking is more
> convenient, and faster.
>
> Andreas writes:
>
>> The alternative (which I used a couple of years back) is to say:
>> Everything in Kernel-* should make for a self-contained kernel image.
>
> Aha, yeah; I'm not that trusting. :)
>
>> So I started by writing a script which would copy all the classes and
>> while doing so rename all references to classes (regardless of whether
>> defined in kernel or not).
>>
>> At the end of the copying process you end up with a huge number of
>> Undeclared variables. This is your starting point. Go in and add,
>> remove or rewrite classes and methods so that they do not refer to
>> entities outside of your environment. This requires applying some
>> judgment calls, for example I had a category Kernel-Graphics which
>> included Color, Point, and Rectangle. Then I did another pass removing
>> lots of unused methods which I had determined to be unused.
>
> Yeah, that's a lot of work; perhaps on the order of work I was
> doing earlier in the project, when I was removing things manually with
> remote tools[2].
>
>> At the end of the process I wrote a script that (via some arcane
>> means) did a self-transformation of the image I was running and
>> magically dropped from 30MB to 400k in size. Then I had a hard disk
>> crash and most of the means that I've been using in this work were
>> lost :-(((
>
> Ouch! I'm sorry to hear that. That actually happened to me too (in
> 2005), but through a total coincidence I had a sufficiently-recent
> backup to keep going. Several nice minutes of panic...
>
>> I still have the resulting image but there is really no realistic way
>> of recovering the process. Which is why I would argue that the better
>> way to go is to write an image compiler that takes packages and
>> compiles them into a new object memory. That way you are concentrating
>> on the process rather than on the artifact (in my experience all the
>> shrinking processes end up with nonrepeatable one-offs)
>
> Oh, I agree that shrinking is not something one should do to
> produce deployment artifacts. I think it should be done to get a truly
> minimal memory that can load modules, and then never done again
> (although the way I do it is repeatable, for the sake of review).
>
> As for whether to produce an object memory statically and then set
> it running, or transform an object memory which is always running... I
> think the resulting memory will need to load modules live anyway, so one
> might as well do all the transformations that way. Perhaps this is
> simply an aesthetic choice.
>
>
> thanks,
>
> -C
>
> [1] http://tinyurl.com/2gbext (lists.squeakfoundation.org)
> [2] http://tinyurl.com/bdtdlb (lists.squeakfoundation.org)
>
> --
> Craig Latta
> www.netjam.org
> next show: 2009-03-13 (www.thishere.org)
>
>
>
>

--
Janko Mivšek
AIDA/Web
Smalltalk Web Application Server
http://www.aidaweb.si

1234