Smalltalk › Squeak › Squeak - Dev

Traits or not Traits that is the question

_‹ Previous Topic Next Topic _›

Classic

List

Threaded

131 messages Options

1 ... 4567

Michael van der Gulik-2

Re: Complexity and starting over on the JVM (ideas)

On Feb 11, 2008 11:42 AM, tim Rowledge <[hidden email]> wrote:

On 10-Feb-08, at 2:36 PM, Michael van der Gulik wrote:
>
>
> How does having blocks make the stack non-linear? If I pop open a
> debugger, the stack looks pretty linear to me.

A block's lifetime is not constrained by the stack. You can pass a
block as an argument.

But the execution of that block (a BlockContext?) will be added to the top of the current execution stack thus preserving the linearity of the stack, right?

At this stage, I'm not sure what "stack linearity" is either... I'm assuming that Paul was referring to a stack being a linked list rather than a tree?

Gulik.

--
http://people.squeakfoundation.org/person/mikevdg
http://gulik.pbwiki.com/

timrowledge

Re: Complexity and starting over on the JVM (ideas)

On 10-Feb-08, at 2:53 PM, Michael van der Gulik wrote:

But the execution of that block (a BlockContext?) will be added to the top of the current execution stack thus preserving the linearity of the stack, right?

Nope.

A stack is linear last-in first-out. In C you branch to a subroutine, a stackframe is built on the stack and you excute code based on that. When you return from the subroutine all the memory in that stackframe is known to be free to reuse immediately if wanted by the next subroutine call.

If you can get a handle to the actual stackframe and pass it to some other code then you *cannot* reuse that memory until you have some way of knowing it is no longer needed. So now you have a sandbar in your stack and a tricky problem to solve. In squeak we do it the simple minded way and don't have contiguous stack but instead have actual explicitly allocated objects in the heap. That completely avoids the problem, at a cost in performance. VW uses a lot of very clever code to allow the system to live with a linearised machine compatible stack and yet present a 'proper' object to the programmer. It works, it's fast, but the complexity is at times bewildering.

I'm reasonably sure this is explained in the Blue Book and certainly in various seminal papers such as the Deutsch-Schiffman 84 paper 'Efficient implementation of the smalltalk-80 system' (http://portal.acm.org/citation.cfm?id=800542) and the Miranda '87 paper 'BrouHaHa- A portable Smalltalk Interpreter'.

tim

tim Rowledge; [hidden email]; http://www.rowledge.org/tim

A hacker does for love what others would not do for money.

Michael van der Gulik-2

Re: Complexity and starting over on the JVM (ideas)

On Feb 11, 2008 12:29 PM, tim Rowledge <[hidden email]> wrote:

On 10-Feb-08, at 2:53 PM, Michael van der Gulik wrote:

But the execution of that block (a BlockContext?) will be added to the top of the current execution stack thus preserving the linearity of the stack, right?

Nope.
A stack is linear last-in first-out. In C you branch to a subroutine, a stackframe is built on the stack and you excute code based on that. When you return from the subroutine all the memory in that stackframe is known to be free to reuse immediately if wanted by the next subroutine call.

If you can get a handle to the actual stackframe and pass it to some other code then you *cannot* reuse that memory until you have some way of knowing it is no longer needed. So now you have a sandbar in your stack and a tricky problem to solve. In squeak we do it the simple minded way and don't have contiguous stack but instead have actual explicitly allocated objects in the heap. That completely avoids the problem, at a cost in performance. VW uses a lot of very clever code to allow the system to live with a linearised machine compatible stack and yet present a 'proper' object to the programmer. It works, it's fast, but the complexity is at times bewildering.

I'm reasonably sure this is explained in the Blue Book and certainly in various seminal papers such as the Deutsch-Schiffman 84 paper 'Efficient implementation of the smalltalk-80 system' (http://portal.acm.org/citation.cfm?id=800542) and the Miranda '87 paper 'BrouHaHa- A portable Smalltalk Interpreter'.

Oh - that's what he meant by a linear stack - stack frames are contiguous in memory.

Gulik.

--
http://people.squeakfoundation.org/person/mikevdg
http://gulik.pbwiki.com/

Paul D. Fernhout

multiple versions of same package vs. mini-images (Was: Re: Guaging & Squeak/JVM)

In reply to this post by Igor Stasenko

Igor-

You suggested "enable multiple versions of same package in same image and
keep track of package dependency". That's been an inspirational suggestion
for me, and I've been thinking about how to implement it for a Squeak/JVM.

I don't have a definite solution yet, but here are some thoughts on it.

I feel it may come down to either picking one of two paths.

We could make a complex system for supporting multiple global system
dictionaries (or the equivalent) to allow multiple applications with
different dependencies to live together in one memory image. That's really
just an extension of the status-quo in some ways, packing ever more stuff
into one bigger and bigger image.

Or, we can break the monolithic image into small images which each just
support one application well (call them "mini-images"). Each mini-image
might in turn depend upon some other common mini-images for defining common
classes. This alternative would probably require Spoon-like
http://netjam.org/spoon/
remote development and remote-debugging support to work best (but it doesn't
absolutely have to, as there easily could be a development tools mini-image
included by reference even in the tiniest mini-image).

Personally, I think the second approach is ultimately simpler and more
elegant, and does a better job of bringing Smalltalk forward in a now
network-oriented world. See:
"Principles of Design -- Tim Berners-Lee "
http://www.w3.org/DesignIssues/Principles.html
"Principles such as simplicity and modularity are the stuff of software
engineering; decentralization and tolerance are the life and breath of
Internet."

You may well know all these issues, but I just thought I'd put it down for
others comments as I understand it (in case I was wrong or missed
something). Probably I'll have outlined some approaches people here know
about already created for Squeak or other systems, and anyone should feel
free to point me to them.

Anyway, feel free to stop reading here, but what follows is more details on
how I came to think about this and arrive at those two possible paths.

=============== how it is now, and a simple approach

The biggest aspect of this is resolving globals. For review, if I recall
correctly this is traditionally done in Squeak by the VM knowing about a
SystemDictionary called "Smalltalk" (the VM needs to know about it
absolutely to resolve a circular dependency of not being able to look up the
global "Smalltalk". :-). When a CompiledMethod being executed does
something like make a new instance of a class, it fetches the current
instance (typically of a class) associated with the name of the global and
sends it a message or stores it in a variable. Using named globals allows
late binding of classes by the compiled method.

If you didn't care about late binding, like in Forth referring to a
previously defined word, you could just make a hard link as a pointer to the
class at compile time in the compiled method. But then you could not replace
or remove the class in its entirety later.

There is room for only one version of a class at a time this normal way --
just one key in the Smalltalk system dictionary with one value.

The simplest way around this might be to have system dictionary values for
keys be dictionaries. Then you could tag each item with a version. But the
executing code would still need to resolve which one it wanted. And I don't
see how that would be easy. But maybe it might be?

And then there is a deeper problem related to composites of objects which
might include instances pointing to two or more different versions of the
same class. But we can ignore that for now. :-)

== A deeper analysis (or, "owww, my brain hurts". :-)

Python has a straightforward way to resolve this -- it supports a sea of
objects, and when you load code, the old classes get overridden in the
equivalent of a system dictionary with new classes, but the existing
instances still point to the old classes so those still hang around but are
not accessible by name. This makes it difficult to do development in a live
system, and you end up issuing special code to load things in differently
(not making new classes) if you want to do Smalltalk-style dynamic
development. But there is no reason you cannot simply load two version of
the same module (source file) and hang on to them somehow. Squeak could
certainly do something similar if it had modules or classes which could
exist without names.

When I try to generalize this global idea, there are other approaches. In
PataPata (in Python/Jython, trying to retrofit them with Squeak-like
capacities) I gave each object (typically Morphic-like GUI components) a
"world" instance variable. That pointed to what was essentially the
equivalent of a Smalltalk system dictionary to store globals or key
functionality. In practice, each major window was in its own world, although
that wasn't strictly required. Then I could have several worlds in the same
process, where each was somewhat self-consistent.

But objects could still slip from one to another, typically when opening an
inspector//browser tool (itself in its own world) on another world and maybe
copying an object from one place to another. Beyond globals, another reason
for each object to have a pointer to its "world" was that when I serialized
a world I just wanted the objects from that world to be written out and no
others, so I could check that pointer to make sure the serialization wasn't
wandering into writing out objects from other worlds (I didn't pursue the
concept of nested worlds, which might have been possible).

I was planning to use unnamed references to parents from prototypes (for
inherited behavior and constants) in PataPata, based on how Self did
prototypes and links, but I decided in the end to reference prototypes
representing parents by by name, for the purpose of documenting intent. But
that left a global lookup problem, resolved by having *every* prototype have
a "World" pointer. And there were predictable problems when worlds pointed
to themselves which I had to work around (especially when loading worlds).
[Self has a fancier way of getting names for unnamed prototypes I did not
want to try pursuing based on determining paths from a root.]

Anyway, generalizing on this "object-focused late binding lookup" approach,
objects can point to a global system dictionary, or they can point to other
objects in some consistently structured way (typically "parent" or
"container" or "class") which might in turn allow a path to find a global
(that process might even percolate up and then back down, say to *search*
for an object with a certain value; I supported this in PataPata to find
widgets with a certain name in the same window as a widget executing some
behavior).

But there is another way to do this, which is to have the thread, process,
stack frame, or virtual machine hold onto a global system dictionary object
somehow. This is closer to how Squeak does it with a system dictionary,
except there might be one system dictionary per process or thread or stack
frame. The difference is that the entity executing the code knows where to
look for globals even if the objects being used for executing code do not
(which presumably saves on memory, and provides a more consistent notion of
what versions of classes a process want to see, assuming that is a good idea
:-). In a most extreme case, the user running the program might know the
object ID or memory location of the global system dictionary and pass it in
as needed (this might happen in a debugger session). I might call this an
"execution-focused late binding lookup" approach.

For completeness, there is another approach which is to have globals stored
in relation to the memory where the objects are stored (or processes
executed) if memory is partitioned somehow. So if you have an object or
process memory location, you can find the global system dictionary that goes
with it by looking somewhere special in that memory chunk (beginning, end,
standard offset). Deep in the reality of a virtual machine, it might even be
using this approach in various ways (like making sure the pointer to the
system dictionary is, say, the first handle in an object memory table).

Probably someone who has a PhD in computer science could tell me the proper
terms for these approaches towards late binding? :-)

And of course, you can use more than one at a time. NewtonScript, for
example, found variables by having two different types of lookup, based on a
parent slot and visual containment. Maybe you could use all of the
approaches at once in some system just for fun. I don't think I'd want to
debug anything in it through. :-)

Anyway, this doesn't answer how specifically to do what you propose, but it
does suggest some possible points of intervention -- mainly instances or
processes.

But this leads to a deeper point. A Smalltalk VM (or any OO VM system like
it, like the JVM objects or Python objects) has problem with multiple global
objects if objects sharing the same VM in different global spaces can point
at each other directly.

Essentially, if you can have multiple global system dictionaries, you end up
in a situation where an object from a "module" in one set of interconnected
versions of modules can be reference by an object in a "module" in another
interconnected set of different module versions. At that point, what governs
the objects behavior, specially late binding lookup of globals? Should it be
governed by the module the object came from? Or should it be governed by the
module which it is now connected to? Or should it be governed by the process
executing and calling a method of the object (and that process might lookup
its globals in yet another way)? And similarly, when you absorb an instance
form another module, should its class still point to the old class or should
it point to the class in the new module?

In general, this issue is a variant of a deeper problem related to OO:
http://mail.python.org/pipermail/edu-sig/2007-April/007852.html
as I feel the idea, that objects can stand alone and be somehow meaningful,
is at the root of a lot of evil in the Smalltalk universe (e.g. "bitrot". :-)

Anyway, just from random comments here over the years, I get the feeling
that in their hearts the original Squeak Central people (Dan Ingalls
especially) understand this and use heavily customized images in practice as
coherent wholes, but perhaps they have never had the time to generalize this
idea to a philosophical principle. Certainly just fighting for objects at
all, as well as messages and VMs and good tools must have taken up lots of
energy.

Part of this issue may depend on whether you think of an object like a
single-celled creature like an Amoeba, or whether instead you think of an
object as part of a biological entirety, like as a protein molecule in a
cell, or a highly regulated cell in a large multi-cellar entity. If objects
can't meaningfully stand alone, then it seems like we need some coherent
philosophical approach to how they fit together into modules or images.

Loading multiple versions of the same classes seems to strain this possible
coherence, as useful as it might be. It's not that it won;t work, it's just
that the mental complexity starts increasing to the point where you may have
to be really clever (and really alert) to keep track of it all. :-)

=== two competing approaches

Because of all these difficulties and complexity, I'm inclined to lean
towards suggesting that images should be smaller, :-) and a VM's could
either be lightweight or perhaps could support multiple open images at once.
Then you can load one version of a module into a larger set of other
modules, and maintain that set for one application. This total image defines
an ecology of objects, and the objects and their classes all make sense in
relation to each other (as well as whatever I/O they choose to do through
the VM to the rest of the world). This is sort of like a living cell. And
you could then load a different version of code modules into another
*different* image and maintain that set for a different application. And
when these applications want to communicate, it will be from one image to
another, through their different VMs, presumably via sockets or shared
memory or files or whatever, via some common serialization process. There
are already several approaches for distributed objects in Smalltalk, so I
doubt this will be much of a problem, and the JVM and Java offer other
possibilities for remote procedure calls and such. I think that a minimal
image ("mini-image") approach might come closest to bringing some sanity to
the idea of personal images (like Dan Ingalls seems to like). Every image
would be a custom mix of module versions and hacked up base class code. The
image would know with a little developer help which objects belonged to
which modules. To help with this, one would need easy tools to export module
versions and configurations. An important aspect of such an approach might
be Spoon-like remote debugging, and remote development of minimal images so
you could have, say, one image open with your favorite debugging tools and
over a socket just plug those tools into other images you wanted to modify
or debug; this isn't strictly necessary -- but conceptually it makes things
more elegant, especially since then the development tools can have
different versions of base classes than thee system being debugged or
developed. I get the feeling the Squeak ecosystem has most of the parts of
all of this, they just haven't been all put together and polished toward
this end.

Still, for the JVM, which is what interests me right now, all the objects do
live in one world, and the JVM has a big memory footprint. So, given memory
footprint and startup time, even with the newer JVM's sharing some memory
across VM instances, I think we might have to end up living with multiple
system dictionaries in one JVM unless JVMs improve further? Or maybe if we
discover they are good enough now? In that case, I end up wondering if a
"world" instance variable added to every underlying Java object is such a
bad idea after all. :-) Or the alternative of a "world" instance variable
stored in each thread (or process) is also possible. Of course, globals are
rarely looked up, so more indirect ways of storing them might be more
efficient trading off time for memory. So this is a second alternative
approach which is closer to the direction you outline.

== best solution long term?

After considering two paths in the previous two paragraphs, I think using
lightweight images with only one system dictionary are a better way to go
long term. They are just simpler and already well understood.

If you, say, want a little clock up on your screen implemented in Squeak
(instead of Lively Kernel :-), you just have a clock image. Ideally, that's
all it does -- it's a clock. If you want to inspect the clock, you fire up
your development image in another JVM and connect to that clock JVM (maybe
using a universal debugging registry service). Maybe your development image
even gives you a copy of the image of the clock window with drag-and-drop
overlays on another screen. Or it might put annotations over the original
window by temporarily inserting a "glasspane" if the clock application was
using Swing widgets, or by the usual Squeak ways if the Clock application
used Morphic widgets.

To save space and maybe help with upgrades, perhaps the Clock application
image depends on another larger base image. I did that in PataPata where
worlds could require other worlds to be loaded first. Since I stored images
as textual Python code which could rebuild a world of objects procedurally,
that worked out OK. Here is an example of simple PataPata world; I would
expect a Squeak clock image built in a similar fashion would be about the
same tiny size and also written out as textual source:
http://patapata.svn.sourceforge.net/viewvc/patapata/tags/PataPata_v204/WorldDandelionGarden.py?revision=315&view=markup
(One fudge, the bitmap was store outside the image in a file.)
Note the line:
world.worldLibraries = [world.newWorldFromFile("WorldCommon.py")]
which is what defines the other worlds this world depends on. So, for
Squeak, this would be like saying your small image depends on other images
which load first.

Obviously you have to have any supporting images around or you can't load
your dependent one, but for the most part you just typically depend on
common downloaded images. If images are stored as text (essentially, a
Smalltalk program needed to rebuild the image) dependencies are a lot less
scary since you could always just go in and start cutting and pasting in a
text editor (but hopefully there would be better tools for this).

How to track and merge changes to base classes in supporting images is
obviously an issue, and it is not one PataPata tried to solve (beyond the
fact that prototypes made it easy to override base class behavior for most
things). But, since at runtime the supporting packages will be loaded, you
can easily modify it in the live image and then write out a modified version
of the base image again with a different version number, and hope somebody
down the road can reconcile your changes if you want them to move forward
with the supporting image.

In this lightweight approach, images might also become modules stored in
some source code repository if desired, or really, they might become more
like (ENVY-ish?) configuration maps on top of available stored modules. So,
to try to provide an example, you might save your running Clock image as
module Clock-1.1.4 which also depends on BaseClasses-3.4.2. (This would
require a worldwide way to identify Squeak modules uniquely.) Of course you
might not store Clock-1.1.4 on a server; it might be stored on a local drive
(perhaps in a Jar file, leading to Java classpath problems, but nothing is
perfect :-). You might open up Clock-1.1.4, modify it using Spoon-like
remote tools, and maybe even save it back under the same version number if
no one else depended on it (perhaps with an automatic minor sequential
internal revision number bump just in case). These names and version numbers
might also be more like human readable suggestions than absolutes -- for
example each "image" "module" could have a unique UUID (plus perhaps save
sequence) and dependencies could be expressed as lists of acceptable UUIDs
as well as names, with some sort of sophisticated matching algorithm to
trying resolve dependency issues and search for modules various places.

For this Clock example, when you work on the clock you might pull up another
image of development tools (browser, debugger, inspector, and so on). But
the versions of these (or the base classes they depend on) don't really
matter to the clock application. All that matters is that somehow the two
JVMs (or JVM processes) agree on how to talk to each other to add new
methods, return results, single step code, follow object references, and so
on. Presumably one could have a fairly standard protocol for that -- maybe
even an extensible one (perhaps Spoon has this?). Let's say something odd is
happening with the Clock. You want to see how an older version works. Well,
you just open up that older clock image. Then you might even open up a
"image comparing" utility image :-) which lets you connect to both the
running Clock images simultaneously and compare versions of all the classes
looking for differences. Still unsatisfied, maybe you clone the older image
(to start a third clock running) and bit by bit copy classes or modules from
the new image to the copy of the old until you find where the clock starts
to behave oddly. Then you make a change (remotely) to the first clock image
and see if it fixes the problem. Perhaps it turns out your code is perfect
but the anomaly is due to a really deep problem in code supporting
Squeak/JVM -- so you drop down a level conceptually and pull up a JVM
debugger image, or maybe even just Eclipse, :-)
http://www.eclipsezone.com/eclipse/forums/t53459.html
connect to the JVM supporting that Clock image directly, and start swearing
as you try to figure out what the Squeak/JVM maintainers did wrong this
time. :-) If you wish, all of your actions with the multiple Squeak-ish VMs
could have been logged to some common history repository somewhere to replay
the entire multi-VM development session back to everyone who doesn't believe
you that it's a JVM level issue. :-) Presumably one could build testing
tools for this architecture as well.

And Squeak in C could go down this mini-image route too.

As I think a little more about this, I am still perhaps stuck with the
problem that even in these mini-images, there would need to be some way to
link specific objects back to specific modules so a modified module could be
written back out with all its related objects. This is because a mini-image
is not just code, it is code plus live objects. And so when objects are
created, they would have to be assigned somehow to a specific module or
source mini-image. So, perhaps this mini-image solution needs to have a
"world" field (or "module" or "segment") in every object anyway, just so the
modified objects can be written back out into the right mini-image or
module? Or, if this was implemented in C, the image would be carved up into
memory segments, with new objects allocated to the chunk of memory going
with the specific min-image that was loaded.

Squeak already has an image segment effort:
http://wiki.squeak.org/squeak/1213
"ImageSegments and project swapping are still in the experimental stage"
But it is binary, not textual source. And it is based on specific roots, not
some sort of tag for each object. I guess both might take about the same
amount of space -- instead of tagging each item with its segment (world),
you have a big array which points to each object in the segment. Maybe you
might want both? So objects know their segment and segments know their
objects? And I find it a little amusing I am putting up windows in PataPata
defined by textual mini-image files of 3688 bytes (assuming a bitmap loads
off the network or from a local file :-) while they are talking about binary
image segments of 10s of megabytes.

And as I read more on modular Squeak, I'm realizing that with mini-images
the idea of a "project" would probably go away entirely.

And any tool which compared mini-images would have to have some way of
representing objects in two different mini-images so it could look for
similarities and differences. At the very least, maybe like Les Tyrrell's
OASIS project:
http://wiki.squeak.org/squeak/1056
But there is a big difference between loading representations of objects
(instances or their classes) to look at them and loading objects to use them.

Anyway, no easy solution. But I still think this second mini-image approach
is simpler conceptually than attempting to keep different versions of the
same things in the same VM. Both are possible, of course.

===

Anyway, maybe someone reading this might have a better suggestion or a
better (simpler, clearer) way of looking at this issue.

--Paul Fernhout

Igor Stasenko wrote:

> Ken Causey wrote:
>> [snip]
>> Within this community I've come to feel that the only day to day
>> practical solution is to do it and then ask for forgiveness when it goes
>> all pear shaped (badly). Of course when that happens it really helps
>> when it is something that can be readily reversed with no harm done.
>> And that's where it seems we have a problem because the current release
>> management schemes don't well-support removing something readily and in
>> such a way that few if any are inconvenienced. I don't have a ready
>> solution to that, it is something I find myself thinking about more and
>> more.
>> [snip]
>
> There is a solution: enable multiple versions of same package in same
> image and keep track of package dependency.
> So, when you loading an updated package, all code which worked before,
> continues to work in same way as it was before.
> We need a way to be able developer to choose, what parts of system can
> use new version and what should use older version due to
> incompatibility reasons by simply checking dependencies and updating
> dependency links.
>
> Also, this would help a lot in maintaining packages: a package author
> can easily keep track of his package dependencies, and may or may not
> wish to release his package with updated dependencies, which use
> latest versions of packages, his package depends from.
>
> Of course, this is somewhat idealistic, and there is many caveats, but
> if done well, will allow us to mix things without fear that something
> will not work due to incompatibilities.

Hans-Martin Mosner

Re: Complexity and starting over on the JVM (ideas)

In reply to this post by Michael van der Gulik-2

Michael van der Gulik schrieb:
>
>
> On Feb 11, 2008 12:29 PM, tim Rowledge <[hidden email]
> <mailto:[hidden email]>> wrote:
>
>
>
...
>
> A stack is linear last-in first-out. In C you branch to a
> subroutine, a stackframe is built on the stack and you excute code
> based on that. When you return from the subroutine all the memory
> in that stackframe is known to be free to reuse immediately if
> wanted by the next subroutine call.
>
...
>
>
> Oh - that's what he meant by a linear stack - stack frames are
> contiguous in memory.
Actually, it's pretty simple to use a linear stack in memory without
giving up the semantics of Smalltalk blocks.
There are two uses of the creating context in a block:
1. Access to variables shared between the block and the context or other
blocks.
2. Non-local return (i.e. a method return from a block)
For the first kind of use, variables could be placed into a separate
array whose lifetime is independent of that of the creating context.
The non-local return is a bit more tricky. One possibility is to keep a
pointer to the stack segment and the position of the frame within that
segment in the array of variables, and point to that array from the
stack frame. When a non-local return is attempted, the return code must
check whether the block is executed in the same process as the creating
context, and whether the frame corresponding to the creating context
still exists (by just checking whether it points to the array of variables).

VA Smalltalk does something like that.
For non-reflective block semantics, that's enough. I think it is not too
difficult to do continuations as well using such a stack architecture.
In the case of VA, things get complicated mostly because it is not
possible to manipulate the stack frame array directly without causing
all sorts of havoc. I think Instantiations is working on this to allow
Seaside to run on VA ST.

One nice side effect of a "linear" stack without separate context
objects is that you can overlap stack frames, i.e. objects pushed onto
the sender's stack frame in preparation can become part of the called
method's stack frame without copying. Of course, you need to get the
frame linkage slots out of the way somehow, but that's not too difficult
either.

The implementations in BrouHaHa and VW which try to hide the "cheating"
are ingenious, but I think that a linear stack implementation can be
exposed to the image without causing too much trouble, making the VM
design much simpler.

Cheers,
Hans-Martin

Michael van der Gulik-2

Re: multiple versions of same package vs. mini-images (Was: Re: Guaging & Squeak/JVM)

In reply to this post by Paul D. Fernhout

On Feb 11, 2008 7:14 PM, Paul D. Fernhout <[hidden email]> wrote:

Igor-

You suggested "enable multiple versions of same package in same image and
keep track of package dependency". That's been an inspirational suggestion
for me, and I've been thinking about how to implement it for a Squeak/JVM.

<sniiiiiiiiiiiiiiiiiip>

For the record, that email was 382 lines long or about 7 pages if printed, and it's just one of many of your posts!

How /do/ you manage to write so much stuff in a day, Paul!? I wish I could be as productive.

Gulik.

stephane ducasse

Re: multiple versions of same package vs. mini-images (Was: Re: Guaging & Squeak/JVM)

In reply to this post by Paul D. Fernhout

paul

cut your emails into chunk else really few people will read them.

Stef

On Feb 11, 2008, at 7:14 AM, Paul D. Fernhout wrote:

> Igor-
>
> You suggested "enable multiple versions of same package in same
> image and
> keep track of package dependency". That's been an inspirational
> suggestion
> for me, and I've been thinking about how to implement it for a
> Squeak/JVM.
>
> I don't have a definite solution yet, but here are some thoughts on
> it.
>
> I feel it may come down to either picking one of two paths.
>
> We could make a complex system for supporting multiple global system
> dictionaries (or the equivalent) to allow multiple applications with
> different dependencies to live together in one memory image. That's
> really
> just an extension of the status-quo in some ways, packing ever more
> stuff
> into one bigger and bigger image.
>
> Or, we can break the monolithic image into small images which each
> just
> support one application well (call them "mini-images"). Each mini-
> image
> might in turn depend upon some other common mini-images for defining
> common
> classes. This alternative would probably require Spoon-like
> http://netjam.org/spoon/
> remote development and remote-debugging support to work best (but it
> doesn't
> absolutely have to, as there easily could be a development tools
> mini-image
> included by reference even in the tiniest mini-image).
>
> Personally, I think the second approach is ultimately simpler and more
> elegant, and does a better job of bringing Smalltalk forward in a now
> network-oriented world. See:
> "Principles of Design -- Tim Berners-Lee "
> http://www.w3.org/DesignIssues/Principles.html
> "Principles such as simplicity and modularity are the stuff of
> software
> engineering; decentralization and tolerance are the life and breath of
> Internet."
>
> You may well know all these issues, but I just thought I'd put it
> down for
> others comments as I understand it (in case I was wrong or missed
> something). Probably I'll have outlined some approaches people here
> know
> about already created for Squeak or other systems, and anyone should
> feel
> free to point me to them.
>
> Anyway, feel free to stop reading here, but what follows is more
> details on
> how I came to think about this and arrive at those two possible paths.
>
> =============== how it is now, and a simple approach
>
> The biggest aspect of this is resolving globals. For review, if I
> recall
> correctly this is traditionally done in Squeak by the VM knowing
> about a
> SystemDictionary called "Smalltalk" (the VM needs to know about it
> absolutely to resolve a circular dependency of not being able to
> look up the
> global "Smalltalk". :-). When a CompiledMethod being executed does
> something like make a new instance of a class, it fetches the current
> instance (typically of a class) associated with the name of the
> global and
> sends it a message or stores it in a variable. Using named globals
> allows
> late binding of classes by the compiled method.
>
> If you didn't care about late binding, like in Forth referring to a
> previously defined word, you could just make a hard link as a
> pointer to the
> class at compile time in the compiled method. But then you could not
> replace
> or remove the class in its entirety later.
>
> There is room for only one version of a class at a time this normal
> way --
> just one key in the Smalltalk system dictionary with one value.
>
> The simplest way around this might be to have system dictionary
> values for
> keys be dictionaries. Then you could tag each item with a version.
> But the
> executing code would still need to resolve which one it wanted. And
> I don't
> see how that would be easy. But maybe it might be?
>
> And then there is a deeper problem related to composites of objects
> which
> might include instances pointing to two or more different versions
> of the
> same class. But we can ignore that for now. :-)
>
> == A deeper analysis (or, "owww, my brain hurts". :-)
>
> Python has a straightforward way to resolve this -- it supports a
> sea of
> objects, and when you load code, the old classes get overridden in the
> equivalent of a system dictionary with new classes, but the existing
> instances still point to the old classes so those still hang around
> but are
> not accessible by name. This makes it difficult to do development in
> a live
> system, and you end up issuing special code to load things in
> differently
> (not making new classes) if you want to do Smalltalk-style dynamic
> development. But there is no reason you cannot simply load two
> version of
> the same module (source file) and hang on to them somehow. Squeak
> could
> certainly do something similar if it had modules or classes which
> could
> exist without names.
>
> When I try to generalize this global idea, there are other
> approaches. In
> PataPata (in Python/Jython, trying to retrofit them with Squeak-like
> capacities) I gave each object (typically Morphic-like GUI
> components) a
> "world" instance variable. That pointed to what was essentially the
> equivalent of a Smalltalk system dictionary to store globals or key
> functionality. In practice, each major window was in its own world,
> although
> that wasn't strictly required. Then I could have several worlds in
> the same
> process, where each was somewhat self-consistent.
>
> But objects could still slip from one to another, typically when
> opening an
> inspector//browser tool (itself in its own world) on another world
> and maybe
> copying an object from one place to another. Beyond globals, another
> reason
> for each object to have a pointer to its "world" was that when I
> serialized
> a world I just wanted the objects from that world to be written out
> and no
> others, so I could check that pointer to make sure the serialization
> wasn't
> wandering into writing out objects from other worlds (I didn't
> pursue the
> concept of nested worlds, which might have been possible).
>
> I was planning to use unnamed references to parents from prototypes
> (for
> inherited behavior and constants) in PataPata, based on how Self did
> prototypes and links, but I decided in the end to reference prototypes
> representing parents by by name, for the purpose of documenting
> intent. But
> that left a global lookup problem, resolved by having *every*
> prototype have
> a "World" pointer. And there were predictable problems when worlds
> pointed
> to themselves which I had to work around (especially when loading
> worlds).
> [Self has a fancier way of getting names for unnamed prototypes I
> did not
> want to try pursuing based on determining paths from a root.]
>
> Anyway, generalizing on this "object-focused late binding lookup"
> approach,
> objects can point to a global system dictionary, or they can point
> to other
> objects in some consistently structured way (typically "parent" or
> "container" or "class") which might in turn allow a path to find a
> global
> (that process might even percolate up and then back down, say to
> *search*
> for an object with a certain value; I supported this in PataPata to
> find
> widgets with a certain name in the same window as a widget executing
> some
> behavior).
>
> But there is another way to do this, which is to have the thread,
> process,
> stack frame, or virtual machine hold onto a global system dictionary
> object
> somehow. This is closer to how Squeak does it with a system
> dictionary,
> except there might be one system dictionary per process or thread or
> stack
> frame. The difference is that the entity executing the code knows
> where to
> look for globals even if the objects being used for executing code
> do not
> (which presumably saves on memory, and provides a more consistent
> notion of
> what versions of classes a process want to see, assuming that is a
> good idea
> :-). In a most extreme case, the user running the program might know
> the
> object ID or memory location of the global system dictionary and
> pass it in
> as needed (this might happen in a debugger session). I might call
> this an
> "execution-focused late binding lookup" approach.
>
> For completeness, there is another approach which is to have globals
> stored
> in relation to the memory where the objects are stored (or processes
> executed) if memory is partitioned somehow. So if you have an object
> or
> process memory location, you can find the global system dictionary
> that goes
> with it by looking somewhere special in that memory chunk
> (beginning, end,
> standard offset). Deep in the reality of a virtual machine, it might
> even be
> using this approach in various ways (like making sure the pointer to
> the
> system dictionary is, say, the first handle in an object memory
> table).
>
> Probably someone who has a PhD in computer science could tell me the
> proper
> terms for these approaches towards late binding? :-)
>
> And of course, you can use more than one at a time. NewtonScript, for
> example, found variables by having two different types of lookup,
> based on a
> parent slot and visual containment. Maybe you could use all of the
> approaches at once in some system just for fun. I don't think I'd
> want to
> debug anything in it through. :-)
>
> Anyway, this doesn't answer how specifically to do what you propose,
> but it
> does suggest some possible points of intervention -- mainly
> instances or
> processes.
>
> But this leads to a deeper point. A Smalltalk VM (or any OO VM
> system like
> it, like the JVM objects or Python objects) has problem with
> multiple global
> objects if objects sharing the same VM in different global spaces
> can point
> at each other directly.
>
> Essentially, if you can have multiple global system dictionaries,
> you end up
> in a situation where an object from a "module" in one set of
> interconnected
> versions of modules can be reference by an object in a "module" in
> another
> interconnected set of different module versions. At that point, what
> governs
> the objects behavior, specially late binding lookup of globals?
> Should it be
> governed by the module the object came from? Or should it be
> governed by the
> module which it is now connected to? Or should it be governed by the
> process
> executing and calling a method of the object (and that process might
> lookup
> its globals in yet another way)? And similarly, when you absorb an
> instance
> form another module, should its class still point to the old class
> or should
> it point to the class in the new module?
>
> In general, this issue is a variant of a deeper problem related to OO:
> http://mail.python.org/pipermail/edu-sig/2007-April/007852.html
> as I feel the idea, that objects can stand alone and be somehow
> meaningful,
> is at the root of a lot of evil in the Smalltalk universe (e.g.
> "bitrot". :-)
>
> Anyway, just from random comments here over the years, I get the
> feeling
> that in their hearts the original Squeak Central people (Dan Ingalls
> especially) understand this and use heavily customized images in
> practice as
> coherent wholes, but perhaps they have never had the time to
> generalize this
> idea to a philosophical principle. Certainly just fighting for
> objects at
> all, as well as messages and VMs and good tools must have taken up
> lots of
> energy.
>
> Part of this issue may depend on whether you think of an object like a
> single-celled creature like an Amoeba, or whether instead you think
> of an
> object as part of a biological entirety, like as a protein molecule
> in a
> cell, or a highly regulated cell in a large multi-cellar entity. If
> objects
> can't meaningfully stand alone, then it seems like we need some
> coherent
> philosophical approach to how they fit together into modules or
> images.
>
> Loading multiple versions of the same classes seems to strain this
> possible
> coherence, as useful as it might be. It's not that it won;t work,
> it's just
> that the mental complexity starts increasing to the point where you
> may have
> to be really clever (and really alert) to keep track of it all. :-)
>
> === two competing approaches
>
> Because of all these difficulties and complexity, I'm inclined to lean
> towards suggesting that images should be smaller, :-) and a VM's could
> either be lightweight or perhaps could support multiple open images
> at once.
> Then you can load one version of a module into a larger set of other
> modules, and maintain that set for one application. This total image
> defines
> an ecology of objects, and the objects and their classes all make
> sense in
> relation to each other (as well as whatever I/O they choose to do
> through
> the VM to the rest of the world). This is sort of like a living
> cell. And
> you could then load a different version of code modules into another
> *different* image and maintain that set for a different application.
> And
> when these applications want to communicate, it will be from one
> image to
> another, through their different VMs, presumably via sockets or shared
> memory or files or whatever, via some common serialization process.
> There
> are already several approaches for distributed objects in Smalltalk,
> so I
> doubt this will be much of a problem, and the JVM and Java offer other
> possibilities for remote procedure calls and such. I think that a
> minimal
> image ("mini-image") approach might come closest to bringing some
> sanity to
> the idea of personal images (like Dan Ingalls seems to like). Every
> image
> would be a custom mix of module versions and hacked up base class
> code. The
> image would know with a little developer help which objects belonged
> to
> which modules. To help with this, one would need easy tools to
> export module
> versions and configurations. An important aspect of such an approach
> might
> be Spoon-like remote debugging, and remote development of minimal
> images so
> you could have, say, one image open with your favorite debugging
> tools and
> over a socket just plug those tools into other images you wanted to
> modify
> or debug; this isn't strictly necessary -- but conceptually it makes
> things
> more elegant, especially since then the development tools can have
> different versions of base classes than thee system being debugged
> or
> developed. I get the feeling the Squeak ecosystem has most of the
> parts of
> all of this, they just haven't been all put together and polished
> toward
> this end.
>
> Still, for the JVM, which is what interests me right now, all the
> objects do
> live in one world, and the JVM has a big memory footprint. So, given
> memory
> footprint and startup time, even with the newer JVM's sharing some
> memory
> across VM instances, I think we might have to end up living with
> multiple
> system dictionaries in one JVM unless JVMs improve further? Or maybe
> if we
> discover they are good enough now? In that case, I end up wondering
> if a
> "world" instance variable added to every underlying Java object is
> such a
> bad idea after all. :-) Or the alternative of a "world" instance
> variable
> stored in each thread (or process) is also possible. Of course,
> globals are
> rarely looked up, so more indirect ways of storing them might be more
> efficient trading off time for memory. So this is a second alternative
> approach which is closer to the direction you outline.
>
> == best solution long term?
>
> After considering two paths in the previous two paragraphs, I think
> using
> lightweight images with only one system dictionary are a better way
> to go
> long term. They are just simpler and already well understood.
>
> If you, say, want a little clock up on your screen implemented in
> Squeak
> (instead of Lively Kernel :-), you just have a clock image. Ideally,
> that's
> all it does -- it's a clock. If you want to inspect the clock, you
> fire up
> your development image in another JVM and connect to that clock JVM
> (maybe
> using a universal debugging registry service). Maybe your
> development image
> even gives you a copy of the image of the clock window with drag-and-
> drop
> overlays on another screen. Or it might put annotations over the
> original
> window by temporarily inserting a "glasspane" if the clock
> application was
> using Swing widgets, or by the usual Squeak ways if the Clock
> application
> used Morphic widgets.
>
> To save space and maybe help with upgrades, perhaps the Clock
> application
> image depends on another larger base image. I did that in PataPata
> where
> worlds could require other worlds to be loaded first. Since I stored
> images
> as textual Python code which could rebuild a world of objects
> procedurally,
> that worked out OK. Here is an example of simple PataPata world; I
> would
> expect a Squeak clock image built in a similar fashion would be
> about the
> same tiny size and also written out as textual source:
> http://patapata.svn.sourceforge.net/viewvc/patapata/tags/PataPata_v204/WorldDandelionGarden.py?revision=315&view=markup
> (One fudge, the bitmap was store outside the image in a file.)
> Note the line:
> world.worldLibraries = [world.newWorldFromFile("WorldCommon.py")]
> which is what defines the other worlds this world depends on. So, for
> Squeak, this would be like saying your small image depends on other
> images
> which load first.
>
> Obviously you have to have any supporting images around or you can't
> load
> your dependent one, but for the most part you just typically depend on
> common downloaded images. If images are stored as text (essentially, a
> Smalltalk program needed to rebuild the image) dependencies are a
> lot less
> scary since you could always just go in and start cutting and
> pasting in a
> text editor (but hopefully there would be better tools for this).
>
> How to track and merge changes to base classes in supporting images is
> obviously an issue, and it is not one PataPata tried to solve
> (beyond the
> fact that prototypes made it easy to override base class behavior
> for most
> things). But, since at runtime the supporting packages will be
> loaded, you
> can easily modify it in the live image and then write out a modified
> version
> of the base image again with a different version number, and hope
> somebody
> down the road can reconcile your changes if you want them to move
> forward
> with the supporting image.
>
> In this lightweight approach, images might also become modules
> stored in
> some source code repository if desired, or really, they might become
> more
> like (ENVY-ish?) configuration maps on top of available stored
> modules. So,
> to try to provide an example, you might save your running Clock
> image as
> module Clock-1.1.4 which also depends on BaseClasses-3.4.2. (This
> would
> require a worldwide way to identify Squeak modules uniquely.) Of
> course you
> might not store Clock-1.1.4 on a server; it might be stored on a
> local drive
> (perhaps in a Jar file, leading to Java classpath problems, but
> nothing is
> perfect :-). You might open up Clock-1.1.4, modify it using Spoon-like
> remote tools, and maybe even save it back under the same version
> number if
> no one else depended on it (perhaps with an automatic minor sequential
> internal revision number bump just in case). These names and version
> numbers
> might also be more like human readable suggestions than absolutes --
> for
> example each "image" "module" could have a unique UUID (plus perhaps
> save
> sequence) and dependencies could be expressed as lists of acceptable
> UUIDs
> as well as names, with some sort of sophisticated matching algorithm
> to
> trying resolve dependency issues and search for modules various
> places.
>
> For this Clock example, when you work on the clock you might pull up
> another
> image of development tools (browser, debugger, inspector, and so
> on). But
> the versions of these (or the base classes they depend on) don't
> really
> matter to the clock application. All that matters is that somehow
> the two
> JVMs (or JVM processes) agree on how to talk to each other to add new
> methods, return results, single step code, follow object references,
> and so
> on. Presumably one could have a fairly standard protocol for that --
> maybe
> even an extensible one (perhaps Spoon has this?). Let's say
> something odd is
> happening with the Clock. You want to see how an older version
> works. Well,
> you just open up that older clock image. Then you might even open up a
> "image comparing" utility image :-) which lets you connect to both the
> running Clock images simultaneously and compare versions of all the
> classes
> looking for differences. Still unsatisfied, maybe you clone the
> older image
> (to start a third clock running) and bit by bit copy classes or
> modules from
> the new image to the copy of the old until you find where the clock
> starts
> to behave oddly. Then you make a change (remotely) to the first
> clock image
> and see if it fixes the problem. Perhaps it turns out your code is
> perfect
> but the anomaly is due to a really deep problem in code supporting
> Squeak/JVM -- so you drop down a level conceptually and pull up a JVM
> debugger image, or maybe even just Eclipse, :-)
> http://www.eclipsezone.com/eclipse/forums/t53459.html
> connect to the JVM supporting that Clock image directly, and start
> swearing
> as you try to figure out what the Squeak/JVM maintainers did wrong
> this
> time. :-) If you wish, all of your actions with the multiple Squeak-
> ish VMs
> could have been logged to some common history repository somewhere
> to replay
> the entire multi-VM development session back to everyone who doesn't
> believe
> you that it's a JVM level issue. :-) Presumably one could build
> testing
> tools for this architecture as well.
>
> And Squeak in C could go down this mini-image route too.
>
> As I think a little more about this, I am still perhaps stuck with the
> problem that even in these mini-images, there would need to be some
> way to
> link specific objects back to specific modules so a modified module
> could be
> written back out with all its related objects. This is because a
> mini-image
> is not just code, it is code plus live objects. And so when objects
> are
> created, they would have to be assigned somehow to a specific module
> or
> source mini-image. So, perhaps this mini-image solution needs to
> have a
> "world" field (or "module" or "segment") in every object anyway,
> just so the
> modified objects can be written back out into the right mini-image or
> module? Or, if this was implemented in C, the image would be carved
> up into
> memory segments, with new objects allocated to the chunk of memory
> going
> with the specific min-image that was loaded.
>
> Squeak already has an image segment effort:
> http://wiki.squeak.org/squeak/1213
> "ImageSegments and project swapping are still in the experimental
> stage"
> But it is binary, not textual source. And it is based on specific
> roots, not
> some sort of tag for each object. I guess both might take about the
> same
> amount of space -- instead of tagging each item with its segment
> (world),
> you have a big array which points to each object in the segment.
> Maybe you
> might want both? So objects know their segment and segments know their
> objects? And I find it a little amusing I am putting up windows in
> PataPata
> defined by textual mini-image files of 3688 bytes (assuming a bitmap
> loads
> off the network or from a local file :-) while they are talking
> about binary
> image segments of 10s of megabytes.
>
> And as I read more on modular Squeak, I'm realizing that with mini-
> images
> the idea of a "project" would probably go away entirely.
>
> And any tool which compared mini-images would have to have some way of
> representing objects in two different mini-images so it could look for
> similarities and differences. At the very least, maybe like Les
> Tyrrell's
> OASIS project:
> http://wiki.squeak.org/squeak/1056
> But there is a big difference between loading representations of
> objects
> (instances or their classes) to look at them and loading objects to
> use them.
>
> Anyway, no easy solution. But I still think this second mini-image
> approach
> is simpler conceptually than attempting to keep different versions
> of the
> same things in the same VM. Both are possible, of course.
>
> ===
>
> Anyway, maybe someone reading this might have a better suggestion or a
> better (simpler, clearer) way of looking at this issue.
>
> --Paul Fernhout
>
> Igor Stasenko wrote:
>> Ken Causey wrote:
>>> [snip]
>>> Within this community I've come to feel that the only day to day
>>> practical solution is to do it and then ask for forgiveness when
>>> it goes
>>> all pear shaped (badly). Of course when that happens it really
>>> helps
>>> when it is something that can be readily reversed with no harm done.
>>> And that's where it seems we have a problem because the current
>>> release
>>> management schemes don't well-support removing something readily
>>> and in
>>> such a way that few if any are inconvenienced. I don't have a ready
>>> solution to that, it is something I find myself thinking about
>>> more and
>>> more.
>>> [snip]
>>
>> There is a solution: enable multiple versions of same package in same
>> image and keep track of package dependency.
>> So, when you loading an updated package, all code which worked
>> before,
>> continues to work in same way as it was before.
>> We need a way to be able developer to choose, what parts of system
>> can
>> use new version and what should use older version due to
>> incompatibility reasons by simply checking dependencies and updating
>> dependency links.
>>
>> Also, this would help a lot in maintaining packages: a package author
>> can easily keep track of his package dependencies, and may or may not
>> wish to release his package with updated dependencies, which use
>> latest versions of packages, his package depends from.
>>
>> Of course, this is somewhat idealistic, and there is many caveats,
>> but
>> if done well, will allow us to mix things without fear that something
>> will not work due to incompatibilities.
>
>

Igor Stasenko

Re: multiple versions of same package vs. mini-images (Was: Re: Guaging & Squeak/JVM)

I'll try to be short.
1. No, smalltalk VM (at least squeak) doesn't care about globals (in
most cases). It uses a special objects table, which can be replaced on
the fly.
It simply because VM don't need to access globals when doing method
lookup. All objects refer to its classes directly.

2. To get rid of globals you have to change only few lines in compiler
code :) Of course, you should provide something another in exchange.
Btw, if you search mail archives, you'll find a discussion about that.

3. The main barrier in making multiple versions of same class/package
to live is support of dev tools (browser/compiler). VM don't require
groundbreaking changes to support this.
The exception is tagged oops (smallintegers) and well known
singletons: nil/true/false objects. Even if you will have multiple
SmallInteger classes, instances will be able to use only one of them.
This is a sacrifice.. Well, but you can always make boxed integers :)

2 Paul: most of these ideas can find a way into world, when Michael
van der Gulik will release his SecureSqueak project.
So, i suggest, you better discuss details with him in first place,
since he is the most interested person in this area. My idea of having
multiple versions of packages was just a fruit of discussion with him
:)
Also, i noticed that Mike's view on many things in different areas are
very similar to mine, which is good :) Who knows, maybe we'll join our
efforts someday.

--
Best regards,
Igor Stasenko AKA sig.

Paul D. Fernhout

Re: multiple versions of same package vs. mini-images

In reply to this post by Michael van der Gulik-2

Michael-

I guess it is a matter of priorities. This is important to me. Plus we don't
have broadcast TV. :-)

Also, which is more productive? Writing a seven page email which gets
ignored by most and *hopefully* trashed as advocating a design which is
uninformed or redundant or massively incomplete by a few people who are
really clued in on these issues (like yourself or Igor or whoever), [thanks
for your comments, Igor] or spending a person-year making such a system and
only then finding out after the fact it is uninformed, redundant, or
massively incomplete? After a dozen years, the Squeak ecosystem of projects
and people is so diverse it is hard to know what everyone is up to or has
done (and I've been away from it for quite a while in Python-land).

Obviously, writing code is more productive, if it gets used. But the problem
I am concerned about here is in part people writing code for Squeak (e.g.
with or without traits) and it being lost. You can have a lot of time to
write and read long emails if you don't write a lot of code which just ends
up getting thrown away instead. :-)

Of course, for most programmers reading and writing code is more enjoyable
than reading and writing design documents (or related documents).

I do write code (eventually. :-) And I am at this point of needing design
feedback precisely because of some code (PataPata) I was ultimately unhappy
with (though I thought it was a productive experiment, since you learn from
experiments whether they succeed or fail).

Still, I'll concede as in a previously supplied link relating to Chandler
that designs are always fraught with the peril that they missed some key
idea you only find out deep in implementation which makes the whole project
pointless. Still, an experienced designer is able to a limited extent to
simulate a paper design in his or her head and get as feel for it, at least
to the point of seeing obvious incompletenesses. But ultimately, it is true,
the proof of a design idea is in working and useful code.

--Paul Fernhout

Michael van der Gulik wrote:
> For the record, that email was 382 lines long or about 7 pages if printed,
> and it's just one of many of your posts!
>
> How /do/ you manage to write so much stuff in a day, Paul!? I wish I could
> be as productive.

Paul D. Fernhout

Re: multiple versions of same package vs. mini-images (Was: Re: Guaging & Squeak/JVM)

In reply to this post by Igor Stasenko

Igor-

Thanks for the feedback. I'll look more into what Mike is doing with
SecureSqueak.

Your thoughts also help me clarify something for myself as to VMs. I realize
now that I am mosly concerned with writing out sets of code and objects
(like in PataPata) whether they are called "image segments", "modules",
"parcels", or whatever, as opposed to writing out images.

And much of this issue of globals for me revolves around how to make it
possible to track what objects go in what image segment (or module or
whatever). And when an object is created is the most obvious time to make
that assignment to an segment/module/parcel/whatever.

I'll agree good tool support for this whole process is essential. For
example, if you were using an inspector, you may often want to know what
module the object was supposed to belong with (or even what image if it was
browsed remotely like with Spoon), as well as maybe change that
relationship. This is probably true as well for loading multiple version of
the same class into the same image (assuming you wanted to write the class
and its instances back out later). And this class/instance relationships is
another way to keep objects separated into modules or some classes of
objects But ultimately, if I write out a live window along with the class
which defines it, I need to write out all the mall integers or collections
which define the window and its behavior and state as well, and those
instances of core classes will usually be defined elsewhere. As I wrote here:
http://patapata.sourceforge.net/critique.html
there were many disappointments with PataPata, but the small text based
images were something I was pleased with (like the 4K example I linked to
which defines a live window). See also:
"Power Of Plain Text"
http://www.c2.com/cgi/wiki/quickDiff?PowerOfPlainText

Anyway, I see as a matter of emphasis what I should be focusing on in a
Squeak/JVM is indeed good tool support as well as whatever it takes within
the underlying infrastructure to be able to round up objects and say they
belong together in some package to be written out. And naturally, the
objects might want to know what module they belong to too, if they need to
use that information someway (like module specific globals).

--Paul Fernhout

Igor Stasenko wrote:

> I'll try to be short.
> 1. No, smalltalk VM (at least squeak) doesn't care about globals (in
> most cases). It uses a special objects table, which can be replaced on
> the fly.
> It simply because VM don't need to access globals when doing method
> lookup. All objects refer to its classes directly.
>
> 2. To get rid of globals you have to change only few lines in compiler
> code :) Of course, you should provide something another in exchange.
> Btw, if you search mail archives, you'll find a discussion about that.
>
> 3. The main barrier in making multiple versions of same class/package
> to live is support of dev tools (browser/compiler). VM don't require
> groundbreaking changes to support this.
> The exception is tagged oops (smallintegers) and well known
> singletons: nil/true/false objects. Even if you will have multiple
> SmallInteger classes, instances will be able to use only one of them.
> This is a sacrifice.. Well, but you can always make boxed integers :)
>
> 2 Paul: most of these ideas can find a way into world, when Michael
> van der Gulik will release his SecureSqueak project.
> So, i suggest, you better discuss details with him in first place,
> since he is the most interested person in this area. My idea of having
> multiple versions of packages was just a fruit of discussion with him
> :)
> Also, i noticed that Mike's view on many things in different areas are
> very similar to mine, which is good :) Who knows, maybe we'll join our
> efforts someday.

Michael van der Gulik-2

Re: multiple versions of same package vs. mini-images (Was: Re: Guaging & Squeak/JVM)

In reply to this post by Igor Stasenko

On Feb 12, 2008 12:38 AM, Igor Stasenko <[hidden email]> wrote:

3. The main barrier in making multiple versions of same class/package
to live is support of dev tools (browser/compiler). VM don't require
groundbreaking changes to support this.
The exception is tagged oops (smallintegers) and well known
singletons: nil/true/false objects. Even if you will have multiple
SmallInteger classes, instances will be able to use only one of them.
This is a sacrifice.. Well, but you can always make boxed integers :)

Sorry, I didn't catch the parent post to this.

My namespaces design[1] will allow the developer to load different versions of the same package into the image, and instantiate classes from either.

The issues are:
- Class comparison, because although two classes might have the same name, they'll be different classes.
- Objects in the VM's special objects array. These are discussed in the last section of [1].

In a few weeks once I've got my package management system[2] working, I'll be making a new Metaclass hierarchy which will exist in the image alongside the existing Metaclass hierarchy. This will allow me to make radical changes to it without worrying about breaking the image.

Note that these are brain dumps rather than documentation, and that they constantly change:
[1] http://gulik.pbwiki.com/Namespaces
[2] http://gulik.pbwiki.com/Packages

Also, i noticed that Mike's view on many things in different areas are
very similar to mine, which is good :) Who knows, maybe we'll join our
efforts someday.

My plans are to design a secure kernel myself and then invite other developers to help when it is usable. I'm only working on this a couple of hours a week as a hobby, so don't hold your breath waiting. I'm usually on IRC when I'm working on it (yay broadband!).

Gulik.

--
http://people.squeakfoundation.org/person/mikevdg
http://gulik.pbwiki.com/

1 ... 4567