Re: [Pharo-project] [squeak-dev] Smalltalk vs Eclipse

Previous Topic Next Topic
 
classic Classic list List threaded Threaded
1 message Options
Reply | Threaded
Open this post in threaded view
|

Re: [Pharo-project] [squeak-dev] Smalltalk vs Eclipse

Eliot Miranda-2
Hi Ralph,

On Thu, Jan 21, 2010 at 10:40 PM, Ralph Boland <[hidden email]> wrote:
Lest I confuse you let me state that I STRONGLY prefer using the
Smalltalk (Squeak in my case) development environment to using
Eclipse.  It is also worth pointing out that Eclipse is a development
environment that works with multiple languages whereas in
Smalltalk the development tool and development language are
intimately connected so to compare the two is to compare apples
and oranges.

Nevertheless there are advantages to both and I wonder to what
extent I can have my cake and eat it too  (that would be an orapple
cake I suppose).  I should also point out that my discussion is abstract;
I am asking if it would have been better if Smalltalk had
been designed the way I propose and not should some Smalltalk,
say Squeak, now be modified to be so.

The thing, particular to this discussion, about Eclipse that I like
is that the Eclipse (image I will call it) is separate from the
application (image I will call it).  In Smalltalk the two images are
combined.  This means that a lot of code is encorporated
into your application image that you may not want.
Of course, before you release your application you can run a
stripping program or process that strips out code that is not
used in your application. (I have never done this but my
understanding is that this is possible.)

Never having done this I assume that stripping is not that
satisfactory because there is much code that cannot be
stripped out; certainly unused methods and likely unused
classes as well; the dynamic nature of Smalltalk means that
much necessary information to determine that a method or
class is not used is not available so such methods and classes
must be included in the application release.
Some may argue, that with all the memory processing power
available today, who cares; but I care.  For example, I would
like to be able to put a Smalltalk application into a linux pipe
as in:

a | b > c

where a or b or both are applications written in Smalltalk.
Similarly I would like to use Smalltalk applications in bash (Linux)
shells scripts or use it in other scripting languages. So it
may be important that my image not be unnecessarily
large as getting it started may take longer than what it
does once it is started.
Of course I could use GNU Smalltalk for this but is it really
necessary that I know/use two smalltalks?

I think the Eclipse model suggests an improvement to Smalltalk.
It seems to me that when you run Smalltalk ideally you would
run two images.  One would be the development environment image
and the other would be the application image.  When you started
a new project you would choose from a selection of start images
each including some subset of the development classes you
expect to need. There would also be a capability to copy entire
methods/classes/applications/other from the development image to the
application image.  Note that in this model there would be need to search
the code base of either image  (or both).   For example, I may want to find
implementors/senders of method  #doSomething  in the development image,
the application image, or both.  I think being able to search the application
code base only is especially useful.

An interesting example is if you were working on the development image
itself.  In this case you would of course make as your application a copy
of the development image (and its changes file).  Now, in principle you need
not worry about many of the situations in which you get into an infinite loop
because you have broken the development environment itself in some way
such as putting a halt message somewhere that get called every time the
debugger is called.
Yes, but what if your are modifying the debugger itself?  Well you need some
kind of command such as:

    #runAsDevelopmentEnvironmentOn: image changesFile: file   "A
metacircular interpreter anybody?"

I am not sure who this message is sent to.

I am not actually sure if this is a good idea either.  Perhaps if you
want to work on the development
environment you start your image with a special flag or simply no
application image.
The development image then becomes editable and there is no
application image;  The
development image and application image become one as is the situation now.

There are doubtless many problems I am not seeing here and even
many ways this may be a good idea that I have not considered.

Either way I would like to hear what they are.

How much of an effort would be involved in modifying some version
of Smalltalk (say Squeak) to work this way?   (I have NO plans to do this!)

How much of a benefit would such a version of Smalltalk really be?

Has this been done before, perhaps in some other language, and what
were the results?

I can't speak in detail to the Firewall work done at Digitalk by people like Steve Messick, but they did get stuff done in the late 90's, although I don't think anything was productised.  I think they made the mistake of focussing on being able to produce very small executables rather than just separating the development and deployment images.  I think Steve was able to produce a tiny executable that only contained SmallInteger and hence was able to eliminate tagged pointers and method lookup.  But the focus on optimization was I think a step too far.  One thing I remember was that the communications stub in the deployment image was in the VM, not in Smalltalk code, again I think a mistake. Can anyone from Digitalk put flesh on these bones?

We spent a lot of time discussing doing this in the VisualWorks team about 10 years ago but as far as I'm aware none of it was realised.  Splitting tools UIs from the image under development (the target image) was a major design goal behind OpenTalk, the distributed messaging framework that Xu Wang did.  We had the idea of the Model in an MVC triad being the interface between the UI and target images, the natural point at which one would define an API narrow enough to be efficiently distributed.  If the API is designed appropriately one would be exchanging symbols (class names, selectors etc) not complex objects (classes, compiled methods) and that because symbols and numbers are immutable they can be passed by value.

We had the idea of the model being split in two, one half being specific to a tool and the same whether one was in a single image or in the ui/target pair, and the other half being an adaptor to either the local image or the remote target.  This adaptor would have sat between a model and some complex object graph (like the class hierarchy) and defined the entire API through which the object graph was accessed, narrowing the interface to make it suitable for distribution, and doumenting the interface to make it easier to understand.  Vassili named these things between the model and an object graph TTIBs for "The Things In Between".

We very much wanted to produce a set of tools in a tools UI image that could be targetted either at itself or at a remote target, and wanted to be able to attach the tools image to a remote target dynamically (which we called "dynamic capitation", for dynamically gaining a head) so that one could have full debuggability on a remote image deployed in the field.  Alas we never managed to get the priority of the project high enough to actually commit major resources to it.  There were always more pressing more mundane things to do.  Frustration at not doing things like this was in part behind my leaving Cincom.  Xu got quite far with a prototype, demonstrating things like a distributed inspector that allowed one to walk though an object graph spanning more than one image.  The inspector would change colour to indicate which machine the current object was one.



Personally I think that Smalltalk is the ideal vehicle for this kind of approach.  We have lightweight remote messaging which is relatively easy to implement.  We have UI architectures which are relatively well-decomposed and amenable to distribution.  We have a reified exception system which is not pushed down into hidden execution machinery and which has resume semantics.  We now have a much faster much more reliable internet.  We could relatively easily add things like delegation which can enable interesting interleaving of tools and target images (see below).  I think we could have something truly revolutionary, the ability to interact with a remote headless image that contains no development tools or UI frameworks and only a relatively small communications module (remote messaging interface).  This would enable us to get to images of a few 10s or 100s of kilobytes.  I fervently hope you work on this.  Lots of people want this.  I was reading Colin's Thin Air pages on wiresong yesterday and what he wants is realised by this.

If you do work on this remember your goals.  The most important thing is incremental deliverables and useful progress.  Don't focus on whizzy optimisations or cool demos (such as the inspector changing colour).  Dn't focus on arbitrarily complex topologies of remote images.  Instead focus on building a robust, well-engineered remote tools framework that just allows one to do what one can do today in a browser and debugger, but remotely between two images.  I would start with browser, debugger and stripper, where stripper would be something that allowed you to eliminate code from the target image using the UI image to discover what was there and what was needed or not, i.e. remote stripping, not the much trickier auto-stripping.

On delegation I think one really cool thing is to wrap remote objects with local development-only code.  In a deployed image I would expect there to be no inspectorClass methods or debugPrintOn: messages etc.  The target is lean and mean.  Anything not needed for deployment except that which supports adequate remote messaging is absent.  So one wants to be able to add methods that add debugging features to remote objects, and delegation is the way.  If the local handle on a remote object is wrapped by a delegate, the delegate can implement the debugging methods.  Catching doesNotUnderstand: in the target mage can allow one to intercept computations in the target and resume them in the development image.  So when connecting a tools image to a target one would layer tools and UI code upon objects in the target but that code would reside only in the tools image.

Naming conventions in packaging, like splitting a Foo package into Foo-Deployment & Foo-Development could help in making this kind of split easy to navigate and conceptualise.  So after producing the tools/target split I would next focus on packaging tools, including analysis of call graphs etc to help automating the deployment/development split.

Anyway, enough waffle.  It's great to hear this idea again.  Go for it!

(Constructive) comments most welcome.

Regards,

Ralph Boland



_______________________________________________
Pharo-project mailing list
[hidden email]
http://lists.gforge.inria.fr/cgi-bin/mailman/listinfo/pharo-project