Smalltalk › Squeak › Squeak - Dev

[squeak-dev] Perl is to CPAN as Squeak is to (what)?

_‹ Previous Topic Next Topic _›

Classic

List

Threaded

91 messages Options

12345

Michael Haupt-3

Re: [squeak-dev] Bootstrapping (Subversion

Hi Michael,

On Sun, Jun 29, 2008 at 6:22 PM, Michael Rueger <[hidden email]> wrote:
> After that probably because it's not Python...

ha! http://pypysqueak.blogspot.com/

scnr,

Michael

Bert Freudenberg

Re: [squeak-dev] Bootstrapping

In reply to this post by Randal L. Schwartz

Am 29.06.2008 um 16:51 schrieb Randal L. Schwartz:

>>>>>> "Bert" == Bert Freudenberg <[hidden email]> writes:
>
> Bert> No it would not. The main issue for them is that you have to
> start from what
> Bert> they perceive as "binary blob" which is monkey-patched into
> newer versions.
>
> The C compiler would fit the same definition, by that reasoning.

No. You need a C compiler, true, but it builds the next C compiler
from text sources only, it does not clone itself.

This is also the difference between using SystemTracer to clone an
image into a new format vs. what Ralph suggested, using an image to
assemble a new image from scratch, containing only an explicitly
defined set of objects.

- Bert -

Randal L. Schwartz

Re: [squeak-dev] Bootstrapping

>>>>> "Bert" == Bert Freudenberg <[hidden email]> writes:

Bert> No. You need a C compiler, true, but it builds the next C compiler from text
Bert> sources only, it does not clone itself.

Bert> This is also the difference between using SystemTracer to clone an image
Bert> into a new format vs. what Ralph suggested, using an image to assemble a
Bert> new image from scratch, containing only an explicitly defined set of
Bert> objects.

Sorry, I was inadvertently confusing what Ralph suggested (which seems
like the C compiler technique) with the objection to the SystemTracer.

--
Randal L. Schwartz - Stonehenge Consulting Services, Inc. - +1 503 777 0095
<[hidden email]> <URL:http://www.stonehenge.com/merlyn/>
Smalltalk/Perl/Unix consulting, Technical writing, Comedy, etc. etc.
See http://methodsandmessages.vox.com/ for Smalltalk and Seaside discussion

Nicolas Cellier-3

[squeak-dev] Re: Bootstrapping (Subversion

In reply to this post by timrowledge

tim Rowledge a écrit :

>
>
> Alejandro Reimondo's Fenix stuff read (through some extraordinarily
> 'creative' and Windows dependant code) source snippets from files and
> built a parallel object hierarchy and then wrote an image out via a
> modified tracer. It's a very simple idea but rather less simple to
> implement correctly. It is however certainly doable.
>
> In principle one could certainly keep a tree of SVN like form, read in
> the code whilst traversing the tree, create objects to suit and then
> trace only those objects to create an 'image created from source'.
>
> And you know what? If we put together a system to do that, there would
> still be objectors; there will always be objectors.
>
> tim
> --
>

This is the idea of replacing a "binary blob" (the image) with an
automatically generated "textual blob" (a textual description of the
object graph, or a script to reconstruct object graph).

Is it usable? A maintainer will use diff tools to check difference
between two versions.
Unfortunately, a small difference in an Object graph (think graph, not
tree) might result in a big difference in textual representation. Such
generator should take care to generate minimal textual diffs... That
means diffing two general graph, analyzing previous textual
representation and producing a textual representation closest to
previous one.

What a maintainer would need, is rather a new diff tool able to perform
above operation, find the diffs between two images...

A tool that analyzes a foreign image and generates a script able to
convert this image to the foreign one...

Not simple, a naive implementation would not work, for example because
of sourcePointers, an image is not independant form it's change files...

Nicolas

Rob Rothwell

Re: [squeak-dev] Re: Bootstrapping (Subversion

On Sun, Jun 29, 2008 at 1:35 PM, nicolas cellier <[hidden email]> wrote:

This is the idea of replacing a "binary blob" (the image) with an automatically generated "textual blob" (a textual description of the object graph, or a script to reconstruct object graph).

Is it usable? A maintainer will use diff tools to check difference between two versions.
Unfortunately, a small difference in an Object graph (think graph, not tree) might result in a big difference in textual representation. Such generator should take care to generate minimal textual diffs... That means diffing two general graph, analyzing previous textual representation and producing a textual representation closest to previous one.

What a maintainer would need, is rather a new diff tool able to perform above operation, find the diffs between two images...

A tool that analyzes a foreign image and generates a script able to convert this image to the foreign one...

Not simple, a naive implementation would not work, for example because of sourcePointers, an image is not independant form it's change files...

Not simple at all! I know this isn't exactly the same as what you are talking about, but I saw some presentations at Smalltalk Solutions dealing with file-based code storage and/or analysis:

http://www.stic.st/stic?content=sts08Detail#june18Detail

Using VisualWorks Store for GemStone Code Management

Porting experience report

Smalltalk Development Tools: Bringing Smalltalk to Eclipse

I think the consensus was "it's hard!"

Rob

Yoshiki Ohshima-2

Re: [squeak-dev] Bootstrapping

In reply to this post by Bert Freudenberg

At Sun, 29 Jun 2008 18:48:01 +0200,
Bert Freudenberg wrote:

>
> Am 29.06.2008 um 16:51 schrieb Randal L. Schwartz:
>
> >>>>>> "Bert" == Bert Freudenberg <[hidden email]> writes:
> >
> > Bert> No it would not. The main issue for them is that you have to
> > start from what
> > Bert> they perceive as "binary blob" which is monkey-patched into
> > newer versions.
> >
> > The C compiler would fit the same definition, by that reasoning.
>
>
> No. You need a C compiler, true, but it builds the next C compiler
> from text sources only, it does not clone itself.

Come to think of it, we don't have to write the bootstrapper in
another language. They already accept Squeak VM, so we can write the
Smalltalk-to-CompiledMethod compiler in Slang (even better; write it
in OMeta and generate Slang), and generate a modified VM that has
different main(). That main() reads the "source" files, compile them
(by using the memory management in ObjectMemory), and write the result
to file. (Yes, you need a C compiler^^;)

Isn't it sound a bit more doable?

-- Yoshiki

Bert Freudenberg

Re: [squeak-dev] Bootstrapping

Am 29.06.2008 um 20:08 schrieb Yoshiki Ohshima:

> At Sun, 29 Jun 2008 18:48:01 +0200,
> Bert Freudenberg wrote:
>>
>> Am 29.06.2008 um 16:51 schrieb Randal L. Schwartz:
>>
>>>>>>>> "Bert" == Bert Freudenberg <[hidden email]> writes:
>>>
>>> Bert> No it would not. The main issue for them is that you have to
>>> start from what
>>> Bert> they perceive as "binary blob" which is monkey-patched into
>>> newer versions.
>>>
>>> The C compiler would fit the same definition, by that reasoning.
>>
>>
>> No. You need a C compiler, true, but it builds the next C compiler
>> from text sources only, it does not clone itself.
>
> Come to think of it, we don't have to write the bootstrapper in
> another language. They already accept Squeak VM, so we can write the
> Smalltalk-to-CompiledMethod compiler in Slang (even better; write it
> in OMeta and generate Slang), and generate a modified VM that has
> different main(). That main() reads the "source" files, compile them
> (by using the memory management in ObjectMemory), and write the result
> to file. (Yes, you need a C compiler^^;)
>
> Isn't it sound a bit more doable?

A bit. But I suspect getting classes and methods assembled into an
image is not even half the work. We'll have to create objects, too,
that were manually assembled (I'm thinking of the PaintBox prototype
for example). Recreating a full Etoys image would still be a major
effort, since many parts would have to be rewritten to actually be
bootstrappable.

- Bert -

K. K. Subramaniam

Re: [squeak-dev] Bootstrapping (Subversion (was: Re: Perl is to CPAN as Squeak is to (what)?))

In reply to this post by Yoshiki Ohshima-2

On Sunday 29 Jun 2008 2:39:42 pm Yoshiki Ohshima wrote:
> BTW, there was a discussion about a month ago (I basically read them
> just recently), and Bert was asking that how hard it is to do
> bootstrap from source. I know many of you have thought about the
> actual bootstrapping.
The OOPSLA Squeak paper refers to "Design a new ObjectMemory and image file
format". Were these design notes ever published anywhere? They could be
included in Squeak package along with the description of chunk file format
used for sources and changes.

Subbu

Jecel Assumpcao Jr

Re: [squeak-dev] Bootstrapping

In reply to this post by Bert Freudenberg

Bert Freudenberg wrote:
> > The C compiler would fit the same definition, by that reasoning.
>
> No. You need a C compiler, true, but it builds the next C compiler
> from text sources only, it does not clone itself.

Jim Gettys pointed out Ken Thompson's "Trusting trust" paper when this
thread was started in the olpc/education lists:

http://cm.bell-labs.com/who/ken/trust.html

Even though you carefully examine the sources for the next version of
the C compiler, you can't know for sure what the current binary of the C
compiler will do with them. It can insert code not seen in the sources.

The solution is to use more than one C compiler. User your current gcc
binary to compile the sources for lcc. Then use the resulting binary to
compile the sources for gcc. A Thompson-style Trojan that can handle
this situation is still possible, but exponentially more complex than
one designed for a single set of sources.

One equivalent in Squeak would be to have image manipulation tools
written in some other language, but a simpler alternative would be to
port the Smalltalk based tools we are already using to VisualWorks or
(better yet) GNU Smalltalk. But since I don't see anyone bothering to do
the gcc->lcc->gcc dance I don't see why Squeak should be held to such a
high standard.

The binary blob thing is a normal problem for Linux distributions. If I
give you the complete sources for some C application but also include
some PNG files for button images and a splash screen, it would take
about as much effort for me to hide nasty stuff in these as it would to
do the same in a Squeak image. If you were aware that I had done this
you would easily find the place in the C code where I was using the
images as I shouldn't, but otherwise I bet any number of people could
look right at the spot and not notice the evil intent.

In the end it is a matter of trusting some people, as Ken pointed out in
his paper. There is not way for me to know what the Intel or AMD people
put in the processors I am using. I might get the full sources for some
Linux system but don't have enough seconds left in my life to read it
all myself (I did it in 1994 when it was orders of magnitude smaller).
So I have to trust my processor company and I have to trust my software
suppliers. Then only alternative is to build my own processor from TTLs
and do all the software:

http://www.homebrewcpu.com/

-- Jecel

timrowledge

Re: [squeak-dev] Bootstrapping

In reply to this post by Bert Freudenberg

On 29-Jun-08, at 9:48 AM, Bert Freudenberg wrote:
>
>
> No. You need a C compiler, true, but it builds the next C compiler
> from text sources only, it does not clone itself.
>
> This is also the difference between using SystemTracer to clone an
> image into a new format vs. what Ralph suggested, using an image to
> assemble a new image from scratch, containing only an explicitly
> defined set of objects.

This turns to not be the case. I did not suggest using the tracer to
clone an image into a new format; I suggested building a clump of new
objects in an existing image's memory and then tracing out that clump
only. How would one separate this from the operation of a typical C
compiler; read source, generate stuff into memory, write it out.

But it really doesn't matter. Nothing we can do will stop some people
from making pointless objections and raising a ruckus. They just don't
like Smalltalk.

tim
--
tim Rowledge; [hidden email]; http://www.rowledge.org/tim
A computer scientist is someone who fixes things that aren't broken.

keith1y

Re: [squeak-dev] Bootstrapping

>
> But it really doesn't matter. Nothing we can do will stop some people
> from making pointless objections and raising a ruckus. They just don't
> like Smalltalk.
>
> tim
> -
Their loss, as far as I am concerned.

Keith

Igor Stasenko

Re: [squeak-dev] Bootstrapping (Subversion (was: Re: Perl is to CPAN as Squeak is to (what)?))

In reply to this post by K. K. Subramaniam

What Linux people proposing looks controversial to me.

Why, bits which was written in early '80 and then forgotten because
they don't needed anymore now become important? :)
A proposal sounds like: lets forget what we're done in last decades
and start it over again. Just because of what?
Smalltalk image is a living system. Try load and save new version of
image. We could say, its the same image. But we all know, that many
objects and states within newly saved image will be different to
original image. So how you suppose to convince linux-people that given
image is a product of creating from text sources? After few iterations
(loading/saving and loading new code) it can be nearly impossible to
clearly state that.

--
Best regards,
Igor Stasenko AKA sig.

Jason Johnson-5

Re: [squeak-dev] Subversion (was: Re: Perl is to CPAN as Squeak is to (what)?)

In reply to this post by Andreas.Raab

On Sat, Jun 28, 2008 at 9:45 PM, Andreas Raab <[hidden email]> wrote:

> Colin Putney wrote:
>>
>> On 28-Jun-08, at 5:27 AM, Claus Kick wrote:
>>
>>> If push comes to shove, I would even say, lets ditch them all and just
>>> use SVN like the rest of the planet (if that is possible). It is hard enough
>>> to sell a image-based language with a real IDE to the C-style crowd, the
>>> package management systems should not add their grain of salt to the soup.
>>
>> Been there, done that... <shudder/>
>>
>> Monticello was created because this turned out not to be feasible in
>> practice.
>
> Can you say something more about that? A couple of weeks ago I saw a demo at
> HPI in Potsdam where students used SVN down to the method level, and it
> seemed to me that this approach might very well work because the SVN
> granularity is the same as the in-image granularity. It may also be
> interesting that this wasn't even trying to deal with source files of any
> sort - it retained the nature of the image and simply hooked it up directly
> with SVN. From my perspective this looked like an extraordinarily
> interesting approach that I am certain to try out as soon as it is
> available.
>
> Cheers,
> - Andreas

Are you sure that was SVN and not something more modern like git,
mercuial, darcs or the like? I can't imagine SVN being seen as
anything but legacy by anyone but the most die-hard of fans. I
suspect integrating with a more modern system would be easier and it
would certainly make repositories better since SVN can't even do one
of the more common actions on a repository: merging [1].

[1] Well, they do a hack using comments to simulate merging with some
of the SVN bolt-on tools, but these days there is just no reason to
use a hack when you can just use one of many properly designed
systems.

Michael Haupt-3

Re: [squeak-dev] Subversion (was: Re: Perl is to CPAN as Squeak is to (what)?)

Hi Jason,

On Mon, Jun 30, 2008 at 2:10 PM, Jason Johnson
<[hidden email]> wrote:
> Are you sure that was SVN

sure it was.

Best,

Michael

Damien Pollet

Re: [squeak-dev] Bootstrapping (Subversion (was: Re: Perl is to CPAN as Squeak is to (what)?))

In reply to this post by Igor Stasenko

On Mon, Jun 30, 2008 at 5:54 AM, Igor Stasenko <[hidden email]> wrote:
> What Linux people proposing looks controversial to me.

Well, being able to redo the bootstrap from scratch or just to edit an
image from a separate tool would also make it simpler to throw stuff
away and get *really* minimal image for special purposes. You could
generate an image with just kernel plus application code, no UI, no
compiler, no reflective stuff...

--
Damien Pollet
type less, do more [ | ] http://people.untyped.org/damien.pollet

Randal L. Schwartz

Re: [squeak-dev] Subversion

In reply to this post by Jason Johnson-5

>>>>> "Jason" == Jason Johnson <[hidden email]> writes:

Jason> Are you sure that was SVN and not something more modern like git,
Jason> mercuial, darcs or the like? I can't imagine SVN being seen as
Jason> anything but legacy by anyone but the most die-hard of fans.

As a big proponent of git, I can tell you that the number of people and
companies who are just *now* considering the move from CVS(!) to "something
modern" like *SVN* is still staggering.

--
Randal L. Schwartz - Stonehenge Consulting Services, Inc. - +1 503 777 0095
<[hidden email]> <URL:http://www.stonehenge.com/merlyn/>
Smalltalk/Perl/Unix consulting, Technical writing, Comedy, etc. etc.
See http://methodsandmessages.vox.com/ for Smalltalk and Seaside discussion

Colin Putney

Re: [squeak-dev] Subversion (was: Re: Perl is to CPAN as Squeak is to (what)?)

In reply to this post by Andreas.Raab

On 28-Jun-08, at 12:45 PM, Andreas Raab wrote:

> Colin Putney wrote:
>> On 28-Jun-08, at 5:27 AM, Claus Kick wrote:
>>> If push comes to shove, I would even say, lets ditch them all and
>>> just use SVN like the rest of the planet (if that is possible). It
>>> is hard enough to sell a image-based language with a real IDE to
>>> the C-style crowd, the package management systems should not add
>>> their grain of salt to the soup.
>> Been there, done that... <shudder/>
>> Monticello was created because this turned out not to be feasible
>> in practice.
>
> Can you say something more about that? A couple of weeks ago I saw a
> demo at HPI in Potsdam where students used SVN down to the method
> level, and it seemed to me that this approach might very well work
> because the SVN granularity is the same as the in-image granularity.
> It may also be interesting that this wasn't even trying to deal with
> source files of any sort - it retained the nature of the image and
> simply hooked it up directly with SVN. From my perspective this
> looked like an extraordinarily interesting approach that I am
> certain to try out as soon as it is available.

DVS, the precursor to Monticello, stored all the source code to each
package in a single text file. Those files were then versioned using
CVS. The file format was a modified chunk format, with the chunks
sorted to prevent unnecessary textual churn. The usage pattern was to
file out, commit, update and file in.

A large part of the problem came from this two step process for
dealing with CVS. It was a hassle to keep track of the state of the
image relative to the state of the CVS working copy. It was easy to
make mistakes - commit when the wc wasn't up to date, develop when the
image wasn't up to date, etc. That would lead to weirdness in the code
that had to be manually sorted out.

Merge conflicts were another problem. The textual merging done by CVS
wasn't smart enough to deal with a lot of the changes that would
happen in development. For example, if two developers each added a
method that sorted similarly, they'd get a textual conflict even
though there was no conflict at the Smalltalk level.

As DVS developed we added functionality to minimize or work around
these issues, until it became clear that it would be less effort to
just keep our own version history and do our own merges. At that point
we ditched CVS and renamed DVS to Monticello.

Now, this idea of using one file per method has come up before, and I
believe it would eliminate many of the difficulties we had with DVS.
Merging methods would get better, for sure. Merging class definitions
would still be hassle, unless each instance variable, class variable,
and pool import were defined in separate files. If the sources and
changes files were eliminated, that would fix many of the
synchronization problems that we had with DVS, since there would be no
need to manually decide when to synchronize.

Still, I see two big problems with this approach. One is that the
synchronization problems don't entirely go away. What if some other
process modifies the files on disk? How does the image find out about
the change, and what should it do in response? What if the
modification happens while the image isn't running? There are probably
answers to these questions, but I doubt they'll be *good* answers.

The other big problem is that tens of thousands of tiny files is a
horribly inefficient way to store source code. Yes, disk is cheap. But
disk IO is not. I discovered this early in the development of MC2,
when I implemented a type of repository that stored each method in a
separate file. Loading OmniBrowser from that repository involved
opening, reading, and closing over 600 files, and was very slow. I
don't remember the exact timing, but I think it was like 5 to 10
minutes, and in any case it was far too slow. Avi wrote a repository
that stored every thing in a single indexed file, and now load time is
dominated by compilation.

A quick doIt in my working image shows 44682 methods. Now imagine that
on start up, the image scans all those files to make sure that all its
compiled methods are up to date. That will take a very, very long time.

Colin

Claus Kick

Re: [squeak-dev] Bootstrapping

In reply to this post by timrowledge

tim Rowledge wrote:

*snip*

> But it really doesn't matter. Nothing we can do will stop some people
> from making pointless objections and raising a ruckus. They just don't
> like Smalltalk.

How did my former boss (Smalltalk software development shop) say it:
Their loss - why should I help them see the light?

From the list, I gather that they like GNU/Smalltalk, probably because
of the prefix. Other than that: Is it that important to get the Debian
crowd to accept Etoys?

Claus

Jecel Assumpcao Jr

Re: [squeak-dev] Bootstrapping

Claus Kick wrote:
> tim Rowledge wrote:
>
> *snip*
>
> > But it really doesn't matter. Nothing we can do will stop some people
> > from making pointless objections and raising a ruckus. They just don't
> > like Smalltalk.

I call this the "Mac slots" syndrome. Back when the Mac was first
introduced I was amazed that the most popular excuse by far for
rejecting it was "it doesn't have slots and I can't buy a computer that
I can't expand as time goes on". I would have expected to hear that it
didn't have needed applications or that it was too expensive, but these
became common later (and are used to this day). When the Mac II and Mac
SE came out in 1987, guess how many of these complainers bought one?

Changing yourself to please people who currently don't like you doesn't
always get results.

> How did my former boss (Smalltalk software development shop) say it:
> Their loss - why should I help them see the light?

This is a very important point - they think it is our loss, that their
suggestions will add good things to Squeak and won't hurt anything we
already have. But as you mention below, other Smalltalks like
GNU/Smalltalk already have these features. So an alternative would be to
add EToys to one of them.

How about Self? Its VM is hand written C++ code. It includes the
compiler so it can build a new image entirely from a set of source files
(which happen to be already organized nicely into one-module-per-file
chunks divided in several subdirectories). It includes a version of
Morphic, which I personally find nicer than Squeak's (but far less
complete since it is older). The VM has advanced adaptive compilation
technology and performs very well.

So I nominate Self for Debian! Oh... right... the Linux port is very
outdated and was never complete in the first place. Why is that? Because
of all the "advantages" I listed above. Because of doing everything
right from the Linux viewpoint.

There are many things that are just fine in theory but never happen in
real life: a good Linux port of Self, an implementation of EToys in
Python, Squeak running on the Strongtalk VM and so on. Some of the
"flaws" that have pointed out in Squeak are exactly what have made
possible in practice for it to have great ports to many OSes and to
serve as the platform for EToys.

> From the list, I gather that they like GNU/Smalltalk, probably because
> of the prefix. Other than that: Is it that important to get the Debian
> crowd to accept Etoys?

Do they want to reject it entirely or just want to lump it in the "non
free" repositories? Having Squeak live next to Adobe Acrobat Reader
rather than beside GIMP isn't something that worries me very much.

-- Jecel

Avi Bryant-2

Re: [squeak-dev] Subversion (was: Re: Perl is to CPAN as Squeak is to (what)?)

In reply to this post by Colin Putney

On Mon, Jun 30, 2008 at 7:57 AM, Colin Putney <[hidden email]> wrote:

> The other big problem is that tens of thousands of tiny files is a horribly
> inefficient way to store source code. Yes, disk is cheap. But disk IO is
> not. I discovered this early in the development of MC2, when I implemented a
> type of repository that stored each method in a separate file. Loading
> OmniBrowser from that repository involved opening, reading, and closing over
> 600 files, and was very slow. I don't remember the exact timing, but I think
> it was like 5 to 10 minutes, and in any case it was far too slow. Avi wrote
> a repository that stored every thing in a single indexed file, and now load
> time is dominated by compilation.

It's worth pointing out that file-based version control has advanced
significantly since we did this work - CVS and SVN are now far from
the state of the art. I haven't used git much, for example, but it
seems to be a well layered system, and it may be that we can build an
alternative front end to its database which is image-based rather than
working directory based. For example, imagine comparing an image
directly to this index file rather than to a directory full of files
on disk:

http://www.kernel.org/pub/software/scm/git/docs/user-manual.html#the-index

And look at this description of the workflow:

http://www.kernel.org/pub/software/scm/git/docs/user-manual.html#the-workflow

I personally believe that we're better off with Smalltalk-specific
version control, but if someone *is* looking at integration with more
mainstream tools, I would strongly suggest they start with git rather
than SVN.

Avi

12345