Monticello 2 - request for information

Previous Topic Next Topic
 
classic Classic list List threaded Threaded
15 messages Options
Reply | Threaded
Open this post in threaded view
|

Monticello 2 - request for information

Damien Cassou-3
Hi Avi and Colin,

Google will pay me during this summer for a work on packages and
Squeak. I would really like to work on Monticello 2. My aim is that in
September, people will be able to start using it. But do to that, I
need some cooperation from you. Can you please answer my questions?

Where can I find documentation about what you have done?

What are the most important hierarchies? What are their roles?

What is the semantic behind the categories MC2-Squeak, MC2-SqueakPlatform?

What is the status of MC2? What has been done and what needs to be done?


Thank you very much

--
Damien Cassou

Reply | Threaded
Open this post in threaded view
|

RE: Monticello 2 - request for information

J J-6
Any interest in making a separate list or something for MC 2 discussion?  I
have several things I would like to see happen with it and would be prepared
(interested in fact) to contribute time on the project.

The main goal I have for it is that the the system should use change sets as
the base unit and in fact push more functionality down into change sets, and
change set tools.  I have heard that their are bugs with change sets but
that seems like a reason to fix the bugs, not avoid the use of change sets.

Some of the benefits of a change set based approach are:
Presumably faster load/save times since much less of the system must be
checked
Don't have to play funny tricks with protocols to get changes included in
your packages
Easier to differentiate local changes from package changes (see below [1])
Easier to separate major revisions after the fact (see below also [2] :)

[1] This is something I am dealing with now:  I am finishing up on adding
recurrence rules to the ICal package, but I switched to 3.10 because I
wanted to use the latest of Damien's dev images.  The problem is one of the
classes in ICal calls #raiseSignal: which doesn't seem to work in 3.10 for
some reason.  So to get my tests to pass (or at least test *my* code :) I
just changed the #raiseSignal: call to use #signal:.

But this isn't part of the work I am doing, and shouldn't be part of the
changes I upload.  In change sets, it is easy, I just create a new change
set and push the unrelated methods into it.  If MC (2) used change sets then
I could just tell the system that that set isn't part of my RR package.  
Then I can forget about it, and just do my saves, knowing that MC wont look
at that set.  The way it is now, I have to revert that method for every
check in.

The other option for me atm would be to just publish it with one of my other
changes, but then it is buried deep down in an unrelated change (similar to
how expensive and useless bridges sometimes get built in the US).

[2] And speaking of keeping related changes together, it is nice to be able
to do this after the fact.  One of the things I really like about darcs is;
I can go through coding, make as many changes as I want but then when I
commit them I specify which changes go in what change sets so that they can
be grouped in some logical way.  This makes it a lot easier if a person
wants to add one set of changes, but not some other.

Change sets have this ability right now, but no collaboration tool (afaik)
to manage the change sets themselves, once grouped.

Anyway, I would like to discuss this and more so I hope there is interest in
setting up a list or something.

Thanks,
Jason

>From: "Damien Cassou" <[hidden email]>
>Reply-To: The general-purpose Squeak developers
>list<[hidden email]>
>To: "The general-purpose Squeak developers
>list"<[hidden email]>
>Subject: Monticello 2 - request for information
>Date: Fri, 18 May 2007 13:20:08 +0200
>
>Hi Avi and Colin,
>
>Google will pay me during this summer for a work on packages and
>Squeak. I would really like to work on Monticello 2. My aim is that in
>September, people will be able to start using it. But do to that, I
>need some cooperation from you. Can you please answer my questions?
>
>Where can I find documentation about what you have done?
>
>What are the most important hierarchies? What are their roles?
>
>What is the semantic behind the categories MC2-Squeak, MC2-SqueakPlatform?
>
>What is the status of MC2? What has been done and what needs to be done?
>
>
>Thank you very much
>
>--
>Damien Cassou
>

_________________________________________________________________
More photos, more messages, more storage—get 2GB with Windows Live Hotmail.
http://imagine-windowslive.com/hotmail/?locale=en-us&ocid=TXT_TAGHM_migration_HM_mini_2G_0507


Reply | Threaded
Open this post in threaded view
|

Re: Monticello 2 - request for information

Ralph Johnson
On 5/18/07, J J <[hidden email]> wrote:
> Any interest in making a separate list or something for MC 2 discussion?  I
> have several things I would like to see happen with it and would be prepared
> (interested in fact) to contribute time on the project.
>
> The main goal I have for it is that the the system should use change sets as
> the base unit and in fact push more functionality down into change sets, and
> change set tools.  I have heard that their are bugs with change sets but
> that seems like a reason to fix the bugs, not avoid the use of change sets.

The problem is not that there are bugs with change sets.  Change sets
are mature, well tested technology.  I have been using them for 20
years and understand them very well.  The problem with change sets is
that they are fundamentally a bad idea.  Change sets are programs
which, when run, modify objects.  The objects that they modify are
classes, i.e. programs.  The purpose of MC is to provide a package
system and to make program definition more declarative, to make it
safer to merge versions, to see the difference between versions, and
so on.  MC is relatively immature and is full of bugs, but it is the
right way to go in the long run.

> [1] This is something I am dealing with now:  I am finishing up on adding
> recurrence rules to the ICal package, but I switched to 3.10 because I
> wanted to use the latest of Damien's dev images.  The problem is one of the
> classes in ICal calls #raiseSignal: which doesn't seem to work in 3.10 for
> some reason.  So to get my tests to pass (or at least test *my* code :) I
> just changed the #raiseSignal: call to use #signal:.

This was a mistake.  You are supposed to complain loudly on the 3.10
mailing list.  This is probably a bug, and the whole purpose of
releasing early and often is to find bugs.  But if people don't
complain about them, the process doesn't work.  We shouldn't provide
workarounds for bugs, we should fix them!

You are right that MC needs to allow for variations on packages, and
for changes that go across packages.  i suppose you could call these
"change sets", but they are not traditional Smalltalk change sets.

One of the problems with MC is that it doesn't have an explicit
representation of what goes in to a package, but instead relies on
names.  The name of a class's cateogry determins which package the
class it is in, and if a method is going to belong to a package other
than the package of its class, it must be in a protocol whose name is
'*' followed by the name of the package.  This was an expedient hack,
but it causes lots of trouble in the long run.  A package should just
be a list of classes and methods.  Perhaps the default is for all the
classes in one particular cateogry to be in the same package, but that
shouldn't be the rule.

It should be possible to have "overlay" packages, which override
certain methods in other packages.  Then you could put your local
changes in an "overlay" package.

Perhaps this is what you are calling a change set, but to me, a change
set is a file format that is a program that modifies classes.  MC
reprensents a package with a data structure that it can manipulate
safely.  We need to fix MC to do the things we need it to do, not go
back to change sets.

-Ralph Johnson

Reply | Threaded
Open this post in threaded view
|

Re: Monticello 2 - request for information

J J-6
Ralph, thanks for the response.

Let me start by defining just what I mean by "Change Sets", to be as clear
as I can.

I am talking from the point of view of how I use the darcs system, which is
by far the best general revision system I have ever used.

In darcs, when you make changes to your repository (files, directories, what
ever) and then try to commit you have the option at that point to specify
what literal changes (in darcs terminology: "chunks") are part of this
commit (what I'm calling a "change set").  After a "darcs change set" is
made, those "chunks" are now effectively bound to that "change set" and are
treated as one unit.  If you wish to break a "chunk" out you would have to
undo the creation of the "change set" and redo it.

The rest of the system does things based on this "change set" unit.  When I
commit to another repository (e.g. the main repo), I can specify which of
the new "change sets" I wish to commit (this makes it easy when you have a
batch of "change sets" where some are emergency fixes that must go to prod,
but need to be in dev as well, and you also have dev-only patches) to bring
the repo in sync.

Please see below for individual responses:

>From: "Ralph Johnson" <[hidden email]>
>Reply-To: The general-purpose Squeak developers
>list<[hidden email]>
>To: "The general-purpose Squeak developers
>list"<[hidden email]>
>Subject: Re: Monticello 2 - request for information
>Date: Fri, 18 May 2007 09:06:45 -0500
>
>The problem is not that there are bugs with change sets.  Change sets
>are mature, well tested technology.  I have been using them for 20
>years and understand them very well.  The problem with change sets is
>that they are fundamentally a bad idea.  Change sets are programs
>which, when run, modify objects.  The objects that they modify are
>classes, i.e. programs.

At the end of the day, that's what MC does as well, no? :)

>  The purpose of MC is to provide a package
>system and to make program definition more declarative, to make it
>safer to merge versions, to see the difference between versions, and
>so on.  MC is relatively immature and is full of bugs, but it is the
>right way to go in the long run.

The darcs model I mention above is, afaik, the best system around for safe
merges.  The whole system is based on a thesis of a "theory of patches". :)

>This was a mistake.  You are supposed to complain loudly on the 3.10
>mailing list.  This is probably a bug, and the whole purpose of
>releasing early and often is to find bugs.  But if people don't
>complain about them, the process doesn't work.  We shouldn't provide
>workarounds for bugs, we should fix them!

Yea, ok, but the way my time is divided at the moment, I can
send/read/manage email or I can write code.  That day I chose to patch and
move on. :)

But anyway, that was just an illustration of the fact that sometimes we will
need local patches (at least temporarily) that have to have some way of not
being considered part of the changes that are to be published.

>One of the problems with MC is that it doesn't have an explicit
>representation of what goes in to a package, but instead relies on
>names.  The name of a class's cateogry determins which package the
>class it is in, and if a method is going to belong to a package other
>than the package of its class, it must be in a protocol whose name is
>'*' followed by the name of the package.  This was an expedient hack,
>but it causes lots of trouble in the long run.  A package should just
>be a list of classes and methods.  Perhaps the default is for all the
>classes in one particular cateogry to be in the same package, but that
>shouldn't be the rule.

That is one way to look at it, and indeed workable, if the system could just
make it's choices and then give us a way to go back and make authoritative
manual corrections later.

But another way to look at it is: a module is the sum of the changes it
makes to the system.  If the working unit is (at least what I am calling) a
change set, then when you created a new "project" you could specify what
change sets apply to this package (and the system could take a cut
automatically using rules much like MC's to build a best guess).

I feel this would aid in documentation, as "when and why was this changed?"
is easier to answer when we have the full history of methods.  One problem I
see in MC is the disconnect to my method history and MC's.

If a method gets changed 20 times in my image and then published, when I
download it in a new image I only get the diff in MC, not those 20
iterations right?  It may be that you want this behavior sometimes but in MC
it is the only option, no?

>Perhaps this is what you are calling a change set, but to me, a change
>set is a file format that is a program that modifies classes.  MC
>reprensents a package with a data structure that it can manipulate
>safely.

Well, I am not thinking about a file format, but any change system is, at
the end of the day, going to modify the system.  The difference in the
Smalltalk system and what general change management systems have to deal
with is:  in general systems (e.g. darcs) the smallest unit of change can be
as little as a byte (maybe even a bit), while in Smalltalk it is a method.  
There isn't much point in recording that one letter changed in some string
since the whole method has to be recompiled anyway.

But as I understand it from what I have looked at in the classes and read on
the wiki, Squeak change sets *do* keep data about what was changed, etc.  I
think you can apply this as safely as anything else.

For example, a revision system that used change sets as a base unit would be
responsible for tracking dependencies, since the change sets themselves
wouldn't know what other change sets they depend on.  An individual change
set could also be though of as a transaction.  If any part of it would fail
application, none of the change set is applied.

I fail to see how this is more or less dangerous then what MC is doing,
though it could be that we just have a terminology mismatch.  I am talking
about objects in the Squeak image that represent changes, not some file
format.

>We need to fix MC to do the things we need it to do, not go
>back to change sets.
>
>-Ralph Johnson

"Go back" does not (always) equal "regress".  After all, some of us are
"going back" to Smalltalk and I would consider that a huge step forward.  :)

_________________________________________________________________
PC Magazine’s 2007 editors’ choice for best Web mail—award-winning Windows
Live Hotmail.
http://imagine-windowslive.com/hotmail/?locale=en-us&ocid=TXT_TAGHM_migration_HM_mini_pcmag_0507


Reply | Threaded
Open this post in threaded view
|

Re: Monticello 2 - request for information

Bert Freudenberg
In reply to this post by Ralph Johnson
> On 5/18/07, J J <[hidden email]> wrote:
>> Any interest in making a separate list or something for MC 2  
>> discussion?

I'd prefer the discussion to take place on squeak-dev.

On May 18, 2007, at 16:06 , Ralph Johnson wrote:

> One of the problems with MC is that it doesn't have an explicit
> representation of what goes in to a package, but instead relies on
> names.  The name of a class's cateogry determins which package the
> class it is in, and if a method is going to belong to a package other
> than the package of its class, it must be in a protocol whose name is
> '*' followed by the name of the package.  This was an expedient hack,
> but it causes lots of trouble in the long run.  A package should just
> be a list of classes and methods.  Perhaps the default is for all the
> classes in one particular cateogry to be in the same package, but that
> shouldn't be the rule.

Actually, MC does *not* define what goes into a package. It leaves  
this to PackageInfo. The default package info provides "virtual"  
packages based on naming conventions in the image. This solved the  
chicken-and-egg problem of how to introduce a packaging system  
without having package tool support in all the coding tools.

But it's very possible to have your own PackageInfo that just has a  
list of classes and methods which is not category-based.  I believe  
someone actually started working on this, but can't quite remember  
the state of this effort.

- Bert -



Reply | Threaded
Open this post in threaded view
|

Re: Monticello 2 - request for information

Philippe Marschall
2007/5/20, Bert Freudenberg <[hidden email]>:

> > On 5/18/07, J J <[hidden email]> wrote:
> >> Any interest in making a separate list or something for MC 2
> >> discussion?
>
> I'd prefer the discussion to take place on squeak-dev.
>
> On May 18, 2007, at 16:06 , Ralph Johnson wrote:
> > One of the problems with MC is that it doesn't have an explicit
> > representation of what goes in to a package, but instead relies on
> > names.  The name of a class's cateogry determins which package the
> > class it is in, and if a method is going to belong to a package other
> > than the package of its class, it must be in a protocol whose name is
> > '*' followed by the name of the package.  This was an expedient hack,
> > but it causes lots of trouble in the long run.  A package should just
> > be a list of classes and methods.  Perhaps the default is for all the
> > classes in one particular cateogry to be in the same package, but that
> > shouldn't be the rule.
>
> Actually, MC does *not* define what goes into a package. It leaves
> this to PackageInfo. The default package info provides "virtual"
> packages based on naming conventions in the image. This solved the
> chicken-and-egg problem of how to introduce a packaging system
> without having package tool support in all the coding tools.
>
> But it's very possible to have your own PackageInfo that just has a
> list of classes and methods which is not category-based.  I believe
> someone actually started working on this, but can't quite remember
> the state of this effort.

Seaside has a hack for this. The category is Seaside but the package
is Seaside2.

Cheers
Philippe

> - Bert -
>
>
>
>

Reply | Threaded
Open this post in threaded view
|

Re: Monticello 2 - request for information

J J-6
I happened on something by accident today that I didn't know about before:

http://www.wiresong.ca/air/articles/category/monticello

>From this it sounds like Monticello 2 is quite far, in fact it sounds like
it may even be close to ready to replace MC 1 already.


>From: "Philippe Marschall" <[hidden email]>
>Reply-To: The general-purpose Squeak developers
>list<[hidden email]>
>To: "The general-purpose Squeak developers
>list"<[hidden email]>
>Subject: Re: Monticello 2 - request for information
>Date: Sun, 20 May 2007 12:58:48 +0200
>
>2007/5/20, Bert Freudenberg <[hidden email]>:
>> > On 5/18/07, J J <[hidden email]> wrote:
>> >> Any interest in making a separate list or something for MC 2
>> >> discussion?
>>
>>I'd prefer the discussion to take place on squeak-dev.
>>
>>On May 18, 2007, at 16:06 , Ralph Johnson wrote:
>> > One of the problems with MC is that it doesn't have an explicit
>> > representation of what goes in to a package, but instead relies on
>> > names.  The name of a class's cateogry determins which package the
>> > class it is in, and if a method is going to belong to a package other
>> > than the package of its class, it must be in a protocol whose name is
>> > '*' followed by the name of the package.  This was an expedient hack,
>> > but it causes lots of trouble in the long run.  A package should just
>> > be a list of classes and methods.  Perhaps the default is for all the
>> > classes in one particular cateogry to be in the same package, but that
>> > shouldn't be the rule.
>>
>>Actually, MC does *not* define what goes into a package. It leaves
>>this to PackageInfo. The default package info provides "virtual"
>>packages based on naming conventions in the image. This solved the
>>chicken-and-egg problem of how to introduce a packaging system
>>without having package tool support in all the coding tools.
>>
>>But it's very possible to have your own PackageInfo that just has a
>>list of classes and methods which is not category-based.  I believe
>>someone actually started working on this, but can't quite remember
>>the state of this effort.
>
>Seaside has a hack for this. The category is Seaside but the package
>is Seaside2.
>
>Cheers
>Philippe
>
>>- Bert -
>>
>>
>>
>>
>

_________________________________________________________________
More photos, more messages, more storage—get 2GB with Windows Live Hotmail.
http://imagine-windowslive.com/hotmail/?locale=en-us&ocid=TXT_TAGHM_migration_HM_mini_2G_0507


Reply | Threaded
Open this post in threaded view
|

Re: Monticello 2 - request for information

Avi Bryant-2
On 5/20/07, J J <[hidden email]> wrote:
> I happened on something by accident today that I didn't know about before:
>
> http://www.wiresong.ca/air/articles/category/monticello
>
> >From this it sounds like Monticello 2 is quite far, in fact it sounds like
> it may even be close to ready to replace MC 1 already.

MC2 tried a lot of new things, which are at different degrees of
completion and usefulness.  Here's my take on them.

The basic underlying semantic model - storing sets of ancestors for
each method version rather than a tree for each package version - is
well developed, solidly implemented, and much more scalable than the
MC1 approach.  The key advantage, I would say, is that it allows
cherry picking individual methods when merging without causing
problems further down the line.  It also doesn't share MC's
performance problems when you get into the thousands of ancestor
versions.  So, new ancestry model: in good shape, and definitely a
good thing.

The new ancestry model also allows a new repository model, where you
commit and request individual method versions separately rather than
entire package versions.  This should have benefits for low bandwidth
connections, because the total amount of data transferred is
potentially much less.  However, it leads to a much chattier protocol,
which makes it infeasible to use the HTTP and FTP repositories we use
for MC1, or even the FileDirectory repositiories, because creating and
transferring that many tiny files is too slow.  Although we've
experimented with using things like PostgreSQL as the repository, I
think in practice we'd want a custom server written in Smalltalk and a
custom protocol, probably using the binary serialization format
defined in MC2.  New repository model: harder to deploy, probably a
good thing anyway, but needs a lot of work.

It also allows a new distribution model, which isn't tied to packages
in the PackageInfo sense.  That is, you don't have to bundle up
changes in package units - you could in theory use any arbitrary way
of "slicing" the image, and mix and match these at will (so you could
merge a version that covered all of the Squeak Kernel with one that
covered only Collections with one that covered only the
implementations of #do:, for example).  This is appealing in theory,
but it makes managing the metadata for these bundles very difficult:
the scope and ancestry of things like commit logs, branch names, and
version numbers becomes  much fuzzier.  So far, I'd be inclined to
call the Slice aspect of MC2 a useful but failed experiment, and go
back to the assumption that all changes will be bundled in
package-sized units.

Finally, and orthogonally to the rest of this, MC2 did a lot of work
to allow it to be more portable across Smalltalk dialects.  So, there
are very few assumptions made about exactly what data makes up a
method and a class and so on (does a class have a namespace? pool
variables? a category? etc), how this data gets compiled into an
artifact in the image, and how it gets retrieved from the image later.
 This leads to a lot of indirection and a lot of complicated class
hierarchies because of the need to separate the portable interfaces
from the Squeak-specific implementations.  I think the interesting
data point here is the recent port GemStone did of MC1 - on the one
hand, it's an obvious demonstration of how important cross dialect
portability is, but on the other hand, it's possibly also a
demonstration that MC2's elaborate portability scheme is overkill.  I
haven't looked into it deeply enough to know which side it argues for,
so I'm reserving judgement.

Personally, what I think I'd be most interested in seeing going
forward would be something that used the core ancestry model and
possibly portability model from MC2, but provided an MC1-workalike UI,
packaging model, and repository model.  Beyond just the technical
issues, that's what I think would stand the best chance of surviving
socially.

Avi

Reply | Threaded
Open this post in threaded view
|

Re: Monticello 2 - request for information

Damien Cassou-3
2007/5/20, Avi Bryant <[hidden email]>:

> On 5/20/07, J J <[hidden email]> wrote:
> > I happened on something by accident today that I didn't know about before:
> >
> > http://www.wiresong.ca/air/articles/category/monticello
> >
> > >From this it sounds like Monticello 2 is quite far, in fact it sounds like
> > it may even be close to ready to replace MC 1 already.
>
> MC2 tried a lot of new things, which are at different degrees of
> completion and usefulness.  Here's my take on them.
>
> The basic underlying semantic model - storing sets of ancestors for
> each method version rather than a tree for each package version - is
> well developed, solidly implemented, and much more scalable than the
> MC1 approach.  The key advantage, I would say, is that it allows
> cherry picking individual methods when merging without causing
> problems further down the line.  It also doesn't share MC's
> performance problems when you get into the thousands of ancestor
> versions.  So, new ancestry model: in good shape, and definitely a
> good thing.
>
> The new ancestry model also allows a new repository model, where you
> commit and request individual method versions separately rather than
> entire package versions.  This should have benefits for low bandwidth
> connections, because the total amount of data transferred is
> potentially much less.  However, it leads to a much chattier protocol,
> which makes it infeasible to use the HTTP and FTP repositories we use
> for MC1, or even the FileDirectory repositiories, because creating and
> transferring that many tiny files is too slow.  Although we've
> experimented with using things like PostgreSQL as the repository, I
> think in practice we'd want a custom server written in Smalltalk and a
> custom protocol, probably using the binary serialization format
> defined in MC2.  New repository model: harder to deploy, probably a
> good thing anyway, but needs a lot of work.
>
> It also allows a new distribution model, which isn't tied to packages
> in the PackageInfo sense.  That is, you don't have to bundle up
> changes in package units - you could in theory use any arbitrary way
> of "slicing" the image, and mix and match these at will (so you could
> merge a version that covered all of the Squeak Kernel with one that
> covered only Collections with one that covered only the
> implementations of #do:, for example).  This is appealing in theory,
> but it makes managing the metadata for these bundles very difficult:
> the scope and ancestry of things like commit logs, branch names, and
> version numbers becomes  much fuzzier.  So far, I'd be inclined to
> call the Slice aspect of MC2 a useful but failed experiment, and go
> back to the assumption that all changes will be bundled in
> package-sized units.
>
> Finally, and orthogonally to the rest of this, MC2 did a lot of work
> to allow it to be more portable across Smalltalk dialects.  So, there
> are very few assumptions made about exactly what data makes up a
> method and a class and so on (does a class have a namespace? pool
> variables? a category? etc), how this data gets compiled into an
> artifact in the image, and how it gets retrieved from the image later.
>  This leads to a lot of indirection and a lot of complicated class
> hierarchies because of the need to separate the portable interfaces
> from the Squeak-specific implementations.  I think the interesting
> data point here is the recent port GemStone did of MC1 - on the one
> hand, it's an obvious demonstration of how important cross dialect
> portability is, but on the other hand, it's possibly also a
> demonstration that MC2's elaborate portability scheme is overkill.  I
> haven't looked into it deeply enough to know which side it argues for,
> so I'm reserving judgement.
>
> Personally, what I think I'd be most interested in seeing going
> forward would be something that used the core ancestry model and
> possibly portability model from MC2, but provided an MC1-workalike UI,
> packaging model, and repository model.  Beyond just the technical
> issues, that's what I think would stand the best chance of surviving
> socially.

Hi Avi,

thank you for your answer. Do you think your suggestion can be
implemented in 3 months by someone who does not have knowledge about
MC2?

If yes, can you give me an overview of the current MC2 hierarchy? This
would help me start in good conditions and maximize the chance to have
a usable result at the end.

Google pay me to do that work so you should not think it's a waste of
time to help me begin.

--
Damien Cassou

Reply | Threaded
Open this post in threaded view
|

Re: Monticello 2 - request for information

Colin Putney

On May 22, 2007, at 4:22 AM, Damien Cassou wrote:

> If yes, can you give me an overview of the current MC2 hierarchy? This
> would help me start in good conditions and maximize the chance to have
> a usable result at the end.
>

I'll take a stab at that. As Avi mentioned, design of MC2 follows  
from the idea that revision history would be attached to individual  
methods, rather than entire packages. The key concepts in MC2 are these:

Elements - these play the same role as MethodReferences. They are  
references to specific parts of a Smalltalk program. They're more  
fine-grained than Definitions in MC1 - you'll see separate elements  
for each instance variable, for example, so that they can referred to  
directly, rather than by implication in a class reference.

Variants - a variant describes the state of a particular element.  
That state might be "not present in the image," in the case of a  
RemovalVariant, or the properties of the element, in the case of a  
DefinitionVariant. One of the things that we noticed with MC1 is that  
the exact components of a definition are most likely to change, so we  
encode them using a PropertyDictionary instead of dedicated objects.

ImageProxy - this is a facade for the class structures of the image.  
It presents a protocol for querying the state of the image or  
modifying it using variants.

Version - this is the central class of MC2. It represents the state  
of an element at a particular point in the evolution of the system.  
It combines a variant with list of versions that have come before it.  
Each version has what we call a Hashstamp - combination of SHA1 hash  
and timestamp which uniquely identifies the version. A version's  
ancestry is a set of hashstamps specifying which versions are part of  
this version's history.

Slice - A slice is sub-set of the elements in the image. It's main  
responsibility is to define which elements are part of the slice and  
which are not. Slices can be based on any set of criteria -  
PackageInfo, ChangeSets, an explicit collection, whatever. Slices can  
overlap. Elements can belong to more than one slice at the same time.

Snapshot - this is a set of Versions which "go together." A snapshot  
is equivalent to an MCVersion in MC1. It's the unit at which code is  
moved between images.

WorkingCopy - I'm not terribly happy with this name, but I can't  
think of a better one. It keeps track of the state of the image with  
respect to versioning. It knows, for example, which versions image is  
based on, and can perform tasks like creating new versions based on  
the state of the image, or loading existing versions into the image.

Repositories - Repositories are similar to those in MC1, but have a  
different protocol and different performance characteristics.  
Conceptually, repositories are dictionaries, mapping hashstamps to  
versions or snapshots. This makes for a very fine-grained protocol -  
versions are loaded individually. That makes directory or HTTP  
repositories impractical, since loading a slice for a medium-size  
package would involve opening and reading hundreds of files. (A  
snapshot of OB, for example runs at over 600 versions).

Hope this helps.

Colin


Reply | Threaded
Open this post in threaded view
|

Re: Monticello 2 - request for information

Bergel, Alexandre
In reply to this post by Bert Freudenberg
> But it's very possible to have your own PackageInfo that just has a  
> list of classes and methods which is not category-based.  I believe  
> someone actually started working on this, but can't quite remember  
> the state of this effort.

I am the one who worked on this (back in 2005).
However it does not load in a Squeak-dev 123 :-(

I attached Package-alexandrebergel.56.mcz to this email, if you want  
to browse the code. You can also browse it online
http://www.squeaksource.com/PackagesForSqueak.html

Here is the def of the class package:
-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=
Object subclass: #Package
        instanceVariableNames: 'name classes methods description cache'
        classVariableNames: ''
        poolDictionaries: ''
        category: 'Package-Base'
-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=

I extended the MC metamodel with
-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=
MCDefinition subclass: #MCPackageCommentDefinition
        instanceVariableNames: 'packageName comment'
        classVariableNames: ''
        poolDictionaries: ''
        category: 'Package-Base'
-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=
in order to have package comments...

Alexandre

--
_,.;:~^~:;._,.;:~^~:;._,.;:~^~:;._,.;:~^~:;._,.;:
Alexandre Bergel  http://www.bergel.eu
^~:;._,.;:~^~:;._,.;:~^~:;._,.;:~^~:;._,.;:~^~:;.







Package-alexandrebergel.56.mcz (38K) Download Attachment
Reply | Threaded
Open this post in threaded view
|

Re: Monticello 2 - request for information

Bergel, Alexandre
In reply to this post by J J-6
> Any interest in making a separate list or something for MC 2  
> discussion?  I have several things I would like to see happen with  
> it and would be prepared (interested in fact) to contribute time on  
> the project.

Please, no more mailing list :-)
squeak-dev seems appropriate for this.

Alexandre

--
_,.;:~^~:;._,.;:~^~:;._,.;:~^~:;._,.;:~^~:;._,.;:
Alexandre Bergel  http://www.bergel.eu
^~:;._,.;:~^~:;._,.;:~^~:;._,.;:~^~:;._,.;:~^~:;.




Reply | Threaded
Open this post in threaded view
|

Re: Monticello 2 - request for information

Damien Pollet
In reply to this post by Colin Putney
On 23/05/07, Colin Putney <[hidden email]> wrote:
> versions or snapshots. This makes for a very fine-grained protocol -
> versions are loaded individually. That makes directory or HTTP
> repositories impractical, since loading a slice for a medium-size
> package would involve opening and reading hundreds of files. (A
> snapshot of OB, for example runs at over 600 versions).

It seems this could easily be optimized by using aggregate requests:
ask for a cluster of versions, download them all at once, then load
them as usual ?

--
Damien Pollet
type less, do more [ | ] http://typo.cdlm.fasmz.org

Reply | Threaded
Open this post in threaded view
|

Re: Monticello 2 - request for information

Colin Putney

On May 24, 2007, at 12:51 PM, Damien Pollet wrote:


> On 23/05/07, Colin Putney <[hidden email]> wrote:
>
>> versions or snapshots. This makes for a very fine-grained protocol -
>> versions are loaded individually. That makes directory or HTTP
>> repositories impractical, since loading a slice for a medium-size
>> package would involve opening and reading hundreds of files. (A
>> snapshot of OB, for example runs at over 600 versions).
>>
>
> It seems this could easily be optimized by using aggregate requests:
> ask for a cluster of versions, download them all at once, then load
> them as usual ?
>

Sure. The client could post a list of versions that it wants and the  
server could gather them, bundle them up and send them in the  
response. But there are a couple of problems here. One is that if  
each version is stored in a separate file, we have the same problem  
of opening and closing 600 files, but on the server. Apache might do  
it faster than Squeak, but it's still going to be slow. So the  
versions have to be stored in some kind of indexed store. Also, for  
transport back to the client, they have to be combined into some kind  
file format that can hold multiple versions.

So yes, we could use HTTP, but we'd still have to write a protocol  
for requesting and returning aggregated versions. We'd also have to  
come up with some kind of database or other fast random-access  
storage system. At that point, why not just write a custom server?

It's very different from MC1, where the HTTP repository can be any  
off-the-shelf HTTP server that supports DAV.

Colin


Reply | Threaded
Open this post in threaded view
|

Re: Monticello 2 - request for information

Bert Freudenberg
On May 25, 2007, at 7:13 , Colin Putney wrote:

> It's very different from MC1, where the HTTP repository can be any  
> off-the-shelf HTTP server that supports DAV.

Not even full DAV but simply HTTP PUT.

OTOH, other SCM systems require custom servers, too. Many provide  
additional read-only check-out over HTTP though, which is very handy  
in locked-down environments.

- Bert -