The perfect revision control system

Previous Topic Next Topic
 
classic Classic list List threaded Threaded
13 messages Options
Reply | Threaded
Open this post in threaded view
|

The perfect revision control system

Jason Johnson-5
Hello all,

I am considering starting on a project idea I had for a darcs-like
version control system in Squeak (based on the low level Delta streams
classes), but I thought I would pull a little input first.  The main
thing I want to know is, what is the consensus on the scope of
changes?  Should it stop at method, or should it go all the way into
the source code?  Since everything, including the parsing, is done in
Smalltalk we can go down to the AST level for changes.

Stated another way, if you have a method:

SomeClass>>someMethod: value
      |i|
      i := value * 10

Later modified to:

SomeClass>>someMethod: value
      |i|
      i := value * 10
      Transcript show: i

Then if the change scope goes only to the method level then there is
no dependency between these two versions, the second one completely
replaces the first [1].  But if we model down to the AST then the
second version clearly depends on the first.

Using the AST, and since the system would be specialized for
Smalltalk, it can be much more advanced in tracking actual changes
then anything other revision system (since unlike a darcs that works
with anything, this system can know if new code uses variables added
earlier and so on), but my concern is that this might produce some
behaviors or conflicts that users don't expect.  Well that and it's
harder to implement to the AST level for me (since the Delta stream
classes don't store this level of information afaik).

[1] Well, the only dependency is the last one must be last, but if one
has 10 such changes the first 9 could be applied in any order or
completely ignored.  Of course this seems like an irrelevant issue for
a one man repository, but it becomes more important in the case of
merges from various distributed sources.

Reply | Threaded
Open this post in threaded view
|

Re: The perfect revision control system

Michael van der Gulik-2
On 2/15/08, Jason Johnson <[hidden email]> wrote:
> Hello all,
>
> I am considering starting on a project idea I had for a darcs-like
> version control system in Squeak (based on the low level Delta streams
> classes), but I thought I would pull a little input first.  The main
> thing I want to know is, what is the consensus on the scope of
> changes?  Should it stop at method, or should it go all the way into
> the source code?  Since everything, including the parsing, is done in
> Smalltalk we can go down to the AST level for changes.


I don't see any benefit in processing changes at the AST level. The
same results can be achieved by comparing the source of two methods,
perhaps pre-compiling them and comparing their ASTs if you want. Plus,
by having the method source, the VC system is more robust.

I was hoping you were going to ask a more general question, such as
"is this list of features sufficient for a VC system?".

Gulik.

--
http://people.squeakfoundation.org/person/mikevdg
http://gulik.pbwiki.com/

Reply | Threaded
Open this post in threaded view
|

Re: The perfect revision control system

Jason Johnson-5
On Feb 16, 2008 4:57 AM, Michael van der Gulik <[hidden email]> wrote:
>
> I don't see any benefit in processing changes at the AST level. The
> same results can be achieved by comparing the source of two methods,

Not without knowledge of the source language, and if you have that
then you are almost certainly building your own AST of some sort.  I
suppose the question is really moot at the moment anyway as I can just
use the method level we have now for the moment and switch to
something more fine grained later.

> perhaps pre-compiling them and comparing their ASTs if you want. Plus,
> by having the method source, the VC system is more robust.

I'm by no means talking about throwing away the source code.  I'm
purely talking about the mechanism for determining dependencies, doing
commuting [1], undos, etc., etc., in the face of multiple concurrent
branches.

> I was hoping you were going to ask a more general question, such as
> "is this list of features sufficient for a VC system?".

This you're always free to list.  Of course I'm interested to know if
there are any features people would really like that aren't currently
covered.

The features of the system I'm planning will be:

*) Based on change sets (well, actually the more robust Delta stream
implementation) instead of snapshotting.  A consequence of this is
that one is no longer required to use *categories to associate your
changes into a package, but rather something a bit more sophisticated
then the current change sorter can be used to manage them.

*) "Cherry picking" of changes.  Smalltalk, with its simple syntax and
keyword arguments, is the best language I know of for writing self
documenting code.  But what is still missing is the *why* and *how did
we get here*.  Intelligent use of change sets can go further to answer
those questions.  When one makes a series of changes to fix multiple
bugs, they can after-the-fact move the changes into a separate set for
each bug so that later maintainers of the software have more
information to determine if the existing code is still relevant, etc.

*) Labeling.  In big companies using sophisticated revisions systems
(i.e. not obsolete stuff like SVN), people are branching, merging and
conflicting all the time.  Then when it comes time to release their
software they apply a "label".  This tags the latest version of all
data managed by the system so that in an audit, they can conclusively
prove what the state of the software was for any version.  This also
gives benefits to the system by allowing it to ignore everything
previous to the latest label (since the current state of the system is
simply: the latest label and all changes after that).

*) Fully distributed.  Anyone can make a copy of a given repository at
any time.  They can make changes that stay only in their own copy, or
push them up if they wish.  To make a branch, just make another copy
of the repository.  The repositories are updated strictly through the
mechanism of applying patches so your totally free to "cherry pick"
changes out of someone else's repository (and the patches don't have
to be applied in order) or sync up if you wish.

*) Multiple ways of managing changes.  Since the system will live in
the normal Squeak system, it has access to all the subsystems in the
image and can leverage them to apply or forward patches.  One example
of this is darcs cool feature of letting a remote user make a change
and use one command to have darcs package up the change in the correct
diff format and forward it to the package maintainer for peer review.

*) Compatible with other systems.  Of course no new revision system is
going to have a chance if it can't deal with packages from all the
other existing systems.  It may even be interesting to generate
packages from these other systems.

Well, that's the list off the top of my head.  I know MC2, and even
MC1 can do much of this (or probably all in the case of MC2), but MC1
is based on snapshotting which I disagree with and I think MC2 tries
to do too much.  I believe the "theory of patches" underlying darcs
that any system is simply the sum of applying all it's patches, and
therefor I don't think anything more then this is needed.

[1] In darcs if two changes (or patches) "commute" then they can be
applied in any order.  The above example would commute under darcs
because the lines modified in the second change don't touch the
modified lines from the first.  If we use a more sophisticated AST
based technique the two patches would not commute because the second
change depends on data from the first.

Reply | Threaded
Open this post in threaded view
|

Re: The perfect revision control system

Colin Putney

A highly abridged version of what Jason wrote:

> The features of the system I'm planning will be:
>
> *) Based on change sets
> *) "Cherry picking" of changes.
> *) Labeling.
> *) Fully distributed.
> *) Multiple ways of managing changes.
> *) Compatible with other systems.
>
> Well, that's the list off the top of my head.  I know MC2, and even
> MC1 can do much of this (or probably all in the case of MC2), but MC1
> is based on snapshotting which I disagree with and I think MC2 tries
> to do too much.  I believe the "theory of patches" underlying darcs
> that any system is simply the sum of applying all it's patches, and
> therefor I don't think anything more then this is needed.

This should be interesting. You're right, MC2 does all this stuff  
(except the first item, obviously). In fact, the design of MC2 came  
out of a discussion that Avi and I had about the darcs theory of  
patches. We concluded that we didn't actually need patch theory in  
Smalltalk, because code is objects rather than text. We figured we  
could achieve all the nice features of darcs just by attaching  
revision history to methods rather than packages.

If you look at discussion of versioning system around the net, you'll  
run across bitter arguments about change-sets-vs-snapshots. As far as  
I can tell, the two are information-equivalent. I think it boils down  
to how you prefer to think about the problem.

That said, I'm glad you've decided to take this on. There maybe some  
real advantages to combining the benefits of code-as-objects, which we  
take for granted in Smalltalk, with a theory of patches. On that leaps  
to mind rather quickly is that you can have a much richer set of  
operations than darcs. Darcs has a token-replacement operation, which  
allows patches to be more easily commuted. But token replacement is  
basically the simplest, crudest refactoring imaginable. If your system  
included operations based on refactorings, you'd gain a lot of power  
over both darcs and MC.

One word of caution: the word "perfect" in the subject of this mail  
worries me a bit. Versioning is a black art, and perfection is a cruel  
god. Please, don't strive for perfection. Just make something better  
than the alternatives for a particular purpose.

Anyway, I'm looking forward to seeing what you come up with.

Colin

Reply | Threaded
Open this post in threaded view
|

Re: The perfect revision control system

Tapple Gao
In reply to this post by Jason Johnson-5
I wrote a long reply. It contains some relevant information
about what DeltaStreams currently has to offer for your
proposal, and what I'm working on.

On Sat, Feb 16, 2008 at 02:49:57PM +0100, Jason Johnson wrote:
> The features of the system I'm planning will be:
>
> *) Based on change sets (well, actually the more robust Delta stream
> implementation) instead of snapshotting.  A consequence of this is
> that one is no longer required to use *categories to associate your
> changes into a package, but rather something a bit more sophisticated
> then the current change sorter can be used to manage them.

I have a pretty sophisticated UI for DeltaStreams, but it is
missing a lot of commands. You can read about it and see an old
screenshot at http://wiki.squeak.org/squeak/6014

> *) "Cherry picking" of changes.  Smalltalk, with its simple syntax and
> keyword arguments, is the best language I know of for writing self
> documenting code.  But what is still missing is the *why* and *how did
> we get here*.  Intelligent use of change sets can go further to answer
> those questions.  When one makes a series of changes to fix multiple
> bugs, they can after-the-fact move the changes into a separate set for
> each bug so that later maintainers of the software have more
> information to determine if the existing code is still relevant, etc.

DeltaStreams can do this, but I haven't made the UI yet to be
able to copy changes between deltas.

> *) Labeling.  In big companies using sophisticated revisions systems
> (i.e. not obsolete stuff like SVN), people are branching, merging and
> conflicting all the time.  Then when it comes time to release their
> software they apply a "label".  This tags the latest version of all
> data managed by the system so that in an audit, they can conclusively
> prove what the state of the software was for any version.  This also
> gives benefits to the system by allowing it to ignore everything
> previous to the latest label (since the current state of the system is
> simply: the latest label and all changes after that).

DeltaStreams does not do this.

> *) Fully distributed.  Anyone can make a copy of a given repository at
> any time.  They can make changes that stay only in their own copy, or
> push them up if they wish.  To make a branch, just make another copy
> of the repository.  The repositories are updated strictly through the
> mechanism of applying patches so your totally free to "cherry pick"
> changes out of someone else's repository (and the patches don't have
> to be applied in order) or sync up if you wish.

DeltaStreams as yet currently has no concept of remote streams
or repositories. Since DeltaStreams is alpha software, it has
known missing features. This is one of them.

Currently, I'm bootstrapping this by hooking onto the existing
update stream supported by Squeak (See class Utilities, category
updates, for how it works). The file format I'm currently
writing interleaves a binary tree serializer with a normal
change set. http://www.squeaksource.com/InterleavedChangeSet .
In this way, you could build a delta with all the metadata
available and file it out. The receiver would make use of all
that metadata if DeltaStreams is loaded; otherwise it sees a
regular change set.

It is still alpha stage; the writer is about 70% complete, and
the reader hasn't been started yet. DeltaStreams doesn't yet
know how to use it either.

> *) Multiple ways of managing changes.  Since the system will live in
> the normal Squeak system, it has access to all the subsystems in the
> image and can leverage them to apply or forward patches.  One example
> of this is darcs cool feature of letting a remote user make a change
> and use one command to have darcs package up the change in the correct
> diff format and forward it to the package maintainer for peer review.

No idea what you're talking about here

> *) Compatible with other systems.  Of course no new revision system is
> going to have a chance if it can't deal with packages from all the
> other existing systems.  It may even be interesting to generate
> packages from these other systems.

This is the feature I'm working on next in DeltaStreams.
Specifically, I want to be able to convert Installer scripts
into deltas. Installer is a great way to specify what goes into
a certain release and what does not. I think so, at least. Maybe
you have a better idea with your darcs knowledge.

> Well, that's the list off the top of my head.  I know MC2, and even
> MC1 can do much of this (or probably all in the case of MC2), but MC1
> is based on snapshotting which I disagree with and I think MC2 tries
> to do too much.  

I don't see how "doing too much" can be a very serious problem.
Colin has great ideas about MC2, and I'm curious to see if MC2
and DeltaStreams will work together independently, if one will
kill the other, or if there will be a need to merge them.
Impossible to say right now.

> I believe the "theory of patches" underlying darcs
> that any system is simply the sum of applying all it's patches, and
> therefor I don't think anything more then this is needed.

Goran wanted deltas to work just like darcs patches when he came
up with the idea. However, I don't know anything about darcs,
and I've written most of the DeltaStreams code so far, so I have
no idea if what I am building lines up with that original goal
or not.

Also, Goran has since discovered Bazaar revision control, and (I
believe) found it superior to darcs. I know nothing about
either; just mentioning it for completeness.

My current goal is to get delta streams working well enough to
be of assistance while I try to merge the separate versions of
Morphic in the various forks, and assist Edgar with Ladrillos:
http://www.squeaksource.com/Ladrillos

How I plan to use it is as a shared log that could reliably
replay a set of changes made to a between a set of packages.
Monticello has issues when moving classes between packages,
which I envision needing to do a lot in dividing up Morphic. So,
I want to use deltas to keep track of what I do while moving
things around, since that is the most reliable order of making
the changes. Eventually, Morphic should be divided sanely enough
to use plain Monticello.

The features I will need for this to work are:
- a reliable file-out format, in progress
- Integration with Monticello, so that package saves and loads
  are correctly stored in the log

So, this is all I'll be working on before I stop work for a
while and concentrate on the release process.

I haven't made a script that will install the latest version of
DeltaStreams, so it requires a bit of work to install. If you
want to discuss revision control systems, install DeltaStreams,
or get help understanding the code, sign on to #squeak and we
can discuss it. If you have an idea of which parts should be
written first.

If there is any interest in seeing DeltaStreams in the basic,
not-really-usable state that it is in, I can make an Installer
script for your squeak version. Also, I would definitely
appreciate any help coding this thing, or understanding what
DeltaStreams should be. I'll be happy to help you with it.

--
Matthew Fulmer -- http://mtfulmer.wordpress.com/
Help improve Squeak Documentation: http://wiki.squeak.org/squeak/808

Reply | Threaded
Open this post in threaded view
|

Re: The perfect revision control system

Igor Stasenko
Guys, wouldn't it better for you and rest of us all to join efforts, and make
DS + MC2 ?
It's really pain to see, that you both duplicating work in multiple areas.

Hoping you will not consider my 2 cents as offensive act :)

--
Best regards,
Igor Stasenko AKA sig.

Reply | Threaded
Open this post in threaded view
|

Re: The perfect revision control system

Colin Putney

On 16-Feb-08, at 6:46 PM, Igor Stasenko wrote:

> Guys, wouldn't it better for you and rest of us all to join efforts,  
> and make
> DS + MC2 ?
> It's really pain to see, that you both duplicating work in multiple  
> areas.

I don't really think that would be possible. The two systems are two  
different to be combined into a single system. They might be  
complimentary, though we won't know how that works out in practice for  
a while.

> Hoping you will not consider my 2 cents as offensive act :)

Not at all. I can understand the community being frustrated with me. I  
got MC2 to alpha-quality, made all sort of promises, then let it stall  
for 2 years, while I put my effort into my day job. Göran and Matthew  
got tired of waiting for something better than MC1 and set out to do  
it themselves. I've also resumed work on MC2 and now we've got two,  
apparently competing projects.

But actually, I think this is a good thing. MC2 is much better than  
MC1, but it won't be the last word in versioning systems. DeltaStreams  
and Jason's new project pursue a completely different philosophy on  
versioning, and I'll be interested to see what they come up with.

In the meantime, I'll keep doing what I'm doing.

Colin
Reply | Threaded
Open this post in threaded view
|

Morphic partitioning (was Re: The perfect revision control system)

Edgar J. De Cleene
In reply to this post by Tapple Gao



El 2/16/08 6:36 PM, "Matthew Fulmer" <[hidden email]> escribió:

> My current goal is to get delta streams working well enough to
> be of assistance while I try to merge the separate versions of
> Morphic in the various forks, and assist Edgar with Ladrillos:
> http://www.squeaksource.com/Ladrillos
>
> How I plan to use it is as a shared log that could reliably
> replay a set of changes made to a between a set of packages.
> Monticello has issues when moving classes between packages,
> which I envision needing to do a lot in dividing up Morphic. So,
> I want to use deltas to keep track of what I do while moving
> things around, since that is the most reliable order of making
> the changes. Eventually, Morphic should be divided sanely enough
> to use plain Monticello.

I wish all know the current state of divide and conquer the Morphic monster
and the ground to cover for future merging of 3dot11 and MinimalMorphic.

In Ladrillos, I put several Morphic and no Morphic regular .mcz and some
weird .mcz like the Rompecabezas related .

Why weird?. Because Monticello don't have how to deal with some different of
code.
And many "apps" "packages" , etc needs .mp3. midi. gif/png/jpg, etc no code
for work.
Now we could do this with .sar IMHO the best single way of you assemble all
your needs in one logical unit.
Long time away I cook some Monticello for deal with this (Monticello kind of
..sar) , and also modify PackageInfo.

Later I learn Impara guys do the same (preamble/postscript) ,before and
better , but life in not perfect :=)

Also Ralph says if Monticello have owners and maintainers, we must don't try
to mess (Team work)

For 3do11 I complete the reshape, for Nebraska and BookMorphandFriends for
unload /load.
Also have some partitioning for Etoys load/unload, but still I need adjust
with MinimalMorphic , taking Pavel magic and testing several practical
troubles MinimalMorphic have.

The goal is advance as far is possible and don't upset oldies and newbies
Squeakers. (this is the most trickier :=)

For a different way of think in text classes and packages, I have this links

ftp://edgardec:[hidden email]/
and then navigate to directory /CompressedClasses for see the 2005
organization and to /3dot11 folder for recent.

In case the ftp ask, user is edgardec and pass is squeak.
Squeak could manage this ftp without troubles and is very reliable.
The only cons is have a 100 k limit for files (we need be efficient)

The classes was gzip compressed from Squeak, using CodeLoader for doing the
".sqz" files.

I complete the Kafkian papers for the new flat and begins the move, so if I
disappear for some days don't miss me much :=)

Edgar





Reply | Threaded
Open this post in threaded view
|

Re: The perfect revision control system

Göran Krampe
In reply to this post by Colin Putney
Hi Colin and all!

Colin Putney <[hidden email]> wrote:

> On 16-Feb-08, at 6:46 PM, Igor Stasenko wrote:
> > Guys, wouldn't it better for you and rest of us all to join efforts,  
> > and make
> > DS + MC2 ?
> > It's really pain to see, that you both duplicating work in multiple  
> > areas.
>
> I don't really think that would be possible. The two systems are two  
> different to be combined into a single system. They might be  
> complimentary, though we won't know how that works out in practice for  
> a while.

That would also be my guess.

> > Hoping you will not consider my 2 cents as offensive act :)
>
> Not at all.

No, we take no offense!

> I can understand the community being frustrated with me. I  

Nah, there is no such frustration - at least not from me. :) Sometimes
we have time, and at other times we do not - it is the way of life.

> got MC2 to alpha-quality, made all sort of promises, then let it stall  
> for 2 years, while I put my effort into my day job. Göran and Matthew  
> got tired of waiting for something better than MC1 and set out to do  
> it themselves.

Mmmm, that was actually not how it went - the reason I started DS was
not because I wanted something "better" than MC1 - in my mind DS is a
different beast for altogether different use cases and the word "better"
doesn't cut it.

> I've also resumed work on MC2 and now we've got two, apparently competing projects.

Personally I don't think they compete. :)

> But actually, I think this is a good thing. MC2 is much better than  
> MC1, but it won't be the last word in versioning systems. DeltaStreams  
> and Jason's new project pursue a completely different philosophy on  
> versioning, and I'll be interested to see what they come up with.
>
> In the meantime, I'll keep doing what I'm doing.

I sure hope so! I want BOTH MC2 and DS in the end - and using lots of
common code like for example the brilliant SE base (SystemEditor) which
we hopefully have fixed quite a few things in while working on DS.

As a final word, this is how I see it:

- MC1 is great for team work and for maintaining packages over time even
if you are a single dev.

- MC2 will presumably be MC1 ++, I mean, that is what I expect. Smarter
and better in most respects.

DS is mainly aimed at being flexible tool for moving changes between
*radically different* branches with no common history - or in other
words - between different Squeak "distros" like say Croquet, Sophie,
Squeak vanilla, Squeakland etc.

It tries to enable this by recording MORE state about the actual changes
instead of relying on advanced history analysis (like the MC-family
does).

So a Delta is a ChangeSet on steroids or a patch or whatever you like to
call it. While both fixing lots of deficiencies with the original
ChangeSet implementation - it also is much more "detailed" in its
recordings.

So a few use cases DS tries to address:

- Bug fixes. Instead of ChangeSets we want to use Deltas. They should be
just as simple to use - but much more reliable and useful.

- Update streams. Like the original stream that feeds an image, but with
the ability to have multiple streams.

- The changelog. Eventually we should be able to use Deltas as a base
for a new .changes system.

A few use cases DS does not try to address *primarily*:

- Team development. We expect Deltas to be able to form a nice *base*
for this, but it is likely MC will be able to make much more educated
merges than Deltas will be able to do, well, it depends on how much
history analysis you put on top of it.

- Package maintainance. Similar argument.


Three really important features with Deltas are:

- The ability to apply/unapply! Excellent undo which enables stuff like
patch queues etc.

- Fully reified. This means a Delta is a self contained object when
loaded into the image. Then you *apply* it or *unapply* it. A changeset
in comparison is NOT self contained, it only refers to the methods it
changes without containing the actual code itself (highly confusing).

- More info! A modified method does not only contain the NEW source but
also the OLD source it replaces. This simple scheme with keeping "before
state" makes applying them in "forreign images" very easy to do - and
still be able to warn the user about possible conflicts.


And above all - they are still very easy to reason about and manipulate.
The model is trivial - just an ordered sequence of low level change
objects. If something goes wrong you will always be able to figure it
out.

Darcs was mentioned in another post as a "source of inspiration" and
yes, that is true. Darcs is remarkable in its "smartness", BUT... lots
of people feel the downside is that when stuff goes wrong you are more
or less toast in trying to understand why.

So DS tries to give us some of these features WITHOUT going overboard on
the complexity side. Without actually knowing what I am talking about
this MAY be a plus for DS compared to MC2. :)

regards, Göran

Reply | Threaded
Open this post in threaded view
|

Re: The perfect revision control system

Göran Krampe
In reply to this post by Tapple Gao
Hi all!

Matthew Fulmer <[hidden email]> wrote:

> > *) "Cherry picking" of changes.  Smalltalk, with its simple syntax and
> > keyword arguments, is the best language I know of for writing self
> > documenting code.  But what is still missing is the *why* and *how did
> > we get here*.  Intelligent use of change sets can go further to answer
> > those questions.  When one makes a series of changes to fix multiple
> > bugs, they can after-the-fact move the changes into a separate set for
> > each bug so that later maintainers of the software have more
> > information to determine if the existing code is still relevant, etc.
>
> DeltaStreams can do this, but I haven't made the UI yet to be
> able to copy changes between deltas.

Yes, the model with Deltas is very similar to ChangeSets in this respect
and the only thing missing is good UIs on top.

> > *) Labeling.  In big companies using sophisticated revisions systems
> > (i.e. not obsolete stuff like SVN), people are branching, merging and
> > conflicting all the time.  Then when it comes time to release their
> > software they apply a "label".  This tags the latest version of all
> > data managed by the system so that in an audit, they can conclusively
> > prove what the state of the software was for any version.  This also
> > gives benefits to the system by allowing it to ignore everything
> > previous to the latest label (since the current state of the system is
> > simply: the latest label and all changes after that).
>
> DeltaStreams does not do this.

Right, the concept of tags is... hard to see how it would work in DS
given its model.
I haven't thought much about honestly.

> > *) Fully distributed.  Anyone can make a copy of a given repository at
> > any time.  They can make changes that stay only in their own copy, or
> > push them up if they wish.  To make a branch, just make another copy
> > of the repository.  The repositories are updated strictly through the
> > mechanism of applying patches so your totally free to "cherry pick"
> > changes out of someone else's repository (and the patches don't have
> > to be applied in order) or sync up if you wish.
>
> DeltaStreams as yet currently has no concept of remote streams
> or repositories. Since DeltaStreams is alpha software, it has
> known missing features. This is one of them.

Given the approach I would say DS is distributed. But it is hard to
compare - the "commands" you mention that are "standard" in tools like
Darcs/BazaarNG/Mercurial etc don't really have a counterpart in DS yet
since we haven't attacked the S-part (streams of Deltas) yet.

We have been developing bottom-up and thus we will first have a full
ChangeSet replacement before attacking the concept of streams of Deltas.
So the field is open but again, DS does not really *aim* to give the
same work model as Darcs/BazaarNG/Mercurial etc. That doesn't mean it is
not possibly though, we just haven't been thinking about it like that.

> Currently, I'm bootstrapping this by hooking onto the existing
> update stream supported by Squeak (See class Utilities, category
> updates, for how it works). The file format I'm currently
> writing interleaves a binary tree serializer with a normal
> change set. http://www.squeaksource.com/InterleavedChangeSet .
> In this way, you could build a delta with all the metadata
> available and file it out. The receiver would make use of all
> that metadata if DeltaStreams is loaded; otherwise it sees a
> regular change set.

Which is bloody genius. :) Sorry, I haven't been able to code on DS for
a long while, but it is great to see Matthew pushing it forward!

> It is still alpha stage; the writer is about 70% complete, and
> the reader hasn't been started yet. DeltaStreams doesn't yet
> know how to use it either.
>
> > *) Multiple ways of managing changes.  Since the system will live in
> > the normal Squeak system, it has access to all the subsystems in the
> > image and can leverage them to apply or forward patches.  One example
> > of this is darcs cool feature of letting a remote user make a change
> > and use one command to have darcs package up the change in the correct
> > diff format and forward it to the package maintainer for peer review.
>
> No idea what you're talking about here

I understand it I think. The "mail bomb" thingy that Darcs had early on
and that the oher tools have added. I would say this particular feature
is *inherent* in DS - since a Delta is a self contained thingy and yes,
feel free to send it through different means of communication.

Other areas of integration can be discussed - for example, if Deltas can
replace ChangeSet fully - should MC1/2 use Deltas (when doing the final
apply-operation) instead of ChangeSets in the future? And so on.

> > *) Compatible with other systems.  Of course no new revision system is
> > going to have a chance if it can't deal with packages from all the
> > other existing systems.  It may even be interesting to generate
> > packages from these other systems.
>
> This is the feature I'm working on next in DeltaStreams.
> Specifically, I want to be able to convert Installer scripts
> into deltas. Installer is a great way to specify what goes into
> a certain release and what does not. I think so, at least. Maybe
> you have a better idea with your darcs knowledge.

I am unsure what Matthew means when he says "convert" - would it be
simply "recording" all changes being done into a single Delta or?

Otherwise I would say that DS is the "tool" in this area that has most
chances of co-existance given its very trivial model and design. It is
something we really aim for.

> > Well, that's the list off the top of my head.  I know MC2, and even
> > MC1 can do much of this (or probably all in the case of MC2), but MC1
> > is based on snapshotting which I disagree with and I think MC2 tries
> > to do too much.  
>
> I don't see how "doing too much" can be a very serious problem.
> Colin has great ideas about MC2, and I'm curious to see if MC2
> and DeltaStreams will work together independently, if one will
> kill the other, or if there will be a need to merge them.
> Impossible to say right now.

Right. My bet is that MC2 will (if it gets to that state) eventually
replace MC1. If we bring DS to its full potential I hope it will replace
ChangeSets with all that comes with that - and also work as "glue" in
the world of Squeak code sharing.

I want people to feel that, "ok, hmmm, let me just file this out as a
Delta and ftp it up there/email it to you/drag and drop it to that other
image" or whatever. I want them to feel that when all else fails Deltas
are there to work as a very nice fallback mechanism.

And I also hope the apply/unapply feature will open up new ways of
working.
 
> > I believe the "theory of patches" underlying darcs
> > that any system is simply the sum of applying all it's patches, and
> > therefor I don't think anything more then this is needed.
>
> Goran wanted deltas to work just like darcs patches when he came
> up with the idea. However, I don't know anything about darcs,
> and I've written most of the DeltaStreams code so far, so I have
> no idea if what I am building lines up with that original goal
> or not.

Let me just note that while being *impressed* with Darcs I feel that
route is simply too complicated. I am inspired by Darcs, but I don't
want to copy it. I want to enable some similar feel of cherry picking -
but the idea was to make DS handle say 80% of the cases but using a MUCH
simpler trick (recording more state) than using highly advanced
algorithms that only 2-3 people even understand.
 
> Also, Goran has since discovered Bazaar revision control, and (I
> believe) found it superior to darcs. I know nothing about
> either; just mentioning it for completeness.

"Superior" only in the sense that it seems to do much the same but using
simpler means. And having tons of other features and a lot of developer
momentum.

Darcs still probably has the most advanced cherry picking out there -
but since the others are very good too - it doesn't outweigh the other
stuff.

> My current goal is to get delta streams working well enough to
> be of assistance while I try to merge the separate versions of
> Morphic in the various forks, and assist Edgar with Ladrillos:
> http://www.squeaksource.com/Ladrillos

Good, real usage is a great way to see if it holds.

> How I plan to use it is as a shared log that could reliably
> replay a set of changes made to a between a set of packages.
> Monticello has issues when moving classes between packages,
> which I envision needing to do a lot in dividing up Morphic. So,
> I want to use deltas to keep track of what I do while moving
> things around, since that is the most reliable order of making
> the changes. Eventually, Morphic should be divided sanely enough
> to use plain Monticello.

Very interesting use case.

> The features I will need for this to work are:
> - a reliable file-out format, in progress
> - Integration with Monticello, so that package saves and loads
>   are correctly stored in the log
>
> So, this is all I'll be working on before I stop work for a
> while and concentrate on the release process.

Sounds great.

Hopefully I will be able to at least catch up a bit. :)

regards, Göran

Reply | Threaded
Open this post in threaded view
|

Re: The perfect revision control system

Tapple Gao
On Mon, Feb 18, 2008 at 11:28:31AM +0200, [hidden email] wrote:

> > > *) Compatible with other systems.  Of course no new revision system is
> > > going to have a chance if it can't deal with packages from all the
> > > other existing systems.  It may even be interesting to generate
> > > packages from these other systems.
> >
> > This is the feature I'm working on next in DeltaStreams.
> > Specifically, I want to be able to convert Installer scripts
> > into deltas. Installer is a great way to specify what goes into
> > a certain release and what does not. I think so, at least. Maybe
> > you have a better idea with your darcs knowledge.
>
> I am unsure what Matthew means when he says "convert" - would it be
> simply "recording" all changes being done into a single Delta or?

Yes. By default, DS will log all changes being applied to the
image with as much metadata as possible. However, each tool
could add a lot of very useful metadata to the log that DS alone
would never know about:

- This next group of changes comes from loading Monticello-kph.273
- This next group of changes fix bug 6342
- I am now running a test suite; ignore changes for now; they're
  not important

In the future, I think it may be useful to try to get this data
'without' loading the changes; as a "what if" mechanism.
However, the undoability of deltas may make it not worth
making a special case out of. Things like:

- Show me what would have happened if I ran 'Installer install:
  'Clean'
- I have a SystemEditor with a modified system state; give me a
  delta from the current system to this new, unapplied system
  state
- Show me all the differences between Croquet and 3.11

None of this is yet implemented

--
Matthew Fulmer -- http://mtfulmer.wordpress.com/
Help improve Squeak Documentation: http://wiki.squeak.org/squeak/808

Reply | Threaded
Open this post in threaded view
|

Re: The perfect revision control system

Colin Putney
In reply to this post by Göran Krampe

On 18-Feb-08, at 1:07 AM, [hidden email] wrote:

>> got MC2 to alpha-quality, made all sort of promises, then let it  
>> stall
>> for 2 years, while I put my effort into my day job. Göran and Matthew
>> got tired of waiting for something better than MC1 and set out to do
>> it themselves.
>
> Mmmm, that was actually not how it went - the reason I started DS was
> not because I wanted something "better" than MC1 - in my mind DS is a
> different beast for altogether different use cases and the word  
> "better"
> doesn't cut it.

Heh, I should know better than to put words in somebody else's mouth.

Thanks for the info on DeltaStreams - can't wait to play with it.

Colin
Reply | Threaded
Open this post in threaded view
|

[squeak-dev] Re: The perfect revision control system

Jason Johnson-5
In reply to this post by Colin Putney
On Feb 16, 2008 10:00 PM, Colin Putney <[hidden email]> wrote:
>
> This should be interesting. You're right, MC2 does all this stuff
> (except the first item, obviously). In fact, the design of MC2 came
> out of a discussion that Avi and I had about the darcs theory of
> patches.

Yea, I remembered that from earlier discussions.

> If you look at discussion of versioning system around the net, you'll
> run across bitter arguments about change-sets-vs-snapshots. As far as
> I can tell, the two are information-equivalent. I think it boils down
> to how you prefer to think about the problem.

I agree, they probably are equivalent.  But the issue I see is that we
already record the information of what is changing in the system, so
we have for free what Darcs (the most advance system in this regard)
is simulating with it's cherry picking [1].  It seems a shame to not
use that information.

> That said, I'm glad you've decided to take this on. There maybe some
> real advantages to combining the benefits of code-as-objects, which we
> take for granted in Smalltalk, with a theory of patches. On that leaps
> to mind rather quickly is that you can have a much richer set of
> operations than darcs.

Thanks.  For sure we can.

>Darcs has a token-replacement operation, which
> allows patches to be more easily commuted. But token replacement is
> basically the simplest, crudest refactoring imaginable. If your system
> included operations based on refactorings, you'd gain a lot of power
> over both darcs and MC.

You mean to do the commuting?  I.e. to truly understand how the code
changed, to be able to make the "A-1" type patch more accurate then
Darcs would be able to?

> One word of caution: the word "perfect" in the subject of this mail
> worries me a bit. Versioning is a black art, and perfection is a cruel
> god. Please, don't strive for perfection. Just make something better
> than the alternatives for a particular purpose.

Ah, no worries about that.  I like to know what a perfect world looks
like so I can point my Smalltalk-mobile in that direction.  I don't
expect to ever get there and I certainly wouldn't hold back a release
until I did.  I'm an extremely incremental developer (a big reason I
love this language).  My projects probably look bad at first, before
the code/problem shows me where the synergies are.  I have been doing
heavy "refactoring" long before people started calling it that. :)

> Anyway, I'm looking forward to seeing what you come up with.
>
> Colin

Thanks, so am I. :)

[1]  That is, any non-Smalltalk system is doing snap-shotting in a
sense, as they are "dead" while all changes are happening and when
reinvoked must take a "snapshot" of the current state of the world and
compare it to the last "snapshot" (Note that even Darcs keeps a
"pristine tree" for this purpose).