[squeak-dev] Our process, some loose ideas regarding DS + MC

Previous Topic Next Topic
 
classic Classic list List threaded Threaded
23 messages Options
12
Reply | Threaded
Open this post in threaded view
|

[squeak-dev] Our process, some loose ideas regarding DS + MC

Göran Krampe
Hi folks!

As we all know we are in the middle of a heated discussion about our
release process etc. I haven't read all messages nor do I think that any
of the two "sides" (Andreas and Keith basically) are 100% wrong nor 100%
right.

Anyway, disregarding all that :) - I would like to do a sketch on an
"ideal" daily process for development. I would appreciate this thread to
NOT turn into the current "war" between "Trunk" and "Bob", ok?

MC is a very good tool and Keith/Matthew and others have turned it into
an even better tool AFAIK using SystemEditor as a base etc. It does have
its limitations though:

1. It tends to be more coarse granular than other tools. "commits" tend
to be more seldom, at least when I work in it. Probably due to less
stellar performance, might have been fixed.

2. It needs history to do its merge magic. Thus it doesn't play well
between forks. It plays very well within a package or within a group of
packages (a fork perhaps).

3. It is centered around packages defined by PackageInfo which more or
less means "a group of class categories + class extensions". It does
have MC configs now, and I haven't used them myself yet so I can't
really comment.

The above three bits are different with Deltas.

Anyway, I am trying to envision how an MC based approach like "trunk"
could work together with Deltastreams. DS is meant to replace Changesets
and could complement MC in the above three departments (and more).

For example, we could create some kind of "commit tool" that operates
above MC/DS which would use Delta recording (exactly like Changesets
record today) to catch current image modifications and to offer a nice
UI to select a subset of those modifications (or all), enter commit
comment, classify it and push "commit".

This commit tool could have some checkboxes and drop down menus to
classify this commit and also tell where to "send it":

- Also snapshot MC packages. This would cause an MC commit as well as a
Delta to be produced. If not checked we only produce a Delta.

- Bug fix to send to proper streams based on touched PIs. The tool could
then select proper Delta *stream* in which to publish the Delta based on
what PIs it touches.

What would this achieve? Well, first of all we might be able to work
more completely in Squeak. :) We would above all be able to get "cross
fork pollination" using the Deltas that are being published alongside
the MC snapshots. Pharo could be "listening" to our bug fix stream for
example. And vice versa if Pharo decides to use this. And well, it opens
up lots of other possibilities I think.

Anyway, would be interested in all feedback on this idea.

regards, Göran


Reply | Threaded
Open this post in threaded view
|

Re: [squeak-dev] Our process, some loose ideas regarding DS + MC

Juan Vuletich-4
Hi Göran!

Göran Krampe wrote:

> Hi folks!
>
> ...
> MC is a very good tool and Keith/Matthew and others have turned it
> into an even better tool AFAIK using SystemEditor as a base etc. It
> does have its limitations though:
>
> 1. It tends to be more coarse granular than other tools. "commits"
> tend to be more seldom, at least when I work in it. Probably due to
> less stellar performance, might have been fixed.
>
> 2. It needs history to do its merge magic. Thus it doesn't play well
> between forks. It plays very well within a package or within a group
> of packages (a fork perhaps).
>
> 3. It is centered around packages defined by PackageInfo which more or
> less means "a group of class categories + class extensions". It does
> have MC configs now, and I haven't used them myself yet so I can't
> really comment.
>
> The above three bits are different with Deltas.
>
> ...
>
> What would this achieve? Well, first of all we might be able to work
> more completely in Squeak. :) We would above all be able to get "cross
> fork pollination" using the Deltas that are being published alongside
> the MC snapshots. Pharo could be "listening" to our bug fix stream for
> example. And vice versa if Pharo decides to use this. And well, it
> opens up lots of other possibilities I think.
>
> Anyway, would be interested in all feedback on this idea.
>
> regards, Göran

You're doing a great work with Deltas!

I agree that Deltas can fix the problems of both ChangeSets and MC. And
that they might ease sharing code between forks. I want to support them
in Cuis. And I hope they become a central piece of the update processes
of Squeak and Pharo.

Cheers,
Juan Vuletich

Reply | Threaded
Open this post in threaded view
|

Re: [squeak-dev] Our process, some loose ideas regarding DS + MC

Giuseppe
In reply to this post by Göran Krampe

El 15/08/2009, a las 10:10, Göran Krampe escribió:

> Hi folks!
>
> As we all know we are in the middle of a heated discussion about our  
> release process etc. I haven't read all messages nor do I think that  
> any of the two "sides" (Andreas and Keith basically) are 100% wrong  
> nor 100% right.
>
> Anyway, disregarding all that :) - I would like to do a sketch on an  
> "ideal" daily process for development. I would appreciate this  
> thread to NOT turn into the current "war" between "Trunk" and "Bob",  
> ok?
>
> MC is a very good tool and Keith/Matthew and others have turned it  
> into an even better tool AFAIK using SystemEditor as a base etc. It  
> does have its limitations though:
>
> 1. It tends to be more coarse granular than other tools. "commits"  
> tend to be more seldom, at least when I work in it. Probably due to  
> less stellar performance, might have been fixed.
>
> 2. It needs history to do its merge magic. Thus it doesn't play well  
> between forks. It plays very well within a package or within a group  
> of packages (a fork perhaps).
>
> 3. It is centered around packages defined by PackageInfo which more  
> or less means "a group of class categories + class extensions". It  
> does have MC configs now, and I haven't used them myself yet so I  
> can't really comment.
>
> The above three bits are different with Deltas.
>
> Anyway, I am trying to envision how an MC based approach like  
> "trunk" could work together with Deltastreams. DS is meant to  
> replace Changesets and could complement MC in the above three  
> departments (and more).
>
> For example, we could create some kind of "commit tool" that  
> operates above MC/DS which would use Delta recording (exactly like  
> Changesets record today) to catch current image modifications and to  
> offer a nice UI to select a subset of those modifications (or all),  
> enter commit comment, classify it and push "commit".
>
> This commit tool could have some checkboxes and drop down menus to  
> classify this commit and also tell where to "send it":
>
> - Also snapshot MC packages. This would cause an MC commit as well  
> as a Delta to be produced. If not checked we only produce a Delta.
>
> - Bug fix to send to proper streams based on touched PIs. The tool  
> could then select proper Delta *stream* in which to publish the  
> Delta based on what PIs it touches.
>
> What would this achieve? Well, first of all we might be able to work  
> more completely in Squeak. :) We would above all be able to get  
> "cross fork pollination" using the Deltas that are being published  
> alongside the MC snapshots. Pharo could be "listening" to our bug  
> fix stream for example. And vice versa if Pharo decides to use this.  
> And well, it opens up lots of other possibilities I think.
>
> Anyway, would be interested in all feedback on this idea.

I'm a little disconnected of the Squeak world, for this, I ask: But  
are DeltaStreams working, stable, and functional?

Cheers.

Giuseppe Luigi Punzi Ruiz
Blog: http://www.lordzealon.com
Twitter & Skype & GoogleTalk accounts: glpunzi






Reply | Threaded
Open this post in threaded view
|

[squeak-dev] Re: Our process, some loose ideas regarding DS + MC

Andreas.Raab
In reply to this post by Göran Krampe
Hi Göran -

It'd be easier to give feedback if you had something to try out ;-)
Abstractly speaking this sounds all cool but we'll only know when we
actually try it. One thing that's not quite clear in this picture is
where the "canonical" source for patches would be and how it would be
produced. In most of what you wrote you've concentrated more on the
harvesting aspect (i.e., being able to cherry-pick contributions from
elsewhere) but how does an actual development flow look like?

Cheers,
   - Andreas

Göran Krampe wrote:

> Hi folks!
>
> As we all know we are in the middle of a heated discussion about our
> release process etc. I haven't read all messages nor do I think that any
> of the two "sides" (Andreas and Keith basically) are 100% wrong nor 100%
> right.
>
> Anyway, disregarding all that :) - I would like to do a sketch on an
> "ideal" daily process for development. I would appreciate this thread to
> NOT turn into the current "war" between "Trunk" and "Bob", ok?
>
> MC is a very good tool and Keith/Matthew and others have turned it into
> an even better tool AFAIK using SystemEditor as a base etc. It does have
> its limitations though:
>
> 1. It tends to be more coarse granular than other tools. "commits" tend
> to be more seldom, at least when I work in it. Probably due to less
> stellar performance, might have been fixed.
>
> 2. It needs history to do its merge magic. Thus it doesn't play well
> between forks. It plays very well within a package or within a group of
> packages (a fork perhaps).
>
> 3. It is centered around packages defined by PackageInfo which more or
> less means "a group of class categories + class extensions". It does
> have MC configs now, and I haven't used them myself yet so I can't
> really comment.
>
> The above three bits are different with Deltas.
>
> Anyway, I am trying to envision how an MC based approach like "trunk"
> could work together with Deltastreams. DS is meant to replace Changesets
> and could complement MC in the above three departments (and more).
>
> For example, we could create some kind of "commit tool" that operates
> above MC/DS which would use Delta recording (exactly like Changesets
> record today) to catch current image modifications and to offer a nice
> UI to select a subset of those modifications (or all), enter commit
> comment, classify it and push "commit".
>
> This commit tool could have some checkboxes and drop down menus to
> classify this commit and also tell where to "send it":
>
> - Also snapshot MC packages. This would cause an MC commit as well as a
> Delta to be produced. If not checked we only produce a Delta.
>
> - Bug fix to send to proper streams based on touched PIs. The tool could
> then select proper Delta *stream* in which to publish the Delta based on
> what PIs it touches.
>
> What would this achieve? Well, first of all we might be able to work
> more completely in Squeak. :) We would above all be able to get "cross
> fork pollination" using the Deltas that are being published alongside
> the MC snapshots. Pharo could be "listening" to our bug fix stream for
> example. And vice versa if Pharo decides to use this. And well, it opens
> up lots of other possibilities I think.
>
> Anyway, would be interested in all feedback on this idea.
>
> regards, Göran
>
>
>


Reply | Threaded
Open this post in threaded view
|

Re: [squeak-dev] Our process, some loose ideas regarding DS + MC

Göran Krampe
In reply to this post by Giuseppe
Hi!

Giuseppe Luigi Punzi Ruiz wrote:
> I'm a little disconnected of the Squeak world, for this, I ask: But are
> DeltaStreams working, stable, and functional?

Nope! :) But I think we have reached a point where we could need more
people helping with it. And thus I am trying to get some interest and
also try to discuss how it can fit together with MC.

Regarding the status and the words "working, stable and functional":

In short, DS consists of two parts: Deltas and Streams. The Streams part
has not even been started and it would be awesome to get some help there.

The Delta part is working to a very large extent but is not yet ready to
be used. Deltas do however support all kinds of source code changes
(AFAIK) with full revert ability. So the domain model is perhaps 98%
complete and there are lots of tests so the stuff that is there is quite
stable.

So kinda working and kinda stable :). The functional bit on the other
hand suffers because of two missing pieces that we really need:

1. External file format. Working on it.

2. Tools with UI. Will start working on that soon.

Also, working on #1 I am refactoring, simplifying and cleaning, so the
code is a bit in flux.

regards, Göran


Reply | Threaded
Open this post in threaded view
|

Re: [squeak-dev] Our process, some loose ideas regarding DS + MC

keith1y
In reply to this post by Göran Krampe
Göran Krampe wrote:
> Hi folks!
>
> As we all know we are in the middle of a heated discussion about our
> release process etc. I haven't read all messages nor do I think that
> any of the two "sides" (Andreas and Keith basically) are 100% wrong
> nor 100% right.
There are not two sides to this there is one side, the board either
acted rightly or wrongly. When that issue has been sorted out then you
can debate the details of the actual proposals.

Currently when you discuss this with board members, the response, due to
the fact that new interest has been generated in making patches, is that
"the end justifies the means".

Once I pointed out that the same ends could have been acheived by
different means, and that we already had 98 patches manually included in
3.11 as of December last year, and with the automatic process we
anticipated harvesting about 300 per release cycle. i.e. we already had
more patches than we knew what to do with, Randal's interest returned
and a civil positive conversation ensued.

I still think that the board is supposed to be hands off and not boots on.

I still think that everyone should be treated fairly.

If Andreas wants to volunteer to be on the release team then that is
cool, but to purposefully initiate a process that is destined to cause
conflict is really out of order.

Igor was at pains to point out that he is contributing as a volunteer to
the DeltaStreams initiative - fantastic. Someone famous once said that
if you want to be a leader then to do so you must be a servant to all.
This is what I have been trying to do all along, but even that famous
person ran into trouble when faced with an autocratic regime.

Keith

Reply | Threaded
Open this post in threaded view
|

Re: [squeak-dev] Re: Our process, some loose ideas regarding DS + MC

Göran Krampe
In reply to this post by Andreas.Raab
Hi!

Andreas Raab wrote:
> Hi Göran -
>
> It'd be easier to give feedback if you had something to try out ;-)

I know, it is a bit unfair of me to "push" it given that it is not yet
usable. I just want to perhaps find someone to help :). So forgive me
for the "DS spam".

> Abstractly speaking this sounds all cool but we'll only know when we
> actually try it.

Yup.

> One thing that's not quite clear in this picture is
> where the "canonical" source for patches would be and how it would be

Not sure I understand what you mean here.

> produced. In most of what you wrote you've concentrated more on the
> harvesting aspect (i.e., being able to cherry-pick contributions from
> elsewhere) but how does an actual development flow look like?

Well, I can't really say although I have some ideas, let me again
mention some things Deltas should make possible and then we could try to
think together what that means for the "flow":

- Deltas can be loaded into an image "en masse" because loading a Delta
simply deserializes it into a fully self contained object. This means we
can read, edit and analyse them without affecting the image.

- Deltas can be applied and unapplied.

- Deltas can "reason" about how well they can be applied. This is
because each change within the Delta has information about the state
"before" the change. For example, a ModifyMethodChange has both new and
old source (and stamps etc).

Some things we already know:

- Deltas in fact implement a more advanced changelog. It could be used
to replace the .changes file.

- Since they can be applied/unapplied Deltas can easily be used to build
a "Quilt"-like system.

- Deltas are "rich" in state and can be analysed/manipulated in a lot of
different ways.

So... post is getting long. :) I am all ears for observations/ideas at
this point. And yes, I will try to get DS into such a shape that we can
actually start playing more seriously with them.

Fully replacing Changesets would be a great start IMHO. To do that we
need fileformat and UI. And a few other odd bits, but not much.

regards, Göran


Reply | Threaded
Open this post in threaded view
|

[squeak-dev] Re: Our process, some loose ideas regarding DS + MC

Andreas.Raab
Göran Krampe wrote:
>> One thing that's not quite clear in this picture is where the
>> "canonical" source for patches would be and how it would be
>
> Not sure I understand what you mean here.

Simply the question of where/how DS are stored. Monticello has
repositories that store full source for a package. As a consequence
packages are large, but they allow you to see what's different in your
image, what a patch does, tracking history etc. You can agree on a place
and say "this is THE repository for package X" etc. I'm wondering where
DS stand with regards to these issues.

>> produced. In most of what you wrote you've concentrated more on the
>> harvesting aspect (i.e., being able to cherry-pick contributions from
>> elsewhere) but how does an actual development flow look like?
>
> Well, I can't really say although I have some ideas, let me again
> mention some things Deltas should make possible and then we could try to
> think together what that means for the "flow":
>
> - Deltas can be loaded into an image "en masse" because loading a Delta
> simply deserializes it into a fully self contained object. This means we
> can read, edit and analyse them without affecting the image.
>
> - Deltas can be applied and unapplied.
>
> - Deltas can "reason" about how well they can be applied. This is
> because each change within the Delta has information about the state
> "before" the change. For example, a ModifyMethodChange has both new and
> old source (and stamps etc).

That's *very* useful. One of my favorite features when using MC is that
it can tell us if there is a conflict in a merge and that this method
requires special attention. If DS can do something similar by telling us
that the base version of a method is different from when the DS was
created this will be hugely helpful.

> Some things we already know:
>
> - Deltas in fact implement a more advanced changelog. It could be used
> to replace the .changes file.
>
> - Since they can be applied/unapplied Deltas can easily be used to build
> a "Quilt"-like system.

I don't know what that is. Any references?

> - Deltas are "rich" in state and can be analysed/manipulated in a lot of
> different ways.
>
> So... post is getting long. :) I am all ears for observations/ideas at
> this point. And yes, I will try to get DS into such a shape that we can
> actually start playing more seriously with them.
>
> Fully replacing Changesets would be a great start IMHO. To do that we
> need fileformat and UI. And a few other odd bits, but not much.

Keep me in the loop if you need help. I probably won't be touching the
domain model and stuff but if you need help on the surface and for
workflows I might be able to help out.

Cheers,
   - Andreas

Reply | Threaded
Open this post in threaded view
|

Re: [squeak-dev] Re: Our process, some loose ideas regarding DS + MC

Göran Krampe
Hi!

Andreas Raab wrote:

> Göran Krampe wrote:
>>> One thing that's not quite clear in this picture is where the
>>> "canonical" source for patches would be and how it would be
>>
>> Not sure I understand what you mean here.
>
> Simply the question of where/how DS are stored. Monticello has
> repositories that store full source for a package. As a consequence
> packages are large, but they allow you to see what's different in your
> image, what a patch does, tracking history etc. You can agree on a place
> and say "this is THE repository for package X" etc. I'm wondering where
> DS stand with regards to these issues.

Since the Stream part is not yet started I am all ears. But apart from
the obvious fact that Deltas can serve the same roles as Changesets can
(only better), I was envisioning chronological "streams" (as in
continuous flows) of Deltas associated with individual developers,
forks, packages and branches of packages etc.

For example, I would gladly suck up all Deltas people produce that touch
any of my packages. :) Thus, if people would submit their Deltas to some
huge searchable archive - or if we had at least some simple "upstream"
hooked to packages so that when people make fixes they can easily just
shoot them "upstream" at me, it would be great.

>>> produced. In most of what you wrote you've concentrated more on the
>>> harvesting aspect (i.e., being able to cherry-pick contributions from
>>> elsewhere) but how does an actual development flow look like?
>>
>> Well, I can't really say although I have some ideas, let me again
>> mention some things Deltas should make possible and then we could try
>> to think together what that means for the "flow":
>>
>> - Deltas can be loaded into an image "en masse" because loading a
>> Delta simply deserializes it into a fully self contained object. This
>> means we can read, edit and analyse them without affecting the image.
>>
>> - Deltas can be applied and unapplied.
>>
>> - Deltas can "reason" about how well they can be applied. This is
>> because each change within the Delta has information about the state
>> "before" the change. For example, a ModifyMethodChange has both new
>> and old source (and stamps etc).
>
> That's *very* useful. One of my favorite features when using MC is that
> it can tell us if there is a conflict in a merge and that this method
> requires special attention. If DS can do something similar by telling us
> that the base version of a method is different from when the DS was
> created this will be hugely helpful.

This is in fact the *core idea*. The idea came about after watching
Linus thoughts on git and to think about how MC and most SCMs work. They
all get their "merge magic" from extensive knowledge of history to a
common base. But that is something we don't have between forks.

Thus, could we get 80% of magic using simpler tricks? The trick is to
let the Changeset contain more info - especially info about the "before"
state. This of course both enables unapply, but more essentially it
enables much smarter apply-logic.

A "perfectly clean" Delta is one that sees the exact same "before" state
as it captured when it was created in the source image. A "dirty" Delta
is one that can be applied (classes are not missing etc) but it will
overwrite for example methods that look different than expected.

Also, a Delta can do other smart things since it has captured class
definition changes in more detail - say you add an ivar "c" and when it
checks the destination image it finds other additional ivars but no "c"
- then it can merge by just adding "c". Nice eh? :)

>> Some things we already know:
>>
>> - Deltas in fact implement a more advanced changelog. It could be used
>> to replace the .changes file.
>>
>> - Since they can be applied/unapplied Deltas can easily be used to
>> build a "Quilt"-like system.
>
> I don't know what that is. Any references?

Well, Mercurial queues are similar, see:

http://mercurial.selenic.com/wiki/MqExtension#Introduction


>> - Deltas are "rich" in state and can be analysed/manipulated in a lot
>> of different ways.
>>
>> So... post is getting long. :) I am all ears for observations/ideas at
>> this point. And yes, I will try to get DS into such a shape that we
>> can actually start playing more seriously with them.
>>
>> Fully replacing Changesets would be a great start IMHO. To do that we
>> need fileformat and UI. And a few other odd bits, but not much.
>
> Keep me in the loop if you need help. I probably won't be touching the
> domain model and stuff but if you need help on the surface and for
> workflows I might be able to help out.

I will get back as soon as I have something to show.

regards, Göran


Reply | Threaded
Open this post in threaded view
|

[squeak-dev] Re: Our process, some loose ideas regarding DS + MC

Andreas.Raab
Göran Krampe wrote:

>> Simply the question of where/how DS are stored. Monticello has
>> repositories that store full source for a package. As a consequence
>> packages are large, but they allow you to see what's different in your
>> image, what a patch does, tracking history etc. You can agree on a
>> place and say "this is THE repository for package X" etc. I'm
>> wondering where DS stand with regards to these issues.
>
> Since the Stream part is not yet started I am all ears. But apart from
> the obvious fact that Deltas can serve the same roles as Changesets can
> (only better), I was envisioning chronological "streams" (as in
> continuous flows) of Deltas associated with individual developers,
> forks, packages and branches of packages etc.

A thing to keep in mind is that there is huge value in having a
definitive version of the code that you can compare against somewhere.
Being able to say "oh, these are my local changes" is a large part of
what makes working with MC superior to working with change sets. So it
would be really useful if one could say "compared with this repository,
you have applied delta x, y, and z, and in addition you have modified
methods foo and bar".

>> That's *very* useful. One of my favorite features when using MC is
>> that it can tell us if there is a conflict in a merge and that this
>> method requires special attention. If DS can do something similar by
>> telling us that the base version of a method is different from when
>> the DS was created this will be hugely helpful.
>
> This is in fact the *core idea*. The idea came about after watching
> Linus thoughts on git and to think about how MC and most SCMs work. They
> all get their "merge magic" from extensive knowledge of history to a
> common base. But that is something we don't have between forks.

Why not? Actually we do. MC will search any repository you add and if it
finds any common ancestor in any of the repositories it will use that.
I've done some pretty extensive merges that way and the only thing MC
lacks in this area is explicit support for cherry-picking (i.e., to
accept or reject changes even when they don't conflict).

> Thus, could we get 80% of magic using simpler tricks? The trick is to
> let the Changeset contain more info - especially info about the "before"
> state. This of course both enables unapply, but more essentially it
> enables much smarter apply-logic.

Interesting. I had thought that keeping the stamp or a hash would be
enough but you're right, having the actual previous version does allow
you to trivially revert to it. Very clever.

> Also, a Delta can do other smart things since it has captured class
> definition changes in more detail - say you add an ivar "c" and when it
> checks the destination image it finds other additional ivars but no "c"
> - then it can merge by just adding "c". Nice eh? :)

Nice. But more a sign of weakness in Monticello ;-) But it's definitely
a good idea since this can cover additions that come from "other"
deltas. Which reminds me: Where are the deltas stored and how big is the
space overhead for keeping them?

Cheers,
   - Andreas

Reply | Threaded
Open this post in threaded view
|

Re: [squeak-dev] Re: Our process, some loose ideas regarding DS + MC

Igor Stasenko
2009/8/16 Andreas Raab <[hidden email]>:

> Göran Krampe wrote:
>>>
>>> Simply the question of where/how DS are stored. Monticello has
>>> repositories that store full source for a package. As a consequence packages
>>> are large, but they allow you to see what's different in your image, what a
>>> patch does, tracking history etc. You can agree on a place and say "this is
>>> THE repository for package X" etc. I'm wondering where DS stand with regards
>>> to these issues.
>>
>> Since the Stream part is not yet started I am all ears. But apart from the
>> obvious fact that Deltas can serve the same roles as Changesets can (only
>> better), I was envisioning chronological "streams" (as in continuous flows)
>> of Deltas associated with individual developers, forks, packages and
>> branches of packages etc.
>
> A thing to keep in mind is that there is huge value in having a definitive
> version of the code that you can compare against somewhere. Being able to
> say "oh, these are my local changes" is a large part of what makes working
> with MC superior to working with change sets. So it would be really useful
> if one could say "compared with this repository, you have applied delta x,
> y, and z, and in addition you have modified methods foo and bar".
>
>>> That's *very* useful. One of my favorite features when using MC is that
>>> it can tell us if there is a conflict in a merge and that this method
>>> requires special attention. If DS can do something similar by telling us
>>> that the base version of a method is different from when the DS was created
>>> this will be hugely helpful.
>>
>> This is in fact the *core idea*. The idea came about after watching Linus
>> thoughts on git and to think about how MC and most SCMs work. They all get
>> their "merge magic" from extensive knowledge of history to a common base.
>> But that is something we don't have between forks.
>
> Why not? Actually we do. MC will search any repository you add and if it
> finds any common ancestor in any of the repositories it will use that. I've
> done some pretty extensive merges that way and the only thing MC lacks in
> this area is explicit support for cherry-picking (i.e., to accept or reject
> changes even when they don't conflict).
>
>> Thus, could we get 80% of magic using simpler tricks? The trick is to let
>> the Changeset contain more info - especially info about the "before" state.
>> This of course both enables unapply, but more essentially it enables much
>> smarter apply-logic.
>
> Interesting. I had thought that keeping the stamp or a hash would be enough
> but you're right, having the actual previous version does allow you to
> trivially revert to it. Very clever.
>
>> Also, a Delta can do other smart things since it has captured class
>> definition changes in more detail - say you add an ivar "c" and when it
>> checks the destination image it finds other additional ivars but no "c" -
>> then it can merge by just adding "c". Nice eh? :)
>
> Nice. But more a sign of weakness in Monticello ;-) But it's definitely a
> good idea since this can cover additions that come from "other" deltas.
> Which reminds me: Where are the deltas stored and how big is the space
> overhead for keeping them?
>
i think most common overhead is 2x more space comparing to analoguous
changeset record, since you have to keep 'before' version in addition
to 'after' version. And most common change is changing the method(s)
source a little bit.

Oh, and did Goran mentioned that Deltas could be also 'compressed' in
same way as changeset?
I mean
you having versions A -> .... -> Z
in uncompressed form,
but surely you may decide to obliterate intermediate versions (....),
so final delta will be
A -> Z
you losing ability to roll back to any intermediate version, but you
still can roll back/forward between A<->Z

> Cheers,
>  - Andreas
>
>



--
Best regards,
Igor Stasenko AKA sig.

Reply | Threaded
Open this post in threaded view
|

Re: [squeak-dev] Re: Our process, some loose ideas regarding DS + MC

Göran Krampe
In reply to this post by Andreas.Raab
Hi!

Andreas Raab wrote:

> Göran Krampe wrote:
>>> Simply the question of where/how DS are stored. Monticello has
>>> repositories that store full source for a package. As a consequence
>>> packages are large, but they allow you to see what's different in
>>> your image, what a patch does, tracking history etc. You can agree on
>>> a place and say "this is THE repository for package X" etc. I'm
>>> wondering where DS stand with regards to these issues.
>>
>> Since the Stream part is not yet started I am all ears. But apart from
>> the obvious fact that Deltas can serve the same roles as Changesets
>> can (only better), I was envisioning chronological "streams" (as in
>> continuous flows) of Deltas associated with individual developers,
>> forks, packages and branches of packages etc.
>
> A thing to keep in mind is that there is huge value in having a
> definitive version of the code that you can compare against somewhere.
> Being able to say "oh, these are my local changes" is a large part of
> what makes working with MC superior to working with change sets. So it
> would be really useful if one could say "compared with this repository,
> you have applied delta x, y, and z, and in addition you have modified
> methods foo and bar".

Yes, I agree - and DS was not meant to replace MC. MC does snapshots and
maintains their history and DS captures "developer changes" in a fine
granular fashion. But a combo of MC and DS would probably be very
interesting.

>>> That's *very* useful. One of my favorite features when using MC is
>>> that it can tell us if there is a conflict in a merge and that this
>>> method requires special attention. If DS can do something similar by
>>> telling us that the base version of a method is different from when
>>> the DS was created this will be hugely helpful.
>>
>> This is in fact the *core idea*. The idea came about after watching
>> Linus thoughts on git and to think about how MC and most SCMs work.
>> They all get their "merge magic" from extensive knowledge of history
>> to a common base. But that is something we don't have between forks.
>
> Why not? Actually we do. MC will search any repository you add and if it
> finds any common ancestor in any of the repositories it will use that.

Yes, I know. But I still think we will end up with situations where the
forks don't share enough history in order to do this. I may be wrong.

> I've done some pretty extensive merges that way and the only thing MC
> lacks in this area is explicit support for cherry-picking (i.e., to
> accept or reject changes even when they don't conflict).
>
>> Thus, could we get 80% of magic using simpler tricks? The trick is to
>> let the Changeset contain more info - especially info about the
>> "before" state. This of course both enables unapply, but more
>> essentially it enables much smarter apply-logic.
>
> Interesting. I had thought that keeping the stamp or a hash would be
> enough but you're right, having the actual previous version does allow
> you to trivially revert to it. Very clever.

Also, a "perfect revert" is only possible if the Delta was "perfectly
clean" when being applied. BUT... the cool trick is that if you are
applying a Delta which is NOT perfectly clean (let's say a method being
changed does not match the "before" state) then you just record a NEW
Delta when applying the Delta and that new Delta will be able to do a
perfect revert.

One interesting aspect here is that say you record a Delta with one
single change - a class delete. The Delta will then create a composite
change which contains all changes needed to recreate that class (class
creation, method additions etc).

So if you then load that into a different image it can check that the
class to be deleted is exactly the same as it was in the source image.

>> Also, a Delta can do other smart things since it has captured class
>> definition changes in more detail - say you add an ivar "c" and when
>> it checks the destination image it finds other additional ivars but no
>> "c" - then it can merge by just adding "c". Nice eh? :)
>
> Nice. But more a sign of weakness in Monticello ;-) But it's definitely

Yes, but since DS captures "developer actions" it can in theory capture
more info than MC can. For example, a class rename is captured as a
class rename change. It can never be confused for a remove class + add
another class.

> a good idea since this can cover additions that come from "other"
> deltas. Which reminds me: Where are the deltas stored and how big is the
> space overhead for keeping them?

You mean in the image? Since a Delta is a "fully self contained" object
with no references to anything outside it and only contains "simple
data" - we are quite free to do what we want with them. We could for
example store them in a database and easily load/edit/save/search them.

Some simple file based repository scheme is of course needed. Perhaps
just some kind of file naming convention to get a sort order.

One small idea I have is to perhaps use CouchDB as a repository option,
would fit quite well and since CouchDB has inter db replication built in
we could get a very nice base for hooking our streams together into
larger and larger streams all the way to the sea :)

I have not measured the in image space overhead. Tirade that I am
hooking in is quite fast in loading them:

http://goran.krampe.se/blog/Squeak/Tirade2.rdoc

Sidenote: One other strong advantage of using DS instead of changesets
is that the DS default applier is SystemEditor which makes applying
fully atomic.

regards, Göran


Reply | Threaded
Open this post in threaded view
|

Re: [squeak-dev] Re: Our process, some loose ideas regarding DS + MC

Göran Krampe
In reply to this post by Igor Stasenko
Hi!

Igor Stasenko wrote:
> 2009/8/16 Andreas Raab <[hidden email]>:
>> Which reminds me: Where are the deltas stored and how big is the space
>> overhead for keeping them?
>
> i think most common overhead is 2x more space comparing to analoguous
> changeset record, since you have to keep 'before' version in addition
> to 'after' version. And most common change is changing the method(s)
> source a little bit.

Well, in fact it is more - a Changeset does in fact not hold a copy of
the new method source - just a pointer to the method modified (IIRC).

Thus Changesets are sneaky beasts, most people get the wrong impression.

> Oh, and did Goran mentioned that Deltas could be also 'compressed' in
> same way as changeset?

Yes, I have called it "normalized" but compressed may be a better term.

That code is not yet finished though, but a fun coding task for anyone
interested. :) There are two scenarios:

- Compressing a single Delta into its non redundant form. This makes it
look like a Changeset. Before compression it is a "perfect" log.

- Compression of multiple Deltas into one. This one is trivial once you
implemented first one.

For bug fixes and patches I believe one would naturally want to compress
before distributing of course.

regards, Göran


Reply | Threaded
Open this post in threaded view
|

Re: [squeak-dev] Re: Our process, some loose ideas regarding DS + MC

Igor Stasenko
2009/8/16 Göran Krampe <[hidden email]>:

> Hi!
>
> Igor Stasenko wrote:
>>
>> 2009/8/16 Andreas Raab <[hidden email]>:
>>>
>>> Which reminds me: Where are the deltas stored and how big is the space
>>> overhead for keeping them?
>>
>> i think most common overhead is 2x more space comparing to analoguous
>> changeset record, since you have to keep 'before' version in addition
>> to 'after' version. And most common change is changing the method(s)
>> source a little bit.
>
> Well, in fact it is more - a Changeset does in fact not hold a copy of the
> new method source - just a pointer to the method modified (IIRC).
>
oh, i meant not in-image space, but storage space required for
interchange format.

> Thus Changesets are sneaky beasts, most people get the wrong impression.
>
yes, and we could do same with deltas - use external storage for
keeping them in file(s), while in-image
they hold only a pointers.
But i think its too early to think about it.
Or.. why too early?
You already mentioned different ways to persist the deltas.. so
a delta 'pointer' could be represented by a tuple: (adaptor , id)

where adaptor could be: file, couch DB, url or anything else,
and id is an additional info , identifying given delta.

Or, just use a common denominator of all of this stuff - URI.
I don't think that its hard to introduce a new URI, which identifies a
Couch DB storage :)


>> Oh, and did Goran mentioned that Deltas could be also 'compressed' in
>> same way as changeset?
>
> Yes, I have called it "normalized" but compressed may be a better term.
>
> That code is not yet finished though, but a fun coding task for anyone
> interested. :) There are two scenarios:
>
> - Compressing a single Delta into its non redundant form. This makes it look
> like a Changeset. Before compression it is a "perfect" log.
>
> - Compression of multiple Deltas into one. This one is trivial once you
> implemented first one.
>
> For bug fixes and patches I believe one would naturally want to compress
> before distributing of course.
>

Right

> regards, Göran
>
>
>



--
Best regards,
Igor Stasenko AKA sig.

Reply | Threaded
Open this post in threaded view
|

Re: [squeak-dev] Re: Our process, some loose ideas regarding DS + MC

Igor Stasenko
2009/8/16 Igor Stasenko <[hidden email]>:

> 2009/8/16 Göran Krampe <[hidden email]>:
>> Hi!
>>
>> Igor Stasenko wrote:
>>>
>>> 2009/8/16 Andreas Raab <[hidden email]>:
>>>>
>>>> Which reminds me: Where are the deltas stored and how big is the space
>>>> overhead for keeping them?
>>>
>>> i think most common overhead is 2x more space comparing to analoguous
>>> changeset record, since you have to keep 'before' version in addition
>>> to 'after' version. And most common change is changing the method(s)
>>> source a little bit.
>>
>> Well, in fact it is more - a Changeset does in fact not hold a copy of the
>> new method source - just a pointer to the method modified (IIRC).
>>
> oh, i meant not in-image space, but storage space required for
> interchange format.
>
>> Thus Changesets are sneaky beasts, most people get the wrong impression.
>>
> yes, and we could do same with deltas - use external storage for
> keeping them in file(s), while in-image
> they hold only a pointers.
> But i think its too early to think about it.
> Or.. why too early?
> You already mentioned different ways to persist the deltas.. so
> a delta 'pointer' could be represented by a tuple: (adaptor , id)
>
> where adaptor could be: file, couch DB, url or anything else,
> and id is an additional info , identifying given delta.
>
> Or, just use a common denominator of all of this stuff - URI.
> I don't think that its hard to introduce a new URI, which identifies a
> Couch DB storage :)
>

Thinking a bit more about it, i think its a great thing which we
should employ early!

Consider exchanging between people not with files, but with URLs:

http://my.site.foo.bar/squeak/myfixToSomeStuff.ds#100

or:

ftp://usr:[hidden email]/myfixToSomeStuff.ds#1234

and locally, one of course could use:

file:///usr/home/geek/squeak/myfixToSomeStuff.ds#1234


and so on..

Simple and powerful, and no early binding to any kind of specific
storage currently existing in the world.

>
>>> Oh, and did Goran mentioned that Deltas could be also 'compressed' in
>>> same way as changeset?
>>
>> Yes, I have called it "normalized" but compressed may be a better term.
>>
>> That code is not yet finished though, but a fun coding task for anyone
>> interested. :) There are two scenarios:
>>
>> - Compressing a single Delta into its non redundant form. This makes it look
>> like a Changeset. Before compression it is a "perfect" log.
>>
>> - Compression of multiple Deltas into one. This one is trivial once you
>> implemented first one.
>>
>> For bug fixes and patches I believe one would naturally want to compress
>> before distributing of course.
>>
>
> Right
>
>> regards, Göran
>>
>>
>>
>
>
>
> --
> Best regards,
> Igor Stasenko AKA sig.
>



--
Best regards,
Igor Stasenko AKA sig.

Reply | Threaded
Open this post in threaded view
|

Re: [squeak-dev] Re: Our process, some loose ideas regarding DS + MC

Juan Vuletich-4
In reply to this post by Göran Krampe
Hi Göran,

Göran Krampe wrote:
> ...
> Yes, I agree - and DS was not meant to replace MC. MC does snapshots
> and maintains their history and DS captures "developer changes" in a
> fine granular fashion. But a combo of MC and DS would probably be very
> interesting.

If possible try to keep DS usable even without MC. Some people might
choose to use DS and not MC...

>>>> That's *very* useful. One of my favorite features when using MC is
>>>> that it can tell us if there is a conflict in a merge and that this
>>>> method requires special attention. If DS can do something similar
>>>> by telling us that the base version of a method is different from
>>>> when the DS was created this will be hugely helpful.
>>>
>>> This is in fact the *core idea*. The idea came about after watching
>>> Linus thoughts on git and to think about how MC and most SCMs work.
>>> They all get their "merge magic" from extensive knowledge of history
>>> to a common base. But that is something we don't have between forks.
>>
>> Why not? Actually we do. MC will search any repository you add and if
>> it finds any common ancestor in any of the repositories it will use
>> that.
>
> Yes, I know. But I still think we will end up with situations where
> the forks don't share enough history in order to do this. I may be wrong.

I agree with you. For example Cuis is not based on MC packages. So
having "merge magic" without needing a common ancestor in MC would be
wonderful.

Cheers,
Juan Vuletich

Reply | Threaded
Open this post in threaded view
|

Re: [squeak-dev] Re: Our process, some loose ideas regarding DS + MC

Göran Krampe
In reply to this post by Igor Stasenko
Hi!

Igor Stasenko wrote:

> 2009/8/16 Göran Krampe <[hidden email]>:
>> Hi!
>>
>> Igor Stasenko wrote:
>>> 2009/8/16 Andreas Raab <[hidden email]>:
>>>> Which reminds me: Where are the deltas stored and how big is the space
>>>> overhead for keeping them?
>>> i think most common overhead is 2x more space comparing to analoguous
>>> changeset record, since you have to keep 'before' version in addition
>>> to 'after' version. And most common change is changing the method(s)
>>> source a little bit.
>> Well, in fact it is more - a Changeset does in fact not hold a copy of the
>> new method source - just a pointer to the method modified (IIRC).
>
> oh, i meant not in-image space, but storage space required for
> interchange format.

The domain model in DS is more or less done. Matthew implemented a file
format but I am not pursuing that format for various reasons - I instead
am working on hooking Tirade into DS to use Tirade as fileformat.

For info about Tirade, see my blog articles:

http://goran.krampe.se/blog/Squeak/Tirade.rdoc
http://goran.krampe.se/blog/Squeak/Tirade2.rdoc
http://goran.krampe.se/blog/Squeak/Tirade3.rdoc

Basically Tirade looks like Smalltalk message sends so it will be in
"low level syntax" similar to the chunk format used for Changesets but
it does not use Compiler to load.

>> Thus Changesets are sneaky beasts, most people get the wrong impression.
>>
> yes, and we could do same with deltas - use external storage for
> keeping them in file(s), while in-image
> they hold only a pointers.

Outside of the image I would primarily use the Tirade format. In image a
Delta does NOT use pointers to live code, and that is by design. A Delta
is a "fully self contained" object graph with no outgoing references. It
is totally independent of current image state.

This is highly different from a Changeset. A Changeset is typically
"filed in" and that means you change the image.

A Delta is handled in two steps: First you load it. This just means
deserializing it from Tirade into an object. Then you apply it. This is
the step that affects the image, atomically, using SystemEditor.

> But i think its too early to think about it.
> Or.. why too early?

Not too early at all, in fact almost too late. :)

> You already mentioned different ways to persist the deltas.. so
> a delta 'pointer' could be represented by a tuple: (adaptor , id)
>
> where adaptor could be: file, couch DB, url or anything else,
> and id is an additional info , identifying given delta.
>
> Or, just use a common denominator of all of this stuff - URI.
> I don't think that its hard to introduce a new URI, which identifies a
> Couch DB storage :)

The CouchDB API is purely HTTP restful. Each document in CouchDB is
accessed simply by a URL. Fits nicely in this area.

regards, Göran


Reply | Threaded
Open this post in threaded view
|

Re: [squeak-dev] Re: Our process, some loose ideas regarding DS + MC

Igor Stasenko
2009/8/16 Göran Krampe <[hidden email]>:

> Hi!
>
> Igor Stasenko wrote:
>>
>> 2009/8/16 Göran Krampe <[hidden email]>:
>>>
>>> Hi!
>>>
>>> Igor Stasenko wrote:
>>>>
>>>> 2009/8/16 Andreas Raab <[hidden email]>:
>>>>>
>>>>> Which reminds me: Where are the deltas stored and how big is the space
>>>>> overhead for keeping them?
>>>>
>>>> i think most common overhead is 2x more space comparing to analoguous
>>>> changeset record, since you have to keep 'before' version in addition
>>>> to 'after' version. And most common change is changing the method(s)
>>>> source a little bit.
>>>
>>> Well, in fact it is more - a Changeset does in fact not hold a copy of
>>> the
>>> new method source - just a pointer to the method modified (IIRC).
>>
>> oh, i meant not in-image space, but storage space required for
>> interchange format.
>
> The domain model in DS is more or less done. Matthew implemented a file
> format but I am not pursuing that format for various reasons - I instead am
> working on hooking Tirade into DS to use Tirade as fileformat.
>
> For info about Tirade, see my blog articles:
>
> http://goran.krampe.se/blog/Squeak/Tirade.rdoc
> http://goran.krampe.se/blog/Squeak/Tirade2.rdoc
> http://goran.krampe.se/blog/Squeak/Tirade3.rdoc
>
> Basically Tirade looks like Smalltalk message sends so it will be in "low
> level syntax" similar to the chunk format used for Changesets but it does
> not use Compiler to load.
>

I am well aware about Tirade, other readers may not, of course. :)

>>> Thus Changesets are sneaky beasts, most people get the wrong impression.
>>>
>> yes, and we could do same with deltas - use external storage for
>> keeping them in file(s), while in-image
>> they hold only a pointers.
>
> Outside of the image I would primarily use the Tirade format. In image a
> Delta does NOT use pointers to live code, and that is by design. A Delta is
> a "fully self contained" object graph with no outgoing references. It is
> totally independent of current image state.
>
> This is highly different from a Changeset. A Changeset is typically "filed
> in" and that means you change the image.
>

I have nothing against one or another data format. If it designed well
to serve its purpose - no-one would put you on fire for that. :)
My concern is different - the ability to 'offload' the bulk data from
image to persistent storage and be able to reload it
back in case of need.
This is what .source & .changes files serving for with more or less success.
So, in this respect, do you see how we could improve the state of
affairs , taking in account my proposal, that
lets call it a 'hibernated' Delta (one which offloaded data from image
to some storage) could carry a pointer to resource which holding all
the information which needed to 'unhibernate' it. URI/URL mechanism..
Same, btw, as Monticello doing with different repositories i.e.
MCHttpRepository
    location: 'http://www.squeaksource.com/DeltaStreams'
    user: ''
    password: ''

> A Delta is handled in two steps: First you load it. This just means
> deserializing it from Tirade into an object. Then you apply it. This is the
> step that affects the image, atomically, using SystemEditor.
>
>> But i think its too early to think about it.
>> Or.. why too early?
>
> Not too early at all, in fact almost too late. :)
>
its never too late to improve things :) But can you explain what you
have in mind?

>> You already mentioned different ways to persist the deltas.. so
>> a delta 'pointer' could be represented by a tuple: (adaptor , id)
>>
>> where adaptor could be: file, couch DB, url or anything else,
>> and id is an additional info , identifying given delta.
>>
>> Or, just use a common denominator of all of this stuff - URI.
>> I don't think that its hard to introduce a new URI, which identifies a
>> Couch DB storage :)
>
> The CouchDB API is purely HTTP restful. Each document in CouchDB is accessed
> simply by a URL. Fits nicely in this area.
>
Perfect.

> regards, Göran
>
>
>



--
Best regards,
Igor Stasenko AKA sig.

Reply | Threaded
Open this post in threaded view
|

Re: [squeak-dev] Re: Our process, some loose ideas regarding DS + MC

Ralph Johnson
Göran Krampe explained how DS gets loaded into two phases.  First, the
Delta gets read from a file and turnd into an object in the image, but
no classes are changed.  Second, the Delta is applied, i.e. it changes
the classes.

Igor Stasenko replied

> My concern is different - the ability to 'offload' the bulk data from
> image to persistent storage and be able to reload it
> back in case of need.
> This is what .source & .changes files serving for with more or less success.
> So, in this respect, do you see how we could improve the state of
> affairs , taking in account my proposal, that
> lets call it a 'hibernated' Delta (one which offloaded data from image
> to some storage) could carry a pointer to resource which holding all
> the information which needed to 'unhibernate' it. URI/URL mechanism..

Is DS going to replace .source and .changes?  I did not think it was.
I thought it was competing more directly with .cs files and, to a
certain extent, with Monticello.    Thus, if you only want to load a
program and run it, you won't need to keep the Deltas around once you
have applied them.

-Ralph

Reply | Threaded
Open this post in threaded view
|

Re: [squeak-dev] Re: Our process, some loose ideas regarding DS + MC

Göran Krampe
In reply to this post by Igor Stasenko
Igor Stasenko wrote:
> I am well aware about Tirade, other readers may not, of course. :)

Exactly. :)

> I have nothing against one or another data format. If it designed well
> to serve its purpose - no-one would put you on fire for that. :)
> My concern is different - the ability to 'offload' the bulk data from
> image to persistent storage and be able to reload it
> back in case of need.
> This is what .source & .changes files serving for with more or less success.
> So, in this respect, do you see how we could improve the state of
> affairs , taking in account my proposal, that
> lets call it a 'hibernated' Delta (one which offloaded data from image
> to some storage) could carry a pointer to resource which holding all
> the information which needed to 'unhibernate' it. URI/URL mechanism..

That idea seems like a good idea, a "smart stub" you mean? Should be
nice to have yes.

> Same, btw, as Monticello doing with different repositories i.e.
> MCHttpRepository
>     location: 'http://www.squeaksource.com/DeltaStreams'
>     user: ''
>     password: ''
>
>> A Delta is handled in two steps: First you load it. This just means
>> deserializing it from Tirade into an object. Then you apply it. This is the
>> step that affects the image, atomically, using SystemEditor.
>>
>>> But i think its too early to think about it.
>>> Or.. why too early?
>> Not too early at all, in fact almost too late. :)
>>
> its never too late to improve things :) But can you explain what you
> have in mind?

I think we misunderstood each other, there is nothing done yet in the
area you describe, so not too late there. I was referring to domain
model and Tirade.

>>> You already mentioned different ways to persist the deltas.. so
>>> a delta 'pointer' could be represented by a tuple: (adaptor , id)
>>>
>>> where adaptor could be: file, couch DB, url or anything else,
>>> and id is an additional info , identifying given delta.
>>>
>>> Or, just use a common denominator of all of this stuff - URI.
>>> I don't think that its hard to introduce a new URI, which identifies a
>>> Couch DB storage :)
>> The CouchDB API is purely HTTP restful. Each document in CouchDB is accessed
>> simply by a URL. Fits nicely in this area.
>>
> Perfect.

Yes.

regards. Göran


12