Hi folks!
As we all know we are in the middle of a heated discussion about our release process etc. I haven't read all messages nor do I think that any of the two "sides" (Andreas and Keith basically) are 100% wrong nor 100% right. Anyway, disregarding all that :) - I would like to do a sketch on an "ideal" daily process for development. I would appreciate this thread to NOT turn into the current "war" between "Trunk" and "Bob", ok? MC is a very good tool and Keith/Matthew and others have turned it into an even better tool AFAIK using SystemEditor as a base etc. It does have its limitations though: 1. It tends to be more coarse granular than other tools. "commits" tend to be more seldom, at least when I work in it. Probably due to less stellar performance, might have been fixed. 2. It needs history to do its merge magic. Thus it doesn't play well between forks. It plays very well within a package or within a group of packages (a fork perhaps). 3. It is centered around packages defined by PackageInfo which more or less means "a group of class categories + class extensions". It does have MC configs now, and I haven't used them myself yet so I can't really comment. The above three bits are different with Deltas. Anyway, I am trying to envision how an MC based approach like "trunk" could work together with Deltastreams. DS is meant to replace Changesets and could complement MC in the above three departments (and more). For example, we could create some kind of "commit tool" that operates above MC/DS which would use Delta recording (exactly like Changesets record today) to catch current image modifications and to offer a nice UI to select a subset of those modifications (or all), enter commit comment, classify it and push "commit". This commit tool could have some checkboxes and drop down menus to classify this commit and also tell where to "send it": - Also snapshot MC packages. This would cause an MC commit as well as a Delta to be produced. If not checked we only produce a Delta. - Bug fix to send to proper streams based on touched PIs. The tool could then select proper Delta *stream* in which to publish the Delta based on what PIs it touches. What would this achieve? Well, first of all we might be able to work more completely in Squeak. :) We would above all be able to get "cross fork pollination" using the Deltas that are being published alongside the MC snapshots. Pharo could be "listening" to our bug fix stream for example. And vice versa if Pharo decides to use this. And well, it opens up lots of other possibilities I think. Anyway, would be interested in all feedback on this idea. regards, Göran |
Hi Göran!
Göran Krampe wrote: > Hi folks! > > ... > MC is a very good tool and Keith/Matthew and others have turned it > into an even better tool AFAIK using SystemEditor as a base etc. It > does have its limitations though: > > 1. It tends to be more coarse granular than other tools. "commits" > tend to be more seldom, at least when I work in it. Probably due to > less stellar performance, might have been fixed. > > 2. It needs history to do its merge magic. Thus it doesn't play well > between forks. It plays very well within a package or within a group > of packages (a fork perhaps). > > 3. It is centered around packages defined by PackageInfo which more or > less means "a group of class categories + class extensions". It does > have MC configs now, and I haven't used them myself yet so I can't > really comment. > > The above three bits are different with Deltas. > > ... > > What would this achieve? Well, first of all we might be able to work > more completely in Squeak. :) We would above all be able to get "cross > fork pollination" using the Deltas that are being published alongside > the MC snapshots. Pharo could be "listening" to our bug fix stream for > example. And vice versa if Pharo decides to use this. And well, it > opens up lots of other possibilities I think. > > Anyway, would be interested in all feedback on this idea. > > regards, Göran You're doing a great work with Deltas! I agree that Deltas can fix the problems of both ChangeSets and MC. And that they might ease sharing code between forks. I want to support them in Cuis. And I hope they become a central piece of the update processes of Squeak and Pharo. Cheers, Juan Vuletich |
In reply to this post by Göran Krampe
El 15/08/2009, a las 10:10, Göran Krampe escribió: > Hi folks! > > As we all know we are in the middle of a heated discussion about our > release process etc. I haven't read all messages nor do I think that > any of the two "sides" (Andreas and Keith basically) are 100% wrong > nor 100% right. > > Anyway, disregarding all that :) - I would like to do a sketch on an > "ideal" daily process for development. I would appreciate this > thread to NOT turn into the current "war" between "Trunk" and "Bob", > ok? > > MC is a very good tool and Keith/Matthew and others have turned it > into an even better tool AFAIK using SystemEditor as a base etc. It > does have its limitations though: > > 1. It tends to be more coarse granular than other tools. "commits" > tend to be more seldom, at least when I work in it. Probably due to > less stellar performance, might have been fixed. > > 2. It needs history to do its merge magic. Thus it doesn't play well > between forks. It plays very well within a package or within a group > of packages (a fork perhaps). > > 3. It is centered around packages defined by PackageInfo which more > or less means "a group of class categories + class extensions". It > does have MC configs now, and I haven't used them myself yet so I > can't really comment. > > The above three bits are different with Deltas. > > Anyway, I am trying to envision how an MC based approach like > "trunk" could work together with Deltastreams. DS is meant to > replace Changesets and could complement MC in the above three > departments (and more). > > For example, we could create some kind of "commit tool" that > operates above MC/DS which would use Delta recording (exactly like > Changesets record today) to catch current image modifications and to > offer a nice UI to select a subset of those modifications (or all), > enter commit comment, classify it and push "commit". > > This commit tool could have some checkboxes and drop down menus to > classify this commit and also tell where to "send it": > > - Also snapshot MC packages. This would cause an MC commit as well > as a Delta to be produced. If not checked we only produce a Delta. > > - Bug fix to send to proper streams based on touched PIs. The tool > could then select proper Delta *stream* in which to publish the > Delta based on what PIs it touches. > > What would this achieve? Well, first of all we might be able to work > more completely in Squeak. :) We would above all be able to get > "cross fork pollination" using the Deltas that are being published > alongside the MC snapshots. Pharo could be "listening" to our bug > fix stream for example. And vice versa if Pharo decides to use this. > And well, it opens up lots of other possibilities I think. > > Anyway, would be interested in all feedback on this idea. I'm a little disconnected of the Squeak world, for this, I ask: But are DeltaStreams working, stable, and functional? Cheers. Giuseppe Luigi Punzi Ruiz Blog: http://www.lordzealon.com Twitter & Skype & GoogleTalk accounts: glpunzi |
In reply to this post by Göran Krampe
Hi Göran -
It'd be easier to give feedback if you had something to try out ;-) Abstractly speaking this sounds all cool but we'll only know when we actually try it. One thing that's not quite clear in this picture is where the "canonical" source for patches would be and how it would be produced. In most of what you wrote you've concentrated more on the harvesting aspect (i.e., being able to cherry-pick contributions from elsewhere) but how does an actual development flow look like? Cheers, - Andreas Göran Krampe wrote: > Hi folks! > > As we all know we are in the middle of a heated discussion about our > release process etc. I haven't read all messages nor do I think that any > of the two "sides" (Andreas and Keith basically) are 100% wrong nor 100% > right. > > Anyway, disregarding all that :) - I would like to do a sketch on an > "ideal" daily process for development. I would appreciate this thread to > NOT turn into the current "war" between "Trunk" and "Bob", ok? > > MC is a very good tool and Keith/Matthew and others have turned it into > an even better tool AFAIK using SystemEditor as a base etc. It does have > its limitations though: > > 1. It tends to be more coarse granular than other tools. "commits" tend > to be more seldom, at least when I work in it. Probably due to less > stellar performance, might have been fixed. > > 2. It needs history to do its merge magic. Thus it doesn't play well > between forks. It plays very well within a package or within a group of > packages (a fork perhaps). > > 3. It is centered around packages defined by PackageInfo which more or > less means "a group of class categories + class extensions". It does > have MC configs now, and I haven't used them myself yet so I can't > really comment. > > The above three bits are different with Deltas. > > Anyway, I am trying to envision how an MC based approach like "trunk" > could work together with Deltastreams. DS is meant to replace Changesets > and could complement MC in the above three departments (and more). > > For example, we could create some kind of "commit tool" that operates > above MC/DS which would use Delta recording (exactly like Changesets > record today) to catch current image modifications and to offer a nice > UI to select a subset of those modifications (or all), enter commit > comment, classify it and push "commit". > > This commit tool could have some checkboxes and drop down menus to > classify this commit and also tell where to "send it": > > - Also snapshot MC packages. This would cause an MC commit as well as a > Delta to be produced. If not checked we only produce a Delta. > > - Bug fix to send to proper streams based on touched PIs. The tool could > then select proper Delta *stream* in which to publish the Delta based on > what PIs it touches. > > What would this achieve? Well, first of all we might be able to work > more completely in Squeak. :) We would above all be able to get "cross > fork pollination" using the Deltas that are being published alongside > the MC snapshots. Pharo could be "listening" to our bug fix stream for > example. And vice versa if Pharo decides to use this. And well, it opens > up lots of other possibilities I think. > > Anyway, would be interested in all feedback on this idea. > > regards, Göran > > > |
In reply to this post by Giuseppe
Hi!
Giuseppe Luigi Punzi Ruiz wrote: > I'm a little disconnected of the Squeak world, for this, I ask: But are > DeltaStreams working, stable, and functional? Nope! :) But I think we have reached a point where we could need more people helping with it. And thus I am trying to get some interest and also try to discuss how it can fit together with MC. Regarding the status and the words "working, stable and functional": In short, DS consists of two parts: Deltas and Streams. The Streams part has not even been started and it would be awesome to get some help there. The Delta part is working to a very large extent but is not yet ready to be used. Deltas do however support all kinds of source code changes (AFAIK) with full revert ability. So the domain model is perhaps 98% complete and there are lots of tests so the stuff that is there is quite stable. So kinda working and kinda stable :). The functional bit on the other hand suffers because of two missing pieces that we really need: 1. External file format. Working on it. 2. Tools with UI. Will start working on that soon. Also, working on #1 I am refactoring, simplifying and cleaning, so the code is a bit in flux. regards, Göran |
In reply to this post by Göran Krampe
Göran Krampe wrote:
> Hi folks! > > As we all know we are in the middle of a heated discussion about our > release process etc. I haven't read all messages nor do I think that > any of the two "sides" (Andreas and Keith basically) are 100% wrong > nor 100% right. There are not two sides to this there is one side, the board either acted rightly or wrongly. When that issue has been sorted out then you can debate the details of the actual proposals. Currently when you discuss this with board members, the response, due to the fact that new interest has been generated in making patches, is that "the end justifies the means". Once I pointed out that the same ends could have been acheived by different means, and that we already had 98 patches manually included in 3.11 as of December last year, and with the automatic process we anticipated harvesting about 300 per release cycle. i.e. we already had more patches than we knew what to do with, Randal's interest returned and a civil positive conversation ensued. I still think that the board is supposed to be hands off and not boots on. I still think that everyone should be treated fairly. If Andreas wants to volunteer to be on the release team then that is cool, but to purposefully initiate a process that is destined to cause conflict is really out of order. Igor was at pains to point out that he is contributing as a volunteer to the DeltaStreams initiative - fantastic. Someone famous once said that if you want to be a leader then to do so you must be a servant to all. This is what I have been trying to do all along, but even that famous person ran into trouble when faced with an autocratic regime. Keith |
In reply to this post by Andreas.Raab
Hi!
Andreas Raab wrote: > Hi Göran - > > It'd be easier to give feedback if you had something to try out ;-) I know, it is a bit unfair of me to "push" it given that it is not yet usable. I just want to perhaps find someone to help :). So forgive me for the "DS spam". > Abstractly speaking this sounds all cool but we'll only know when we > actually try it. Yup. > One thing that's not quite clear in this picture is > where the "canonical" source for patches would be and how it would be Not sure I understand what you mean here. > produced. In most of what you wrote you've concentrated more on the > harvesting aspect (i.e., being able to cherry-pick contributions from > elsewhere) but how does an actual development flow look like? Well, I can't really say although I have some ideas, let me again mention some things Deltas should make possible and then we could try to think together what that means for the "flow": - Deltas can be loaded into an image "en masse" because loading a Delta simply deserializes it into a fully self contained object. This means we can read, edit and analyse them without affecting the image. - Deltas can be applied and unapplied. - Deltas can "reason" about how well they can be applied. This is because each change within the Delta has information about the state "before" the change. For example, a ModifyMethodChange has both new and old source (and stamps etc). Some things we already know: - Deltas in fact implement a more advanced changelog. It could be used to replace the .changes file. - Since they can be applied/unapplied Deltas can easily be used to build a "Quilt"-like system. - Deltas are "rich" in state and can be analysed/manipulated in a lot of different ways. So... post is getting long. :) I am all ears for observations/ideas at this point. And yes, I will try to get DS into such a shape that we can actually start playing more seriously with them. Fully replacing Changesets would be a great start IMHO. To do that we need fileformat and UI. And a few other odd bits, but not much. regards, Göran |
Göran Krampe wrote:
>> One thing that's not quite clear in this picture is where the >> "canonical" source for patches would be and how it would be > > Not sure I understand what you mean here. Simply the question of where/how DS are stored. Monticello has repositories that store full source for a package. As a consequence packages are large, but they allow you to see what's different in your image, what a patch does, tracking history etc. You can agree on a place and say "this is THE repository for package X" etc. I'm wondering where DS stand with regards to these issues. >> produced. In most of what you wrote you've concentrated more on the >> harvesting aspect (i.e., being able to cherry-pick contributions from >> elsewhere) but how does an actual development flow look like? > > Well, I can't really say although I have some ideas, let me again > mention some things Deltas should make possible and then we could try to > think together what that means for the "flow": > > - Deltas can be loaded into an image "en masse" because loading a Delta > simply deserializes it into a fully self contained object. This means we > can read, edit and analyse them without affecting the image. > > - Deltas can be applied and unapplied. > > - Deltas can "reason" about how well they can be applied. This is > because each change within the Delta has information about the state > "before" the change. For example, a ModifyMethodChange has both new and > old source (and stamps etc). That's *very* useful. One of my favorite features when using MC is that it can tell us if there is a conflict in a merge and that this method requires special attention. If DS can do something similar by telling us that the base version of a method is different from when the DS was created this will be hugely helpful. > Some things we already know: > > - Deltas in fact implement a more advanced changelog. It could be used > to replace the .changes file. > > - Since they can be applied/unapplied Deltas can easily be used to build > a "Quilt"-like system. I don't know what that is. Any references? > - Deltas are "rich" in state and can be analysed/manipulated in a lot of > different ways. > > So... post is getting long. :) I am all ears for observations/ideas at > this point. And yes, I will try to get DS into such a shape that we can > actually start playing more seriously with them. > > Fully replacing Changesets would be a great start IMHO. To do that we > need fileformat and UI. And a few other odd bits, but not much. Keep me in the loop if you need help. I probably won't be touching the domain model and stuff but if you need help on the surface and for workflows I might be able to help out. Cheers, - Andreas |
Hi!
Andreas Raab wrote: > Göran Krampe wrote: >>> One thing that's not quite clear in this picture is where the >>> "canonical" source for patches would be and how it would be >> >> Not sure I understand what you mean here. > > Simply the question of where/how DS are stored. Monticello has > repositories that store full source for a package. As a consequence > packages are large, but they allow you to see what's different in your > image, what a patch does, tracking history etc. You can agree on a place > and say "this is THE repository for package X" etc. I'm wondering where > DS stand with regards to these issues. Since the Stream part is not yet started I am all ears. But apart from the obvious fact that Deltas can serve the same roles as Changesets can (only better), I was envisioning chronological "streams" (as in continuous flows) of Deltas associated with individual developers, forks, packages and branches of packages etc. For example, I would gladly suck up all Deltas people produce that touch any of my packages. :) Thus, if people would submit their Deltas to some huge searchable archive - or if we had at least some simple "upstream" hooked to packages so that when people make fixes they can easily just shoot them "upstream" at me, it would be great. >>> produced. In most of what you wrote you've concentrated more on the >>> harvesting aspect (i.e., being able to cherry-pick contributions from >>> elsewhere) but how does an actual development flow look like? >> >> Well, I can't really say although I have some ideas, let me again >> mention some things Deltas should make possible and then we could try >> to think together what that means for the "flow": >> >> - Deltas can be loaded into an image "en masse" because loading a >> Delta simply deserializes it into a fully self contained object. This >> means we can read, edit and analyse them without affecting the image. >> >> - Deltas can be applied and unapplied. >> >> - Deltas can "reason" about how well they can be applied. This is >> because each change within the Delta has information about the state >> "before" the change. For example, a ModifyMethodChange has both new >> and old source (and stamps etc). > > That's *very* useful. One of my favorite features when using MC is that > it can tell us if there is a conflict in a merge and that this method > requires special attention. If DS can do something similar by telling us > that the base version of a method is different from when the DS was > created this will be hugely helpful. This is in fact the *core idea*. The idea came about after watching Linus thoughts on git and to think about how MC and most SCMs work. They all get their "merge magic" from extensive knowledge of history to a common base. But that is something we don't have between forks. Thus, could we get 80% of magic using simpler tricks? The trick is to let the Changeset contain more info - especially info about the "before" state. This of course both enables unapply, but more essentially it enables much smarter apply-logic. A "perfectly clean" Delta is one that sees the exact same "before" state as it captured when it was created in the source image. A "dirty" Delta is one that can be applied (classes are not missing etc) but it will overwrite for example methods that look different than expected. Also, a Delta can do other smart things since it has captured class definition changes in more detail - say you add an ivar "c" and when it checks the destination image it finds other additional ivars but no "c" - then it can merge by just adding "c". Nice eh? :) >> Some things we already know: >> >> - Deltas in fact implement a more advanced changelog. It could be used >> to replace the .changes file. >> >> - Since they can be applied/unapplied Deltas can easily be used to >> build a "Quilt"-like system. > > I don't know what that is. Any references? Well, Mercurial queues are similar, see: http://mercurial.selenic.com/wiki/MqExtension#Introduction >> - Deltas are "rich" in state and can be analysed/manipulated in a lot >> of different ways. >> >> So... post is getting long. :) I am all ears for observations/ideas at >> this point. And yes, I will try to get DS into such a shape that we >> can actually start playing more seriously with them. >> >> Fully replacing Changesets would be a great start IMHO. To do that we >> need fileformat and UI. And a few other odd bits, but not much. > > Keep me in the loop if you need help. I probably won't be touching the > domain model and stuff but if you need help on the surface and for > workflows I might be able to help out. I will get back as soon as I have something to show. regards, Göran |
Göran Krampe wrote:
>> Simply the question of where/how DS are stored. Monticello has >> repositories that store full source for a package. As a consequence >> packages are large, but they allow you to see what's different in your >> image, what a patch does, tracking history etc. You can agree on a >> place and say "this is THE repository for package X" etc. I'm >> wondering where DS stand with regards to these issues. > > Since the Stream part is not yet started I am all ears. But apart from > the obvious fact that Deltas can serve the same roles as Changesets can > (only better), I was envisioning chronological "streams" (as in > continuous flows) of Deltas associated with individual developers, > forks, packages and branches of packages etc. A thing to keep in mind is that there is huge value in having a definitive version of the code that you can compare against somewhere. Being able to say "oh, these are my local changes" is a large part of what makes working with MC superior to working with change sets. So it would be really useful if one could say "compared with this repository, you have applied delta x, y, and z, and in addition you have modified methods foo and bar". >> That's *very* useful. One of my favorite features when using MC is >> that it can tell us if there is a conflict in a merge and that this >> method requires special attention. If DS can do something similar by >> telling us that the base version of a method is different from when >> the DS was created this will be hugely helpful. > > This is in fact the *core idea*. The idea came about after watching > Linus thoughts on git and to think about how MC and most SCMs work. They > all get their "merge magic" from extensive knowledge of history to a > common base. But that is something we don't have between forks. Why not? Actually we do. MC will search any repository you add and if it finds any common ancestor in any of the repositories it will use that. I've done some pretty extensive merges that way and the only thing MC lacks in this area is explicit support for cherry-picking (i.e., to accept or reject changes even when they don't conflict). > Thus, could we get 80% of magic using simpler tricks? The trick is to > let the Changeset contain more info - especially info about the "before" > state. This of course both enables unapply, but more essentially it > enables much smarter apply-logic. Interesting. I had thought that keeping the stamp or a hash would be enough but you're right, having the actual previous version does allow you to trivially revert to it. Very clever. > Also, a Delta can do other smart things since it has captured class > definition changes in more detail - say you add an ivar "c" and when it > checks the destination image it finds other additional ivars but no "c" > - then it can merge by just adding "c". Nice eh? :) Nice. But more a sign of weakness in Monticello ;-) But it's definitely a good idea since this can cover additions that come from "other" deltas. Which reminds me: Where are the deltas stored and how big is the space overhead for keeping them? Cheers, - Andreas |
2009/8/16 Andreas Raab <[hidden email]>:
> Göran Krampe wrote: >>> >>> Simply the question of where/how DS are stored. Monticello has >>> repositories that store full source for a package. As a consequence packages >>> are large, but they allow you to see what's different in your image, what a >>> patch does, tracking history etc. You can agree on a place and say "this is >>> THE repository for package X" etc. I'm wondering where DS stand with regards >>> to these issues. >> >> Since the Stream part is not yet started I am all ears. But apart from the >> obvious fact that Deltas can serve the same roles as Changesets can (only >> better), I was envisioning chronological "streams" (as in continuous flows) >> of Deltas associated with individual developers, forks, packages and >> branches of packages etc. > > A thing to keep in mind is that there is huge value in having a definitive > version of the code that you can compare against somewhere. Being able to > say "oh, these are my local changes" is a large part of what makes working > with MC superior to working with change sets. So it would be really useful > if one could say "compared with this repository, you have applied delta x, > y, and z, and in addition you have modified methods foo and bar". > >>> That's *very* useful. One of my favorite features when using MC is that >>> it can tell us if there is a conflict in a merge and that this method >>> requires special attention. If DS can do something similar by telling us >>> that the base version of a method is different from when the DS was created >>> this will be hugely helpful. >> >> This is in fact the *core idea*. The idea came about after watching Linus >> thoughts on git and to think about how MC and most SCMs work. They all get >> their "merge magic" from extensive knowledge of history to a common base. >> But that is something we don't have between forks. > > Why not? Actually we do. MC will search any repository you add and if it > finds any common ancestor in any of the repositories it will use that. I've > done some pretty extensive merges that way and the only thing MC lacks in > this area is explicit support for cherry-picking (i.e., to accept or reject > changes even when they don't conflict). > >> Thus, could we get 80% of magic using simpler tricks? The trick is to let >> the Changeset contain more info - especially info about the "before" state. >> This of course both enables unapply, but more essentially it enables much >> smarter apply-logic. > > Interesting. I had thought that keeping the stamp or a hash would be enough > but you're right, having the actual previous version does allow you to > trivially revert to it. Very clever. > >> Also, a Delta can do other smart things since it has captured class >> definition changes in more detail - say you add an ivar "c" and when it >> checks the destination image it finds other additional ivars but no "c" - >> then it can merge by just adding "c". Nice eh? :) > > Nice. But more a sign of weakness in Monticello ;-) But it's definitely a > good idea since this can cover additions that come from "other" deltas. > Which reminds me: Where are the deltas stored and how big is the space > overhead for keeping them? > changeset record, since you have to keep 'before' version in addition to 'after' version. And most common change is changing the method(s) source a little bit. Oh, and did Goran mentioned that Deltas could be also 'compressed' in same way as changeset? I mean you having versions A -> .... -> Z in uncompressed form, but surely you may decide to obliterate intermediate versions (....), so final delta will be A -> Z you losing ability to roll back to any intermediate version, but you still can roll back/forward between A<->Z > Cheers, > - Andreas > > -- Best regards, Igor Stasenko AKA sig. |
In reply to this post by Andreas.Raab
Hi!
Andreas Raab wrote: > Göran Krampe wrote: >>> Simply the question of where/how DS are stored. Monticello has >>> repositories that store full source for a package. As a consequence >>> packages are large, but they allow you to see what's different in >>> your image, what a patch does, tracking history etc. You can agree on >>> a place and say "this is THE repository for package X" etc. I'm >>> wondering where DS stand with regards to these issues. >> >> Since the Stream part is not yet started I am all ears. But apart from >> the obvious fact that Deltas can serve the same roles as Changesets >> can (only better), I was envisioning chronological "streams" (as in >> continuous flows) of Deltas associated with individual developers, >> forks, packages and branches of packages etc. > > A thing to keep in mind is that there is huge value in having a > definitive version of the code that you can compare against somewhere. > Being able to say "oh, these are my local changes" is a large part of > what makes working with MC superior to working with change sets. So it > would be really useful if one could say "compared with this repository, > you have applied delta x, y, and z, and in addition you have modified > methods foo and bar". Yes, I agree - and DS was not meant to replace MC. MC does snapshots and maintains their history and DS captures "developer changes" in a fine granular fashion. But a combo of MC and DS would probably be very interesting. >>> That's *very* useful. One of my favorite features when using MC is >>> that it can tell us if there is a conflict in a merge and that this >>> method requires special attention. If DS can do something similar by >>> telling us that the base version of a method is different from when >>> the DS was created this will be hugely helpful. >> >> This is in fact the *core idea*. The idea came about after watching >> Linus thoughts on git and to think about how MC and most SCMs work. >> They all get their "merge magic" from extensive knowledge of history >> to a common base. But that is something we don't have between forks. > > Why not? Actually we do. MC will search any repository you add and if it > finds any common ancestor in any of the repositories it will use that. Yes, I know. But I still think we will end up with situations where the forks don't share enough history in order to do this. I may be wrong. > I've done some pretty extensive merges that way and the only thing MC > lacks in this area is explicit support for cherry-picking (i.e., to > accept or reject changes even when they don't conflict). > >> Thus, could we get 80% of magic using simpler tricks? The trick is to >> let the Changeset contain more info - especially info about the >> "before" state. This of course both enables unapply, but more >> essentially it enables much smarter apply-logic. > > Interesting. I had thought that keeping the stamp or a hash would be > enough but you're right, having the actual previous version does allow > you to trivially revert to it. Very clever. Also, a "perfect revert" is only possible if the Delta was "perfectly clean" when being applied. BUT... the cool trick is that if you are applying a Delta which is NOT perfectly clean (let's say a method being changed does not match the "before" state) then you just record a NEW Delta when applying the Delta and that new Delta will be able to do a perfect revert. One interesting aspect here is that say you record a Delta with one single change - a class delete. The Delta will then create a composite change which contains all changes needed to recreate that class (class creation, method additions etc). So if you then load that into a different image it can check that the class to be deleted is exactly the same as it was in the source image. >> Also, a Delta can do other smart things since it has captured class >> definition changes in more detail - say you add an ivar "c" and when >> it checks the destination image it finds other additional ivars but no >> "c" - then it can merge by just adding "c". Nice eh? :) > > Nice. But more a sign of weakness in Monticello ;-) But it's definitely Yes, but since DS captures "developer actions" it can in theory capture more info than MC can. For example, a class rename is captured as a class rename change. It can never be confused for a remove class + add another class. > a good idea since this can cover additions that come from "other" > deltas. Which reminds me: Where are the deltas stored and how big is the > space overhead for keeping them? You mean in the image? Since a Delta is a "fully self contained" object with no references to anything outside it and only contains "simple data" - we are quite free to do what we want with them. We could for example store them in a database and easily load/edit/save/search them. Some simple file based repository scheme is of course needed. Perhaps just some kind of file naming convention to get a sort order. One small idea I have is to perhaps use CouchDB as a repository option, would fit quite well and since CouchDB has inter db replication built in we could get a very nice base for hooking our streams together into larger and larger streams all the way to the sea :) I have not measured the in image space overhead. Tirade that I am hooking in is quite fast in loading them: http://goran.krampe.se/blog/Squeak/Tirade2.rdoc Sidenote: One other strong advantage of using DS instead of changesets is that the DS default applier is SystemEditor which makes applying fully atomic. regards, Göran |
In reply to this post by Igor Stasenko
Hi!
Igor Stasenko wrote: > 2009/8/16 Andreas Raab <[hidden email]>: >> Which reminds me: Where are the deltas stored and how big is the space >> overhead for keeping them? > > i think most common overhead is 2x more space comparing to analoguous > changeset record, since you have to keep 'before' version in addition > to 'after' version. And most common change is changing the method(s) > source a little bit. Well, in fact it is more - a Changeset does in fact not hold a copy of the new method source - just a pointer to the method modified (IIRC). Thus Changesets are sneaky beasts, most people get the wrong impression. > Oh, and did Goran mentioned that Deltas could be also 'compressed' in > same way as changeset? Yes, I have called it "normalized" but compressed may be a better term. That code is not yet finished though, but a fun coding task for anyone interested. :) There are two scenarios: - Compressing a single Delta into its non redundant form. This makes it look like a Changeset. Before compression it is a "perfect" log. - Compression of multiple Deltas into one. This one is trivial once you implemented first one. For bug fixes and patches I believe one would naturally want to compress before distributing of course. regards, Göran |
2009/8/16 Göran Krampe <[hidden email]>:
> Hi! > > Igor Stasenko wrote: >> >> 2009/8/16 Andreas Raab <[hidden email]>: >>> >>> Which reminds me: Where are the deltas stored and how big is the space >>> overhead for keeping them? >> >> i think most common overhead is 2x more space comparing to analoguous >> changeset record, since you have to keep 'before' version in addition >> to 'after' version. And most common change is changing the method(s) >> source a little bit. > > Well, in fact it is more - a Changeset does in fact not hold a copy of the > new method source - just a pointer to the method modified (IIRC). > interchange format. > Thus Changesets are sneaky beasts, most people get the wrong impression. > yes, and we could do same with deltas - use external storage for keeping them in file(s), while in-image they hold only a pointers. But i think its too early to think about it. Or.. why too early? You already mentioned different ways to persist the deltas.. so a delta 'pointer' could be represented by a tuple: (adaptor , id) where adaptor could be: file, couch DB, url or anything else, and id is an additional info , identifying given delta. Or, just use a common denominator of all of this stuff - URI. I don't think that its hard to introduce a new URI, which identifies a Couch DB storage :) >> Oh, and did Goran mentioned that Deltas could be also 'compressed' in >> same way as changeset? > > Yes, I have called it "normalized" but compressed may be a better term. > > That code is not yet finished though, but a fun coding task for anyone > interested. :) There are two scenarios: > > - Compressing a single Delta into its non redundant form. This makes it look > like a Changeset. Before compression it is a "perfect" log. > > - Compression of multiple Deltas into one. This one is trivial once you > implemented first one. > > For bug fixes and patches I believe one would naturally want to compress > before distributing of course. > Right > regards, Göran > > > -- Best regards, Igor Stasenko AKA sig. |
2009/8/16 Igor Stasenko <[hidden email]>:
> 2009/8/16 Göran Krampe <[hidden email]>: >> Hi! >> >> Igor Stasenko wrote: >>> >>> 2009/8/16 Andreas Raab <[hidden email]>: >>>> >>>> Which reminds me: Where are the deltas stored and how big is the space >>>> overhead for keeping them? >>> >>> i think most common overhead is 2x more space comparing to analoguous >>> changeset record, since you have to keep 'before' version in addition >>> to 'after' version. And most common change is changing the method(s) >>> source a little bit. >> >> Well, in fact it is more - a Changeset does in fact not hold a copy of the >> new method source - just a pointer to the method modified (IIRC). >> > oh, i meant not in-image space, but storage space required for > interchange format. > >> Thus Changesets are sneaky beasts, most people get the wrong impression. >> > yes, and we could do same with deltas - use external storage for > keeping them in file(s), while in-image > they hold only a pointers. > But i think its too early to think about it. > Or.. why too early? > You already mentioned different ways to persist the deltas.. so > a delta 'pointer' could be represented by a tuple: (adaptor , id) > > where adaptor could be: file, couch DB, url or anything else, > and id is an additional info , identifying given delta. > > Or, just use a common denominator of all of this stuff - URI. > I don't think that its hard to introduce a new URI, which identifies a > Couch DB storage :) > Thinking a bit more about it, i think its a great thing which we should employ early! Consider exchanging between people not with files, but with URLs: http://my.site.foo.bar/squeak/myfixToSomeStuff.ds#100 or: ftp://usr:[hidden email]/myfixToSomeStuff.ds#1234 and locally, one of course could use: file:///usr/home/geek/squeak/myfixToSomeStuff.ds#1234 and so on.. Simple and powerful, and no early binding to any kind of specific storage currently existing in the world. > >>> Oh, and did Goran mentioned that Deltas could be also 'compressed' in >>> same way as changeset? >> >> Yes, I have called it "normalized" but compressed may be a better term. >> >> That code is not yet finished though, but a fun coding task for anyone >> interested. :) There are two scenarios: >> >> - Compressing a single Delta into its non redundant form. This makes it look >> like a Changeset. Before compression it is a "perfect" log. >> >> - Compression of multiple Deltas into one. This one is trivial once you >> implemented first one. >> >> For bug fixes and patches I believe one would naturally want to compress >> before distributing of course. >> > > Right > >> regards, Göran >> >> >> > > > > -- > Best regards, > Igor Stasenko AKA sig. > -- Best regards, Igor Stasenko AKA sig. |
In reply to this post by Göran Krampe
Hi Göran,
Göran Krampe wrote: > ... > Yes, I agree - and DS was not meant to replace MC. MC does snapshots > and maintains their history and DS captures "developer changes" in a > fine granular fashion. But a combo of MC and DS would probably be very > interesting. If possible try to keep DS usable even without MC. Some people might choose to use DS and not MC... >>>> That's *very* useful. One of my favorite features when using MC is >>>> that it can tell us if there is a conflict in a merge and that this >>>> method requires special attention. If DS can do something similar >>>> by telling us that the base version of a method is different from >>>> when the DS was created this will be hugely helpful. >>> >>> This is in fact the *core idea*. The idea came about after watching >>> Linus thoughts on git and to think about how MC and most SCMs work. >>> They all get their "merge magic" from extensive knowledge of history >>> to a common base. But that is something we don't have between forks. >> >> Why not? Actually we do. MC will search any repository you add and if >> it finds any common ancestor in any of the repositories it will use >> that. > > Yes, I know. But I still think we will end up with situations where > the forks don't share enough history in order to do this. I may be wrong. I agree with you. For example Cuis is not based on MC packages. So having "merge magic" without needing a common ancestor in MC would be wonderful. Cheers, Juan Vuletich |
In reply to this post by Igor Stasenko
Hi!
Igor Stasenko wrote: > 2009/8/16 Göran Krampe <[hidden email]>: >> Hi! >> >> Igor Stasenko wrote: >>> 2009/8/16 Andreas Raab <[hidden email]>: >>>> Which reminds me: Where are the deltas stored and how big is the space >>>> overhead for keeping them? >>> i think most common overhead is 2x more space comparing to analoguous >>> changeset record, since you have to keep 'before' version in addition >>> to 'after' version. And most common change is changing the method(s) >>> source a little bit. >> Well, in fact it is more - a Changeset does in fact not hold a copy of the >> new method source - just a pointer to the method modified (IIRC). > > oh, i meant not in-image space, but storage space required for > interchange format. The domain model in DS is more or less done. Matthew implemented a file format but I am not pursuing that format for various reasons - I instead am working on hooking Tirade into DS to use Tirade as fileformat. For info about Tirade, see my blog articles: http://goran.krampe.se/blog/Squeak/Tirade.rdoc http://goran.krampe.se/blog/Squeak/Tirade2.rdoc http://goran.krampe.se/blog/Squeak/Tirade3.rdoc Basically Tirade looks like Smalltalk message sends so it will be in "low level syntax" similar to the chunk format used for Changesets but it does not use Compiler to load. >> Thus Changesets are sneaky beasts, most people get the wrong impression. >> > yes, and we could do same with deltas - use external storage for > keeping them in file(s), while in-image > they hold only a pointers. Outside of the image I would primarily use the Tirade format. In image a Delta does NOT use pointers to live code, and that is by design. A Delta is a "fully self contained" object graph with no outgoing references. It is totally independent of current image state. This is highly different from a Changeset. A Changeset is typically "filed in" and that means you change the image. A Delta is handled in two steps: First you load it. This just means deserializing it from Tirade into an object. Then you apply it. This is the step that affects the image, atomically, using SystemEditor. > But i think its too early to think about it. > Or.. why too early? Not too early at all, in fact almost too late. :) > You already mentioned different ways to persist the deltas.. so > a delta 'pointer' could be represented by a tuple: (adaptor , id) > > where adaptor could be: file, couch DB, url or anything else, > and id is an additional info , identifying given delta. > > Or, just use a common denominator of all of this stuff - URI. > I don't think that its hard to introduce a new URI, which identifies a > Couch DB storage :) The CouchDB API is purely HTTP restful. Each document in CouchDB is accessed simply by a URL. Fits nicely in this area. regards, Göran |
2009/8/16 Göran Krampe <[hidden email]>:
> Hi! > > Igor Stasenko wrote: >> >> 2009/8/16 Göran Krampe <[hidden email]>: >>> >>> Hi! >>> >>> Igor Stasenko wrote: >>>> >>>> 2009/8/16 Andreas Raab <[hidden email]>: >>>>> >>>>> Which reminds me: Where are the deltas stored and how big is the space >>>>> overhead for keeping them? >>>> >>>> i think most common overhead is 2x more space comparing to analoguous >>>> changeset record, since you have to keep 'before' version in addition >>>> to 'after' version. And most common change is changing the method(s) >>>> source a little bit. >>> >>> Well, in fact it is more - a Changeset does in fact not hold a copy of >>> the >>> new method source - just a pointer to the method modified (IIRC). >> >> oh, i meant not in-image space, but storage space required for >> interchange format. > > The domain model in DS is more or less done. Matthew implemented a file > format but I am not pursuing that format for various reasons - I instead am > working on hooking Tirade into DS to use Tirade as fileformat. > > For info about Tirade, see my blog articles: > > http://goran.krampe.se/blog/Squeak/Tirade.rdoc > http://goran.krampe.se/blog/Squeak/Tirade2.rdoc > http://goran.krampe.se/blog/Squeak/Tirade3.rdoc > > Basically Tirade looks like Smalltalk message sends so it will be in "low > level syntax" similar to the chunk format used for Changesets but it does > not use Compiler to load. > I am well aware about Tirade, other readers may not, of course. :) >>> Thus Changesets are sneaky beasts, most people get the wrong impression. >>> >> yes, and we could do same with deltas - use external storage for >> keeping them in file(s), while in-image >> they hold only a pointers. > > Outside of the image I would primarily use the Tirade format. In image a > Delta does NOT use pointers to live code, and that is by design. A Delta is > a "fully self contained" object graph with no outgoing references. It is > totally independent of current image state. > > This is highly different from a Changeset. A Changeset is typically "filed > in" and that means you change the image. > I have nothing against one or another data format. If it designed well to serve its purpose - no-one would put you on fire for that. :) My concern is different - the ability to 'offload' the bulk data from image to persistent storage and be able to reload it back in case of need. This is what .source & .changes files serving for with more or less success. So, in this respect, do you see how we could improve the state of affairs , taking in account my proposal, that lets call it a 'hibernated' Delta (one which offloaded data from image to some storage) could carry a pointer to resource which holding all the information which needed to 'unhibernate' it. URI/URL mechanism.. Same, btw, as Monticello doing with different repositories i.e. MCHttpRepository location: 'http://www.squeaksource.com/DeltaStreams' user: '' password: '' > A Delta is handled in two steps: First you load it. This just means > deserializing it from Tirade into an object. Then you apply it. This is the > step that affects the image, atomically, using SystemEditor. > >> But i think its too early to think about it. >> Or.. why too early? > > Not too early at all, in fact almost too late. :) > its never too late to improve things :) But can you explain what you have in mind? >> You already mentioned different ways to persist the deltas.. so >> a delta 'pointer' could be represented by a tuple: (adaptor , id) >> >> where adaptor could be: file, couch DB, url or anything else, >> and id is an additional info , identifying given delta. >> >> Or, just use a common denominator of all of this stuff - URI. >> I don't think that its hard to introduce a new URI, which identifies a >> Couch DB storage :) > > The CouchDB API is purely HTTP restful. Each document in CouchDB is accessed > simply by a URL. Fits nicely in this area. > > regards, Göran > > > -- Best regards, Igor Stasenko AKA sig. |
Göran Krampe explained how DS gets loaded into two phases. First, the
Delta gets read from a file and turnd into an object in the image, but no classes are changed. Second, the Delta is applied, i.e. it changes the classes. Igor Stasenko replied > My concern is different - the ability to 'offload' the bulk data from > image to persistent storage and be able to reload it > back in case of need. > This is what .source & .changes files serving for with more or less success. > So, in this respect, do you see how we could improve the state of > affairs , taking in account my proposal, that > lets call it a 'hibernated' Delta (one which offloaded data from image > to some storage) could carry a pointer to resource which holding all > the information which needed to 'unhibernate' it. URI/URL mechanism.. Is DS going to replace .source and .changes? I did not think it was. I thought it was competing more directly with .cs files and, to a certain extent, with Monticello. Thus, if you only want to load a program and run it, you won't need to keep the Deltas around once you have applied them. -Ralph |
In reply to this post by Igor Stasenko
Igor Stasenko wrote:
> I am well aware about Tirade, other readers may not, of course. :) Exactly. :) > I have nothing against one or another data format. If it designed well > to serve its purpose - no-one would put you on fire for that. :) > My concern is different - the ability to 'offload' the bulk data from > image to persistent storage and be able to reload it > back in case of need. > This is what .source & .changes files serving for with more or less success. > So, in this respect, do you see how we could improve the state of > affairs , taking in account my proposal, that > lets call it a 'hibernated' Delta (one which offloaded data from image > to some storage) could carry a pointer to resource which holding all > the information which needed to 'unhibernate' it. URI/URL mechanism.. That idea seems like a good idea, a "smart stub" you mean? Should be nice to have yes. > Same, btw, as Monticello doing with different repositories i.e. > MCHttpRepository > location: 'http://www.squeaksource.com/DeltaStreams' > user: '' > password: '' > >> A Delta is handled in two steps: First you load it. This just means >> deserializing it from Tirade into an object. Then you apply it. This is the >> step that affects the image, atomically, using SystemEditor. >> >>> But i think its too early to think about it. >>> Or.. why too early? >> Not too early at all, in fact almost too late. :) >> > its never too late to improve things :) But can you explain what you > have in mind? I think we misunderstood each other, there is nothing done yet in the area you describe, so not too late there. I was referring to domain model and Tirade. >>> You already mentioned different ways to persist the deltas.. so >>> a delta 'pointer' could be represented by a tuple: (adaptor , id) >>> >>> where adaptor could be: file, couch DB, url or anything else, >>> and id is an additional info , identifying given delta. >>> >>> Or, just use a common denominator of all of this stuff - URI. >>> I don't think that its hard to introduce a new URI, which identifies a >>> Couch DB storage :) >> The CouchDB API is purely HTTP restful. Each document in CouchDB is accessed >> simply by a URL. Fits nicely in this area. >> > Perfect. Yes. regards. Göran |
Free forum by Nabble | Edit this page |