Contributing to Pharo

Previous Topic Next Topic
 
classic Classic list List threaded Threaded
104 messages Options
123456
Reply | Threaded
Open this post in threaded view
|

Re: Contributing to Pharo

Eliot Miranda-2
Hi David,

> On Jan 29, 2016, at 2:45 PM, David Allouche <[hidden email]> wrote:
>
> Thanks Dale for all the explanations.
>
> How Monticello and version control relate in the big picture is starting to make sense for me.
>
> Now, I better understand why filetree ended up uses a file-per-method format, even though that is relatively hostile to git user interfaces optimised for other languages. There is really a need for a file-per-class exchange format, because that would works a lot better with the existing VCS ecosystem.

I agree so strongly.  Class file outs which are eg sorted by selector make much more sense.  They won't hit the file name length limit.  They make it trivial to maintain method and class comment time stamps.  They're easier to construct into snapshots because it's easier to decode the file name.

And then it's easy to add files for package load/unload scripts and for the history.  And then one is much more decoupled from the specific back end.  It could be mercurial just as easily as git.

> I think more package-based user interfaces would indeed be a very good idea, for browsing and for source code management.
>
> Stef, I have the impression you think that git is popular because it is a new shiny toy. I disagree with this idea. Git is a typical worse-is-better tool. It's good enough for most people, but it still has many shortcomings. It is popular in spite of its shortcomings. It became popular as destination for projects shifting from CVS and Subversion. So it is unlikely to be displaced by a newer, incrementally shinier tools. Anything that will displace it will have to provide an improvement of a similar magnitude as the jump between centralised and distributed version control.

This is a good analysis.  What's valuable to the Pharo community is not displacing an already functional dvcs (Monticello) with an ill-suited one (git), but in being able to function in ecosystems like github where people can display their identity and where infrastructure for bug reports etc exist.

> Still, I think it's a good idea not to restrict high level models to what git provides if that's a less than ideal fit to the image model.

Absolutely.  Dale's talk of ditching Monticello metadata fills me with repulsion and makes me want to ask is he trying to sabotage or what?  It seems entirely destructive.  We have a functional package manager which currently supports interchange between Pharo, Squeak and Cuis, something that I think is very important and valuable.  We should have the confidence to improve on it and work on exchanging improvements between the dialects.  For example Bert Freudenburg recently extended the commit dialog with the ability to mark changes to ignore, which are then /not/ committed, which makes it possible to maintain a few of one's favorite image mods without the tedium of reverting them (or whatever other Sisyphean means) to commit from ones work image.

> I have a lot of ideas to improve browsing and source code management in Pharo. I can make no promises, but I would like to produce something there.

Good luck in your efforts.


_,,,^..^,,,_ (phone)
Best, Eliot
bpi
Reply | Threaded
Open this post in threaded view
|

Re: Contributing to Pharo

bpi
In reply to this post by Dale Henrichs-3
Dale,

Thanks for your thorough answer. I really appreciate how you include links to helpful articles.

I find the description of the workflow you actually use very enlightening. However, one thing still remains unclear. In the last step, when merging the pull request. How is the unchanged metadata reconciled with the code changes? I just realized that I just don’t know what information is in the Monticello metadata, which is not in the code?

Cheers,
Bernhard

> Am 29.01.2016 um 19:12 schrieb Dale Henrichs <[hidden email]>:
> On 01/29/2016 09:02 AM, Bernhard Pieber wrote:
>> Hi Dale,
>>
>> I am trying to understand this a little better. If a package containing metadata would be changed using a dialect which cannot interpret the metadata, wouldn’t or at least couldn’t it be broken or lost afterwards for a dialect which tries to interpret the metadata? At least if my assumption is correct that the metadata is related to the code in some way?
>>
> Bernhard,
>
> Very good question.
>
> You are absolutely correct, the act of writing a new version of a package using a different dialect would indeed destroy the platform-specific meta data.
>
> However, it is NOT expected that two different dialects (with different disk formats)  share the same commit.
>
> If dialect A has committed a package, then dialect B is expected to be able to read the package written by dialect A. Being able to read the package means that all of the information that is common between the two dialects will be preserved ... in FileTree, this means that:
>
>  - all of the class and instance method source is readable and shared
>  - the method category is readable and shared
>  - the name, superclass, instance variables, class instance variables,
>    pools, class category, and class vars for classes is readable and shared
>
> This is an awful lot common data .... things like traits or namespaces are dialect specific and are ignored by dialects that don't have them when reading the package ...
>
> When dialect B has read, loaded/compiled and tested the code in the package, the developer has a couple of choices to be made when writing out a new package with her changes:
>
>  - use the current branch
>  - use a new branch
>
> I think that using a new branch is the cleanest option. When I port a project like Zinc[1] to GemStone[2], the master branch is preserved for the Pharo-specific code. I create a gs_master branch where the GemStone code goes. When Sven commits new code in his repo, I merge the changes from the his new master branch into my gs_master and resolve conflicts if any, run tests and I'm done.
>
> The vast mjority of the code base is shared between GemStone and Pharo and only the places where I made changes in porting Zinc to GemStone have the potential for conflicts. Running the tests should highlight any  impacts not covered by direct conflicts and if there were some Pharo-specific meta data that I deleted on my branch that causes a test to fail then that is up to me to research and fix.
>
> When I have changes to feed back to Sven, I make those changes on a separate "topic branch"[3] that is merged into gs_master for my use. I then cherry-pick[4] the topic branch commit into a separate topic branch off of the master branch (this picks up only the changes I made on the topic branch) and open a pull request[5] against Svens repository ... if Sven were using travis-ci, tests would be run automatically against a merge of my changes into his master branch ... if the tests are green and the code passes Sven's scrutiny in a code review he merges my proposed changes into his master branch...
>
> GemStone and Pharo share a common disk format (not 100% common), but since git only merges the deltas for a commit, it is relatively easy to keep the meta data differences isolated while still sharing the vast bulk of the code and classes ....
>
> Dale
>
> [1] https://github.com/svenvc/zinc
> [2] https://github.com/GsDevKit/zinc
> [3] https://git-scm.com/book/en/v2/Git-Branching-Branching-Workflows
> [4] http://think-like-a-git.net/sections/rebase-from-the-ground-up/cherry-picking-explained.html
> [5] https://help.github.com/articles/using-pull-requests/

bpi
Reply | Threaded
Open this post in threaded view
|

Re: Contributing to Pharo

bpi
In reply to this post by David Allouche
Hi David,

Just for your information, Cuis uses a file-per-package format. It looks really nice on GitHub and on SourceTree. As far as I can tell, there has not been much merging going, though. So it is entirely possible that it turns out not to work as well as the filetree format when merging.

Cheers,
Bernhard
 

> Am 29.01.2016 um 23:45 schrieb David Allouche <[hidden email]>:
>
> Thanks Dale for all the explanations.
>
> How Monticello and version control relate in the big picture is starting to make sense for me.
>
> Now, I better understand why filetree ended up uses a file-per-method format, even though that is relatively hostile to git user interfaces optimised for other languages. There is really a need for a file-per-class exchange format, because that would works a lot better with the existing VCS ecosystem.
>
> I think more package-based user interfaces would indeed be a very good idea, for browsing and for source code management.
>
> Stef, I have the impression you think that git is popular because it is a new shiny toy. I disagree with this idea. Git is a typical worse-is-better tool. It's good enough for most people, but it still has many shortcomings. It is popular in spite of its shortcomings. It became popular as destination for projects shifting from CVS and Subversion. So it is unlikely to be displaced by a newer, incrementally shinier tools. Anything that will displace it will have to provide an improvement of a similar magnitude as the jump between centralised and distributed version control.
>
> Still, I think it's a good idea not to restrict high level models to what git provides if that's a less than ideal fit to the image model.
>
> I have a lot of ideas to improve browsing and source code management in Pharo. I can make no promises, but I would like to produce something there.


Reply | Threaded
Open this post in threaded view
|

Re: Contributing to Pharo

Dale Henrichs-3
In reply to this post by bpi


On 1/30/16 1:54 AM, Bernhard Pieber wrote:
> Dale,
>
> Thanks for your thorough answer. I really appreciate how you include links to helpful articles.
>
> I find the description of the workflow you actually use very enlightening. However, one thing still remains unclear. In the last step, when merging the pull request. How is the unchanged metadata reconciled with the code changes? I just realized that I just don’t know what information is in the Monticello metadata, which is not in the code?
>
Monticello metadata is basically the entire Monticello version history
of the package, it includes direct ancestors, commit comments, the GUID,
etc. For a FileTree repo, the meta data is stashed a separate file ...
The form of the data is actually serialized Smalltalk object graph - a
deeply nested set of Arrays - all written on a single line, so git has
very little chance of being able to merge two files ... The other bit of
Monticello metadata that is maintained by FileTree involves the per
method timestamp and developer initials - these are stored in a method
properties file associated with each class.

In fact is nearly impossible for a human to properly merge to Monticello
version histories, which is why Thierry Goubier's
GitFileTree-MergeDriver[1] is especially useful. Thierry's MergeDriver
fires up a Pharo image (called by git) and performs object surgery to
merge the conflicting version histories and method properties.

Anyway, in the scenario that I described, I simplified the process a bit
and left out some of the gorier details.

When I cherry-picked the commit for my topic branch and merged that
commit into Sven's latest master branch, the Monticello metadata was
certain to conflict, but when you perform a merge on a local git
repository and you've registered Thierry's merge driver with git, the
Monticello metadata is automatically merged.

I also made the assumption that Sven would not have made any additional
commits to his master branch between the time I merged his master branch
and the time I submitted the pull request - the latest version of a file
wins in a merge as long as they share the same immediate previous ancestor.

If Sven had made a new commit, the Monticello metadata would have
conflicted ...

For the record, these "guaranteed conflicts" are the only reason that
"removing Monticello metadata" enters into the conversation .... git
maintains it's own version history and method properties, so for
developers who are "exclusively using git" the Monticello metadata is an
annoying source of "meaningless conflicts." All of the information that
is present in Monticello meta data is present in git in another form

As I said, Thierry Goubier's magical GitFileTree-MergeDriver[1]
eliminates the conflict pain when the merges are performed in a local
git repository, but on GitHub, merging the Monticello metadata is a
constant thorn in one's side, since it becomes necessary to perform a
manual merge (and leverage Thierry's Merge Driver) when Monticello
metadata is involved.

Dale

[1] https://github.com/ThierryGoubier/GitFileTree-MergeDriver

Reply | Threaded
Open this post in threaded view
|

Re: Contributing to Pharo

Dale Henrichs-3
In reply to this post by David Allouche


On 1/29/16 2:45 PM, David Allouche wrote:
> Thanks Dale for all the explanations.
>
> How Monticello and version control relate in the big picture is starting to make sense for me.
>
> Now, I better understand why filetree ended up uses a file-per-method format, even though that is relatively hostile to git user interfaces optimised for other languages. There is really a need for a file-per-class exchange format, because that would works a lot better with the existing VCS ecosystem.
I agree 100%, but as I've mentioned in another post, I think that
switching to a file-per-class exchange format needs to be a
cross-dialect effort. A format that accommodates the diversity of
Smalltalk class definition and even method metadata while still being
readable is a bit of a challenge - solvable, but challenging....

Personally, I have other fish to fry, but I would be willing to
participate in such an effort if someone else is willing to lead the
charge:)

Dale


Reply | Threaded
Open this post in threaded view
|

Re: Contributing to Pharo

Ben Coman
In reply to this post by Dale Henrichs-3
On Sun, Jan 31, 2016 at 3:38 AM, Dale Henrichs
<[hidden email]> wrote:

>
>
> On 1/30/16 1:54 AM, Bernhard Pieber wrote:
>>
>> Dale,
>>
>> Thanks for your thorough answer. I really appreciate how you include links
>> to helpful articles.
>>
>> I find the description of the workflow you actually use very enlightening.
>> However, one thing still remains unclear. In the last step, when merging the
>> pull request. How is the unchanged metadata reconciled with the code
>> changes? I just realized that I just don’t know what information is in the
>> Monticello metadata, which is not in the code?
>>
> Monticello metadata is basically the entire Monticello version history of
> the package, it includes direct ancestors, commit comments, the GUID, etc.
> For a FileTree repo, the meta data is stashed a separate file ... The form
> of the data is actually serialized Smalltalk object graph - a deeply nested
> set of Arrays - all written on a single line, so git has very little chance
> of being able to merge two files

What would it take to have the meta data spread out over multiple
lines (if that would work better with git?)

cheers -ben

Reply | Threaded
Open this post in threaded view
|

Re: Contributing to Pharo

Dale Henrichs-3
In reply to this post by Eliot Miranda-2


On 01/29/2016 05:17 PM, Eliot Miranda wrote:

> Hi David,
>
>> On Jan 29, 2016, at 2:45 PM, David Allouche <[hidden email]> wrote:
>>
>> Thanks Dale for all the explanations.
>>
>> How Monticello and version control relate in the big picture is starting to make sense for me.
>>
>> Now, I better understand why filetree ended up uses a file-per-method format, even though that is relatively hostile to git user interfaces optimised for other languages. There is really a need for a file-per-class exchange format, because that would works a lot better with the existing VCS ecosystem.
> I agree so strongly.  Class file outs which are eg sorted by selector make much more sense.  They won't hit the file name length limit.  They make it trivial to maintain method and class comment time stamps.  They're easier to construct into snapshots because it's easier to decode the file name.
>
> And then it's easy to add files for package load/unload scripts and for the history.  And then one is much more decoupled from the specific back end.  It could be mercurial just as easily as git.
>
>> I think more package-based user interfaces would indeed be a very good idea, for browsing and for source code management.
>>
>> Stef, I have the impression you think that git is popular because it is a new shiny toy. I disagree with this idea. Git is a typical worse-is-better tool. It's good enough for most people, but it still has many shortcomings. It is popular in spite of its shortcomings. It became popular as destination for projects shifting from CVS and Subversion. So it is unlikely to be displaced by a newer, incrementally shinier tools. Anything that will displace it will have to provide an improvement of a similar magnitude as the jump between centralised and distributed version control.
> This is a good analysis.  What's valuable to the Pharo community is not displacing an already functional dvcs (Monticello) with an ill-suited one (git), but in being able to function in ecosystems like github where people can display their identity and where infrastructure for bug reports etc exist.
>
>> Still, I think it's a good idea not to restrict high level models to what git provides if that's a less than ideal fit to the image model.
> Absolutely.  Dale's talk of ditching Monticello metadata fills me with repulsion and makes me want to ask is he trying to sabotage or what?
Eliot, you can't be serious - accusing me of sabotage? Ah, well.... how
about you assume that I'm doing "or what":)

The Monticello metadata in a git repository is redundant and leads to
unnecessary commit conflicts -- end of story ....

Despite the fact that the Monticello metadata is redundant, I have made
sure that the Monticello metadata was included in FileTree from the very
beginning for the very reason that I wanted developers to be able to try
out FileTree, git and github without having to burn any  Monticello
bridges .... if they didn't like FileTree, git and github, then they
would be able to back out of their use of git without losing data ...

> It seems entirely destructive.
It is not destructive ...
> We have a functional package manager which currently supports interchange between Pharo, Squeak and Cuis,
and GemStone?

I assume that you are talking about Monticello packages and Monticello
repositories ... or what?

I am really not trying to do anything but "invent the future" --- I am
not trying to destroy, I am trying to improve ... If you are not able to
see the shortcomings of Monticello repositories  (Note that I am
distinguishing between Monticello packages and Monticello repositories  
--- FileTree uses Monticello packages and replaces Monticello
repositories with git) and where git has advantages over Monticello
repositories, then you should continue to use Monticello repositories ...

Personally I don't see Monticello repositories going away anytime soon
and expect to support Monticello repositories in GsDevKit_home, tODE,
and Metacello for the rest of my life:)

Dale

Reply | Threaded
Open this post in threaded view
|

Re: Contributing to Pharo

Dale Henrichs-3
In reply to this post by Ben Coman


On 01/30/2016 08:27 PM, Ben Coman wrote:

> On Sun, Jan 31, 2016 at 3:38 AM, Dale Henrichs
> <[hidden email]> wrote:
>>
>> On 1/30/16 1:54 AM, Bernhard Pieber wrote:
>>> Dale,
>>>
>>> Thanks for your thorough answer. I really appreciate how you include links
>>> to helpful articles.
>>>
>>> I find the description of the workflow you actually use very enlightening.
>>> However, one thing still remains unclear. In the last step, when merging the
>>> pull request. How is the unchanged metadata reconciled with the code
>>> changes? I just realized that I just don’t know what information is in the
>>> Monticello metadata, which is not in the code?
>>>
>> Monticello metadata is basically the entire Monticello version history of
>> the package, it includes direct ancestors, commit comments, the GUID, etc.
>> For a FileTree repo, the meta data is stashed a separate file ... The form
>> of the data is actually serialized Smalltalk object graph - a deeply nested
>> set of Arrays - all written on a single line, so git has very little chance
>> of being able to merge two files
> What would it take to have the meta data spread out over multiple
> lines (if that would work better with git?)

Hmmm I suppose that if the meta data were represented in STON (using
pretty print) the changes might be more mergable, but I think that
Thierry's algorithm might be still to a proper merge, since it takes
"object surgery" to get things right .... STON might make it possible
for a human to do the necessary edits though ...

Dale

Reply | Threaded
Open this post in threaded view
|

Re: Contributing to Pharo

Eliot Miranda-2
In reply to this post by Dale Henrichs-3
Hi Dale,

On Tue, Feb 2, 2016 at 11:35 AM, Dale Henrichs <[hidden email]> wrote:


On 01/29/2016 05:17 PM, Eliot Miranda wrote:
Hi David,

On Jan 29, 2016, at 2:45 PM, David Allouche <[hidden email]> wrote:

Thanks Dale for all the explanations.

How Monticello and version control relate in the big picture is starting to make sense for me.

Now, I better understand why filetree ended up uses a file-per-method format, even though that is relatively hostile to git user interfaces optimised for other languages. There is really a need for a file-per-class exchange format, because that would works a lot better with the existing VCS ecosystem.
I agree so strongly.  Class file outs which are eg sorted by selector make much more sense.  They won't hit the file name length limit.  They make it trivial to maintain method and class comment time stamps.  They're easier to construct into snapshots because it's easier to decode the file name.

And then it's easy to add files for package load/unload scripts and for the history.  And then one is much more decoupled from the specific back end.  It could be mercurial just as easily as git.

I think more package-based user interfaces would indeed be a very good idea, for browsing and for source code management.

Stef, I have the impression you think that git is popular because it is a new shiny toy. I disagree with this idea. Git is a typical worse-is-better tool. It's good enough for most people, but it still has many shortcomings. It is popular in spite of its shortcomings. It became popular as destination for projects shifting from CVS and Subversion. So it is unlikely to be displaced by a newer, incrementally shinier tools. Anything that will displace it will have to provide an improvement of a similar magnitude as the jump between centralised and distributed version control.
This is a good analysis.  What's valuable to the Pharo community is not displacing an already functional dvcs (Monticello) with an ill-suited one (git), but in being able to function in ecosystems like github where people can display their identity and where infrastructure for bug reports etc exist.

Still, I think it's a good idea not to restrict high level models to what git provides if that's a less than ideal fit to the image model.
Absolutely.  Dale's talk of ditching Monticello metadata fills me with repulsion and makes me want to ask is he trying to sabotage or what?
Eliot, you can't be serious - accusing me of sabotage? Ah, well.... how about you assume that I'm doing "or what":)

The Monticello metadata in a git repository is redundant and leads to unnecessary commit conflicts -- end of story ....

No it's /not/ the end of the story.  The essential part of the story is how Monticello remains compatible and interoperable between dialects.  I haven't seen you account for how you maintain that compatibility.  As far as I can tell, you propose replacing the Monticello metadata with that from git.  How do I, as a Squeak user with Monticello, ever get to look at your package again?  As I understand it, moving the metadata from Monticello commit time to git means that the metadata is in a format that git determines, not Monticello.

So I don't understand how on the one hand you can say "The Monticello metadata in a git repository is redundant and leads to unnecessary commit conflicts -- end of story ....", which implies you want to eliminate the Monticello metadata, and on the other hand you say you're keeping the Monticello metadata.  I'm hopelessly confused.  How does the Monticello metadata get reconstituted if it's been thrown away?

What happens to the metadata in the following workflow?

load package P from Monticello repository R into an image
change P, commit via git to local git repository G
load P from G into an image
store P to R via Monticello


Despite the fact that the Monticello metadata is redundant, I have made sure that the Monticello metadata was included in FileTree from the very beginning for the very reason that I wanted developers to be able to try out FileTree, git and github without having to burn any  Monticello bridges .... if they didn't like FileTree, git and github, then they would be able to back out of their use of git without losing data ...

Then forgive me.  I couldn't see the wood for the trees.  When I read your talk of eliminating the conflicts from git commits due to the Monticello metadata I infer that you're eliminating the Monticello metadata.  I'm not sure I understand the implications of this, but it seems to me that a natural consequence is that the Montivcello metadata is lost and at that point merges and the like become problematic.  What am I missing?

 

It seems entirely destructive.
It is not destructive ...
We have a functional package manager which currently supports interchange between Pharo, Squeak and Cuis,
and GemStone?

I assume that you are talking about Monticello packages and Monticello repositories ... or what?

Yes.
 

I am really not trying to do anything but "invent the future" --- I am not trying to destroy, I am trying to improve ... If you are not able to see the shortcomings of Monticello repositories  (Note that I am distinguishing between Monticello packages and Monticello repositories  --- FileTree uses Monticello packages and replaces Monticello repositories with git) and where git has advantages over Monticello repositories, then you should continue to use Monticello repositories ...

But what does this imply to some package that starts off in a Monticello repository and then spends some time in gitland?  Can I merge again?  If I can I'm happy.  If I can't, I feel sabotaged.
 
Personally I don't see Monticello repositories going away anytime soon and expect to support Monticello repositories in GsDevKit_home, tODE, and Metacello for the rest of my life:)

Fine.  Except merging is, IIUC, about method time stamps and ancestry.  If that gets preserved then I'm happy.  But for the life of me I haven't read an explanation that reassures me that these are being preserved.  Do you see the roots of my fear?

Dale

_,,,^..^,,,_
confused, Eliot
Reply | Threaded
Open this post in threaded view
|

Re: Contributing to Pharo

Eliot Miranda-2
In reply to this post by Dale Henrichs-3


On Tue, Feb 2, 2016 at 11:38 AM, Dale Henrichs <[hidden email]> wrote:


On 01/30/2016 08:27 PM, Ben Coman wrote:
On Sun, Jan 31, 2016 at 3:38 AM, Dale Henrichs
<[hidden email]> wrote:

On 1/30/16 1:54 AM, Bernhard Pieber wrote:
Dale,

Thanks for your thorough answer. I really appreciate how you include links
to helpful articles.

I find the description of the workflow you actually use very enlightening.
However, one thing still remains unclear. In the last step, when merging the
pull request. How is the unchanged metadata reconciled with the code
changes? I just realized that I just don’t know what information is in the
Monticello metadata, which is not in the code?

Monticello metadata is basically the entire Monticello version history of
the package, it includes direct ancestors, commit comments, the GUID, etc.
For a FileTree repo, the meta data is stashed a separate file ... The form
of the data is actually serialized Smalltalk object graph - a deeply nested
set of Arrays - all written on a single line, so git has very little chance
of being able to merge two files
What would it take to have the meta data spread out over multiple
lines (if that would work better with git?)

Hmmm I suppose that if the meta data were represented in STON (using pretty print) the changes might be more mergable, but I think that Thierry's algorithm might be still to a proper merge, since it takes "object surgery" to get things right .... STON might make it possible for a human to do the necessary edits though ...

Please tell me this isn't about line endings?  Why can't the version history be written with lf line endings?  That's hardly a bone of contention is it?

_,,,^..^,,,_
best, Eliot
Reply | Threaded
Open this post in threaded view
|

Re: Contributing to Pharo

Thierry Goubier
Le 02/02/2016 21:56, Eliot Miranda a écrit :

>
>
> On Tue, Feb 2, 2016 at 11:38 AM, Dale Henrichs
> <[hidden email]
> <mailto:[hidden email]>> wrote:
>
>
>
>     On 01/30/2016 08:27 PM, Ben Coman wrote:
>
>         On Sun, Jan 31, 2016 at 3:38 AM, Dale Henrichs
>         <[hidden email]
>         <mailto:[hidden email]>> wrote:
>
>
>             On 1/30/16 1:54 AM, Bernhard Pieber wrote:
>
>                 Dale,
>
>                 Thanks for your thorough answer. I really appreciate how
>                 you include links
>                 to helpful articles.
>
>                 I find the description of the workflow you actually use
>                 very enlightening.
>                 However, one thing still remains unclear. In the last
>                 step, when merging the
>                 pull request. How is the unchanged metadata reconciled
>                 with the code
>                 changes? I just realized that I just don’t know what
>                 information is in the
>                 Monticello metadata, which is not in the code?
>
>             Monticello metadata is basically the entire Monticello
>             version history of
>             the package, it includes direct ancestors, commit comments,
>             the GUID, etc.
>             For a FileTree repo, the meta data is stashed a separate
>             file ... The form
>             of the data is actually serialized Smalltalk object graph -
>             a deeply nested
>             set of Arrays - all written on a single line, so git has
>             very little chance
>             of being able to merge two files
>
>         What would it take to have the meta data spread out over multiple
>         lines (if that would work better with git?)
>
>
>     Hmmm I suppose that if the meta data were represented in STON (using
>     pretty print) the changes might be more mergable, but I think that
>     Thierry's algorithm might be still to a proper merge, since it takes
>     "object surgery" to get things right .... STON might make it
>     possible for a human to do the necessary edits though ...
>
>
> Please tell me this isn't about line endings?  Why can't the version
> history be written with lf line endings?  That's hardly a bone of
> contention is it?

No, it's a lot more complex than that. It's simply hard to do a
merge-able text log format. Git also makes a mess of json / ston files,
simply because they are structured data in what appears to be text
files, git treat them as text files, and three-way line by line merging
does not work.

The only hope you can have is to make a format which is easier for a
human to correct when git creates a conflict when merging ... in short,
something which would only make it easier as long as the history is
limited to 50 versions and a single developper :(

Now, given how much cleaner, safer and sound is extracting that data
from the dvcs log, I'll never bother with that sort of things again.

Thierry

Reply | Threaded
Open this post in threaded view
|

Re: Contributing to Pharo

Dale Henrichs-3
In reply to this post by Eliot Miranda-2


On 02/02/2016 12:54 PM, Eliot Miranda wrote:
Hi Dale,

On Tue, Feb 2, 2016 at 11:35 AM, Dale Henrichs <[hidden email]> wrote:


On 01/29/2016 05:17 PM, Eliot Miranda wrote:
Hi David,

On Jan 29, 2016, at 2:45 PM, David Allouche <[hidden email]> wrote:

Thanks Dale for all the explanations.

How Monticello and version control relate in the big picture is starting to make sense for me.

Now, I better understand why filetree ended up uses a file-per-method format, even though that is relatively hostile to git user interfaces optimised for other languages. There is really a need for a file-per-class exchange format, because that would works a lot better with the existing VCS ecosystem.
I agree so strongly.  Class file outs which are eg sorted by selector make much more sense.  They won't hit the file name length limit.  They make it trivial to maintain method and class comment time stamps.  They're easier to construct into snapshots because it's easier to decode the file name.

And then it's easy to add files for package load/unload scripts and for the history.  And then one is much more decoupled from the specific back end.  It could be mercurial just as easily as git.

I think more package-based user interfaces would indeed be a very good idea, for browsing and for source code management.

Stef, I have the impression you think that git is popular because it is a new shiny toy. I disagree with this idea. Git is a typical worse-is-better tool. It's good enough for most people, but it still has many shortcomings. It is popular in spite of its shortcomings. It became popular as destination for projects shifting from CVS and Subversion. So it is unlikely to be displaced by a newer, incrementally shinier tools. Anything that will displace it will have to provide an improvement of a similar magnitude as the jump between centralised and distributed version control.
This is a good analysis.  What's valuable to the Pharo community is not displacing an already functional dvcs (Monticello) with an ill-suited one (git), but in being able to function in ecosystems like github where people can display their identity and where infrastructure for bug reports etc exist.

Still, I think it's a good idea not to restrict high level models to what git provides if that's a less than ideal fit to the image model.
Absolutely.  Dale's talk of ditching Monticello metadata fills me with repulsion and makes me want to ask is he trying to sabotage or what?
Eliot, you can't be serious - accusing me of sabotage? Ah, well.... how about you assume that I'm doing "or what":)

The Monticello metadata in a git repository is redundant and leads to unnecessary commit conflicts -- end of story ....

No it's /not/ the end of the story.  The essential part of the story is how Monticello remains compatible and interoperable between dialects.  I haven't seen you account for how you maintain that compatibility.  As far as I can tell, you propose replacing the Monticello metadata with that from git.  How do I, as a Squeak user with Monticello, ever get to look at your package again?  As I understand it, moving the metadata from Monticello commit time to git means that the metadata is in a format that git determines, not Monticello.
Good question, FileTree has been supported on Squeak since the very beginning (I along with a small number of Squeak users have made sure of that).

So TODAY, any Squeak user can "look at, load and commit" any package that has been written using FileTree (with or without Monticello meta data).

[1] https://github.com/dalehenrich/filetree/tree/squeak4.3#squeak


So I don't understand how on the one hand you can say "The Monticello metadata in a git repository is redundant and leads to unnecessary commit conflicts -- end of story ....", which implies you want to eliminate the Monticello metadata, and on the other hand you say you're keeping the Monticello metadata.  I'm hopelessly confused.  How does the Monticello metadata get reconstituted if it's been thrown away?
Monticello meta data is not an integral part of the "package-ness" of a Monticello package ... it _is_ integral to the "repository-ness" of a Monticello package ...

If the Monticello metadata is "thrown away" then the revision history for Monticello is lost, but for a package that is "born" in a git repository, the Monticello metadata is not needed. Git has it's own commit meta data and the Monticello metadata is redundant.

If you want to see the revision history from a git-based FileTree repo, then:

  1. one can include the Monticello meta data as part of the package - this is what FileTree
      currently does
  2. one can build a tool that reconstructs the Monticello version history from the git metadata
      making it possible to use "old" Monitcello tools to look at the git repo - I believe that Thierry's
      GitFileTree takes this approach for metadata-less repositories
  3. One can build a new tool that presents the git metadata without reconstructing the
      Monticello metadata at all. Note that by "embracing git", it is possible to present revision
      history at the package level (current mcz techology) as well as the class and method level -
      which is what I do in tODE


What happens to the metadata in the following workflow?

load package P from Monticello repository R into an image
change P, commit via git to local git repository G
load P from G into an image
store P to R via Monticello

The above workflow can be accomplished whether or not Monticello metadata is present, however, if one does not make an effort to preserve the revision history then at the end of your workflow the Monticello metadata is lost.

If one takes the pains to preserve the Monticello metadata before committing to the git repository and the metadata is updated with each commit during the git lifetime, then the full metadata will be present at the end ...

This I think is the crux of the discussion.

There are a number of alternate schemes that can be used to preserve the metadata through this scenario:

  1. (current FileTree implementation) store the Monticello meta data in git and update on
      every git commit
  2. duplicate the existing Monticello revision history by committing in order all of the
      package ancestors and arrange for a way to reconstruct Monticello metadata from git
      meta data
  3. (variant of 1) store the original monticello meta data in a file, do not update on every commit
      but arrange for a way to reconstruct Monticello metadata from git meta data and graft onto
      original Monticello meta data for use in mcz repository --- on demand
  4. ????

Option 3 seems to be a good compromise solution and perhaps is the approach that should be adopted moving forward ... we get to preserver Monticello metadata while avoiding the messy commit conflicts for git while providing a (somewhat) seamless path for a package to migrate back into the mcz repository world ... if we somehow incorporate the SHA of the commit and the github/bitbucket url into the revision history, then it would be possible to perform a 3 way merge involving two mcz package versions and common ancestor that is only present in git ...



Despite the fact that the Monticello metadata is redundant, I have made sure that the Monticello metadata was included in FileTree from the very beginning for the very reason that I wanted developers to be able to try out FileTree, git and github without having to burn any  Monticello bridges .... if they didn't like FileTree, git and github, then they would be able to back out of their use of git without losing data ...

Then forgive me.  I couldn't see the wood for the trees.  When I read your talk of eliminating the conflicts from git commits due to the Monticello metadata I infer that you're eliminating the Monticello metadata.  I'm not sure I understand the implications of this, but it seems to me that a natural consequence is that the Montivcello metadata is lost and at that point merges and the like become problematic.  What am I missing?
You've hit the nail on the head, but I think that option 3 above gives us a way to avoid losing the Monticello metadata without incurring a hit for a packages lifetime while in git ...

A package that starts its life in git will have an empty Monticello metadata and a package that never makes it's way into an mcz repository will not incur per commit penalties ...

 

It seems entirely destructive.
It is not destructive ...
We have a functional package manager which currently supports interchange between Pharo, Squeak and Cuis,
and GemStone?

I assume that you are talking about Monticello packages and Monticello repositories ... or what?

Yes.
 

I am really not trying to do anything but "invent the future" --- I am not trying to destroy, I am trying to improve ... If you are not able to see the shortcomings of Monticello repositories  (Note that I am distinguishing between Monticello packages and Monticello repositories  --- FileTree uses Monticello packages and replaces Monticello repositories with git) and where git has advantages over Monticello repositories, then you should continue to use Monticello repositories ...

But what does this imply to some package that starts off in a Monticello repository and then spends some time in gitland?  Can I merge again?  If I can I'm happy.  If I can't, I feel sabotaged.
and sabotage was never my intent ...
 
Personally I don't see Monticello repositories going away anytime soon and expect to support Monticello repositories in GsDevKit_home, tODE, and Metacello for the rest of my life:)

Fine.  Except merging is, IIUC, about method time stamps and ancestry.  If that gets preserved then I'm happy.  But for the life of me I haven't read an explanation that reassures me that these are being preserved.  Do you see the roots of my fear?

Haha, from the very beginning back in 2012, I understood that there would be fear and resistance to change and anger and joy and excitement but it was not clear when if ever we'd reach point where a resolution for this "problem" was needed: either no one would be interested or everyone would be interested or ???

I really think that option 3 is going to be the best compromise moving forward - there is some implementation work that will be required but I really think option 3 gives you (and frankly me) a way to preserve the Monticello revision history for packages that make their way back and forth between lifetimes in git and Monticello repositories.

Eliot, I appreciate the fact that you demanded a better solution!

Dale
Reply | Threaded
Open this post in threaded view
|

Re: Contributing to Pharo

Ben Coman
On Wed, Feb 3, 2016 at 7:00 AM, Dale Henrichs
<[hidden email]> wrote:

>
>
> On 02/02/2016 12:54 PM, Eliot Miranda wrote:
>
> Hi Dale,
>
> On Tue, Feb 2, 2016 at 11:35 AM, Dale Henrichs
> <[hidden email]> wrote:
>>
>>
>>
>> On 01/29/2016 05:17 PM, Eliot Miranda wrote:
>>>
>>> Hi David,
>>>
>>>> On Jan 29, 2016, at 2:45 PM, David Allouche <[hidden email]> wrote:
>>>>
>>>> Thanks Dale for all the explanations.
>>>>
>>>> How Monticello and version control relate in the big picture is starting
>>>> to make sense for me.
>>>>
>>>> Now, I better understand why filetree ended up uses a file-per-method
>>>> format, even though that is relatively hostile to git user interfaces
>>>> optimised for other languages. There is really a need for a file-per-class
>>>> exchange format, because that would works a lot better with the existing VCS
>>>> ecosystem.
>>>
>>> I agree so strongly.  Class file outs which are eg sorted by selector
>>> make much more sense.  They won't hit the file name length limit.  They make
>>> it trivial to maintain method and class comment time stamps.  They're easier
>>> to construct into snapshots because it's easier to decode the file name.
>>>
>>> And then it's easy to add files for package load/unload scripts and for
>>> the history.  And then one is much more decoupled from the specific back
>>> end.  It could be mercurial just as easily as git.
>>>
>>>> I think more package-based user interfaces would indeed be a very good
>>>> idea, for browsing and for source code management.
>>>>
>>>> Stef, I have the impression you think that git is popular because it is
>>>> a new shiny toy. I disagree with this idea. Git is a typical worse-is-better
>>>> tool. It's good enough for most people, but it still has many shortcomings.
>>>> It is popular in spite of its shortcomings. It became popular as destination
>>>> for projects shifting from CVS and Subversion. So it is unlikely to be
>>>> displaced by a newer, incrementally shinier tools. Anything that will
>>>> displace it will have to provide an improvement of a similar magnitude as
>>>> the jump between centralised and distributed version control.
>>>
>>> This is a good analysis.  What's valuable to the Pharo community is not
>>> displacing an already functional dvcs (Monticello) with an ill-suited one
>>> (git), but in being able to function in ecosystems like github where people
>>> can display their identity and where infrastructure for bug reports etc
>>> exist.
>>>
>>>> Still, I think it's a good idea not to restrict high level models to
>>>> what git provides if that's a less than ideal fit to the image model.
>>>
>>> Absolutely.  Dale's talk of ditching Monticello metadata fills me with
>>> repulsion and makes me want to ask is he trying to sabotage or what?
>>
>> Eliot, you can't be serious - accusing me of sabotage? Ah, well.... how
>> about you assume that I'm doing "or what":)
>>
>> The Monticello metadata in a git repository is redundant and leads to
>> unnecessary commit conflicts -- end of story ....
>
>
> No it's /not/ the end of the story.  The essential part of the story is how
> Monticello remains compatible and interoperable between dialects.  I haven't
> seen you account for how you maintain that compatibility.  As far as I can
> tell, you propose replacing the Monticello metadata with that from git.  How
> do I, as a Squeak user with Monticello, ever get to look at your package
> again?  As I understand it, moving the metadata from Monticello commit time
> to git means that the metadata is in a format that git determines, not
> Monticello.
>
> Good question, FileTree has been supported on Squeak since the very
> beginning (I along with a small number of Squeak users have made sure of
> that).
>
> So TODAY, any Squeak user can "look at, load and commit" any package that
> has been written using FileTree (with or without Monticello meta data).
>
> [1] https://github.com/dalehenrich/filetree/tree/squeak4.3#squeak
>
>
> So I don't understand how on the one hand you can say "The Monticello
> metadata in a git repository is redundant and leads to unnecessary commit
> conflicts -- end of story ....", which implies you want to eliminate the
> Monticello metadata, and on the other hand you say you're keeping the
> Monticello metadata.  I'm hopelessly confused.  How does the Monticello
> metadata get reconstituted if it's been thrown away?
>
> Monticello meta data is not an integral part of the "package-ness" of a
> Monticello package ... it _is_ integral to the "repository-ness" of a
> Monticello package ...
>
> If the Monticello metadata is "thrown away" then the revision history for
> Monticello is lost, but for a package that is "born" in a git repository,
> the Monticello metadata is not needed. Git has it's own commit meta data and
> the Monticello metadata is redundant.
>
> If you want to see the revision history from a git-based FileTree repo,
> then:
>
>   1. one can include the Monticello meta data as part of the package - this
> is what FileTree
>       currently does
>   2. one can build a tool that reconstructs the Monticello version history
> from the git metadata
>       making it possible to use "old" Monitcello tools to look at the git
> repo - I believe that Thierry's
>       GitFileTree takes this approach for metadata-less repositories
>   3. One can build a new tool that presents the git metadata without
> reconstructing the
>       Monticello metadata at all. Note that by "embracing git", it is
> possible to present revision
>       history at the package level (current mcz techology) as well as the
> class and method level -
>       which is what I do in tODE
>
>
> What happens to the metadata in the following workflow?
>
> load package P from Monticello repository R into an image
> change P, commit via git to local git repository G
> load P from G into an image
> store P to R via Monticello
>
>
> The above workflow can be accomplished whether or not Monticello metadata is
> present, however, if one does not make an effort to preserve the revision
> history then at the end of your workflow the Monticello metadata is lost.
>
> If one takes the pains to preserve the Monticello metadata before committing
> to the git repository and the metadata is updated with each commit during
> the git lifetime, then the full metadata will be present at the end ...
>
> This I think is the crux of the discussion.
>
> There are a number of alternate schemes that can be used to preserve the
> metadata through this scenario:
>
>   1. (current FileTree implementation) store the Monticello meta data in git
> and update on
>       every git commit
>   2. duplicate the existing Monticello revision history by committing in
> order all of the
>       package ancestors and arrange for a way to reconstruct Monticello
> metadata from git
>       meta data
>   3. (variant of 1) store the original monticello meta data in a file, do
> not update on every commit
>       but arrange for a way to reconstruct Monticello metadata from git meta
> data and graft onto
>       original Monticello meta data for use in mcz repository --- on demand
>   4. ????
>
> Option 3 seems to be a good compromise solution and perhaps is the approach
> that should be adopted moving forward ... we get to preserver Monticello
> metadata while avoiding the messy commit conflicts for git while providing a
> (somewhat) seamless path for a package to migrate back into the mcz
> repository world ... if we somehow incorporate the SHA of the commit and the
> github/bitbucket url into the revision history, then it would be possible to
> perform a 3 way merge involving two mcz package versions and common ancestor
> that is only present in git ...
>
>
>
>> Despite the fact that the Monticello metadata is redundant, I have made
>> sure that the Monticello metadata was included in FileTree from the very
>> beginning for the very reason that I wanted developers to be able to try out
>> FileTree, git and github without having to burn any  Monticello bridges ....
>> if they didn't like FileTree, git and github, then they would be able to
>> back out of their use of git without losing data ...
>
>
> Then forgive me.  I couldn't see the wood for the trees.  When I read your
> talk of eliminating the conflicts from git commits due to the Monticello
> metadata I infer that you're eliminating the Monticello metadata.  I'm not
> sure I understand the implications of this, but it seems to me that a
> natural consequence is that the Montivcello metadata is lost and at that
> point merges and the like become problematic.  What am I missing?
>
> You've hit the nail on the head, but I think that option 3 above gives us a
> way to avoid losing the Monticello metadata without incurring a hit for a
> packages lifetime while in git ...
>
> A package that starts its life in git will have an empty Monticello metadata
> and a package that never makes it's way into an mcz repository will not
> incur per commit penalties ...
>
>
>
>>
>>
>>> It seems entirely destructive.
>>
>> It is not destructive ...
>>>
>>> We have a functional package manager which currently supports interchange
>>> between Pharo, Squeak and Cuis,
>>
>> and GemStone?
>>
>> I assume that you are talking about Monticello packages and Monticello
>> repositories ... or what?
>
>
> Yes.
>
>>
>>
>> I am really not trying to do anything but "invent the future" --- I am not
>> trying to destroy, I am trying to improve ... If you are not able to see the
>> shortcomings of Monticello repositories  (Note that I am distinguishing
>> between Monticello packages and Monticello repositories  --- FileTree uses
>> Monticello packages and replaces Monticello repositories with git) and where
>> git has advantages over Monticello repositories, then you should continue to
>> use Monticello repositories ...
>
>
> But what does this imply to some package that starts off in a Monticello
> repository and then spends some time in gitland?  Can I merge again?  If I
> can I'm happy.  If I can't, I feel sabotaged.
>
> and sabotage was never my intent ...
>
>
>>
>> Personally I don't see Monticello repositories going away anytime soon and
>> expect to support Monticello repositories in GsDevKit_home, tODE, and
>> Metacello for the rest of my life:)
>
>
> Fine.  Except merging is, IIUC, about method time stamps and ancestry.  If
> that gets preserved then I'm happy.  But for the life of me I haven't read
> an explanation that reassures me that these are being preserved.  Do you see
> the roots of my fear?
>
> Haha, from the very beginning back in 2012, I understood that there would be
> fear and resistance to change and anger and joy and excitement but it was
> not clear when if ever we'd reach point where a resolution for this
> "problem" was needed: either no one would be interested or everyone would be
> interested or ???
>
> I really think that option 3 is going to be the best compromise moving
> forward - there is some implementation work that will be required but I
> really think option 3 gives you (and frankly me) a way to preserve the
> Monticello revision history for packages that make their way back and forth
> between lifetimes in git and Monticello repositories.
>
> Eliot, I appreciate the fact that you demanded a better solution!
>
> Dale

I'm curious how the merge driver is implemented.  I think it was
mentioned that git calls-back to Pharo to do the processing. Is this
it something like this...

* How to make Git preserve specific files while merging
   https://medium.com/@porteneuve/how-to-make-git-preserve-specific-files-while-merging-18c92343826b#.raovdvj9p

* A few of my Git tricks, tips and workflows
  section under gitattribute(5)
  http://nuclearsquid.com/writings/git-tricks-tips-workflows/


Or... rather than the metadata being a specific file (e.g.
'metadata'), the file could change for each commit (e.g.
'metadata.nnn') where nnn is the revision number, so each commit
deletes the old metadata file and writes the new metadata file.
Monticello could know to use whichever single 'metadata.*' file
exists.  Would that avoid the problem of metadata merge conflict?

cheers -ben

Reply | Threaded
Open this post in threaded view
|

Re: Contributing to Pharo

Nicolas Cellier

2016-02-03 6:00 GMT+01:00 Ben Coman <[hidden email]>:
On Wed, Feb 3, 2016 at 7:00 AM, Dale Henrichs
<[hidden email]> wrote:
>
>
> On 02/02/2016 12:54 PM, Eliot Miranda wrote:
>
> Hi Dale,
>
> On Tue, Feb 2, 2016 at 11:35 AM, Dale Henrichs
> <[hidden email]> wrote:
>>
>>
>>
>> On 01/29/2016 05:17 PM, Eliot Miranda wrote:
>>>
>>> Hi David,
>>>
>>>> On Jan 29, 2016, at 2:45 PM, David Allouche <[hidden email]> wrote:
>>>>
>>>> Thanks Dale for all the explanations.
>>>>
>>>> How Monticello and version control relate in the big picture is starting
>>>> to make sense for me.
>>>>
>>>> Now, I better understand why filetree ended up uses a file-per-method
>>>> format, even though that is relatively hostile to git user interfaces
>>>> optimised for other languages. There is really a need for a file-per-class
>>>> exchange format, because that would works a lot better with the existing VCS
>>>> ecosystem.
>>>
>>> I agree so strongly.  Class file outs which are eg sorted by selector
>>> make much more sense.  They won't hit the file name length limit.  They make
>>> it trivial to maintain method and class comment time stamps.  They're easier
>>> to construct into snapshots because it's easier to decode the file name.
>>>
>>> And then it's easy to add files for package load/unload scripts and for
>>> the history.  And then one is much more decoupled from the specific back
>>> end.  It could be mercurial just as easily as git.
>>>
>>>> I think more package-based user interfaces would indeed be a very good
>>>> idea, for browsing and for source code management.
>>>>
>>>> Stef, I have the impression you think that git is popular because it is
>>>> a new shiny toy. I disagree with this idea. Git is a typical worse-is-better
>>>> tool. It's good enough for most people, but it still has many shortcomings.
>>>> It is popular in spite of its shortcomings. It became popular as destination
>>>> for projects shifting from CVS and Subversion. So it is unlikely to be
>>>> displaced by a newer, incrementally shinier tools. Anything that will
>>>> displace it will have to provide an improvement of a similar magnitude as
>>>> the jump between centralised and distributed version control.
>>>
>>> This is a good analysis.  What's valuable to the Pharo community is not
>>> displacing an already functional dvcs (Monticello) with an ill-suited one
>>> (git), but in being able to function in ecosystems like github where people
>>> can display their identity and where infrastructure for bug reports etc
>>> exist.
>>>
>>>> Still, I think it's a good idea not to restrict high level models to
>>>> what git provides if that's a less than ideal fit to the image model.
>>>
>>> Absolutely.  Dale's talk of ditching Monticello metadata fills me with
>>> repulsion and makes me want to ask is he trying to sabotage or what?
>>
>> Eliot, you can't be serious - accusing me of sabotage? Ah, well.... how
>> about you assume that I'm doing "or what":)
>>
>> The Monticello metadata in a git repository is redundant and leads to
>> unnecessary commit conflicts -- end of story ....
>
>
> No it's /not/ the end of the story.  The essential part of the story is how
> Monticello remains compatible and interoperable between dialects.  I haven't
> seen you account for how you maintain that compatibility.  As far as I can
> tell, you propose replacing the Monticello metadata with that from git.  How
> do I, as a Squeak user with Monticello, ever get to look at your package
> again?  As I understand it, moving the metadata from Monticello commit time
> to git means that the metadata is in a format that git determines, not
> Monticello.
>
> Good question, FileTree has been supported on Squeak since the very
> beginning (I along with a small number of Squeak users have made sure of
> that).
>
> So TODAY, any Squeak user can "look at, load and commit" any package that
> has been written using FileTree (with or without Monticello meta data).
>
> [1] https://github.com/dalehenrich/filetree/tree/squeak4.3#squeak
>
>
> So I don't understand how on the one hand you can say "The Monticello
> metadata in a git repository is redundant and leads to unnecessary commit
> conflicts -- end of story ....", which implies you want to eliminate the
> Monticello metadata, and on the other hand you say you're keeping the
> Monticello metadata.  I'm hopelessly confused.  How does the Monticello
> metadata get reconstituted if it's been thrown away?
>
> Monticello meta data is not an integral part of the "package-ness" of a
> Monticello package ... it _is_ integral to the "repository-ness" of a
> Monticello package ...
>
> If the Monticello metadata is "thrown away" then the revision history for
> Monticello is lost, but for a package that is "born" in a git repository,
> the Monticello metadata is not needed. Git has it's own commit meta data and
> the Monticello metadata is redundant.
>
> If you want to see the revision history from a git-based FileTree repo,
> then:
>
>   1. one can include the Monticello meta data as part of the package - this
> is what FileTree
>       currently does
>   2. one can build a tool that reconstructs the Monticello version history
> from the git metadata
>       making it possible to use "old" Monitcello tools to look at the git
> repo - I believe that Thierry's
>       GitFileTree takes this approach for metadata-less repositories
>   3. One can build a new tool that presents the git metadata without
> reconstructing the
>       Monticello metadata at all. Note that by "embracing git", it is
> possible to present revision
>       history at the package level (current mcz techology) as well as the
> class and method level -
>       which is what I do in tODE
>
>
> What happens to the metadata in the following workflow?
>
> load package P from Monticello repository R into an image
> change P, commit via git to local git repository G
> load P from G into an image
> store P to R via Monticello
>
>
> The above workflow can be accomplished whether or not Monticello metadata is
> present, however, if one does not make an effort to preserve the revision
> history then at the end of your workflow the Monticello metadata is lost.
>
> If one takes the pains to preserve the Monticello metadata before committing
> to the git repository and the metadata is updated with each commit during
> the git lifetime, then the full metadata will be present at the end ...
>
> This I think is the crux of the discussion.
>
> There are a number of alternate schemes that can be used to preserve the
> metadata through this scenario:
>
>   1. (current FileTree implementation) store the Monticello meta data in git
> and update on
>       every git commit
>   2. duplicate the existing Monticello revision history by committing in
> order all of the
>       package ancestors and arrange for a way to reconstruct Monticello
> metadata from git
>       meta data
>   3. (variant of 1) store the original monticello meta data in a file, do
> not update on every commit
>       but arrange for a way to reconstruct Monticello metadata from git meta
> data and graft onto
>       original Monticello meta data for use in mcz repository --- on demand
>   4. ????
>
> Option 3 seems to be a good compromise solution and perhaps is the approach
> that should be adopted moving forward ... we get to preserver Monticello
> metadata while avoiding the messy commit conflicts for git while providing a
> (somewhat) seamless path for a package to migrate back into the mcz
> repository world ... if we somehow incorporate the SHA of the commit and the
> github/bitbucket url into the revision history, then it would be possible to
> perform a 3 way merge involving two mcz package versions and common ancestor
> that is only present in git ...
>
>
>
>> Despite the fact that the Monticello metadata is redundant, I have made
>> sure that the Monticello metadata was included in FileTree from the very
>> beginning for the very reason that I wanted developers to be able to try out
>> FileTree, git and github without having to burn any  Monticello bridges ....
>> if they didn't like FileTree, git and github, then they would be able to
>> back out of their use of git without losing data ...
>
>
> Then forgive me.  I couldn't see the wood for the trees.  When I read your
> talk of eliminating the conflicts from git commits due to the Monticello
> metadata I infer that you're eliminating the Monticello metadata.  I'm not
> sure I understand the implications of this, but it seems to me that a
> natural consequence is that the Montivcello metadata is lost and at that
> point merges and the like become problematic.  What am I missing?
>
> You've hit the nail on the head, but I think that option 3 above gives us a
> way to avoid losing the Monticello metadata without incurring a hit for a
> packages lifetime while in git ...
>
> A package that starts its life in git will have an empty Monticello metadata
> and a package that never makes it's way into an mcz repository will not
> incur per commit penalties ...
>
>
>
>>
>>
>>> It seems entirely destructive.
>>
>> It is not destructive ...
>>>
>>> We have a functional package manager which currently supports interchange
>>> between Pharo, Squeak and Cuis,
>>
>> and GemStone?
>>
>> I assume that you are talking about Monticello packages and Monticello
>> repositories ... or what?
>
>
> Yes.
>
>>
>>
>> I am really not trying to do anything but "invent the future" --- I am not
>> trying to destroy, I am trying to improve ... If you are not able to see the
>> shortcomings of Monticello repositories  (Note that I am distinguishing
>> between Monticello packages and Monticello repositories  --- FileTree uses
>> Monticello packages and replaces Monticello repositories with git) and where
>> git has advantages over Monticello repositories, then you should continue to
>> use Monticello repositories ...
>
>
> But what does this imply to some package that starts off in a Monticello
> repository and then spends some time in gitland?  Can I merge again?  If I
> can I'm happy.  If I can't, I feel sabotaged.
>
> and sabotage was never my intent ...
>
>
>>
>> Personally I don't see Monticello repositories going away anytime soon and
>> expect to support Monticello repositories in GsDevKit_home, tODE, and
>> Metacello for the rest of my life:)
>
>
> Fine.  Except merging is, IIUC, about method time stamps and ancestry.  If
> that gets preserved then I'm happy.  But for the life of me I haven't read
> an explanation that reassures me that these are being preserved.  Do you see
> the roots of my fear?
>
> Haha, from the very beginning back in 2012, I understood that there would be
> fear and resistance to change and anger and joy and excitement but it was
> not clear when if ever we'd reach point where a resolution for this
> "problem" was needed: either no one would be interested or everyone would be
> interested or ???
>
> I really think that option 3 is going to be the best compromise moving
> forward - there is some implementation work that will be required but I
> really think option 3 gives you (and frankly me) a way to preserve the
> Monticello revision history for packages that make their way back and forth
> between lifetimes in git and Monticello repositories.
>
> Eliot, I appreciate the fact that you demanded a better solution!
>
> Dale

I'm curious how the merge driver is implemented.  I think it was
mentioned that git calls-back to Pharo to do the processing. Is this
it something like this...

* How to make Git preserve specific files while merging
   https://medium.com/@porteneuve/how-to-make-git-preserve-specific-files-while-merging-18c92343826b#.raovdvj9p

* A few of my Git tricks, tips and workflows
  section under gitattribute(5)
  http://nuclearsquid.com/writings/git-tricks-tips-workflows/


Or... rather than the metadata being a specific file (e.g.
'metadata'), the file could change for each commit (e.g.
'metadata.nnn') where nnn is the revision number, so each commit
deletes the old metadata file and writes the new metadata file.
Monticello could know to use whichever single 'metadata.*' file
exists.  Would that avoid the problem of metadata merge conflict?

cheers -ben


Reply | Threaded
Open this post in threaded view
|

Re: Contributing to Pharo

Thierry Goubier
In reply to this post by Eliot Miranda-2
Hi Eliot,

Le 02/02/2016 21:54, Eliot Miranda a écrit :
  ....
>
> No it's /not/ the end of the story.  The essential part of the story is
> how Monticello remains compatible and interoperable between dialects.  I
> haven't seen you account for how you maintain that compatibility.  As
> far as I can tell, you propose replacing the Monticello metadata with
> that from git.  How do I, as a Squeak user with Monticello, ever get to
> look at your package again?  As I understand it, moving the metadata
> from Monticello commit time to git means that the metadata is in a
> format that git determines, not Monticello.

Yes. See below why.

> So I don't understand how on the one hand you can say "The Monticello
> metadata in a git repository is redundant and leads to unnecessary
> commit conflicts -- end of story ....", which implies you want to
> eliminate the Monticello metadata, and on the other hand you say you're
> keeping the Monticello metadata.  I'm hopelessly confused.  How does the
> Monticello metadata get reconstituted if it's been thrown away?
>
> What happens to the metadata in the following workflow?
>
> load package P from Monticello repository R into an image
> change P, commit via git to local git repository G
> load P from G into an image
> store P to R via Monticello

It's not a scenario I've specifically worked on, but all the tech is
implemented / implementable to do that perfectly.

The only thing that is problematic there is that the only safe history
is the one generated from git... there are so many MC packages with
broken history that, on mcz packages, you have to admit that it's not
safe to base things on their history.

> But what does this imply to some package that starts off in a Monticello
> repository and then spends some time in gitland?  Can I merge again?  If
> I can I'm happy.  If I can't, I feel sabotaged.

You could. Just express your needs and wait until one of us has enough
free time to solve it for you, that is.

> Fine.  Except merging is, IIUC, about method time stamps and ancestry.
> If that gets preserved then I'm happy.  But for the life of me I haven't
> read an explanation that reassures me that these are being preserved.
> Do you see the roots of my fear?

Of course. But that also project a bit what you think of the people
working on it... which may makes it a bit hard to answer.

Thierry

Reply | Threaded
Open this post in threaded view
|

Re: Contributing to Pharo

Thierry Goubier
In reply to this post by Ben Coman
Hi Ben,

Le 03/02/2016 06:00, Ben Coman a écrit :>
> I'm curious how the merge driver is implemented.  I think it was
> mentioned that git calls-back to Pharo to do the processing. Is this
> it something like this...
>
> * How to make Git preserve specific files while merging
>     https://medium.com/@porteneuve/how-to-make-git-preserve-specific-files-while-merging-18c92343826b#.raovdvj9p

Probably, haven't read it.

> * A few of my Git tricks, tips and workflows
>    section under gitattribute(5)
>    http://nuclearsquid.com/writings/git-tricks-tips-workflows/

Yes. You use git attributes to trigger the use of a pharo-implemented
command line tool which does the merge for git.


> Or... rather than the metadata being a specific file (e.g.
> 'metadata'), the file could change for each commit (e.g.
> 'metadata.nnn') where nnn is the revision number, so each commit
> deletes the old metadata file and writes the new metadata file.
> Monticello could know to use whichever single 'metadata.*' file
> exists.  Would that avoid the problem of metadata merge conflict?

I've thought of that, and it seems hard to get right (which file goes
first? how do you recognize a merge?). You have to remember that
packages have hundreds, even thousands of versions so this would mean
that number of files.

I went through all the different possible file formats, class-based,
package-based, method-based, log metadata and the like, and I concluded
that:

- the method based format is as good as any other.
- method based format allow for method-history queries on the git/vcs
history (as well as class based / package based queries).
- the tree structure on github or bitbucket is quite convenient (and
browsable) to the point one could edit a package directly in it (I do
when I need to do a quick fix).
- anything that can compress a bit the metadata version is probably good
to consider. version files can be huge.
- merge drivers really work and releave us from conflict resolution
- we need a merge tool written in Smalltalk/MC
- MC version numbering is a very bad idea
- MC almost never using properly UUIDs is a very bad behavior
- MC packages history can be considered broken in the general case
- it takes time to define, implement, test and really use a new format
and tooling

Thierry


Reply | Threaded
Open this post in threaded view
|

Re: Contributing to Pharo

Thierry Goubier
In reply to this post by Ben Coman
Hi Ben,

Le 03/02/2016 06:00, Ben Coman a écrit :>
> I'm curious how the merge driver is implemented.  I think it was
> mentioned that git calls-back to Pharo to do the processing. Is this
> it something like this...
>
> * How to make Git preserve specific files while merging
>     https://medium.com/@porteneuve/how-to-make-git-preserve-specific-files-while-merging-18c92343826b#.raovdvj9p

Probably, haven't read it.

> * A few of my Git tricks, tips and workflows
>    section under gitattribute(5)
>    http://nuclearsquid.com/writings/git-tricks-tips-workflows/

Yes. You use git attributes to trigger the use of a pharo-implemented
command line tool which does the merge for git.


> Or... rather than the metadata being a specific file (e.g.
> 'metadata'), the file could change for each commit (e.g.
> 'metadata.nnn') where nnn is the revision number, so each commit
> deletes the old metadata file and writes the new metadata file.
> Monticello could know to use whichever single 'metadata.*' file
> exists.  Would that avoid the problem of metadata merge conflict?

I've thought of that, and it seems hard to get right (which file goes
first? how do you recognize a merge?). You have to remember that
packages have hundreds, even thousands of versions so this would mean
that number of files.

I went through all the different possible file formats, class-based,
package-based, method-based, log metadata and the like, and I concluded
that:

- the method based format is as good as any other. Even better since it
has a spec (cypress).
- method based format allow for method-history queries on the git/vcs
history (as well as class based / package based queries).
- the tree structure on github or bitbucket is quite convenient (and
browsable) to the point one could edit a package directly in it (I do
when I need to do a quick fix).
- anything that can compress a bit the metadata version is probably good
to consider. version files can be huge.
- merge drivers really work and releave us from conflict resolution
- we need a merge tool written in Smalltalk/MC
- MC version numbering is a very bad idea
- MC almost never using properly UUIDs is a very bad behavior
- MC packages history can be considered broken in the general case
- it takes time to define, implement, test and really use a new format
and tooling

Thierry


Reply | Threaded
Open this post in threaded view
|

Re: Contributing to Pharo

Sven Van Caekenberghe-2
Thanks Thierry for pushing this subject and working so patiently with the community.

> On 03 Feb 2016, at 10:18, Thierry Goubier <[hidden email]> wrote:
>
> I went through all the different possible file formats, class-based, package-based, method-based, log metadata and the like, and I concluded that:
>
> - the method based format is as good as any other. Even better since it has a spec (cypress).
> - method based format allow for method-history queries on the git/vcs history (as well as class based / package based queries).
> - the tree structure on github or bitbucket is quite convenient (and browsable) to the point one could edit a package directly in it (I do when I need to do a quick fix).
> - anything that can compress a bit the metadata version is probably good to consider. version files can be huge.
> - merge drivers really work and releave us from conflict resolution
> - we need a merge tool written in Smalltalk/MC
> - MC version numbering is a very bad idea
> - MC almost never using properly UUIDs is a very bad behavior
> - MC packages history can be considered broken in the general case
> - it takes time to define, implement, test and really use a new format and tooling

Excellent summary.

I also feel that the current structure (basic filetree) is more than OK. You can /almost/ browse through github.

I am sure we'll get there with the meta data and tooling.

Sven
Reply | Threaded
Open this post in threaded view
|

Re: Contributing to Pharo

Dale Henrichs-3
In reply to this post by Thierry Goubier
Thierry,

Very good points and I agree with all of them ...

Regarding a Smalltalk-based merge tool, I've written a git mergetool for
tODE (consider it a prototype) that could be adapted/ported for Pharo ....

My thoughts regarding the MC issues ... Fyou make good points some of
the "internal bugs" in MC and I think that it is important to point out
that the "internal bugs"  you are referring to only affect Monticello
with regards to its repository functionality and not its package
functionality .

My inclination is to move forward and create a new, simpler package
model (Cypress is a good name) that is used with disk-based repos ...
the new package model would be protocol compatible with MC so that
definition-based snapshot comparisons continue. The new package model
would only be used for loading and there could/should be facilities for
converting between MC and this new package model.

FileTree would continue to be MC based and Cypress would be able to read
FileTree and FileTree would be able to read Cypress.

The improvements and changes that we are discussing (many of which
require a fair amount of implementation work) would be done in Cypress ...

The rationale is that Monticello packages encompass both the repository
and the code/definitions and for disk-based scms, Cypress would only
need to worry about the code/definitions --- and would make it possible
to move away from the version number-based package names that aren't
necessary when the underlying scm is doing the versioning without
affecting the MC implementation ....

Of course, if a new Cypress package model were introduced, the tools
would need to change, but since we are already considering changing
tools, this would be the perfect time to improve the underlying model
for disk-based SCMs.

Dale

On 2/3/16 1:13 AM, Thierry Goubier wrote:

>
> I went through all the different possible file formats, class-based,
> package-based, method-based, log metadata and the like, and I
> concluded that:
>
> - the method based format is as good as any other.
> - method based format allow for method-history queries on the git/vcs
> history (as well as class based / package based queries).
> - the tree structure on github or bitbucket is quite convenient (and
> browsable) to the point one could edit a package directly in it (I do
> when I need to do a quick fix).
> - anything that can compress a bit the metadata version is probably
> good to consider. version files can be huge.
> - merge drivers really work and releave us from conflict resolution
> - we need a merge tool written in Smalltalk/MC
> - MC version numbering is a very bad idea
> - MC almost never using properly UUIDs is a very bad behavior
> - MC packages history can be considered broken in the general case
> - it takes time to define, implement, test and really use a new format
> and tooling


Reply | Threaded
Open this post in threaded view
|

Re: Contributing to Pharo

Eliot Miranda-2
In reply to this post by Thierry Goubier


On Wed, Feb 3, 2016 at 12:54 AM, Thierry Goubier <[hidden email]> wrote:
Hi Eliot,

Le 02/02/2016 21:54, Eliot Miranda a écrit :
 ....

No it's /not/ the end of the story.  The essential part of the story is
how Monticello remains compatible and interoperable between dialects.  I
haven't seen you account for how you maintain that compatibility.  As
far as I can tell, you propose replacing the Monticello metadata with
that from git.  How do I, as a Squeak user with Monticello, ever get to
look at your package again?  As I understand it, moving the metadata
from Monticello commit time to git means that the metadata is in a
format that git determines, not Monticello.

Yes. See below why.

So I don't understand how on the one hand you can say "The Monticello
metadata in a git repository is redundant and leads to unnecessary
commit conflicts -- end of story ....", which implies you want to
eliminate the Monticello metadata, and on the other hand you say you're
keeping the Monticello metadata.  I'm hopelessly confused.  How does the
Monticello metadata get reconstituted if it's been thrown away?

What happens to the metadata in the following workflow?

load package P from Monticello repository R into an image
change P, commit via git to local git repository G
load P from G into an image
store P to R via Monticello

It's not a scenario I've specifically worked on, but all the tech is implemented / implementable to do that perfectly.

The only thing that is problematic there is that the only safe history is the one generated from git... there are so many MC packages with broken history that, on mcz packages, you have to admit that it's not safe to base things on their history.

I'm sorry but I don't accept that.  In the Squeak trunk we have history in our mczs that is correct.  Certainly in VMMaker.oscog I have history that goes back a long time.  If bugs have broken history then efforts should be made to repair that history.  But you can't just write off Monticello history like that.

But what does this imply to some package that starts off in a Monticello
repository and then spends some time in gitland?  Can I merge again?  If
I can I'm happy.  If I can't, I feel sabotaged.

You could. Just express your needs and wait until one of us has enough free time to solve it for you, that is.

Fine.  Except merging is, IIUC, about method time stamps and ancestry.
If that gets preserved then I'm happy.  But for the life of me I haven't
read an explanation that reassures me that these are being preserved.
Do you see the roots of my fear?

Of course. But that also project a bit what you think of the people working on it... which may makes it a bit hard to answer.

Thierry

_,,,^..^,,,_
best, Eliot
123456