I noticed recently that I had "duplicate" method versions in Store,
which had the same content, but different timestamps. Today I lookup up why this happens: You publish a new definition (version2) of an existing method (version1) in an existing class. Then you have two versions of the package, two versions of the method and two instances of MethodInPackage, one relation for each package with the corresponding version of the method. package version 1 <=> method version 1 package version 2 <=> method version 2 Now you take the older version 1 of the package which still includes method version1 and create a fork of that package: You load version 2 of the method into the version1 of the package and publish this as version 1.1. (the same also happens by a merge in the merge tool) I had assumed that this would result in one new package version (1.1) and one new MethodInPackage between version 1.1 of the package and version2 of the method. package version 1 <=> method version 1 package version 2 <=> method version 2 package version 1.1 <=> method version2 Instead I get a new version2dup of the method and a relation between package 1.1 and the version2dup of the method package version 1 <=> method version 1 package version 2 <=> method version 2 package version 1.1 <=> method version2dup and the additional version2dup in StoreMethod There are two points: 1) in this scenario, I loose the information that the version 2 of the method is the same in package version 2 as in package version 1.1 and their descendants. 2) the linear traceability of method evolution is interrupted through an artificial spawn 3) I get additional "versions" of methods that are just copies of merged ones. the third one is disturbing because it pretends many changes that aren't ones, especially if there are several versions of applications in development (as we have it with several deployed version that still require changes to them). Is this by design? If I remember correctly this wasn't the case in earlier versions (before Store on top of Glorp?). Am I wrong with my expectations? Do I miss any disadvantages? Thanks for any answer. Thomas _______________________________________________ vwnc mailing list [hidden email] http://lists.cs.uiuc.edu/mailman/listinfo/vwnc |
Store does not keep information for each individual method around in the image. All it keeps is the single pointer for each package to the database version of it. So when you load an individual method it does not know what the database version that it originally came from was. When publishing, it would need to look through the whole database, or at least through the other versions of that same package, to find any matches, which would be very expensive. It would definitely have more capabilities along the lines you describe if it kept that information for all the sub-entities. But it's never worked that way, and there's non-trivial overhead of both space and bookkeeping to do that. I'd say that the reverse error (you made a change to a method which the system erroneously thought was the same as some other version, so it threw away your code and just saved a reference to the other) would be much worse.
On 6 December 2013 07:15, Thomas Brodt <[hidden email]> wrote: I noticed recently that I had "duplicate" method versions in Store, which had the same content, but different timestamps. Today I lookup up why this happens: _______________________________________________ vwnc mailing list [hidden email] http://lists.cs.uiuc.edu/mailman/listinfo/vwnc |
Hi Alan
thank you for your explanatory answer. Nice to hear from you, and you still seem to know something about Smalltalk ;-). Am 10.12.2013 23:17, schrieb Alan
Knight:
Agreed. It would be expensive to look for any "matching" method for the current source when publishing, and guesses might also be wrong. My thoughts (wishes?) were that the merge tool would better handle these two cases different when you either choose existing method editions as resolution or otherwise new source code that was entered manually. When I choose an already published edition, the merge tool could know that it only has to create another link to an existing method record instead of also creating a new method with the same content. Regular changes to method via source code changes would of course need to cut the relation to the existing method in Store and create a new one. In the meanwhile I digged a bit into the publishing code and saw that the XChangesets only keep information about #change, but not how the change was made, nor any link to store records. So you have no means to detect such more sophisticated possibilities. And so a #change unfortunately results in a completely new method.
There is more overhead in bookkeeping, that's true. That's why I thought of the merging process only where you clearly specify what you want when you choose a resolution. The sad result is that simultaneously keeping several versions open for development leads to more editions of code that would be necessary. So you loose relations and oversight over "real" changes. Thomas
_______________________________________________ vwnc mailing list [hidden email] http://lists.cs.uiuc.edu/mailman/listinfo/vwnc |
In reply to this post by Thomas Brodt-2
Dear Thomas,
Alan is correct that the image does not store method version information. During a merge, the merge tool does have (the ability to deduce) this information, so could functionally do the kind of thing you describe. The reasons why it does not currently do so relate to (1) history (2) the need to make a synchronous change to the replicator (3) the need to assess performance impact, and (4) having more urgent tasks. Functionally, if the user merges branch package version MyPackage(1.0 + importantFix) onto trunk version MyPackage(1.0) and a method clash is observed, e.g. in the trunk MyClass>>initialize self makeGhastlyError versus, in the branch MyClass>>initialize self doSaneThing then, at the point of merge, the merge tool knows which method versions clashed, and also knows which version the user chose to publish to merged MyPackage(1.1) If the user chooses the trunk version MyClass>>initialize self makeGhastlyError then no new clone is created: Store recognises nothing has changed and simply has MyPackage(1.1) and MyPackage(1.0) both point at the record of that method. If, however, the user chooses the branch version MyClass>>initialize self doSaneThing then Store notices only that the trunk has changed and so clones a fresh method for MyPackage(1.1) with the same source as the branch method in MyPackage(1.0 + importantFix). As Alan points out, it could be unsafe in general for Store to assume that code-identical methods were in fact the same. However when using the merge tool, the user supplies all the information in the act of merging. Store's schema is as well able to have MyPackage(1.1) and MyPackage(1.0 + importantFix) both point at the record of the method MyClass>>initialize self doSaneThing as to have MyPackage(1.1) and MyPackage(1.0) point at the record of method MyClass>>initialize self makeGhastlyError when the trunk version is chosen. When the user tells the merge tool to resolve the conflict in favour of branch method MyClass>>initialize self doSaneThing it is as clear this is what is meant as when the user tells the merge tool to resolve the conflict in favour of MyClass>>initialize self makeGhastlyError So why is this not done, I hear you ask. 1) History: the tools that drive the Store schema have never used the schema in this way. They are coded to confine sharing of method versions (and class definition versions and other objects pointed at by package versions) strictly within the package inheritance tree: the tools will share an object between a package version and its ancestors but not between a package version and its siblings; whenever presented with the latter, they clone. I have a decades-long familiarity with this class of problems: 2-dimensional containment and connection graphs, and the issue of whether and when to constrain them for ease of modelling. Were I in the place of the original Store tool creators in the early-mid-90s, I would not have made their choice, but I'm familiar with that choice being made. 2) Synchronous replicator change: the replicator is a more recent Store tool. because of (1), it does not expect to encounter method versions shared between siblings. When first I did experiments on this, I promptly discovered that the base replicator simply recreates the clones in a target database, so needed change too. Thus anyone who maintains mirror Store databases (as we do), would need to upgrade their replicator as well as their merge tool before seeing benefit (and we might need to review whether a temporary intermediate state could risk any hiccoughs). 3) Performance: having sharing confined within a package's ancestor tree loses information supplied at merge time, but also permits simple coding of some queries. Could the complexities of having a method's inheritance not be within its package's inheritance cause performance problems in the revised versions of these queries? My very slight and partial experiments a while back suggested not, but I must repeat "my very _slight_ and _partial_ experiments". This would need review. 4) Store has other and important tasks for 8.0. This is not on any plan, and should not be presumed to be intended to be so for any given future time. In summary, I understand where you are coming from on this. I've had similar thoughts and done experiments. The above explains why I'm not suggesting this idea to the Store team save as a long term 'maybe think about'. HTH Niall Ross > I noticed recently that I had "duplicate" method versions in Store, > which had the same content, but different timestamps. Today I lookup > up why this happens: > > You publish a new definition (version2) of an existing method > (version1) in an existing class. Then you have two versions of the > package, two versions of the method and two instances of > MethodInPackage, one relation for each package with the corresponding > version of the method. > package version 1 <=> method version 1 > package version 2 <=> method version 2 > > Now you take the older version 1 of the package which still includes > method version1 and create a fork of that package: You load version 2 > of the method into the version1 of the package and publish this as > version 1.1. (the same also happens by a merge in the merge tool) > > I had assumed that this would result in one new package version (1.1) > and one new MethodInPackage between version 1.1 of the package and > version2 of the method. > package version 1 <=> method version 1 > package version 2 <=> method version 2 > package version 1.1 <=> method version2 > > Instead I get a new version2dup of the method and a relation between > package 1.1 and the version2dup of the method > package version 1 <=> method version 1 > package version 2 <=> method version 2 > package version 1.1 <=> method version2dup > and the additional version2dup in StoreMethod > > There are two points: > > 1) in this scenario, I loose the information that the version 2 of the > method is the same in package version 2 as in package version 1.1 and > their descendants. > 2) the linear traceability of method evolution is interrupted through > an artificial spawn > 3) I get additional "versions" of methods that are just copies of > merged ones. > > the third one is disturbing because it pretends many changes that > aren't ones, especially if there are several versions of applications > in development (as we have it with several deployed version that still > require changes to them). > > Is this by design? If I remember correctly this wasn't the case in > earlier versions (before Store on top of Glorp?). > Am I wrong with my expectations? Do I miss any disadvantages? > > Thanks for any answer. > > Thomas > > > > > _______________________________________________ > vwnc mailing list > [hidden email] > http://lists.cs.uiuc.edu/mailman/listinfo/vwnc > > _______________________________________________ vwnc mailing list [hidden email] http://lists.cs.uiuc.edu/mailman/listinfo/vwnc |
Free forum by Nabble | Edit this page |