replication

Previous Topic Next Topic
 
classic Classic list List threaded Threaded
4 messages Options
Reply | Threaded
Open this post in threaded view
|

replication

Ken G. Brown
If I replicate a parcel from repository 1 into repo 2, without the Replicate Recursively checkbox checked, is all the necessary code replicated from that particular version so that everything except history is retreivable in repo 2 into a fresh image?

I've been trying to do that, and I'm finding many unloadable definition errors showing up when loading from repo 2, apparently referencing a previous version to that which was replicated.

Thx for any insight,
Ken G. Brown
_______________________________________________
vwnc mailing list
[hidden email]
http://lists.cs.uiuc.edu/mailman/listinfo/vwnc
Reply | Threaded
Open this post in threaded view
|

Re: replication

Niall Ross
Dear Ken,

>If I replicate a parcel
>
Trivial nomenclature point.

     - Replication is of packages or bundles (pundles is the generic
term) from one Store database to another.

    - A parcel is loaded from or saved to a binary.pcl file (its source
in .pst).

A parcel necessarily has a package or bundle equivalent (usually of the
same name, though the parcel's name and, separately, the name of the
parcel's file, can differ).  A package or bundle loaded from Store, not
from a parcel file, will not have a parcel equivalent (unless saved as a
parcel, which you would not usually do, unless creation of a parcel was
your purpose).

>from repository 1 into repo 2, without the Replicate Recursively checkbox checked, is all the necessary code replicated from that particular version so that everything except history is retreivable in repo 2 into a fresh image?
>  
>
Yes.  Replicate recursively means replicate with ancestors back to the
latest prior ancestor that is already in the target database.  The
replicator finds this latest ancestor (or that there is none, in which
case it will replicate all ancestors).  If you replicate unrecursively,
you get a free-standing copy with no ancestors.  This may be quicker but
if, later on, you replicate the pundle's parent, the replicator will not
fix-up the parent-child links post-hoc.  What was a parent and child in
your source database will be two independent free-standing versions in
your target database.

    - Items unchanged between parent and child (e.g. class definitions
or methods) will be cloned in the target database, not a single item
pointed at by both versions as in the source database

    - The merge tool will be less able to compute merges;  if, after
replicating several versions unrecursively, you want to merge between
two or more, the tool will leave more for your decision than if the
parent-child links were all in place, as in the source database.

So if you want to throw away the past and treat the version you are
replicating as a free-standing source, use unrecursive.  If you want to
preserve the history of your development, use recursive.  Remember, at
the moment you can't fix this up afterwards;  you'd have to GC all the
unrecursively-replicated versiions in the target database and rerun a
recursive replication to get the parent-child links into the target.

Remark:  If you replicate a package version and then replicate its
immediate child, it does not matter whether you are unrecursive or not.  
Links are lost only when the parent is absent when the child is
replicated unrecursively.  However that could fail for a bundle which
had a subpundle whose parent was a child of the version in the bundle's
parent, whereas recursion takes care of all cases.

Carfeully choosing to replicating a selected 'oldest version you need'
unrecursively, then replicating later versions recursively, is one
common strategy.

>I've been trying to do that, and I'm finding many unloadable definition errors showing up when loading from repo 2, apparently referencing a previous version to that which was replicated.
>  
>
At first glance, unloadable definitions usually suggest missing prereqs,
not missing versions, to me;  your pundle contains a method extending a
class that did not load because the prereq was not there, or a class
that subclasses the missing class.  Your target database may of course
be missing prereq pundles present imn yopur source database.

I've never experienced the replicator actually losing class or method
definitions contained in a replicated package.  If it did, I'd expect a
harder error than 'Unloadable definition' - a debugger popping up or
similar.  If a package references a method or class definition id, I
think the code assumes that must be there.

             HTH
                   Niall Ross

_______________________________________________
vwnc mailing list
[hidden email]
http://lists.cs.uiuc.edu/mailman/listinfo/vwnc
Reply | Threaded
Open this post in threaded view
|

Re: replication

Ken G. Brown
See comments interspersed:

On 2013-10-22, at 8:59 AM, Niall Ross <[hidden email]> wrote:

> Dear Ken,
>
>> If I replicate a parcel
> Trivial nomenclature point.
>
>    - Replication is of packages or bundles (pundles is the generic term) from one Store database to another.
>
>   - A parcel is loaded from or saved to a binary.pcl file (its source in .pst).
>
> A parcel necessarily has a package or bundle equivalent (usually of the same name, though the parcel's name and, separately, the name of the parcel's file, can differ).  A package or bundle loaded from Store, not from a parcel file, will not have a parcel equivalent (unless saved as a parcel, which you would not usually do, unless creation of a parcel was your purpose).

I've been somewhat unclear in my understanding of the difference between package and parcel.
Thx for the explanation.

>> from repository 1 into repo 2, without the Replicate Recursively checkbox checked, is all the necessary code replicated from that particular version so that everything except history is retrievable in repo 2 into a fresh image?
>>
> Yes.  Replicate recursively means replicate with ancestors back to the latest prior ancestor that is already in the target database.  The replicator finds this latest ancestor (or that there is none, in which case it will replicate all ancestors).  If you replicate unrecursively, you get a free-standing copy with no ancestors.  This may be quicker but if, later on, you replicate the pundle's parent, the replicator will not fix-up the parent-child links post-hoc.  What was a parent and child in your source database will be two independent free-standing versions in your target database.
>
>   - Items unchanged between parent and child (e.g. class definitions or methods) will be cloned in the target database, not a single item pointed at by both versions as in the source database
>
>   - The merge tool will be less able to compute merges;  if, after replicating several versions unrecursively, you want to merge between two or more, the tool will leave more for your decision than if the parent-child links were all in place, as in the source database.
>
> So if you want to throw away the past and treat the version you are replicating as a free-standing source, use unrecursive.  If you want to preserve the history of your development, use recursive.  Remember, at the moment you can't fix this up afterwards;  you'd have to GC all the unrecursively-replicated versiions in the target database and rerun a recursive replication to get the parent-child links into the target.
>
> Remark:  If you replicate a package version and then replicate its immediate child, it does not matter whether you are unrecursive or not.  Links are lost only when the parent is absent when the child is replicated unrecursively.  However that could fail for a bundle which had a subpundle whose parent was a child of the version in the bundle's parent, whereas recursion takes care of all cases.
>
> Carfeully choosing to replicating a selected 'oldest version you need' unrecursively, then replicating later versions recursively, is one common strategy.


Seems like this is what we want. Basically we are wanting a minimal snapshot of our releases in without all the cruft that has accumulated in our dev repository. Eventually being able to build a runtime package automatically from a fresh image is a goal.

>
>> I've been trying to do that, and I'm finding many unloadable definition errors showing up when loading from repo 2, apparently referencing a previous version to that which was replicated.
>>
> At first glance, unloadable definitions usually suggest missing prereqs, not missing versions, to me;  your pundle contains a method extending a class that did not load because the prereq was not there, or a class that subclasses the missing class.  Your target database may of course be missing prereq pundles present imn yopur source database.

You were correct. The package I was replicating was Jun which apparently needed AT Benchmarks which was not there. For some reason Jun loads ok from Parcel manager into a fresh image but not once it is replicated, or at least that is what is seems like.

>
> I've never experienced the replicator actually losing class or method definitions contained in a replicated package.  If it did, I'd expect a harder error than 'Unloadable definition' - a debugger popping up or similar.  If a package references a method or class definition id, I think the code assumes that must be there.
>
>            HTH
>                  Niall Ross
>

Regarding deleting pundles in a repository, it seems like the Garbage Collector is in need of some serious attention as far as speed is concerned. Right now it is so slow that it is almost unworkable and maybe better to delete the repo completely and rebuild. A pundle (Jun) that takes a few minutes to replicate, took around 3 1/2 hrs to delete. And yesterday evening I selected a few things to delete, and it has been running for 11 hrs now without finishing. From Process Monitor it looks alive. Maybe only selecting one item at a time for deletion would help? Some activity reports to the Transcript would be extremely reassuring. It does however send a message when or if it finally completes.

Is the deletion any faster in 7.10?

Running VW 7.9.1 on Mac OS X 10.8.5.

Thx,
Ken G. Brown


_______________________________________________
vwnc mailing list
[hidden email]
http://lists.cs.uiuc.edu/mailman/listinfo/vwnc
Reply | Threaded
Open this post in threaded view
|

Re: replication

Niall Ross
Dear Ken,

>See comments interspersed:
>  
>
My latest comments are interspersed with your comments.  (Essentially, I
think all is clear and this thread can close;  I make a few final
remarks in case they assist.)

>On 2013-10-22, at 8:59 AM, Niall Ross <[hidden email]> wrote:
>
>  
>
>>Dear Ken,
>>
>>    
>>
>>>If I replicate a parcel
>>>      
>>>
>>Trivial nomenclature point.
>>
>>   - Replication is of packages or bundles (pundles is the generic term) from one Store database to another.
>>
>>  - A parcel is loaded from or saved to a binary.pcl file (its source in .pst).
>>
>>A parcel necessarily has a package or bundle equivalent (usually of the same name, though the parcel's name and, separately, the name of the parcel's file, can differ).  A package or bundle loaded from Store, not from a parcel file, will not have a parcel equivalent (unless saved as a parcel, which you would not usually do, unless creation of a parcel was your purpose).
>>    
>>
>
>I've been somewhat unclear in my understanding of the difference between package and parcel.
>Thx for the explanation.
>  
>
A parcel is a binary .pcl file, optimised for loading bytecodes quickly,
plus the associate .pst file to provide the source.  When you load the
parcel, these bytecodes are swiftly built into a package in the image.  
The package will remember it has an associated parcel.  The .pst source
file location is noted by the SoureFileManager.  When you select a
method, the SFM looks in that file to find the method's source text.

Store  is a CM system for holding source code.  When you load from
Store, the source is compiled into bytecodes in a package, and the
source text is also added to the .cha file;  the SFM looks in that file
to find source for any method loaded from Store (including a method
originally loaded from a parcel but then changed by loading a different
version from Store).

Thus parcels are a deployment mechanism.  They load quickly and have no
CM aspect.  Store is the opposite.

Parcel's also have partial loading:  quietly, don't load this method
extension of a class not in my image, but if that class appears in the
image later, load it then.  Store does not do this.  Again, this
reflects what you want when deploying - the customer doesn't use that
utility so don't load methods that let my parcel work with that utility
- as against developing - I want to change my pundle so I want to see
_all_ the code in my pundle that could be affected by my change.

(Publishing binary to Store means writing the parcel to Store instead of
the source.  This is emphatically _not_ recommended in any normal
circumstance.)

>  
>
>>>from repository 1 into repo 2, without the Replicate Recursively checkbox checked, is all the necessary code replicated from that particular version so that everything except history is retrievable in repo 2 into a fresh image?
>>>
>>>      
>>>
>>Yes.  Replicate recursively means replicate with ancestors back to the latest prior ancestor that is already in the target database.  The replicator finds this latest ancestor (or that there is none, in which case it will replicate all ancestors).  If you replicate unrecursively, you get a free-standing copy with no ancestors.  This may be quicker but if, later on, you replicate the pundle's parent, the replicator will not fix-up the parent-child links post-hoc.  What was a parent and child in your source database will be two independent free-standing versions in your target database.
>>
>>  - Items unchanged between parent and child (e.g. class definitions or methods) will be cloned in the target database, not a single item pointed at by both versions as in the source database
>>
>>  - The merge tool will be less able to compute merges;  if, after replicating several versions unrecursively, you want to merge between two or more, the tool will leave more for your decision than if the parent-child links were all in place, as in the source database.
>>
>>So if you want to throw away the past and treat the version you are replicating as a free-standing source, use unrecursive.  If you want to preserve the history of your development, use recursive.  Remember, at the moment you can't fix this up afterwards;  you'd have to GC all the unrecursively-replicated versiions in the target database and rerun a recursive replication to get the parent-child links into the target.
>>
>>Remark:  If you replicate a package version and then replicate its immediate child, it does not matter whether you are unrecursive or not.  Links are lost only when the parent is absent when the child is replicated unrecursively.  However that could fail for a bundle which had a subpundle whose parent was a child of the version in the bundle's parent, whereas recursion takes care of all cases.
>>
>>Carefully choosing to replicating a selected 'oldest version you need' unrecursively, then replicating later versions recursively, is one common strategy.
>>    
>>
>
>
>Seems like this is what we want. Basically we are wanting a minimal snapshot of our releases in without all the cruft that has accumulated in our dev repository. Eventually being able to build a runtime package automatically from a fresh image is a goal.
>  
>
Replicating unrecursively the oldest trunk version whose yet-older prior
history does not interest you gets you a start point.  If you then
replicate recursively the latest trunk version, you'll get all its
ancestors back to that old trunk version.  You won't get any
side-branches:  if you want those as well, you'd need to replicate
recursively each leaf of any side-branch that interested you - and
avoid  replicate any leaves of yet-older side-branches or you'll get
precisely those yet-older trunk ancestors you wanted to ignore.

It is faster to copy what you want than to copy a pundle (i.e. all
versions) and then GC the unwanted ones - even if, for a complex pundle
with many wanted and many unwanted branches, that makes for some care in
leaf selection.

(One can of course dive under the UI and start writing highly specific
replication code but that is not recommended for the faint-hearted.)

>  
>
>>>I've been trying to do that, and I'm finding many unloadable definition errors showing up when loading from repo 2, apparently referencing a previous version to that which was replicated.
>>>
>>>      
>>>
>>At first glance, unloadable definitions usually suggest missing prereqs, not missing versions, to me;  your pundle contains a method extending a class that did not load because the prereq was not there, or a class that subclasses the missing class.  Your target database may of course be missing prereq pundles present imn yopur source database.
>>    
>>
>
>You were correct. The package I was replicating was Jun which apparently needed AT Benchmarks which was not there. For some reason Jun loads ok from Parcel manager into a fresh image but not once it is replicated, or at least that is what is seems like.
>  
>
Loading from Store and loading from parcel can differ in what prereqs
are (a) sought and (b) accepted if found.

    - You can set a prereq to be loaded only when its requester is
loaded from Store or only when its requester is loaded from a parcel.

    - You can set a prereq to be sought only as a parcel or only as a
Store pundle

    - Store prerequisite settings control whether the loader seeks a
prerequisite amongst the parcels first or in Store first.

This can relate to the point I made above about partial loading.  The
prereqs of Jun the parcel may be configured to accept the absence of
some other parcel whose classes it extends, using what I listed above.  
Jun the Store pundle cannot accept that:  if it extends a class, it must
prereq the item that loads that class or else hit a loading error.

>  
>
>>I've never experienced the replicator actually losing class or method definitions contained in a replicated package.  If it did, I'd expect a harder error than 'Unloadable definition' - a debugger popping up or similar.  If a package references a method or class definition id, I think the code assumes that must be there.
>>
>>           HTH
>>                 Niall Ross
>>
>>    
>>
>
>Regarding deleting pundles in a repository, it seems like the Garbage Collector is in need of some serious attention as far as speed is concerned. Right now it is so slow that it is almost unworkable and maybe better to delete the repo completely and rebuild. A pundle (Jun) that takes a few minutes to replicate, took around 3 1/2 hrs to delete. And yesterday evening I selected a few things to delete, and it has been running for 11 hrs now without finishing. From Process Monitor it looks alive. Maybe only selecting one item at a time for deletion would help? Some activity reports to the Transcript would be extremely reassuring. It does however send a message when or if it finally completes.
>  
>
The times you quote are slower than I see but do not astonish me
overly;  Store GC is _much_ slower than Store replication.

>Is the deletion any faster in 7.10?
>  
>
Not greatly in 7.10, but the point is being looked at for the current
cycle.  I had many deletions to do, so did work on the GC to speed
things up by ordering what was deleted first.  I am passing these to
Store's curator, who intends to review them, and to examine the area.

There are some areas of Store where speed is very important to users.  
By contrast, Store GC deletions tend either to relate to single very
recent (childless) versions, doable in acceptable time, or else to be
not urgent, so that those who, like me, have many to do can endure
batching them up and scheduling them for a succession of weekends when
we leave the GC running overnight.

It may be that most customers almost never use the Store GC.

                HTH
                   Niall Ross

>Running VW 7.9.1 on Mac OS X 10.8.5.
>
>Thx,
>Ken G. Brown
>
>
>
>  
>


_______________________________________________
vwnc mailing list
[hidden email]
http://lists.cs.uiuc.edu/mailman/listinfo/vwnc