The .changes file should be bound to a single image

Previous Topic Next Topic
 
classic Classic list List threaded Threaded
27 messages Options
12
Reply | Threaded
Open this post in threaded view
|

The .changes file should be bound to a single image

Max Leske
Hi,

Opening the same image twice works fine as long as no writes to the .changes file occur. When both images write to the .changes file however it will be broken for both because the offsets for the changes are wrong. This can lead to lost data and predominantly to invalid method source code, which is a pain with Monticello.

I suggest that we implement a kind of lock mechanism to ensure that only one image (the first one opened) can write to the .changes file.


I’ve opened an issue for Pharo here: https://pharo.fogbugz.com/f/cases/18635/The-changes-file-should-be-bound-to-a-single-image


Cheers,
Max
Reply | Threaded
Open this post in threaded view
|

Re: The .changes file should be bound to a single image

marcel.taeumel
Max Leske wrote
Hi,

Opening the same image twice works fine as long as no writes to the .changes file occur. When both images write to the .changes file however it will be broken for both because the offsets for the changes are wrong. This can lead to lost data and predominantly to invalid method source code, which is a pain with Monticello.

I suggest that we implement a kind of lock mechanism to ensure that only one image (the first one opened) can write to the .changes file.


I’ve opened an issue for Pharo here: https://pharo.fogbugz.com/f/cases/18635/The-changes-file-should-be-bound-to-a-single-image


Cheers,
Max
Hi Max,

if I have a running VM in Windows 10, I cannot open a second one on the same image and also get write access to it. A warning appears. So, it never happens that two images write into the same changes file.

Best,
Marcel
Reply | Threaded
Open this post in threaded view
|

Re: The .changes file should be bound to a single image

Tobias Pape

On 28.06.2016, at 14:23, marcel.taeumel <[hidden email]> wrote:

> Max Leske wrote
>> Hi,
>>
>> Opening the same image twice works fine as long as no writes to the
>> .changes file occur. When both images write to the .changes file however
>> it will be broken for both because the offsets for the changes are wrong.
>> This can lead to lost data and predominantly to invalid method source
>> code, which is a pain with Monticello.
>>
>> I suggest that we implement a kind of lock mechanism to ensure that only
>> one image (the first one opened) can write to the .changes file.
>>
>>
>> I’ve opened an issue for Pharo here:
>> https://pharo.fogbugz.com/f/cases/18635/The-changes-file-should-be-bound-to-a-single-image
>>
>>
>> Cheers,
>> Max
>
> Hi Max,
>
> if I have a running VM in Windows 10, I cannot open a second one on the same
> image and also get write access to it. A warning appears. So, it never
> happens that two images write into the same changes file.

On Mac and Linux this is a problem, however.

Best
        -Tobias
Reply | Threaded
Open this post in threaded view
|

Re: The .changes file should be bound to a single image

timrowledge
Now don’t get me started.. I’ve ranted about this off and on since ’96! Be happy that I have to leave now to take my subaru in for service...



tim
--
tim Rowledge; [hidden email]; http://www.rowledge.org/tim
Useful random insult:- Out there where the buses don't run.



Reply | Threaded
Open this post in threaded view
|

Re: The .changes file should be bound to a single image

Chris Muller-3
In reply to this post by Max Leske
I have several applications which launch multiple copies of the same
image for multicore processing.  The images do their work, commit it
to database, then exit themselves without saving.  Its a great
feature.

I know OSProcess, when combined with CommandShell, has a RemoteTask
which allows efficient forking of the image (via Linux copy-on-write
memory sharing) and so a solution like what happens in Windows is not
really good.

Instead of putting a pop-up in front of the user, perhaps one way to
solve the problem would be to, upon image save, simply goes through
all the changes since the last save and re-flushes them to the
.changes file.

That way, if someone does want to save the same image on top of
themself, at least it would be whichever saved last "wins"....


On Tue, Jun 28, 2016 at 5:04 AM, Max Leske <[hidden email]> wrote:

> Hi,
>
> Opening the same image twice works fine as long as no writes to the .changes file occur. When both images write to the .changes file however it will be broken for both because the offsets for the changes are wrong. This can lead to lost data and predominantly to invalid method source code, which is a pain with Monticello.
>
> I suggest that we implement a kind of lock mechanism to ensure that only one image (the first one opened) can write to the .changes file.
>
>
> I’ve opened an issue for Pharo here: https://pharo.fogbugz.com/f/cases/18635/The-changes-file-should-be-bound-to-a-single-image
>
>
> Cheers,
> Max

Reply | Threaded
Open this post in threaded view
|

Re: The .changes file should be bound to a single image

David T. Lewis
On Tue, Jun 28, 2016 at 04:47:00PM -0500, Chris Muller wrote:
>
> On Tue, Jun 28, 2016 at 5:04 AM, Max Leske <[hidden email]> wrote:
> > Hi,
> >
> > Opening the same image twice works fine as long as no writes to the .changes file occur. When both images write to the .changes file however it will be broken for both because the offsets for the changes are wrong. This can lead to lost data and predominantly to invalid method source code, which is a pain with Monticello.
> >
> > I suggest that we implement a kind of lock mechanism to ensure that only one image (the first one opened) can write to the .changes file.
> >

If the offsets are wrong in this scenario, it's a bug in the image. The
image is supposed to seek to the end of the changes file before writing
the next chunk. While this sounds horrible in theory, in practice it works
remarkably well, and I have been happily surprised at how reliable it
is after many years of using and abusing the feature. That is a very
good thing.

Adding a lock to prevent the scenario would be bad, because it would
surely break a number of other legitimate use cases.


> >
> > I???ve opened an issue for Pharo here: https://pharo.fogbugz.com/f/cases/18635/The-changes-file-should-be-bound-to-a-single-image
> >
>
>
> I have several applications which launch multiple copies of the same
> image for multicore processing.  The images do their work, commit it
> to database, then exit themselves without saving.  Its a great
> feature.

That is consistent with my experience. I remember expecting horrible
things to happen if I had two images sharing a changes file, but nothing
bad ever happened. It just works.

>
> I know OSProcess, when combined with CommandShell, has a RemoteTask
> which allows efficient forking of the image (via Linux copy-on-write
> memory sharing) and so a solution like what happens in Windows is not
> really good.

My assumption with RemoteTask was that someone doing complex or long-running
jobs would more or less know what they were doing, and would have the good
sense to stop writing to the changes file from a bunch of forked images.
But in actual practice, I have never seen a problem related to this.
It just works.

>
> Instead of putting a pop-up in front of the user, perhaps one way to
> solve the problem would be to, upon image save, simply goes through
> all the changes since the last save and re-flushes them to the
> .changes file.
>
> That way, if someone does want to save the same image on top of
> themself, at least it would be whichever saved last "wins"....
>

There must be a problem somewhere, otherwise Max would not be raising
the issue. So whatever combination of operating system and image is
having a problem, I would be inclined fix that.

Windows cannot be a problem, because the operating system will not
permit you to open the changes file twice. The Unix/Linux systems that
I have used all work fine.

Max, which operating system/VM/image are you using? Is this on a Mac?

Dave



Reply | Threaded
Open this post in threaded view
|

Re: [Pharo-dev] [squeak-dev] The .changes file should be bound to a single image

Max Leske

> On 29 Jun 2016, at 02:06, David T. Lewis <[hidden email]> wrote:
>
> On Tue, Jun 28, 2016 at 04:47:00PM -0500, Chris Muller wrote:
>>
>> On Tue, Jun 28, 2016 at 5:04 AM, Max Leske <[hidden email]> wrote:
>>> Hi,
>>>
>>> Opening the same image twice works fine as long as no writes to the .changes file occur. When both images write to the .changes file however it will be broken for both because the offsets for the changes are wrong. This can lead to lost data and predominantly to invalid method source code, which is a pain with Monticello.
>>>
>>> I suggest that we implement a kind of lock mechanism to ensure that only one image (the first one opened) can write to the .changes file.
>>>
>
> If the offsets are wrong in this scenario, it's a bug in the image. The
> image is supposed to seek to the end of the changes file before writing
> the next chunk. While this sounds horrible in theory, in practice it works
> remarkably well, and I have been happily surprised at how reliable it
> is after many years of using and abusing the feature. That is a very
> good thing.
>
> Adding a lock to prevent the scenario would be bad, because it would
> surely break a number of other legitimate use cases.
>
>
>>>
>>> I???ve opened an issue for Pharo here: https://pharo.fogbugz.com/f/cases/18635/The-changes-file-should-be-bound-to-a-single-image
>>>
>>
>>
>> I have several applications which launch multiple copies of the same
>> image for multicore processing.  The images do their work, commit it
>> to database, then exit themselves without saving.  Its a great
>> feature.

Doing work is not the problem. Modifying source code is the problem.

>
> That is consistent with my experience. I remember expecting horrible
> things to happen if I had two images sharing a changes file, but nothing
> bad ever happened. It just works.
>
>>
>> I know OSProcess, when combined with CommandShell, has a RemoteTask
>> which allows efficient forking of the image (via Linux copy-on-write
>> memory sharing) and so a solution like what happens in Windows is not
>> really good.
>
> My assumption with RemoteTask was that someone doing complex or long-running
> jobs would more or less know what they were doing, and would have the good
> sense to stop writing to the changes file from a bunch of forked images.
> But in actual practice, I have never seen a problem related to this.
> It just works.
>
>>
>> Instead of putting a pop-up in front of the user, perhaps one way to
>> solve the problem would be to, upon image save, simply goes through
>> all the changes since the last save and re-flushes them to the
>> .changes file.
>>
>> That way, if someone does want to save the same image on top of
>> themself, at least it would be whichever saved last "wins"....
>>
>
> There must be a problem somewhere, otherwise Max would not be raising
> the issue. So whatever combination of operating system and image is
> having a problem, I would be inclined fix that.

:) Thanks Dave!

>
> Windows cannot be a problem, because the operating system will not
> permit you to open the changes file twice. The Unix/Linux systems that
> I have used all work fine.
>
> Max, which operating system/VM/image are you using? Is this on a Mac?

Mac OS X 10.11.5,
Pharo 6 (60086)

>
> Dave
>
>

I actually didn’t open the issue for myself but because of a student who ran into this. I’ve been in the same situation before but I’m an experienced user while students at the research group sometimes just spend a couple of weeks with Pharo and then such things are a real problem.

Interestingly the issue is, as you already suggested, pretty hard to reproduce i.e., modifying arbitrary methods in both images did not show the symptoms I was looking for.

Here’s a reproducible case (at least on my machine):

1. create a new method in both images:

foo
        ^ nil

2. Modify it in one image:

foo
"Lorem ipsum dolor sit amet, consectetur adipiscing elit, sed do eiusmod tempor incididunt ut labore et dolore magna aliqua. Ut enim ad minim veniam, quis nostrud exercitation ullamco laboris nisi ut aliquip ex ea commodo consequat. Duis aute irure dolor in reprehenderit in voluptate velit esse cillum dolore eu fugiat nulla pariatur. Excepteur sint occaecat cupidatat non proident, sunt in culpa qui officia deserunt mollit anim id est laborum."
        ^ nil + 1

3. Modify it in the other image:

foo
        ^ nil - 1 isEmpty ifTrue: [ "blah" nil ]

In my case saving in step three produces a syntax error when the source is loaded from file again. I don’t really have a clue as to what the underlying issue is, but I suspect it may have to do with comments and a particular situation in which the position is not being correctly updated before or after writing.


I agree with Chris that locks may be problematic, it just seemed like the simplest obvious solution (although of course it gets complicated when an image crashes and doesn’t clean up the lock…).

Cheers,
Max

>


Reply | Threaded
Open this post in threaded view
|

Re: [Pharo-dev] [squeak-dev] The .changes file should be bound to a single image

David T. Lewis
Max,

Confirming on Linux and Squeak. See below.

On Wed, Jun 29, 2016 at 08:53:12AM +0200, Max Leske wrote:

>
> > On 29 Jun 2016, at 02:06, David T. Lewis <[hidden email]> wrote:
> >
> > On Tue, Jun 28, 2016 at 04:47:00PM -0500, Chris Muller wrote:
> >>
> >> On Tue, Jun 28, 2016 at 5:04 AM, Max Leske <[hidden email]> wrote:
> >>> Hi,
> >>>
> >>> Opening the same image twice works fine as long as no writes to the .changes file occur. When both images write to the .changes file however it will be broken for both because the offsets for the changes are wrong. This can lead to lost data and predominantly to invalid method source code, which is a pain with Monticello.
> >>>
> >>> I suggest that we implement a kind of lock mechanism to ensure that only one image (the first one opened) can write to the .changes file.
> >>>
> >
> > If the offsets are wrong in this scenario, it's a bug in the image. The
> > image is supposed to seek to the end of the changes file before writing
> > the next chunk. While this sounds horrible in theory, in practice it works
> > remarkably well, and I have been happily surprised at how reliable it
> > is after many years of using and abusing the feature. That is a very
> > good thing.
> >
> > Adding a lock to prevent the scenario would be bad, because it would
> > surely break a number of other legitimate use cases.
> >
> >
> >>>
> >>> I???ve opened an issue for Pharo here: https://pharo.fogbugz.com/f/cases/18635/The-changes-file-should-be-bound-to-a-single-image
> >>>
> >>
> >>
> >> I have several applications which launch multiple copies of the same
> >> image for multicore processing.  The images do their work, commit it
> >> to database, then exit themselves without saving.  Its a great
> >> feature.
>
> Doing work is not the problem. Modifying source code is the problem.
>
> >
> > That is consistent with my experience. I remember expecting horrible
> > things to happen if I had two images sharing a changes file, but nothing
> > bad ever happened. It just works.
> >
> >>
> >> I know OSProcess, when combined with CommandShell, has a RemoteTask
> >> which allows efficient forking of the image (via Linux copy-on-write
> >> memory sharing) and so a solution like what happens in Windows is not
> >> really good.
> >
> > My assumption with RemoteTask was that someone doing complex or long-running
> > jobs would more or less know what they were doing, and would have the good
> > sense to stop writing to the changes file from a bunch of forked images.
> > But in actual practice, I have never seen a problem related to this.
> > It just works.
> >
> >>
> >> Instead of putting a pop-up in front of the user, perhaps one way to
> >> solve the problem would be to, upon image save, simply goes through
> >> all the changes since the last save and re-flushes them to the
> >> .changes file.
> >>
> >> That way, if someone does want to save the same image on top of
> >> themself, at least it would be whichever saved last "wins"....
> >>
> >
> > There must be a problem somewhere, otherwise Max would not be raising
> > the issue. So whatever combination of operating system and image is
> > having a problem, I would be inclined fix that.
>
> :) Thanks Dave!
>
> >
> > Windows cannot be a problem, because the operating system will not
> > permit you to open the changes file twice. The Unix/Linux systems that
> > I have used all work fine.
> >
> > Max, which operating system/VM/image are you using? Is this on a Mac?
>
> Mac OS X 10.11.5,
> Pharo 6 (60086)
>
> >
> > Dave
> >
> >
>
> I actually didn???t open the issue for myself but because of a student who ran into this. I???ve been in the same situation before but I???m an experienced user while students at the research group sometimes just spend a couple of weeks with Pharo and then such things are a real problem.
>
> Interestingly the issue is, as you already suggested, pretty hard to reproduce i.e., modifying arbitrary methods in both images did not show the symptoms I was looking for.
>
> Here???s a reproducible case (at least on my machine):
>
> 1. create a new method in both images:
>
> foo
> ^ nil
>
> 2. Modify it in one image:
>
> foo
> "Lorem ipsum dolor sit amet, consectetur adipiscing elit, sed do eiusmod tempor incididunt ut labore et dolore magna aliqua. Ut enim ad minim veniam, quis nostrud exercitation ullamco laboris nisi ut aliquip ex ea commodo consequat. Duis aute irure dolor in reprehenderit in voluptate velit esse cillum dolore eu fugiat nulla pariatur. Excepteur sint occaecat cupidatat non proident, sunt in culpa qui officia deserunt mollit anim id est laborum."
> ^ nil + 1
>
> 3. Modify it in the other image:
>
> foo
> ^ nil - 1 isEmpty ifTrue: [ "blah" nil ]
>

Confirmed on Linux + Squeak.

I did your test above using #forkSqueak so that I had two identical images
sharing the same changes file. In each image, I saved the #foo method. At
that point, the changes file conntained exactly what I would expect.

I then did a save and exit from the child image, followed by a save and exit
from the original image. I can see that the changes from the child image are
now overwriting the changes from the original parent image. Since the
parent image is the one that was saved last, its #foo method now has
corrupted source.

This is not a scenario that I have ever encountered, but I can see how
it might happen in a classroom setting.

I can't look into this further right now, but it seems possible that the
problem happens only when saving the image, in which case we could force
the changes file to seek to end of file before doing the save. But we'll
need to do some more testing to make sure that this is the only scenario
in which it happens.

Dave



> In my case saving in step three produces a syntax error when the source is loaded from file again. I don???t really have a clue as to what the underlying issue is, but I suspect it may have to do with comments and a particular situation in which the position is not being correctly updated before or after writing.
>
>
> I agree with Chris that locks may be problematic, it just seemed like the simplest obvious solution (although of course it gets complicated when an image crashes and doesn???t clean up the lock???).
>
> Cheers,
> Max
>
> >
>

Reply | Threaded
Open this post in threaded view
|

Re: [Pharo-dev] [squeak-dev] The .changes file should be bound to a single image

Max Leske

On 29 Jun 2016, at 14:45, David T. Lewis <[hidden email]> wrote:

Max,

Confirming on Linux and Squeak. See below.

On Wed, Jun 29, 2016 at 08:53:12AM +0200, Max Leske wrote:

On 29 Jun 2016, at 02:06, David T. Lewis <[hidden email]> wrote:

On Tue, Jun 28, 2016 at 04:47:00PM -0500, Chris Muller wrote:

On Tue, Jun 28, 2016 at 5:04 AM, Max Leske <[hidden email]> wrote:
Hi,

Opening the same image twice works fine as long as no writes to the .changes file occur. When both images write to the .changes file however it will be broken for both because the offsets for the changes are wrong. This can lead to lost data and predominantly to invalid method source code, which is a pain with Monticello.

I suggest that we implement a kind of lock mechanism to ensure that only one image (the first one opened) can write to the .changes file.


If the offsets are wrong in this scenario, it's a bug in the image. The
image is supposed to seek to the end of the changes file before writing
the next chunk. While this sounds horrible in theory, in practice it works
remarkably well, and I have been happily surprised at how reliable it
is after many years of using and abusing the feature. That is a very
good thing.

Adding a lock to prevent the scenario would be bad, because it would
surely break a number of other legitimate use cases.



I???ve opened an issue for Pharo here: https://pharo.fogbugz.com/f/cases/18635/The-changes-file-should-be-bound-to-a-single-image



I have several applications which launch multiple copies of the same
image for multicore processing.  The images do their work, commit it
to database, then exit themselves without saving.  Its a great
feature.

Doing work is not the problem. Modifying source code is the problem.


That is consistent with my experience. I remember expecting horrible
things to happen if I had two images sharing a changes file, but nothing
bad ever happened. It just works.


I know OSProcess, when combined with CommandShell, has a RemoteTask
which allows efficient forking of the image (via Linux copy-on-write
memory sharing) and so a solution like what happens in Windows is not
really good.

My assumption with RemoteTask was that someone doing complex or long-running
jobs would more or less know what they were doing, and would have the good
sense to stop writing to the changes file from a bunch of forked images.
But in actual practice, I have never seen a problem related to this.
It just works.


Instead of putting a pop-up in front of the user, perhaps one way to
solve the problem would be to, upon image save, simply goes through
all the changes since the last save and re-flushes them to the
.changes file.

That way, if someone does want to save the same image on top of
themself, at least it would be whichever saved last "wins"....


There must be a problem somewhere, otherwise Max would not be raising
the issue. So whatever combination of operating system and image is
having a problem, I would be inclined fix that.

:) Thanks Dave!


Windows cannot be a problem, because the operating system will not
permit you to open the changes file twice. The Unix/Linux systems that
I have used all work fine.

Max, which operating system/VM/image are you using? Is this on a Mac?

Mac OS X 10.11.5,
Pharo 6 (60086)


Dave



I actually didn???t open the issue for myself but because of a student who ran into this. I???ve been in the same situation before but I???m an experienced user while students at the research group sometimes just spend a couple of weeks with Pharo and then such things are a real problem.

Interestingly the issue is, as you already suggested, pretty hard to reproduce i.e., modifying arbitrary methods in both images did not show the symptoms I was looking for.

Here???s a reproducible case (at least on my machine):

1. create a new method in both images:

foo
^ nil

2. Modify it in one image:

foo
"Lorem ipsum dolor sit amet, consectetur adipiscing elit, sed do eiusmod tempor incididunt ut labore et dolore magna aliqua. Ut enim ad minim veniam, quis nostrud exercitation ullamco laboris nisi ut aliquip ex ea commodo consequat. Duis aute irure dolor in reprehenderit in voluptate velit esse cillum dolore eu fugiat nulla pariatur. Excepteur sint occaecat cupidatat non proident, sunt in culpa qui officia deserunt mollit anim id est laborum."
^ nil + 1

3. Modify it in the other image:

foo
^ nil - 1 isEmpty ifTrue: [ "blah" nil ]


Confirmed on Linux + Squeak.

I did your test above using #forkSqueak so that I had two identical images
sharing the same changes file. In each image, I saved the #foo method. At
that point, the changes file conntained exactly what I would expect.

I then did a save and exit from the child image, followed by a save and exit
from the original image. I can see that the changes from the child image are
now overwriting the changes from the original parent image. Since the
parent image is the one that was saved last, its #foo method now has
corrupted source.

This is not a scenario that I have ever encountered, but I can see how
it might happen in a classroom setting.

I can't look into this further right now, but it seems possible that the
problem happens only when saving the image, in which case we could force
the changes file to seek to end of file before doing the save. But we'll
need to do some more testing to make sure that this is the only scenario
in which it happens.

Dave

Great, thanks.

In the scenario I described I did not save either image. Of course, without saving the problem will not exist as soon as you start the image anew (the old pointers are still valid and new content will be written to the end). The problem does exhibit itself without saving though.

Since this is not anything critical, don’t put too much effort into it. I’ll have time in a couple of weeks to look at it in detail and then, once we understand the problem, we can discuss possible solutions.

Cheers,
Max




In my case saving in step three produces a syntax error when the source is loaded from file again. I don???t really have a clue as to what the underlying issue is, but I suspect it may have to do with comments and a particular situation in which the position is not being correctly updated before or after writing.


I agree with Chris that locks may be problematic, it just seemed like the simplest obvious solution (although of course it gets complicated when an image crashes and doesn???t clean up the lock???).

Cheers,
Max



Reply | Threaded
Open this post in threaded view
|

Re: [Pharo-dev] [squeak-dev] The .changes file should be bound to a single image

Levente Uzonyi
This seems to be a missing #flush after changes are written to the file.
Without #flush both processes (unix) will maintain their own version of
the file in memory.

Levente

On Wed, 29 Jun 2016, Max Leske wrote:

>
>       On 29 Jun 2016, at 14:45, David T. Lewis <[hidden email]> wrote:
>
> Max,
>
> Confirming on Linux and Squeak. See below.
>
> On Wed, Jun 29, 2016 at 08:53:12AM +0200, Max Leske wrote:
>
>             On 29 Jun 2016, at 02:06, David T. Lewis <[hidden email]> wrote:
>
>             On Tue, Jun 28, 2016 at 04:47:00PM -0500, Chris Muller wrote:
>
>                   On Tue, Jun 28, 2016 at 5:04 AM, Max Leske <[hidden email]> wrote:
>                         Hi,
>
>                         Opening the same image twice works fine as long as no writes to the .changes file occur.
>                         When both images write to the .changes file however it will be broken for both because the
>                         offsets for the changes are wrong. This can lead to lost data and predominantly to invalid
>                         method source code, which is a pain with Monticello.
>
>                         I suggest that we implement a kind of lock mechanism to ensure that only one image (the
>                         first one opened) can write to the .changes file.
>
>
>             If the offsets are wrong in this scenario, it's a bug in the image. The
>             image is supposed to seek to the end of the changes file before writing
>             the next chunk. While this sounds horrible in theory, in practice it works
>             remarkably well, and I have been happily surprised at how reliable it
>             is after many years of using and abusing the feature. That is a very
>             good thing.
>
>             Adding a lock to prevent the scenario would be bad, because it would
>             surely break a number of other legitimate use cases.
>
>
>
>                         I???ve opened an issue for Pharo here:
>                         https://pharo.fogbugz.com/f/cases/18635/The-changes-file-should-be-bound-to-a-single-image
>
>
>
>                   I have several applications which launch multiple copies of the same
>                   image for multicore processing.  The images do their work, commit it
>                   to database, then exit themselves without saving.  Its a great
>                   feature.
>
>
>       Doing work is not the problem. Modifying source code is the problem.
>
>
>             That is consistent with my experience. I remember expecting horrible
>             things to happen if I had two images sharing a changes file, but nothing
>             bad ever happened. It just works.
>
>
>                   I know OSProcess, when combined with CommandShell, has a RemoteTask
>                   which allows efficient forking of the image (via Linux copy-on-write
>                   memory sharing) and so a solution like what happens in Windows is not
>                   really good.
>
>
>             My assumption with RemoteTask was that someone doing complex or long-running
>             jobs would more or less know what they were doing, and would have the good
>             sense to stop writing to the changes file from a bunch of forked images.
>             But in actual practice, I have never seen a problem related to this.
>             It just works.
>
>
>                   Instead of putting a pop-up in front of the user, perhaps one way to
>                   solve the problem would be to, upon image save, simply goes through
>                   all the changes since the last save and re-flushes them to the
>                   .changes file.
>
>                   That way, if someone does want to save the same image on top of
>                   themself, at least it would be whichever saved last "wins"....
>
>
>             There must be a problem somewhere, otherwise Max would not be raising
>             the issue. So whatever combination of operating system and image is
>             having a problem, I would be inclined fix that.
>
>
>       :) Thanks Dave!
>
>
>             Windows cannot be a problem, because the operating system will not
>             permit you to open the changes file twice. The Unix/Linux systems that
>             I have used all work fine.
>
>             Max, which operating system/VM/image are you using? Is this on a Mac?
>
>
>       Mac OS X 10.11.5,
>       Pharo 6 (60086)
>
>
>             Dave
>
>
>
>       I actually didn???t open the issue for myself but because of a student who ran into this. I???ve been in the same situation
>       before but I???m an experienced user while students at the research group sometimes just spend a couple of weeks with Pharo and
>       then such things are a real problem.
>
>       Interestingly the issue is, as you already suggested, pretty hard to reproduce i.e., modifying arbitrary methods in both images
>       did not show the symptoms I was looking for.
>
>       Here???s a reproducible case (at least on my machine):
>
>       1. create a new method in both images:
>
>       foo
>       ^ nil
>
>       2. Modify it in one image:
>
>       foo
>       "Lorem ipsum dolor sit amet, consectetur adipiscing elit, sed do eiusmod tempor incididunt ut labore et dolore magna aliqua. Ut
>       enim ad minim veniam, quis nostrud exercitation ullamco laboris nisi ut aliquip ex ea commodo consequat. Duis aute irure dolor
>       in reprehenderit in voluptate velit esse cillum dolore eu fugiat nulla pariatur. Excepteur sint occaecat cupidatat non proident,
>       sunt in culpa qui officia deserunt mollit anim id est laborum."
>       ^ nil + 1
>
>       3. Modify it in the other image:
>
>       foo
>       ^ nil - 1 isEmpty ifTrue: [ "blah" nil ]
>
>
> Confirmed on Linux + Squeak.
>
> I did your test above using #forkSqueak so that I had two identical images
> sharing the same changes file. In each image, I saved the #foo method. At
> that point, the changes file conntained exactly what I would expect.
>
> I then did a save and exit from the child image, followed by a save and exit
> from the original image. I can see that the changes from the child image are
> now overwriting the changes from the original parent image. Since the
> parent image is the one that was saved last, its #foo method now has
> corrupted source.
>
> This is not a scenario that I have ever encountered, but I can see how
> it might happen in a classroom setting.
>
> I can't look into this further right now, but it seems possible that the
> problem happens only when saving the image, in which case we could force
> the changes file to seek to end of file before doing the save. But we'll
> need to do some more testing to make sure that this is the only scenario
> in which it happens.
>
> Dave
>
>
> Great, thanks.
>
> In the scenario I described I did not save either image. Of course, without saving the problem will not exist as soon as you start the image anew
> (the old pointers are still valid and new content will be written to the end). The problem does exhibit itself without saving though.
>
> Since this is not anything critical, don’t put too much effort into it. I’ll have time in a couple of weeks to look at it in detail and then, once
> we understand the problem, we can discuss possible solutions.
>
> Cheers,
> Max
>
>
>
>
>             In my case saving in step three produces a syntax error when the source is loaded from file again. I don???t really have a
>             clue as to what the underlying issue is, but I suspect it may have to do with comments and a particular situation in which
>             the position is not being correctly updated before or after writing.
>
>
>             I agree with Chris that locks may be problematic, it just seemed like the simplest obvious solution (although of course it
>             gets complicated when an image crashes and doesn???t clean up the lock???).
>
>             Cheers,
>             Max
>
>
>
>

Reply | Threaded
Open this post in threaded view
|

[Pharo-dev] The .changes file should be bound to a single image

Louis LaBrunda
Hi Levente,

Without having looked into this at all I think you are on to something with the missing #flush
and maybe even a #close is needed because jumping to the end of a file unclosed in another
process may not (probably does not) go to the end.

Lou

On Wed, 29 Jun 2016 16:08:50 +0200 (CEST), Levente Uzonyi <[hidden email]> wrote:

>This seems to be a missing #flush after changes are written to the file.
>Without #flush both processes (unix) will maintain their own version of
>the file in memory.
>
>Levente
>
>On Wed, 29 Jun 2016, Max Leske wrote:
>
>>
>>       On 29 Jun 2016, at 14:45, David T. Lewis <[hidden email]> wrote:
>>
>> Max,
>>
>> Confirming on Linux and Squeak. See below.
>>
>> On Wed, Jun 29, 2016 at 08:53:12AM +0200, Max Leske wrote:
>>
>>             On 29 Jun 2016, at 02:06, David T. Lewis <[hidden email]> wrote:
>>
>>             On Tue, Jun 28, 2016 at 04:47:00PM -0500, Chris Muller wrote:
>>
>>                   On Tue, Jun 28, 2016 at 5:04 AM, Max Leske <[hidden email]> wrote:
>>                         Hi,
>>
>>                         Opening the same image twice works fine as long as no writes to the .changes file occur.
>>                         When both images write to the .changes file however it will be broken for both because the
>>                         offsets for the changes are wrong. This can lead to lost data and predominantly to invalid
>>                         method source code, which is a pain with Monticello.
>>
>>                         I suggest that we implement a kind of lock mechanism to ensure that only one image (the
>>                         first one opened) can write to the .changes file.
>>
>>
>>             If the offsets are wrong in this scenario, it's a bug in the image. The
>>             image is supposed to seek to the end of the changes file before writing
>>             the next chunk. While this sounds horrible in theory, in practice it works
>>             remarkably well, and I have been happily surprised at how reliable it
>>             is after many years of using and abusing the feature. That is a very
>>             good thing.
>>
>>             Adding a lock to prevent the scenario would be bad, because it would
>>             surely break a number of other legitimate use cases.
>>
>>
>>
>>                         I???ve opened an issue for Pharo here:
>>                         https://pharo.fogbugz.com/f/cases/18635/The-changes-file-should-be-bound-to-a-single-image
>>
>>
>>
>>                   I have several applications which launch multiple copies of the same
>>                   image for multicore processing.  The images do their work, commit it
>>                   to database, then exit themselves without saving.  Its a great
>>                   feature.
>>
>>
>>       Doing work is not the problem. Modifying source code is the problem.
>>
>>
>>             That is consistent with my experience. I remember expecting horrible
>>             things to happen if I had two images sharing a changes file, but nothing
>>             bad ever happened. It just works.
>>
>>
>>                   I know OSProcess, when combined with CommandShell, has a RemoteTask
>>                   which allows efficient forking of the image (via Linux copy-on-write
>>                   memory sharing) and so a solution like what happens in Windows is not
>>                   really good.
>>
>>
>>             My assumption with RemoteTask was that someone doing complex or long-running
>>             jobs would more or less know what they were doing, and would have the good
>>             sense to stop writing to the changes file from a bunch of forked images.
>>             But in actual practice, I have never seen a problem related to this.
>>             It just works.
>>
>>
>>                   Instead of putting a pop-up in front of the user, perhaps one way to
>>                   solve the problem would be to, upon image save, simply goes through
>>                   all the changes since the last save and re-flushes them to the
>>                   .changes file.
>>
>>                   That way, if someone does want to save the same image on top of
>>                   themself, at least it would be whichever saved last "wins"....
>>
>>
>>             There must be a problem somewhere, otherwise Max would not be raising
>>             the issue. So whatever combination of operating system and image is
>>             having a problem, I would be inclined fix that.
>>
>>
>>       :) Thanks Dave!
>>
>>
>>             Windows cannot be a problem, because the operating system will not
>>             permit you to open the changes file twice. The Unix/Linux systems that
>>             I have used all work fine.
>>
>>             Max, which operating system/VM/image are you using? Is this on a Mac?
>>
>>
>>       Mac OS X 10.11.5,
>>       Pharo 6 (60086)
>>
>>
>>             Dave
>>
>>
>>
>>       I actually didn???t open the issue for myself but because of a student who ran into this. I???ve been in the same situation
>>       before but I???m an experienced user while students at the research group sometimes just spend a couple of weeks with Pharo and
>>       then such things are a real problem.
>>
>>       Interestingly the issue is, as you already suggested, pretty hard to reproduce i.e., modifying arbitrary methods in both images
>>       did not show the symptoms I was looking for.
>>
>>       Here???s a reproducible case (at least on my machine):
>>
>>       1. create a new method in both images:
>>
>>       foo
>>       ^ nil
>>
>>       2. Modify it in one image:
>>
>>       foo
>>       "Lorem ipsum dolor sit amet, consectetur adipiscing elit, sed do eiusmod tempor incididunt ut labore et dolore magna aliqua. Ut
>>       enim ad minim veniam, quis nostrud exercitation ullamco laboris nisi ut aliquip ex ea commodo consequat. Duis aute irure dolor
>>       in reprehenderit in voluptate velit esse cillum dolore eu fugiat nulla pariatur. Excepteur sint occaecat cupidatat non proident,
>>       sunt in culpa qui officia deserunt mollit anim id est laborum."
>>       ^ nil + 1
>>
>>       3. Modify it in the other image:
>>
>>       foo
>>       ^ nil - 1 isEmpty ifTrue: [ "blah" nil ]
>>
>>
>> Confirmed on Linux + Squeak.
>>
>> I did your test above using #forkSqueak so that I had two identical images
>> sharing the same changes file. In each image, I saved the #foo method. At
>> that point, the changes file conntained exactly what I would expect.
>>
>> I then did a save and exit from the child image, followed by a save and exit
>> from the original image. I can see that the changes from the child image are
>> now overwriting the changes from the original parent image. Since the
>> parent image is the one that was saved last, its #foo method now has
>> corrupted source.
>>
>> This is not a scenario that I have ever encountered, but I can see how
>> it might happen in a classroom setting.
>>
>> I can't look into this further right now, but it seems possible that the
>> problem happens only when saving the image, in which case we could force
>> the changes file to seek to end of file before doing the save. But we'll
>> need to do some more testing to make sure that this is the only scenario
>> in which it happens.
>>
>> Dave
>>
>>
>> Great, thanks.
>>
>> In the scenario I described I did not save either image. Of course, without saving the problem will not exist as soon as you start the image anew
>> (the old pointers are still valid and new content will be written to the end). The problem does exhibit itself without saving though.
>>
>> Since this is not anything critical, don’t put too much effort into it. I’ll have time in a couple of weeks to look at it in detail and then, once
>> we understand the problem, we can discuss possible solutions.
>>
>> Cheers,
>> Max
>>
>>
>>
>>
>>             In my case saving in step three produces a syntax error when the source is loaded from file again. I don???t really have a
>>             clue as to what the underlying issue is, but I suspect it may have to do with comments and a particular situation in which
>>             the position is not being correctly updated before or after writing.
>>
>>
>>             I agree with Chris that locks may be problematic, it just seemed like the simplest obvious solution (although of course it
>>             gets complicated when an image crashes and doesn???t clean up the lock???).
>>
>>             Cheers,
>>             Max
>>
>>
>>
>>
--
Louis LaBrunda
Keystone Software Corp.
SkypeMe callto://PhotonDemon


Reply | Threaded
Open this post in threaded view
|

Re: [Pharo-dev] [squeak-dev] The .changes file should be bound to a single image

timrowledge
In reply to this post by Levente Uzonyi

> On 29-06-2016, at 7:08 AM, Levente Uzonyi <[hidden email]> wrote:
>
> This seems to be a missing #flush after changes are written to the file.
> Without #flush both processes (unix) will maintain their own version of the file in memory.

Pretty much exactly what I was about to type. We just had part of this discussion wrt Scratch project files on the Pi - adding flush/sync etc.

In many cases letting an OS buffer the buffering of the buffer’s buffer buffer is tolerable - though insane, and wasteful, and a symptom of the lack of careful analysis that seems to pervade the world of software these days - because nothing goes horribly wrong in most cases. Everything eventually gets pushed to actual hardware, the system doesn’t crash, evaporate, get zapped by Zargon DeathRay(™) emissions, the power doesn’t get ripped out etc. Evidently, on a Pi in a classroom we can’t assign quite such a low probability to the Zargon problem.

However, the changes file is supposed to be a transaction log and as such I claim the data ought to hit hardware as soon as possible and in a way that as near as dammit guarantees correct results. So the mega-layer buffering is An Issue so far as I’m concerned.

We also, still, and for decades now, have the behaviour I consider stupid beyond all reason whereby a file write is done by
a) tell the file pointer to move to a certain location
b) think about it
c) oh, finally write some text to the file.
With the obvious possibility that the file pointer can be changed in b)
Then if you can open-for-write a file multiple times, how much confusion can that actually cause? What about a forked process with nominally the same file objects? Are we at all sure any OS properly deals with it? Are we sure that what is purportedly ‘proper’ makes any sense for our requirements?

The most obvious place where this is an issue is where two images are using the same changes file and think they’re appending. Image A seeks to the end of the file, ‘writes’ stuff. Image B near-simultaneously does the same. Eventually each process gets around to pushing data to hardware. Oops! And let’s not dwell too much on the problems possible if either process causes a truncation of the file. Oh, wait, I think we actually had a problem with that some years ago.


tim
--
tim Rowledge; [hidden email]; http://www.rowledge.org/tim
"Daddy, what does FORMATTING DRIVE C mean?"



Reply | Threaded
Open this post in threaded view
|

Re: [Pharo-dev] [squeak-dev] The .changes file should be bound to a single image

Eliot Miranda-2
Hi Tim,

On Wed, Jun 29, 2016 at 10:24 AM, tim Rowledge <[hidden email]> wrote:

> On 29-06-2016, at 7:08 AM, Levente Uzonyi <[hidden email]> wrote:
>
> This seems to be a missing #flush after changes are written to the file.
> Without #flush both processes (unix) will maintain their own version of the file in memory.

Pretty much exactly what I was about to type. We just had part of this discussion wrt Scratch project files on the Pi - adding flush/sync etc.

In many cases letting an OS buffer the buffering of the buffer’s buffer buffer is tolerable - though insane, and wasteful, and a symptom of the lack of careful analysis that seems to pervade the world of software these days - because nothing goes horribly wrong in most cases. Everything eventually gets pushed to actual hardware, the system doesn’t crash, evaporate, get zapped by Zargon DeathRay(™) emissions, the power doesn’t get ripped out etc. Evidently, on a Pi in a classroom we can’t assign quite such a low probability to the Zargon problem.

However, the changes file is supposed to be a transaction log and as such I claim the data ought to hit hardware as soon as possible and in a way that as near as dammit guarantees correct results. So the mega-layer buffering is An Issue so far as I’m concerned.

We also, still, and for decades now, have the behaviour I consider stupid beyond all reason whereby a file write is done by
a) tell the file pointer to move to a certain location
b) think about it
c) oh, finally write some text to the file.
With the obvious possibility that the file pointer can be changed in b)
Then if you can open-for-write a file multiple times, how much confusion can that actually cause? What about a forked process with nominally the same file objects? Are we at all sure any OS properly deals with it? Are we sure that what is purportedly ‘proper’ makes any sense for our requirements?

The most obvious place where this is an issue is where two images are using the same changes file and think they’re appending. Image A seeks to the end of the file, ‘writes’ stuff. Image B near-simultaneously does the same. Eventually each process gets around to pushing data to hardware. Oops! And let’s not dwell too much on the problems possible if either process causes a truncation of the file. Oh, wait, I think we actually had a problem with that some years ago.

The thing is that this problem bites even if we have a unitary primitive that both positions and writes if that primitive is written above a substrate that, as unix and stdio streams do, separates positioning from writing.  The primitive is neat but it simply drives the problem further underground.

A more robust solution might be to position, write, reposition, read, and compare, shortening on corruption, and retrying, using exponential back-off like ethernet packet transmission.  Most of the time this adds only the overhead of reading what's written.  

_,,,^..^,,,_
best, Eliot


Reply | Threaded
Open this post in threaded view
|

Re: [Pharo-dev] [squeak-dev] The .changes file should be bound to a single image

timrowledge

> On 29-06-2016, at 10:35 AM, Eliot Miranda <[hidden email]> wrote:
>
{snip much rant}

> The most obvious place where this is an issue is where two images are using the same changes file and think they’re appending. Image A seeks to the end of the file, ‘writes’ stuff. Image B near-simultaneously does the same. Eventually each process gets around to pushing data to hardware. Oops! And let’s not dwell too much on the problems possible if either process causes a truncation of the file. Oh, wait, I think we actually had a problem with that some years ago.
>
> The thing is that this problem bites even if we have a unitary primitive that both positions and writes if that primitive is written above a substrate that, as unix and stdio streams do, separates positioning from writing.  The primitive is neat but it simply drives the problem further underground.


Oh absolutely - we only have real control over a small part of it. It would probably be worth making use of that where we can.

>
> A more robust solution might be to position, write, reposition, read, and compare, shortening on corruption, and retrying, using exponential back-off like ethernet packet transmission.  Most of the time this adds only the overhead of reading what's written.  

Yes, for anything we want reliable that’s probably a good way. A limit on the number of retries would probably be smart to stop infinite recursion. Imagine the fun of an error causing infinite retries of writing an error log about an infinite recursion. On an infinitely large Beowulf cluster!

It’s all yet another example of where software meeting reality leads to nightmares.


tim
--
tim Rowledge; [hidden email]; http://www.rowledge.org/tim
If it was easy, the hardware people would take care of it.



Reply | Threaded
Open this post in threaded view
|

Re: [Pharo-dev] [squeak-dev] The .changes file should be bound to a single image

David T. Lewis
Let's not solve the wrong problem folks. I only looked at this for 10
minutes this morning, and I think (but I am not sure) that the issue
affects the case of saving the image, and that the normal writing of
changes is fine.

Max was running on Pharo, which may or may not be handling changes the
same way. I think he may be seeing a different problem from the one I
confirmed.

So a bit more testing and verification would be in order. I can't look at
it now though.

Dave

>
>> On 29-06-2016, at 10:35 AM, Eliot Miranda <[hidden email]>
>> wrote:
>>
> {snip much rant}
>
>> The most obvious place where this is an issue is where two images are
>> using the same changes file and think they’re appending. Image A seeks
>> to the end of the file, ‘writes’ stuff. Image B near-simultaneously
>> does the same. Eventually each process gets around to pushing data to
>> hardware. Oops! And let’s not dwell too much on the problems possible
>> if either process causes a truncation of the file. Oh, wait, I think we
>> actually had a problem with that some years ago.
>>
>> The thing is that this problem bites even if we have a unitary primitive
>> that both positions and writes if that primitive is written above a
>> substrate that, as unix and stdio streams do, separates positioning from
>> writing.  The primitive is neat but it simply drives the problem further
>> underground.
>
>
> Oh absolutely - we only have real control over a small part of it. It
> would probably be worth making use of that where we can.
>
>>
>> A more robust solution might be to position, write, reposition, read,
>> and compare, shortening on corruption, and retrying, using exponential
>> back-off like ethernet packet transmission.  Most of the time this adds
>> only the overhead of reading what's written.
>
> Yes, for anything we want reliable that’s probably a good way. A limit
> on the number of retries would probably be smart to stop infinite
> recursion. Imagine the fun of an error causing infinite retries of writing
> an error log about an infinite recursion. On an infinitely large Beowulf
> cluster!
>
> It’s all yet another example of where software meeting reality leads to
> nightmares.
>
>
> tim
> --
> tim Rowledge; [hidden email]; http://www.rowledge.org/tim
> If it was easy, the hardware people would take care of it.
>
>
>



Reply | Threaded
Open this post in threaded view
|

Re: [Pharo-dev] [squeak-dev] The .changes file should be bound to a single image

David T. Lewis
On Wed, Jun 29, 2016 at 02:00:19PM -0400, David T. Lewis wrote:
> Let's not solve the wrong problem folks. I only looked at this for 10
> minutes this morning, and I think (but I am not sure) that the issue
> affects the case of saving the image, and that the normal writing of
> changes is fine.

I am wrong.

I spent some more time with this, and it is clear that two images saving
chunks to the same changes file will result in corrupted change records
in the changes file. It is not just an issue related to the image save
as I suggested above.

In practice, this is not an issue that either Chris or I have noticed,
probably because we are not doing software development (saving method
changes) at the same time that we are running RemoteTask and similar.
But I can certainly see how it might be a problem if, for example, I
had a bunch of students running the same image from a network shared
folder.

Dave


>
> Max was running on Pharo, which may or may not be handling changes the
> same way. I think he may be seeing a different problem from the one I
> confirmed.
>
> So a bit more testing and verification would be in order. I can't look at
> it now though.
>
> Dave
>
> >
> >> On 29-06-2016, at 10:35 AM, Eliot Miranda <[hidden email]>
> >> wrote:
> >>
> > {snip much rant}
> >
> >> The most obvious place where this is an issue is where two images are
> >> using the same changes file and think they???re appending. Image A seeks
> >> to the end of the file, ???writes??? stuff. Image B near-simultaneously
> >> does the same. Eventually each process gets around to pushing data to
> >> hardware. Oops! And let???s not dwell too much on the problems possible
> >> if either process causes a truncation of the file. Oh, wait, I think we
> >> actually had a problem with that some years ago.
> >>
> >> The thing is that this problem bites even if we have a unitary primitive
> >> that both positions and writes if that primitive is written above a
> >> substrate that, as unix and stdio streams do, separates positioning from
> >> writing.  The primitive is neat but it simply drives the problem further
> >> underground.
> >
> >
> > Oh absolutely - we only have real control over a small part of it. It
> > would probably be worth making use of that where we can.
> >
> >>
> >> A more robust solution might be to position, write, reposition, read,
> >> and compare, shortening on corruption, and retrying, using exponential
> >> back-off like ethernet packet transmission.  Most of the time this adds
> >> only the overhead of reading what's written.
> >
> > Yes, for anything we want reliable that???s probably a good way. A limit
> > on the number of retries would probably be smart to stop infinite
> > recursion. Imagine the fun of an error causing infinite retries of writing
> > an error log about an infinite recursion. On an infinitely large Beowulf
> > cluster!
> >
> > It???s all yet another example of where software meeting reality leads to
> > nightmares.
> >
> >
> > tim
> > --
> > tim Rowledge; [hidden email]; http://www.rowledge.org/tim
> > If it was easy, the hardware people would take care of it.
> >
> >
> >
>

Reply | Threaded
Open this post in threaded view
|

Re: The .changes file should be bound to a single image

Ben Coman
In reply to this post by Max Leske
On Tue, Jun 28, 2016 at 6:04 PM, Max Leske <[hidden email]> wrote:

> Hi,
>
> Opening the same image twice works fine as long as no writes to the .changes file occur. When both images write to the .changes file however it will be broken for both because the offsets for the changes are wrong. This can lead to lost data and predominantly to invalid method source code, which is a pain with Monticello.
>
> I suggest that we implement a kind of lock mechanism to ensure that only one image (the first one opened) can write to the .changes file.
>
>
> I’ve opened an issue for Pharo here: https://pharo.fogbugz.com/f/cases/18635/The-changes-file-should-be-bound-to-a-single-image
>
>
> Cheers,
> Max

I just learnt something quite surprising that is probably important to
be aware of... "Locks given by fcntl are not associated with the
file-descriptor or open-file table entries. Instead, they are bound to
the process itself. For example, a process has multiple open file
descriptors for a particular file and gets a read/write lock using any
one of these descriptors. Now closing any of these file descriptors
will release the lock, the process holds on the file. The descriptor
that was used to acquire the lock in the first place might still be
open, but the process will loose its lock.  So, it does not require an
explicit unlock or a close ONLY on the descriptor that was used to
acquire the lock in fcntl call. Doing unlock or close on any of the
open file descriptors will release the lock owned by the process on
the particular file."

https://loonytek.com/2015/01/15/advisory-file-locking-differences-between-posix-and-bsd-locks/

cheers -ben

Reply | Threaded
Open this post in threaded view
|

Re: [Pharo-dev] [squeak-dev] The .changes file should be bound to a single image

Ben Coman
In reply to this post by David T. Lewis
On Thu, Jun 30, 2016 at 7:07 AM, David T. Lewis <[hidden email]> wrote:

> On Wed, Jun 29, 2016 at 02:00:19PM -0400, David T. Lewis wrote:
>> Let's not solve the wrong problem folks. I only looked at this for 10
>> minutes this morning, and I think (but I am not sure) that the issue
>> affects the case of saving the image, and that the normal writing of
>> changes is fine.
>
> I am wrong.
>
> I spent some more time with this, and it is clear that two images saving
> chunks to the same changes file will result in corrupted change records
> in the changes file. It is not just an issue related to the image save
> as I suggested above.
>
> In practice, this is not an issue that either Chris or I have noticed,
> probably because we are not doing software development (saving method
> changes) at the same time that we are running RemoteTask and similar.
> But I can certainly see how it might be a problem if, for example, I
> had a bunch of students running the same image from a network shared
> folder.

Maybe its time to consider a fundamental change in how method-sources
are referred to.
Taking inspiration from git... A content addressable key-value file
store might solve concurrent access.  Each CompiledMethod gets written
to a file named for the hash of its contents, which is the only
reference the Image getsto a method's source.  Each such file would
*only* need be written once and thereafter could be read
simultaneously by multiple Images.  Anyone on the network wanting
store the same source would see the file already exists and have
nothing to do.
Perhaps having many individual files implies abysmal performance,

Or maybe something similar to Mecurial's reflog format [1] could be
used, one file per class.

The thing about the Image *only* referring to a method's source by its
content hash would seem to great flexibility in backends to
locate/store that source.  Possibly...
* stored as individual files as above
* bundled in a zip file in random order
* a school could configure a database server in Image provided to students
* hashes could be thrown at a service on the Internet
* cached locally with a key-value database like LMDB [2]
* remote replication to multiple internet backup locations
* in an emergency you could throw bundle of hashes as a query to the
mail list and get an adhoc response of individual files.
* Inter-Smalltalk image communication

Pharo has a stated goal to get rid of the changes file.  Changing to
content-hash-addressable method-source seems a logicial step along
that road. Even if the Squeak community doesn't want to go so far as
eliminating the .changes file, can they see value in changing method
source references to be content-hashes rather than indexes into a
particular file?

[1] http://blog.prasoonshukla.com/mercurial-vs-git-scaling
[2] https://en.wikipedia.org/wiki/Lightning_Memory-Mapped_Database


Just having a poke at this, it seems a new form of
CompiledMethodTrailer may need to be defined, being invoked from
CompiledMethod>>sourceCode.  CompiledMethodTrailer>>sourceCode would
find the source code based on a content-hash held by the
CompiledMethod.  If found, the call to #getSourceFromFile that
accesses the .changes file will be bypassed, and could remain as a
backup.

cheers -ben

>
> Dave
>
>
>>
>> Max was running on Pharo, which may or may not be handling changes the
>> same way. I think he may be seeing a different problem from the one I
>> confirmed.
>>
>> So a bit more testing and verification would be in order. I can't look at
>> it now though.
>>
>> Dave
>>
>> >
>> >> On 29-06-2016, at 10:35 AM, Eliot Miranda <[hidden email]>
>> >> wrote:
>> >>
>> > {snip much rant}
>> >
>> >> The most obvious place where this is an issue is where two images are
>> >> using the same changes file and think they???re appending. Image A seeks
>> >> to the end of the file, ???writes??? stuff. Image B near-simultaneously
>> >> does the same. Eventually each process gets around to pushing data to
>> >> hardware. Oops! And let???s not dwell too much on the problems possible
>> >> if either process causes a truncation of the file. Oh, wait, I think we
>> >> actually had a problem with that some years ago.
>> >>
>> >> The thing is that this problem bites even if we have a unitary primitive
>> >> that both positions and writes if that primitive is written above a
>> >> substrate that, as unix and stdio streams do, separates positioning from
>> >> writing.  The primitive is neat but it simply drives the problem further
>> >> underground.
>> >
>> >
>> > Oh absolutely - we only have real control over a small part of it. It
>> > would probably be worth making use of that where we can.
>> >
>> >>
>> >> A more robust solution might be to position, write, reposition, read,
>> >> and compare, shortening on corruption, and retrying, using exponential
>> >> back-off like ethernet packet transmission.  Most of the time this adds
>> >> only the overhead of reading what's written.
>> >
>> > Yes, for anything we want reliable that???s probably a good way. A limit
>> > on the number of retries would probably be smart to stop infinite
>> > recursion. Imagine the fun of an error causing infinite retries of writing
>> > an error log about an infinite recursion. On an infinitely large Beowulf
>> > cluster!
>> >
>> > It???s all yet another example of where software meeting reality leads to
>> > nightmares.
>> >
>> >
>> > tim
>> > --
>> > tim Rowledge; [hidden email]; http://www.rowledge.org/tim
>> > If it was easy, the hardware people would take care of it.
>> >
>> >
>> >
>>
>

Reply | Threaded
Open this post in threaded view
|

Re: [Pharo-dev] [squeak-dev] The .changes file should be bound to a single image

Max Leske
In reply to this post by Ben Coman

> On 30 Jun 2016, at 05:09, Ben Coman <[hidden email]> wrote:
>
> On Tue, Jun 28, 2016 at 6:04 PM, Max Leske <[hidden email]> wrote:
>> Hi,
>>
>> Opening the same image twice works fine as long as no writes to the .changes file occur. When both images write to the .changes file however it will be broken for both because the offsets for the changes are wrong. This can lead to lost data and predominantly to invalid method source code, which is a pain with Monticello.
>>
>> I suggest that we implement a kind of lock mechanism to ensure that only one image (the first one opened) can write to the .changes file.
>>
>>
>> I’ve opened an issue for Pharo here: https://pharo.fogbugz.com/f/cases/18635/The-changes-file-should-be-bound-to-a-single-image
>>
>>
>> Cheers,
>> Max
>
> I just learnt something quite surprising that is probably important to
> be aware of... "Locks given by fcntl are not associated with the
> file-descriptor or open-file table entries. Instead, they are bound to
> the process itself. For example, a process has multiple open file
> descriptors for a particular file and gets a read/write lock using any
> one of these descriptors. Now closing any of these file descriptors
> will release the lock, the process holds on the file. The descriptor
> that was used to acquire the lock in the first place might still be
> open, but the process will loose its lock.  So, it does not require an
> explicit unlock or a close ONLY on the descriptor that was used to
> acquire the lock in fcntl call. Doing unlock or close on any of the
> open file descriptors will release the lock owned by the process on
> the particular file."
>
> https://loonytek.com/2015/01/15/advisory-file-locking-differences-between-posix-and-bsd-locks/
>
> cheers -ben
>

Which would solve the problem of a crashed image not cleaning up its lock. Thanks for sharing Ben.
Reply | Threaded
Open this post in threaded view
|

Re: [Pharo-dev] [squeak-dev] The .changes file should be bound to a single image

David T. Lewis
On Thu, Jun 30, 2016 at 09:59:37AM +0200, Max Leske wrote:

>
> > On 30 Jun 2016, at 05:09, Ben Coman <[hidden email]> wrote:
> >
> > On Tue, Jun 28, 2016 at 6:04 PM, Max Leske <[hidden email]> wrote:
> >> Hi,
> >>
> >> Opening the same image twice works fine as long as no writes to the .changes file occur. When both images write to the .changes file however it will be broken for both because the offsets for the changes are wrong. This can lead to lost data and predominantly to invalid method source code, which is a pain with Monticello.
> >>
> >> I suggest that we implement a kind of lock mechanism to ensure that only one image (the first one opened) can write to the .changes file.
> >>
> >>
> >> I???ve opened an issue for Pharo here: https://pharo.fogbugz.com/f/cases/18635/The-changes-file-should-be-bound-to-a-single-image
> >>
> >>
> >> Cheers,
> >> Max
> >
> > I just learnt something quite surprising that is probably important to
> > be aware of... "Locks given by fcntl are not associated with the
> > file-descriptor or open-file table entries. Instead, they are bound to
> > the process itself. For example, a process has multiple open file
> > descriptors for a particular file and gets a read/write lock using any
> > one of these descriptors. Now closing any of these file descriptors
> > will release the lock, the process holds on the file. The descriptor
> > that was used to acquire the lock in the first place might still be
> > open, but the process will loose its lock.  So, it does not require an
> > explicit unlock or a close ONLY on the descriptor that was used to
> > acquire the lock in fcntl call. Doing unlock or close on any of the
> > open file descriptors will release the lock owned by the process on
> > the particular file."
> >
> > https://loonytek.com/2015/01/15/advisory-file-locking-differences-between-posix-and-bsd-locks/
> >
> > cheers -ben
> >
>
> Which would solve the problem of a crashed image not cleaning up its lock. Thanks for sharing Ben.

FYI, file locking for Unix/Linux/OS X is supported in OSProcess, see UnixProcessFileLockTestCase
and the 'file locking' tests in UnixProcessAccessorTestCase.

Dave


12