Hi,
Opening the same image twice works fine as long as no writes to the .changes file occur. When both images write to the .changes file however it will be broken for both because the offsets for the changes are wrong. This can lead to lost data and predominantly to invalid method source code, which is a pain with Monticello. I suggest that we implement a kind of lock mechanism to ensure that only one image (the first one opened) can write to the .changes file. I’ve opened an issue for Pharo here: https://pharo.fogbugz.com/f/cases/18635/The-changes-file-should-be-bound-to-a-single-image Cheers, Max |
Hi Max, if I have a running VM in Windows 10, I cannot open a second one on the same image and also get write access to it. A warning appears. So, it never happens that two images write into the same changes file. Best, Marcel |
On 28.06.2016, at 14:23, marcel.taeumel <[hidden email]> wrote: > Max Leske wrote >> Hi, >> >> Opening the same image twice works fine as long as no writes to the >> .changes file occur. When both images write to the .changes file however >> it will be broken for both because the offsets for the changes are wrong. >> This can lead to lost data and predominantly to invalid method source >> code, which is a pain with Monticello. >> >> I suggest that we implement a kind of lock mechanism to ensure that only >> one image (the first one opened) can write to the .changes file. >> >> >> I’ve opened an issue for Pharo here: >> https://pharo.fogbugz.com/f/cases/18635/The-changes-file-should-be-bound-to-a-single-image >> >> >> Cheers, >> Max > > Hi Max, > > if I have a running VM in Windows 10, I cannot open a second one on the same > image and also get write access to it. A warning appears. So, it never > happens that two images write into the same changes file. On Mac and Linux this is a problem, however. Best -Tobias |
Now don’t get me started.. I’ve ranted about this off and on since ’96! Be happy that I have to leave now to take my subaru in for service...
tim -- tim Rowledge; [hidden email]; http://www.rowledge.org/tim Useful random insult:- Out there where the buses don't run. |
In reply to this post by Max Leske
I have several applications which launch multiple copies of the same
image for multicore processing. The images do their work, commit it to database, then exit themselves without saving. Its a great feature. I know OSProcess, when combined with CommandShell, has a RemoteTask which allows efficient forking of the image (via Linux copy-on-write memory sharing) and so a solution like what happens in Windows is not really good. Instead of putting a pop-up in front of the user, perhaps one way to solve the problem would be to, upon image save, simply goes through all the changes since the last save and re-flushes them to the .changes file. That way, if someone does want to save the same image on top of themself, at least it would be whichever saved last "wins".... On Tue, Jun 28, 2016 at 5:04 AM, Max Leske <[hidden email]> wrote: > Hi, > > Opening the same image twice works fine as long as no writes to the .changes file occur. When both images write to the .changes file however it will be broken for both because the offsets for the changes are wrong. This can lead to lost data and predominantly to invalid method source code, which is a pain with Monticello. > > I suggest that we implement a kind of lock mechanism to ensure that only one image (the first one opened) can write to the .changes file. > > > I’ve opened an issue for Pharo here: https://pharo.fogbugz.com/f/cases/18635/The-changes-file-should-be-bound-to-a-single-image > > > Cheers, > Max |
On Tue, Jun 28, 2016 at 04:47:00PM -0500, Chris Muller wrote:
> > On Tue, Jun 28, 2016 at 5:04 AM, Max Leske <[hidden email]> wrote: > > Hi, > > > > Opening the same image twice works fine as long as no writes to the .changes file occur. When both images write to the .changes file however it will be broken for both because the offsets for the changes are wrong. This can lead to lost data and predominantly to invalid method source code, which is a pain with Monticello. > > > > I suggest that we implement a kind of lock mechanism to ensure that only one image (the first one opened) can write to the .changes file. > > If the offsets are wrong in this scenario, it's a bug in the image. The image is supposed to seek to the end of the changes file before writing the next chunk. While this sounds horrible in theory, in practice it works remarkably well, and I have been happily surprised at how reliable it is after many years of using and abusing the feature. That is a very good thing. Adding a lock to prevent the scenario would be bad, because it would surely break a number of other legitimate use cases. > > > > I???ve opened an issue for Pharo here: https://pharo.fogbugz.com/f/cases/18635/The-changes-file-should-be-bound-to-a-single-image > > > > > I have several applications which launch multiple copies of the same > image for multicore processing. The images do their work, commit it > to database, then exit themselves without saving. Its a great > feature. That is consistent with my experience. I remember expecting horrible things to happen if I had two images sharing a changes file, but nothing bad ever happened. It just works. > > I know OSProcess, when combined with CommandShell, has a RemoteTask > which allows efficient forking of the image (via Linux copy-on-write > memory sharing) and so a solution like what happens in Windows is not > really good. My assumption with RemoteTask was that someone doing complex or long-running jobs would more or less know what they were doing, and would have the good sense to stop writing to the changes file from a bunch of forked images. But in actual practice, I have never seen a problem related to this. It just works. > > Instead of putting a pop-up in front of the user, perhaps one way to > solve the problem would be to, upon image save, simply goes through > all the changes since the last save and re-flushes them to the > .changes file. > > That way, if someone does want to save the same image on top of > themself, at least it would be whichever saved last "wins".... > There must be a problem somewhere, otherwise Max would not be raising the issue. So whatever combination of operating system and image is having a problem, I would be inclined fix that. Windows cannot be a problem, because the operating system will not permit you to open the changes file twice. The Unix/Linux systems that I have used all work fine. Max, which operating system/VM/image are you using? Is this on a Mac? Dave |
> On 29 Jun 2016, at 02:06, David T. Lewis <[hidden email]> wrote: > > On Tue, Jun 28, 2016 at 04:47:00PM -0500, Chris Muller wrote: >> >> On Tue, Jun 28, 2016 at 5:04 AM, Max Leske <[hidden email]> wrote: >>> Hi, >>> >>> Opening the same image twice works fine as long as no writes to the .changes file occur. When both images write to the .changes file however it will be broken for both because the offsets for the changes are wrong. This can lead to lost data and predominantly to invalid method source code, which is a pain with Monticello. >>> >>> I suggest that we implement a kind of lock mechanism to ensure that only one image (the first one opened) can write to the .changes file. >>> > > If the offsets are wrong in this scenario, it's a bug in the image. The > image is supposed to seek to the end of the changes file before writing > the next chunk. While this sounds horrible in theory, in practice it works > remarkably well, and I have been happily surprised at how reliable it > is after many years of using and abusing the feature. That is a very > good thing. > > Adding a lock to prevent the scenario would be bad, because it would > surely break a number of other legitimate use cases. > > >>> >>> I???ve opened an issue for Pharo here: https://pharo.fogbugz.com/f/cases/18635/The-changes-file-should-be-bound-to-a-single-image >>> >> >> >> I have several applications which launch multiple copies of the same >> image for multicore processing. The images do their work, commit it >> to database, then exit themselves without saving. Its a great >> feature. Doing work is not the problem. Modifying source code is the problem. > > That is consistent with my experience. I remember expecting horrible > things to happen if I had two images sharing a changes file, but nothing > bad ever happened. It just works. > >> >> I know OSProcess, when combined with CommandShell, has a RemoteTask >> which allows efficient forking of the image (via Linux copy-on-write >> memory sharing) and so a solution like what happens in Windows is not >> really good. > > My assumption with RemoteTask was that someone doing complex or long-running > jobs would more or less know what they were doing, and would have the good > sense to stop writing to the changes file from a bunch of forked images. > But in actual practice, I have never seen a problem related to this. > It just works. > >> >> Instead of putting a pop-up in front of the user, perhaps one way to >> solve the problem would be to, upon image save, simply goes through >> all the changes since the last save and re-flushes them to the >> .changes file. >> >> That way, if someone does want to save the same image on top of >> themself, at least it would be whichever saved last "wins".... >> > > There must be a problem somewhere, otherwise Max would not be raising > the issue. So whatever combination of operating system and image is > having a problem, I would be inclined fix that. :) Thanks Dave! > > Windows cannot be a problem, because the operating system will not > permit you to open the changes file twice. The Unix/Linux systems that > I have used all work fine. > > Max, which operating system/VM/image are you using? Is this on a Mac? Mac OS X 10.11.5, Pharo 6 (60086) > > Dave > > I actually didn’t open the issue for myself but because of a student who ran into this. I’ve been in the same situation before but I’m an experienced user while students at the research group sometimes just spend a couple of weeks with Pharo and then such things are a real problem. Interestingly the issue is, as you already suggested, pretty hard to reproduce i.e., modifying arbitrary methods in both images did not show the symptoms I was looking for. Here’s a reproducible case (at least on my machine): 1. create a new method in both images: foo ^ nil 2. Modify it in one image: foo "Lorem ipsum dolor sit amet, consectetur adipiscing elit, sed do eiusmod tempor incididunt ut labore et dolore magna aliqua. Ut enim ad minim veniam, quis nostrud exercitation ullamco laboris nisi ut aliquip ex ea commodo consequat. Duis aute irure dolor in reprehenderit in voluptate velit esse cillum dolore eu fugiat nulla pariatur. Excepteur sint occaecat cupidatat non proident, sunt in culpa qui officia deserunt mollit anim id est laborum." ^ nil + 1 3. Modify it in the other image: foo ^ nil - 1 isEmpty ifTrue: [ "blah" nil ] In my case saving in step three produces a syntax error when the source is loaded from file again. I don’t really have a clue as to what the underlying issue is, but I suspect it may have to do with comments and a particular situation in which the position is not being correctly updated before or after writing. I agree with Chris that locks may be problematic, it just seemed like the simplest obvious solution (although of course it gets complicated when an image crashes and doesn’t clean up the lock…). Cheers, Max > |
Max,
Confirming on Linux and Squeak. See below. On Wed, Jun 29, 2016 at 08:53:12AM +0200, Max Leske wrote: > > > On 29 Jun 2016, at 02:06, David T. Lewis <[hidden email]> wrote: > > > > On Tue, Jun 28, 2016 at 04:47:00PM -0500, Chris Muller wrote: > >> > >> On Tue, Jun 28, 2016 at 5:04 AM, Max Leske <[hidden email]> wrote: > >>> Hi, > >>> > >>> Opening the same image twice works fine as long as no writes to the .changes file occur. When both images write to the .changes file however it will be broken for both because the offsets for the changes are wrong. This can lead to lost data and predominantly to invalid method source code, which is a pain with Monticello. > >>> > >>> I suggest that we implement a kind of lock mechanism to ensure that only one image (the first one opened) can write to the .changes file. > >>> > > > > If the offsets are wrong in this scenario, it's a bug in the image. The > > image is supposed to seek to the end of the changes file before writing > > the next chunk. While this sounds horrible in theory, in practice it works > > remarkably well, and I have been happily surprised at how reliable it > > is after many years of using and abusing the feature. That is a very > > good thing. > > > > Adding a lock to prevent the scenario would be bad, because it would > > surely break a number of other legitimate use cases. > > > > > >>> > >>> I???ve opened an issue for Pharo here: https://pharo.fogbugz.com/f/cases/18635/The-changes-file-should-be-bound-to-a-single-image > >>> > >> > >> > >> I have several applications which launch multiple copies of the same > >> image for multicore processing. The images do their work, commit it > >> to database, then exit themselves without saving. Its a great > >> feature. > > Doing work is not the problem. Modifying source code is the problem. > > > > > That is consistent with my experience. I remember expecting horrible > > things to happen if I had two images sharing a changes file, but nothing > > bad ever happened. It just works. > > > >> > >> I know OSProcess, when combined with CommandShell, has a RemoteTask > >> which allows efficient forking of the image (via Linux copy-on-write > >> memory sharing) and so a solution like what happens in Windows is not > >> really good. > > > > My assumption with RemoteTask was that someone doing complex or long-running > > jobs would more or less know what they were doing, and would have the good > > sense to stop writing to the changes file from a bunch of forked images. > > But in actual practice, I have never seen a problem related to this. > > It just works. > > > >> > >> Instead of putting a pop-up in front of the user, perhaps one way to > >> solve the problem would be to, upon image save, simply goes through > >> all the changes since the last save and re-flushes them to the > >> .changes file. > >> > >> That way, if someone does want to save the same image on top of > >> themself, at least it would be whichever saved last "wins".... > >> > > > > There must be a problem somewhere, otherwise Max would not be raising > > the issue. So whatever combination of operating system and image is > > having a problem, I would be inclined fix that. > > :) Thanks Dave! > > > > > Windows cannot be a problem, because the operating system will not > > permit you to open the changes file twice. The Unix/Linux systems that > > I have used all work fine. > > > > Max, which operating system/VM/image are you using? Is this on a Mac? > > Mac OS X 10.11.5, > Pharo 6 (60086) > > > > > Dave > > > > > > I actually didn???t open the issue for myself but because of a student who ran into this. I???ve been in the same situation before but I???m an experienced user while students at the research group sometimes just spend a couple of weeks with Pharo and then such things are a real problem. > > Interestingly the issue is, as you already suggested, pretty hard to reproduce i.e., modifying arbitrary methods in both images did not show the symptoms I was looking for. > > Here???s a reproducible case (at least on my machine): > > 1. create a new method in both images: > > foo > ^ nil > > 2. Modify it in one image: > > foo > "Lorem ipsum dolor sit amet, consectetur adipiscing elit, sed do eiusmod tempor incididunt ut labore et dolore magna aliqua. Ut enim ad minim veniam, quis nostrud exercitation ullamco laboris nisi ut aliquip ex ea commodo consequat. Duis aute irure dolor in reprehenderit in voluptate velit esse cillum dolore eu fugiat nulla pariatur. Excepteur sint occaecat cupidatat non proident, sunt in culpa qui officia deserunt mollit anim id est laborum." > ^ nil + 1 > > 3. Modify it in the other image: > > foo > ^ nil - 1 isEmpty ifTrue: [ "blah" nil ] > Confirmed on Linux + Squeak. I did your test above using #forkSqueak so that I had two identical images sharing the same changes file. In each image, I saved the #foo method. At that point, the changes file conntained exactly what I would expect. I then did a save and exit from the child image, followed by a save and exit from the original image. I can see that the changes from the child image are now overwriting the changes from the original parent image. Since the parent image is the one that was saved last, its #foo method now has corrupted source. This is not a scenario that I have ever encountered, but I can see how it might happen in a classroom setting. I can't look into this further right now, but it seems possible that the problem happens only when saving the image, in which case we could force the changes file to seek to end of file before doing the save. But we'll need to do some more testing to make sure that this is the only scenario in which it happens. Dave > In my case saving in step three produces a syntax error when the source is loaded from file again. I don???t really have a clue as to what the underlying issue is, but I suspect it may have to do with comments and a particular situation in which the position is not being correctly updated before or after writing. > > > I agree with Chris that locks may be problematic, it just seemed like the simplest obvious solution (although of course it gets complicated when an image crashes and doesn???t clean up the lock???). > > Cheers, > Max > > > > |
Great, thanks. In the scenario I described I did not save either image. Of course, without saving the problem will not exist as soon as you start the image anew (the old pointers are still valid and new content will be written to the end). The problem does exhibit itself without saving though. Since this is not anything critical, don’t put too much effort into it. I’ll have time in a couple of weeks to look at it in detail and then, once we understand the problem, we can discuss possible solutions. Cheers, Max
|
This seems to be a missing #flush after changes are written to the file.
Without #flush both processes (unix) will maintain their own version of the file in memory. Levente On Wed, 29 Jun 2016, Max Leske wrote: > > On 29 Jun 2016, at 14:45, David T. Lewis <[hidden email]> wrote: > > Max, > > Confirming on Linux and Squeak. See below. > > On Wed, Jun 29, 2016 at 08:53:12AM +0200, Max Leske wrote: > > On 29 Jun 2016, at 02:06, David T. Lewis <[hidden email]> wrote: > > On Tue, Jun 28, 2016 at 04:47:00PM -0500, Chris Muller wrote: > > On Tue, Jun 28, 2016 at 5:04 AM, Max Leske <[hidden email]> wrote: > Hi, > > Opening the same image twice works fine as long as no writes to the .changes file occur. > When both images write to the .changes file however it will be broken for both because the > offsets for the changes are wrong. This can lead to lost data and predominantly to invalid > method source code, which is a pain with Monticello. > > I suggest that we implement a kind of lock mechanism to ensure that only one image (the > first one opened) can write to the .changes file. > > > If the offsets are wrong in this scenario, it's a bug in the image. The > image is supposed to seek to the end of the changes file before writing > the next chunk. While this sounds horrible in theory, in practice it works > remarkably well, and I have been happily surprised at how reliable it > is after many years of using and abusing the feature. That is a very > good thing. > > Adding a lock to prevent the scenario would be bad, because it would > surely break a number of other legitimate use cases. > > > > I???ve opened an issue for Pharo here: > https://pharo.fogbugz.com/f/cases/18635/The-changes-file-should-be-bound-to-a-single-image > > > > I have several applications which launch multiple copies of the same > image for multicore processing. The images do their work, commit it > to database, then exit themselves without saving. Its a great > feature. > > > Doing work is not the problem. Modifying source code is the problem. > > > That is consistent with my experience. I remember expecting horrible > things to happen if I had two images sharing a changes file, but nothing > bad ever happened. It just works. > > > I know OSProcess, when combined with CommandShell, has a RemoteTask > which allows efficient forking of the image (via Linux copy-on-write > memory sharing) and so a solution like what happens in Windows is not > really good. > > > My assumption with RemoteTask was that someone doing complex or long-running > jobs would more or less know what they were doing, and would have the good > sense to stop writing to the changes file from a bunch of forked images. > But in actual practice, I have never seen a problem related to this. > It just works. > > > Instead of putting a pop-up in front of the user, perhaps one way to > solve the problem would be to, upon image save, simply goes through > all the changes since the last save and re-flushes them to the > .changes file. > > That way, if someone does want to save the same image on top of > themself, at least it would be whichever saved last "wins".... > > > There must be a problem somewhere, otherwise Max would not be raising > the issue. So whatever combination of operating system and image is > having a problem, I would be inclined fix that. > > > :) Thanks Dave! > > > Windows cannot be a problem, because the operating system will not > permit you to open the changes file twice. The Unix/Linux systems that > I have used all work fine. > > Max, which operating system/VM/image are you using? Is this on a Mac? > > > Mac OS X 10.11.5, > Pharo 6 (60086) > > > Dave > > > > I actually didn???t open the issue for myself but because of a student who ran into this. I???ve been in the same situation > before but I???m an experienced user while students at the research group sometimes just spend a couple of weeks with Pharo and > then such things are a real problem. > > Interestingly the issue is, as you already suggested, pretty hard to reproduce i.e., modifying arbitrary methods in both images > did not show the symptoms I was looking for. > > Here???s a reproducible case (at least on my machine): > > 1. create a new method in both images: > > foo > ^ nil > > 2. Modify it in one image: > > foo > "Lorem ipsum dolor sit amet, consectetur adipiscing elit, sed do eiusmod tempor incididunt ut labore et dolore magna aliqua. Ut > enim ad minim veniam, quis nostrud exercitation ullamco laboris nisi ut aliquip ex ea commodo consequat. Duis aute irure dolor > in reprehenderit in voluptate velit esse cillum dolore eu fugiat nulla pariatur. Excepteur sint occaecat cupidatat non proident, > sunt in culpa qui officia deserunt mollit anim id est laborum." > ^ nil + 1 > > 3. Modify it in the other image: > > foo > ^ nil - 1 isEmpty ifTrue: [ "blah" nil ] > > > Confirmed on Linux + Squeak. > > I did your test above using #forkSqueak so that I had two identical images > sharing the same changes file. In each image, I saved the #foo method. At > that point, the changes file conntained exactly what I would expect. > > I then did a save and exit from the child image, followed by a save and exit > from the original image. I can see that the changes from the child image are > now overwriting the changes from the original parent image. Since the > parent image is the one that was saved last, its #foo method now has > corrupted source. > > This is not a scenario that I have ever encountered, but I can see how > it might happen in a classroom setting. > > I can't look into this further right now, but it seems possible that the > problem happens only when saving the image, in which case we could force > the changes file to seek to end of file before doing the save. But we'll > need to do some more testing to make sure that this is the only scenario > in which it happens. > > Dave > > > Great, thanks. > > In the scenario I described I did not save either image. Of course, without saving the problem will not exist as soon as you start the image anew > (the old pointers are still valid and new content will be written to the end). The problem does exhibit itself without saving though. > > Since this is not anything critical, don’t put too much effort into it. I’ll have time in a couple of weeks to look at it in detail and then, once > we understand the problem, we can discuss possible solutions. > > Cheers, > Max > > > > > In my case saving in step three produces a syntax error when the source is loaded from file again. I don???t really have a > clue as to what the underlying issue is, but I suspect it may have to do with comments and a particular situation in which > the position is not being correctly updated before or after writing. > > > I agree with Chris that locks may be problematic, it just seemed like the simplest obvious solution (although of course it > gets complicated when an image crashes and doesn???t clean up the lock???). > > Cheers, > Max > > > > |
Hi Levente,
Without having looked into this at all I think you are on to something with the missing #flush and maybe even a #close is needed because jumping to the end of a file unclosed in another process may not (probably does not) go to the end. Lou On Wed, 29 Jun 2016 16:08:50 +0200 (CEST), Levente Uzonyi <[hidden email]> wrote: >This seems to be a missing #flush after changes are written to the file. >Without #flush both processes (unix) will maintain their own version of >the file in memory. > >Levente > >On Wed, 29 Jun 2016, Max Leske wrote: > >> >> On 29 Jun 2016, at 14:45, David T. Lewis <[hidden email]> wrote: >> >> Max, >> >> Confirming on Linux and Squeak. See below. >> >> On Wed, Jun 29, 2016 at 08:53:12AM +0200, Max Leske wrote: >> >> On 29 Jun 2016, at 02:06, David T. Lewis <[hidden email]> wrote: >> >> On Tue, Jun 28, 2016 at 04:47:00PM -0500, Chris Muller wrote: >> >> On Tue, Jun 28, 2016 at 5:04 AM, Max Leske <[hidden email]> wrote: >> Hi, >> >> Opening the same image twice works fine as long as no writes to the .changes file occur. >> When both images write to the .changes file however it will be broken for both because the >> offsets for the changes are wrong. This can lead to lost data and predominantly to invalid >> method source code, which is a pain with Monticello. >> >> I suggest that we implement a kind of lock mechanism to ensure that only one image (the >> first one opened) can write to the .changes file. >> >> >> If the offsets are wrong in this scenario, it's a bug in the image. The >> image is supposed to seek to the end of the changes file before writing >> the next chunk. While this sounds horrible in theory, in practice it works >> remarkably well, and I have been happily surprised at how reliable it >> is after many years of using and abusing the feature. That is a very >> good thing. >> >> Adding a lock to prevent the scenario would be bad, because it would >> surely break a number of other legitimate use cases. >> >> >> >> I???ve opened an issue for Pharo here: >> https://pharo.fogbugz.com/f/cases/18635/The-changes-file-should-be-bound-to-a-single-image >> >> >> >> I have several applications which launch multiple copies of the same >> image for multicore processing. The images do their work, commit it >> to database, then exit themselves without saving. Its a great >> feature. >> >> >> Doing work is not the problem. Modifying source code is the problem. >> >> >> That is consistent with my experience. I remember expecting horrible >> things to happen if I had two images sharing a changes file, but nothing >> bad ever happened. It just works. >> >> >> I know OSProcess, when combined with CommandShell, has a RemoteTask >> which allows efficient forking of the image (via Linux copy-on-write >> memory sharing) and so a solution like what happens in Windows is not >> really good. >> >> >> My assumption with RemoteTask was that someone doing complex or long-running >> jobs would more or less know what they were doing, and would have the good >> sense to stop writing to the changes file from a bunch of forked images. >> But in actual practice, I have never seen a problem related to this. >> It just works. >> >> >> Instead of putting a pop-up in front of the user, perhaps one way to >> solve the problem would be to, upon image save, simply goes through >> all the changes since the last save and re-flushes them to the >> .changes file. >> >> That way, if someone does want to save the same image on top of >> themself, at least it would be whichever saved last "wins".... >> >> >> There must be a problem somewhere, otherwise Max would not be raising >> the issue. So whatever combination of operating system and image is >> having a problem, I would be inclined fix that. >> >> >> :) Thanks Dave! >> >> >> Windows cannot be a problem, because the operating system will not >> permit you to open the changes file twice. The Unix/Linux systems that >> I have used all work fine. >> >> Max, which operating system/VM/image are you using? Is this on a Mac? >> >> >> Mac OS X 10.11.5, >> Pharo 6 (60086) >> >> >> Dave >> >> >> >> I actually didn???t open the issue for myself but because of a student who ran into this. I???ve been in the same situation >> before but I???m an experienced user while students at the research group sometimes just spend a couple of weeks with Pharo and >> then such things are a real problem. >> >> Interestingly the issue is, as you already suggested, pretty hard to reproduce i.e., modifying arbitrary methods in both images >> did not show the symptoms I was looking for. >> >> Here???s a reproducible case (at least on my machine): >> >> 1. create a new method in both images: >> >> foo >> ^ nil >> >> 2. Modify it in one image: >> >> foo >> "Lorem ipsum dolor sit amet, consectetur adipiscing elit, sed do eiusmod tempor incididunt ut labore et dolore magna aliqua. Ut >> enim ad minim veniam, quis nostrud exercitation ullamco laboris nisi ut aliquip ex ea commodo consequat. Duis aute irure dolor >> in reprehenderit in voluptate velit esse cillum dolore eu fugiat nulla pariatur. Excepteur sint occaecat cupidatat non proident, >> sunt in culpa qui officia deserunt mollit anim id est laborum." >> ^ nil + 1 >> >> 3. Modify it in the other image: >> >> foo >> ^ nil - 1 isEmpty ifTrue: [ "blah" nil ] >> >> >> Confirmed on Linux + Squeak. >> >> I did your test above using #forkSqueak so that I had two identical images >> sharing the same changes file. In each image, I saved the #foo method. At >> that point, the changes file conntained exactly what I would expect. >> >> I then did a save and exit from the child image, followed by a save and exit >> from the original image. I can see that the changes from the child image are >> now overwriting the changes from the original parent image. Since the >> parent image is the one that was saved last, its #foo method now has >> corrupted source. >> >> This is not a scenario that I have ever encountered, but I can see how >> it might happen in a classroom setting. >> >> I can't look into this further right now, but it seems possible that the >> problem happens only when saving the image, in which case we could force >> the changes file to seek to end of file before doing the save. But we'll >> need to do some more testing to make sure that this is the only scenario >> in which it happens. >> >> Dave >> >> >> Great, thanks. >> >> In the scenario I described I did not save either image. Of course, without saving the problem will not exist as soon as you start the image anew >> (the old pointers are still valid and new content will be written to the end). The problem does exhibit itself without saving though. >> >> Since this is not anything critical, dont put too much effort into it. Ill have time in a couple of weeks to look at it in detail and then, once >> we understand the problem, we can discuss possible solutions. >> >> Cheers, >> Max >> >> >> >> >> In my case saving in step three produces a syntax error when the source is loaded from file again. I don???t really have a >> clue as to what the underlying issue is, but I suspect it may have to do with comments and a particular situation in which >> the position is not being correctly updated before or after writing. >> >> >> I agree with Chris that locks may be problematic, it just seemed like the simplest obvious solution (although of course it >> gets complicated when an image crashes and doesn???t clean up the lock???). >> >> Cheers, >> Max >> >> >> >> Louis LaBrunda Keystone Software Corp. SkypeMe callto://PhotonDemon |
In reply to this post by Levente Uzonyi
> On 29-06-2016, at 7:08 AM, Levente Uzonyi <[hidden email]> wrote: > > This seems to be a missing #flush after changes are written to the file. > Without #flush both processes (unix) will maintain their own version of the file in memory. Pretty much exactly what I was about to type. We just had part of this discussion wrt Scratch project files on the Pi - adding flush/sync etc. In many cases letting an OS buffer the buffering of the buffer’s buffer buffer is tolerable - though insane, and wasteful, and a symptom of the lack of careful analysis that seems to pervade the world of software these days - because nothing goes horribly wrong in most cases. Everything eventually gets pushed to actual hardware, the system doesn’t crash, evaporate, get zapped by Zargon DeathRay(™) emissions, the power doesn’t get ripped out etc. Evidently, on a Pi in a classroom we can’t assign quite such a low probability to the Zargon problem. However, the changes file is supposed to be a transaction log and as such I claim the data ought to hit hardware as soon as possible and in a way that as near as dammit guarantees correct results. So the mega-layer buffering is An Issue so far as I’m concerned. We also, still, and for decades now, have the behaviour I consider stupid beyond all reason whereby a file write is done by a) tell the file pointer to move to a certain location b) think about it c) oh, finally write some text to the file. With the obvious possibility that the file pointer can be changed in b) Then if you can open-for-write a file multiple times, how much confusion can that actually cause? What about a forked process with nominally the same file objects? Are we at all sure any OS properly deals with it? Are we sure that what is purportedly ‘proper’ makes any sense for our requirements? The most obvious place where this is an issue is where two images are using the same changes file and think they’re appending. Image A seeks to the end of the file, ‘writes’ stuff. Image B near-simultaneously does the same. Eventually each process gets around to pushing data to hardware. Oops! And let’s not dwell too much on the problems possible if either process causes a truncation of the file. Oh, wait, I think we actually had a problem with that some years ago. tim -- tim Rowledge; [hidden email]; http://www.rowledge.org/tim "Daddy, what does FORMATTING DRIVE C mean?" |
Hi Tim,
On Wed, Jun 29, 2016 at 10:24 AM, tim Rowledge <[hidden email]> wrote:
The thing is that this problem bites even if we have a unitary primitive that both positions and writes if that primitive is written above a substrate that, as unix and stdio streams do, separates positioning from writing. The primitive is neat but it simply drives the problem further underground. A more robust solution might be to position, write, reposition, read, and compare, shortening on corruption, and retrying, using exponential back-off like ethernet packet transmission. Most of the time this adds only the overhead of reading what's written. _,,,^..^,,,_ best, Eliot |
> On 29-06-2016, at 10:35 AM, Eliot Miranda <[hidden email]> wrote: > {snip much rant} > The most obvious place where this is an issue is where two images are using the same changes file and think they’re appending. Image A seeks to the end of the file, ‘writes’ stuff. Image B near-simultaneously does the same. Eventually each process gets around to pushing data to hardware. Oops! And let’s not dwell too much on the problems possible if either process causes a truncation of the file. Oh, wait, I think we actually had a problem with that some years ago. > > The thing is that this problem bites even if we have a unitary primitive that both positions and writes if that primitive is written above a substrate that, as unix and stdio streams do, separates positioning from writing. The primitive is neat but it simply drives the problem further underground. Oh absolutely - we only have real control over a small part of it. It would probably be worth making use of that where we can. > > A more robust solution might be to position, write, reposition, read, and compare, shortening on corruption, and retrying, using exponential back-off like ethernet packet transmission. Most of the time this adds only the overhead of reading what's written. Yes, for anything we want reliable that’s probably a good way. A limit on the number of retries would probably be smart to stop infinite recursion. Imagine the fun of an error causing infinite retries of writing an error log about an infinite recursion. On an infinitely large Beowulf cluster! It’s all yet another example of where software meeting reality leads to nightmares. tim -- tim Rowledge; [hidden email]; http://www.rowledge.org/tim If it was easy, the hardware people would take care of it. |
Let's not solve the wrong problem folks. I only looked at this for 10
minutes this morning, and I think (but I am not sure) that the issue affects the case of saving the image, and that the normal writing of changes is fine. Max was running on Pharo, which may or may not be handling changes the same way. I think he may be seeing a different problem from the one I confirmed. So a bit more testing and verification would be in order. I can't look at it now though. Dave > >> On 29-06-2016, at 10:35 AM, Eliot Miranda <[hidden email]> >> wrote: >> > {snip much rant} > >> The most obvious place where this is an issue is where two images are >> using the same changes file and think theyâre appending. Image A seeks >> to the end of the file, âwritesâ stuff. Image B near-simultaneously >> does the same. Eventually each process gets around to pushing data to >> hardware. Oops! And letâs not dwell too much on the problems possible >> if either process causes a truncation of the file. Oh, wait, I think we >> actually had a problem with that some years ago. >> >> The thing is that this problem bites even if we have a unitary primitive >> that both positions and writes if that primitive is written above a >> substrate that, as unix and stdio streams do, separates positioning from >> writing. The primitive is neat but it simply drives the problem further >> underground. > > > Oh absolutely - we only have real control over a small part of it. It > would probably be worth making use of that where we can. > >> >> A more robust solution might be to position, write, reposition, read, >> and compare, shortening on corruption, and retrying, using exponential >> back-off like ethernet packet transmission. Most of the time this adds >> only the overhead of reading what's written. > > Yes, for anything we want reliable thatâs probably a good way. A limit > on the number of retries would probably be smart to stop infinite > recursion. Imagine the fun of an error causing infinite retries of writing > an error log about an infinite recursion. On an infinitely large Beowulf > cluster! > > Itâs all yet another example of where software meeting reality leads to > nightmares. > > > tim > -- > tim Rowledge; [hidden email]; http://www.rowledge.org/tim > If it was easy, the hardware people would take care of it. > > > |
On Wed, Jun 29, 2016 at 02:00:19PM -0400, David T. Lewis wrote:
> Let's not solve the wrong problem folks. I only looked at this for 10 > minutes this morning, and I think (but I am not sure) that the issue > affects the case of saving the image, and that the normal writing of > changes is fine. I am wrong. I spent some more time with this, and it is clear that two images saving chunks to the same changes file will result in corrupted change records in the changes file. It is not just an issue related to the image save as I suggested above. In practice, this is not an issue that either Chris or I have noticed, probably because we are not doing software development (saving method changes) at the same time that we are running RemoteTask and similar. But I can certainly see how it might be a problem if, for example, I had a bunch of students running the same image from a network shared folder. Dave > > Max was running on Pharo, which may or may not be handling changes the > same way. I think he may be seeing a different problem from the one I > confirmed. > > So a bit more testing and verification would be in order. I can't look at > it now though. > > Dave > > > > >> On 29-06-2016, at 10:35 AM, Eliot Miranda <[hidden email]> > >> wrote: > >> > > {snip much rant} > > > >> The most obvious place where this is an issue is where two images are > >> using the same changes file and think they???re appending. Image A seeks > >> to the end of the file, ???writes??? stuff. Image B near-simultaneously > >> does the same. Eventually each process gets around to pushing data to > >> hardware. Oops! And let???s not dwell too much on the problems possible > >> if either process causes a truncation of the file. Oh, wait, I think we > >> actually had a problem with that some years ago. > >> > >> The thing is that this problem bites even if we have a unitary primitive > >> that both positions and writes if that primitive is written above a > >> substrate that, as unix and stdio streams do, separates positioning from > >> writing. The primitive is neat but it simply drives the problem further > >> underground. > > > > > > Oh absolutely - we only have real control over a small part of it. It > > would probably be worth making use of that where we can. > > > >> > >> A more robust solution might be to position, write, reposition, read, > >> and compare, shortening on corruption, and retrying, using exponential > >> back-off like ethernet packet transmission. Most of the time this adds > >> only the overhead of reading what's written. > > > > Yes, for anything we want reliable that???s probably a good way. A limit > > on the number of retries would probably be smart to stop infinite > > recursion. Imagine the fun of an error causing infinite retries of writing > > an error log about an infinite recursion. On an infinitely large Beowulf > > cluster! > > > > It???s all yet another example of where software meeting reality leads to > > nightmares. > > > > > > tim > > -- > > tim Rowledge; [hidden email]; http://www.rowledge.org/tim > > If it was easy, the hardware people would take care of it. > > > > > > > |
In reply to this post by Max Leske
On Tue, Jun 28, 2016 at 6:04 PM, Max Leske <[hidden email]> wrote:
> Hi, > > Opening the same image twice works fine as long as no writes to the .changes file occur. When both images write to the .changes file however it will be broken for both because the offsets for the changes are wrong. This can lead to lost data and predominantly to invalid method source code, which is a pain with Monticello. > > I suggest that we implement a kind of lock mechanism to ensure that only one image (the first one opened) can write to the .changes file. > > > I’ve opened an issue for Pharo here: https://pharo.fogbugz.com/f/cases/18635/The-changes-file-should-be-bound-to-a-single-image > > > Cheers, > Max I just learnt something quite surprising that is probably important to be aware of... "Locks given by fcntl are not associated with the file-descriptor or open-file table entries. Instead, they are bound to the process itself. For example, a process has multiple open file descriptors for a particular file and gets a read/write lock using any one of these descriptors. Now closing any of these file descriptors will release the lock, the process holds on the file. The descriptor that was used to acquire the lock in the first place might still be open, but the process will loose its lock. So, it does not require an explicit unlock or a close ONLY on the descriptor that was used to acquire the lock in fcntl call. Doing unlock or close on any of the open file descriptors will release the lock owned by the process on the particular file." https://loonytek.com/2015/01/15/advisory-file-locking-differences-between-posix-and-bsd-locks/ cheers -ben |
In reply to this post by David T. Lewis
On Thu, Jun 30, 2016 at 7:07 AM, David T. Lewis <[hidden email]> wrote:
> On Wed, Jun 29, 2016 at 02:00:19PM -0400, David T. Lewis wrote: >> Let's not solve the wrong problem folks. I only looked at this for 10 >> minutes this morning, and I think (but I am not sure) that the issue >> affects the case of saving the image, and that the normal writing of >> changes is fine. > > I am wrong. > > I spent some more time with this, and it is clear that two images saving > chunks to the same changes file will result in corrupted change records > in the changes file. It is not just an issue related to the image save > as I suggested above. > > In practice, this is not an issue that either Chris or I have noticed, > probably because we are not doing software development (saving method > changes) at the same time that we are running RemoteTask and similar. > But I can certainly see how it might be a problem if, for example, I > had a bunch of students running the same image from a network shared > folder. Maybe its time to consider a fundamental change in how method-sources are referred to. Taking inspiration from git... A content addressable key-value file store might solve concurrent access. Each CompiledMethod gets written to a file named for the hash of its contents, which is the only reference the Image getsto a method's source. Each such file would *only* need be written once and thereafter could be read simultaneously by multiple Images. Anyone on the network wanting store the same source would see the file already exists and have nothing to do. Perhaps having many individual files implies abysmal performance, Or maybe something similar to Mecurial's reflog format [1] could be used, one file per class. The thing about the Image *only* referring to a method's source by its content hash would seem to great flexibility in backends to locate/store that source. Possibly... * stored as individual files as above * bundled in a zip file in random order * a school could configure a database server in Image provided to students * hashes could be thrown at a service on the Internet * cached locally with a key-value database like LMDB [2] * remote replication to multiple internet backup locations * in an emergency you could throw bundle of hashes as a query to the mail list and get an adhoc response of individual files. * Inter-Smalltalk image communication Pharo has a stated goal to get rid of the changes file. Changing to content-hash-addressable method-source seems a logicial step along that road. Even if the Squeak community doesn't want to go so far as eliminating the .changes file, can they see value in changing method source references to be content-hashes rather than indexes into a particular file? [1] http://blog.prasoonshukla.com/mercurial-vs-git-scaling [2] https://en.wikipedia.org/wiki/Lightning_Memory-Mapped_Database Just having a poke at this, it seems a new form of CompiledMethodTrailer may need to be defined, being invoked from CompiledMethod>>sourceCode. CompiledMethodTrailer>>sourceCode would find the source code based on a content-hash held by the CompiledMethod. If found, the call to #getSourceFromFile that accesses the .changes file will be bypassed, and could remain as a backup. cheers -ben > > Dave > > >> >> Max was running on Pharo, which may or may not be handling changes the >> same way. I think he may be seeing a different problem from the one I >> confirmed. >> >> So a bit more testing and verification would be in order. I can't look at >> it now though. >> >> Dave >> >> > >> >> On 29-06-2016, at 10:35 AM, Eliot Miranda <[hidden email]> >> >> wrote: >> >> >> > {snip much rant} >> > >> >> The most obvious place where this is an issue is where two images are >> >> using the same changes file and think they???re appending. Image A seeks >> >> to the end of the file, ???writes??? stuff. Image B near-simultaneously >> >> does the same. Eventually each process gets around to pushing data to >> >> hardware. Oops! And let???s not dwell too much on the problems possible >> >> if either process causes a truncation of the file. Oh, wait, I think we >> >> actually had a problem with that some years ago. >> >> >> >> The thing is that this problem bites even if we have a unitary primitive >> >> that both positions and writes if that primitive is written above a >> >> substrate that, as unix and stdio streams do, separates positioning from >> >> writing. The primitive is neat but it simply drives the problem further >> >> underground. >> > >> > >> > Oh absolutely - we only have real control over a small part of it. It >> > would probably be worth making use of that where we can. >> > >> >> >> >> A more robust solution might be to position, write, reposition, read, >> >> and compare, shortening on corruption, and retrying, using exponential >> >> back-off like ethernet packet transmission. Most of the time this adds >> >> only the overhead of reading what's written. >> > >> > Yes, for anything we want reliable that???s probably a good way. A limit >> > on the number of retries would probably be smart to stop infinite >> > recursion. Imagine the fun of an error causing infinite retries of writing >> > an error log about an infinite recursion. On an infinitely large Beowulf >> > cluster! >> > >> > It???s all yet another example of where software meeting reality leads to >> > nightmares. >> > >> > >> > tim >> > -- >> > tim Rowledge; [hidden email]; http://www.rowledge.org/tim >> > If it was easy, the hardware people would take care of it. >> > >> > >> > >> > |
In reply to this post by Ben Coman
> On 30 Jun 2016, at 05:09, Ben Coman <[hidden email]> wrote: > > On Tue, Jun 28, 2016 at 6:04 PM, Max Leske <[hidden email]> wrote: >> Hi, >> >> Opening the same image twice works fine as long as no writes to the .changes file occur. When both images write to the .changes file however it will be broken for both because the offsets for the changes are wrong. This can lead to lost data and predominantly to invalid method source code, which is a pain with Monticello. >> >> I suggest that we implement a kind of lock mechanism to ensure that only one image (the first one opened) can write to the .changes file. >> >> >> I’ve opened an issue for Pharo here: https://pharo.fogbugz.com/f/cases/18635/The-changes-file-should-be-bound-to-a-single-image >> >> >> Cheers, >> Max > > I just learnt something quite surprising that is probably important to > be aware of... "Locks given by fcntl are not associated with the > file-descriptor or open-file table entries. Instead, they are bound to > the process itself. For example, a process has multiple open file > descriptors for a particular file and gets a read/write lock using any > one of these descriptors. Now closing any of these file descriptors > will release the lock, the process holds on the file. The descriptor > that was used to acquire the lock in the first place might still be > open, but the process will loose its lock. So, it does not require an > explicit unlock or a close ONLY on the descriptor that was used to > acquire the lock in fcntl call. Doing unlock or close on any of the > open file descriptors will release the lock owned by the process on > the particular file." > > https://loonytek.com/2015/01/15/advisory-file-locking-differences-between-posix-and-bsd-locks/ > > cheers -ben > Which would solve the problem of a crashed image not cleaning up its lock. Thanks for sharing Ben. |
On Thu, Jun 30, 2016 at 09:59:37AM +0200, Max Leske wrote:
> > > On 30 Jun 2016, at 05:09, Ben Coman <[hidden email]> wrote: > > > > On Tue, Jun 28, 2016 at 6:04 PM, Max Leske <[hidden email]> wrote: > >> Hi, > >> > >> Opening the same image twice works fine as long as no writes to the .changes file occur. When both images write to the .changes file however it will be broken for both because the offsets for the changes are wrong. This can lead to lost data and predominantly to invalid method source code, which is a pain with Monticello. > >> > >> I suggest that we implement a kind of lock mechanism to ensure that only one image (the first one opened) can write to the .changes file. > >> > >> > >> I???ve opened an issue for Pharo here: https://pharo.fogbugz.com/f/cases/18635/The-changes-file-should-be-bound-to-a-single-image > >> > >> > >> Cheers, > >> Max > > > > I just learnt something quite surprising that is probably important to > > be aware of... "Locks given by fcntl are not associated with the > > file-descriptor or open-file table entries. Instead, they are bound to > > the process itself. For example, a process has multiple open file > > descriptors for a particular file and gets a read/write lock using any > > one of these descriptors. Now closing any of these file descriptors > > will release the lock, the process holds on the file. The descriptor > > that was used to acquire the lock in the first place might still be > > open, but the process will loose its lock. So, it does not require an > > explicit unlock or a close ONLY on the descriptor that was used to > > acquire the lock in fcntl call. Doing unlock or close on any of the > > open file descriptors will release the lock owned by the process on > > the particular file." > > > > https://loonytek.com/2015/01/15/advisory-file-locking-differences-between-posix-and-bsd-locks/ > > > > cheers -ben > > > > Which would solve the problem of a crashed image not cleaning up its lock. Thanks for sharing Ben. FYI, file locking for Unix/Linux/OS X is supported in OSProcess, see UnixProcessFileLockTestCase and the 'file locking' tests in UnixProcessAccessorTestCase. Dave |
Free forum by Nabble | Edit this page |