Image damaged due to IO error while saving

Previous Topic Next Topic
 
classic Classic list List threaded Threaded
15 messages Options
Reply | Threaded
Open this post in threaded view
|

Image damaged due to IO error while saving

Christoph Thiede

Hi all,


some months ago, I corrupted my image by accidentally shutting down the host system while saving the image file (many of my images are > 500 MB, so this can take a few seconds even on an SSD). The same can happen due to various other IO/connection issues, so here's an idea:
Couldn't we always use overwrite-by-rename when saving the image file? I. e., first the image into a new temporary file and, after saving has completed, replace the original file with that temp file (via mv)? This would ensure the image file's integrity.


A possible disadvantage, though, would be that some filesystems, such as NTFS, associate meta-information with the file identity, which changes when using the overwrite-by-rename approach. Also, technologies such as FileSystemWatcher would be confused for the same reason. However, afaik overwrite-by-rename is a quite common approach, in primary for big and sensitive files.


However, what are your opinions about this topic? :-)


Best,

Christoph



Carpe Squeak!
Reply | Threaded
Open this post in threaded view
|

Re: Image damaged due to IO error while saving

Tony Garnock-Jones-5
That sounds like a great idea.

On configurations where overwrite-by-rename is a problem, perhaps an
alternate of "copy the existing image to a *.bak file" would work?

Perhaps the image save primitive could respond to a VM command-line
switch (or in-image VM parameter?) selecting among three behaviours:

 1. The current overwrite-in-place, risk-of-corruption behaviour
 2. Overwrite-by-rename if possible
 3. Make backup copy before overwrite-in-place

Regards,
  Tony


On 1/29/20 6:00 PM, Thiede, Christoph wrote:

> Hi all,
>
>
> some months ago, I corrupted my image by accidentally shutting down the
> host system while saving the image file (many of my images are > 500 MB,
> so this can take a few seconds even on an SSD). The same can happen due
> to various other IO/connection issues, so here's an idea:
> Couldn't we always use overwrite-by-rename when saving the image file?
> I. e., first the image into a new temporary file and, after saving
> has completed, replace the original file with that temp file (via mv)?
> This would ensure the image file's integrity.
>
>
> A possible disadvantage, though, would be that some filesystems, such as
> NTFS, associate meta-information with the file identity, which changes
> when using the overwrite-by-rename approach. Also, technologies such as
> FileSystemWatcher would be confused for the same reason. However, afaik
> overwrite-by-rename is a quite common approach, in primary for big and
> sensitive files.
>
>
> However, what are your opinions about this topic? :-)
>
>
> Best,
>
> Christoph
>
>
>

Reply | Threaded
Open this post in threaded view
|

Re: Image damaged due to IO error while saving

timrowledge
It's certainly do-able; we had a system like this at Interval. Basically write an image to a suitably chosen name, check if it was ok (which could involve a lot of work if you want to be paranoid) and if so rename (and check the rename worked!) and quit.
Craig might possibly have the code around? I certainly don't.

> On 2020-01-29, at 12:10 PM, Tony Garnock-Jones <[hidden email]> wrote:
>
> That sounds like a great idea.
>
> On configurations where overwrite-by-rename is a problem, perhaps an
> alternate of "copy the existing image to a *.bak file" would work?
>
> Perhaps the image save primitive could respond to a VM command-line
> switch (or in-image VM parameter?) selecting among three behaviours:
>
> 1. The current overwrite-in-place, risk-of-corruption behaviour
> 2. Overwrite-by-rename if possible
> 3. Make backup copy before overwrite-in-place
>
> Regards,
>  Tony
>
>
> On 1/29/20 6:00 PM, Thiede, Christoph wrote:
>> Hi all,
>>
>>
>> some months ago, I corrupted my image by accidentally shutting down the
>> host system while saving the image file (many of my images are > 500 MB,
>> so this can take a few seconds even on an SSD). The same can happen due
>> to various other IO/connection issues, so here's an idea:
>> Couldn't we always use overwrite-by-rename when saving the image file?
>> I. e., first the image into a new temporary file and, after saving
>> has completed, replace the original file with that temp file (via mv)?
>> This would ensure the image file's integrity.
>>
>>
>> A possible disadvantage, though, would be that some filesystems, such as
>> NTFS, associate meta-information with the file identity, which changes
>> when using the overwrite-by-rename approach. Also, technologies such as
>> FileSystemWatcher would be confused for the same reason. However, afaik
>> overwrite-by-rename is a quite common approach, in primary for big and
>> sensitive files.
>>
>>
>> However, what are your opinions about this topic? :-)
>>
>>
>> Best,
>>
>> Christoph
>>
>>
>>
>
>


tim
--
tim Rowledge; [hidden email]; http://www.rowledge.org/tim
Oxymorons: Living dead



Reply | Threaded
Open this post in threaded view
|

Re: Image damaged due to IO error while saving

Tony Garnock-Jones-5
Oh, I see, so it's probably something that can be arranged entirely
image-side, no VM support needed. Right?

... it'd be a Preference, I suppose?

Tony



On 1/29/20 9:19 PM, tim Rowledge wrote:

> It's certainly do-able; we had a system like this at Interval. Basically write an image to a suitably chosen name, check if it was ok (which could involve a lot of work if you want to be paranoid) and if so rename (and check the rename worked!) and quit.
> Craig might possibly have the code around? I certainly don't.
>
>> On 2020-01-29, at 12:10 PM, Tony Garnock-Jones <[hidden email]> wrote:
>>
>> That sounds like a great idea.
>>
>> On configurations where overwrite-by-rename is a problem, perhaps an
>> alternate of "copy the existing image to a *.bak file" would work?
>>
>> Perhaps the image save primitive could respond to a VM command-line
>> switch (or in-image VM parameter?) selecting among three behaviours:
>>
>> 1. The current overwrite-in-place, risk-of-corruption behaviour
>> 2. Overwrite-by-rename if possible
>> 3. Make backup copy before overwrite-in-place
>>
>> Regards,
>>  Tony
>>
>>
>> On 1/29/20 6:00 PM, Thiede, Christoph wrote:
>>> Hi all,
>>>
>>>
>>> some months ago, I corrupted my image by accidentally shutting down the
>>> host system while saving the image file (many of my images are > 500 MB,
>>> so this can take a few seconds even on an SSD). The same can happen due
>>> to various other IO/connection issues, so here's an idea:
>>> Couldn't we always use overwrite-by-rename when saving the image file?
>>> I. e., first the image into a new temporary file and, after saving
>>> has completed, replace the original file with that temp file (via mv)?
>>> This would ensure the image file's integrity.
>>>
>>>
>>> A possible disadvantage, though, would be that some filesystems, such as
>>> NTFS, associate meta-information with the file identity, which changes
>>> when using the overwrite-by-rename approach. Also, technologies such as
>>> FileSystemWatcher would be confused for the same reason. However, afaik
>>> overwrite-by-rename is a quite common approach, in primary for big and
>>> sensitive files.
>>>
>>>
>>> However, what are your opinions about this topic? :-)
>>>
>>>
>>> Best,
>>>
>>> Christoph
>>>
>>>
>>>
>>
>>
>
>
> tim
> --
> tim Rowledge; [hidden email]; http://www.rowledge.org/tim
> Oxymorons: Living dead
>
>
>

Reply | Threaded
Open this post in threaded view
|

Re: Image damaged due to IO error while saving

timrowledge


> On 2020-01-29, at 12:24 PM, Tony Garnock-Jones <[hidden email]> wrote:
>
> Oh, I see, so it's probably something that can be arranged entirely
> image-side, no VM support needed. Right?

Pretty sure it could be done without VM support, yes. One might even use the OSProcess forking trick to do it, I think.

> ... it'd be a Preference, I suppose?
Yet another ...


tim
--
tim Rowledge; [hidden email]; http://www.rowledge.org/tim
"How many Kdatlyno does it take to change a lightbulb?”
"None. It sounds perfectly OK to them."




Reply | Threaded
Open this post in threaded view
|

Re: Image damaged due to IO error while saving

Christoph Thiede

Hi all!


On configurations where overwrite-by-rename is a problem, perhaps an alternate of "copy the existing image to a *.bak file" would work?


Compared to overwrite-by-rename, this proposal would double the storage effort. Provided that I understand you correctly, -1 :-)

On configurations where overwrite-by-rename is a problem, perhaps an alternate of "copy the existing image to a *.bak file" would work?

What would you like to do with this backup file? Keep them permanently? As we speak about hundreds-of-megabytes file sizes, I think this could be quite storage extensive ... Also, it messes up your image folder. We already have two files for each image: .image and .changes. No need for even more files, imho. But there may always be some special application areas, of course :)

+1 for making a preference for it :-)
However, my personal flavor would be to rule this behavior via the Squeak.ini file (not sure what's the equivalent on other host platforms), so I would prefer to store this preference image-invariant.

VM support: What would be the pros and cons of implementing this in the VM?
First, I don't know whether we already support a way to read the Squeak.ini file from within the image (see above)?
Second, I *could* imagine (though this is spoken hypothetically) that certain host systems might provide convenient ways for implementing overwrite-by-rename. See my initial mail for my worries about a naive implementation. Again, wouldn't this be an argument for implementing this rather at VM side?

Ad validation: Sounds interesting! How high would be the effort for that? Could you do this from within the VM (it's also a question of performance, I guess)? Wouldn't this double the store time? Maybe it would be a good idea to have a second (VM) preference for toggling validation.

Best,
Christoph

Von: Squeak-dev <[hidden email]> im Auftrag von tim Rowledge <[hidden email]>
Gesendet: Mittwoch, 29. Januar 2020 22:05:07
An: The general-purpose Squeak developers list
Betreff: Re: [squeak-dev] Image damaged due to IO error while saving
 


> On 2020-01-29, at 12:24 PM, Tony Garnock-Jones <[hidden email]> wrote:
>
> Oh, I see, so it's probably something that can be arranged entirely
> image-side, no VM support needed. Right?

Pretty sure it could be done without VM support, yes. One might even use the OSProcess forking trick to do it, I think.

> ... it'd be a Preference, I suppose?
Yet another ...


tim
--
tim Rowledge; [hidden email]; http://www.rowledge.org/tim
"How many Kdatlyno does it take to change a lightbulb?”
"None. It sounds perfectly OK to them."






Carpe Squeak!
Reply | Threaded
Open this post in threaded view
|

Re: Image damaged due to IO error while saving

David T. Lewis
My suggestion is to just try some ideas in your own image and see
if it's something you want to live with. The intentions are good but
I have a feeling that this is the kind of thing where the inintended
side effects are worse than the problem to be solved.

In my own experience, I have encountered an IO error while saving
the image several times over the years. In every case, the cause has
been a file system full condition. A solution that uses more disc
space would not have been helpful.

Tim mentions using OSProcess, so you can try something based on
"UnixProcess saveImageInBackground" if you want.

Dave

On Thu, Jan 30, 2020 at 12:13:27PM +0000, Thiede, Christoph wrote:

> Hi all!
>
>
> > On configurations where overwrite-by-rename is a problem, perhaps an alternate of "copy the existing image to a *.bak file" would work?
>
>
> <http://www.hpi.de/>
> Compared to overwrite-by-rename, this proposal would double the storage effort. Provided that I understand you correctly, -1 :-)
>
> > On configurations where overwrite-by-rename is a problem, perhaps an alternate of "copy the existing image to a *.bak file" would work?
>
> What would you like to do with this backup file? Keep them permanently? As we speak about hundreds-of-megabytes file sizes, I think this could be quite storage extensive ... Also, it messes up your image folder. We already have two files for each image: .image and .changes. No need for even more files, imho. But there may always be some special application areas, of course :)
>
> +1 for making a preference for it :-)
> However, my personal flavor would be to rule this behavior via the Squeak.ini file (not sure what's the equivalent on other host platforms), so I would prefer to store this preference image-invariant.
>
> VM support: What would be the pros and cons of implementing this in the VM?
> First, I don't know whether we already support a way to read the Squeak.ini file from within the image (see above)?
> Second, I *could* imagine (though this is spoken hypothetically) that certain host systems might provide convenient ways for implementing overwrite-by-rename. See my initial mail for my worries about a naive implementation. Again, wouldn't this be an argument for implementing this rather at VM side?
>
> Ad validation: Sounds interesting! How high would be the effort for that? Could you do this from within the VM (it's also a question of performance, I guess)? Wouldn't this double the store time? Maybe it would be a good idea to have a second (VM) preference for toggling validation.
>
> Best,
> Christoph
> ________________________________
> Von: Squeak-dev <[hidden email]> im Auftrag von tim Rowledge <[hidden email]>
> Gesendet: Mittwoch, 29. Januar 2020 22:05:07
> An: The general-purpose Squeak developers list
> Betreff: Re: [squeak-dev] Image damaged due to IO error while saving
>
>
>
> > On 2020-01-29, at 12:24 PM, Tony Garnock-Jones <[hidden email]> wrote:
> >
> > Oh, I see, so it's probably something that can be arranged entirely
> > image-side, no VM support needed. Right?
>
> Pretty sure it could be done without VM support, yes. One might even use the OSProcess forking trick to do it, I think.
>
> > ... it'd be a Preference, I suppose?
> Yet another ...
>
>
> tim
> --
> tim Rowledge; [hidden email]; http://www.rowledge.org/tim
> "How many Kdatlyno does it take to change a lightbulb??
> "None. It sounds perfectly OK to them."
>
>
>
>

>


Reply | Threaded
Open this post in threaded view
|

Re: Image damaged due to IO error while saving

Eliot Miranda-2
In reply to this post by Tony Garnock-Jones-5
Hi Christoph, Hi Tony,

> On Jan 29, 2020, at 12:10 PM, Tony Garnock-Jones <[hidden email]> wrote:
>
> That sounds like a great idea.

+1

> On configurations where overwrite-by-rename is a problem, perhaps an
> alternate of "copy the existing image to a *.bak file" would work?

+1. This is IMO a safer and easier implementation path.  I would use @rename to a backup” though.  So the snapshot file operation is

- if saving to an existing file, then
   - rename existing file to some backup, eg foo.imagebak
   - write new image file foo.image
   - delete foo.imagebak

- if not saving to an existing file, then
   - write new image file foo.image

> Perhaps the image save primitive could respond to a VM command-line
> switch (or in-image VM parameter?) selecting among three behaviours:
>
> 1. The current overwrite-in-place, risk-of-corruption behaviour
> 2. Overwrite-by-rename if possible
> 3. Make backup copy before overwrite-in-place

Why would the rename be possible and the save not?  Ah, if the file is writable but the directory is not the rename would fail but the write would not, right?  But then both copy and rename would fail.  So I think we only need to support rename and the snapshot primitive should fail if the directory is not writable.

P.S. volunteers welcomed to do the work...

> Regards,
>  Tony
>
>
>> On 1/29/20 6:00 PM, Thiede, Christoph wrote:
>> Hi all,
>>
>>
>> some months ago, I corrupted my image by accidentally shutting down the
>> host system while saving the image file (many of my images are > 500 MB,
>> so this can take a few seconds even on an SSD). The same can happen due
>> to various other IO/connection issues, so here's an idea:
>> Couldn't we always use overwrite-by-rename when saving the image file?
>> I. e., first the image into a new temporary file and, after saving
>> has completed, replace the original file with that temp file (via mv)?
>> This would ensure the image file's integrity.
>>
>>
>> A possible disadvantage, though, would be that some filesystems, such as
>> NTFS, associate meta-information with the file identity, which changes
>> when using the overwrite-by-rename approach. Also, technologies such as
>> FileSystemWatcher would be confused for the same reason. However, afaik
>> overwrite-by-rename is a quite common approach, in primary for big and
>> sensitive files.
>>
>>
>> However, what are your opinions about this topic? :-)
>>
>>
>> Best,
>>
>> Christoph
>>
>>
>>
>

Reply | Threaded
Open this post in threaded view
|

Re: Image damaged due to IO error while saving

Eliot Miranda-2
In reply to this post by Christoph Thiede
Hi Christoph,

On Jan 30, 2020, at 4:13 AM, Thiede, Christoph <[hidden email]> wrote:



Hi all!


On configurations where overwrite-by-rename is a problem, perhaps an alternate of "copy the existing image to a *.bak file" would work?


Compared to overwrite-by-rename, this proposal would double the storage effort. Provided that I understand you correctly, -1 :-)

On configurations where overwrite-by-rename is a problem, perhaps an alternate of "copy the existing image to a *.bak file" would work?

If you want a backup, even temporarily, then you can’t avoid needing twice the file storage while the new snapshot is being written.  So careful what you wish for.  All implementations have this as a consequence, by definition.

What would you like to do with this backup file? Keep them permanently? As we speak about hundreds-of-megabytes file sizes, I think this could be quite storage extensive ... Also, it messes up your image folder. We already have two files for each image: .image and .changes. No need for even more files, imho. But there may always be some special application areas, of course :)

Well, if it stays around then it gets replaced on every save.  So one only has one copy per image.  One presumably would never rename the backup to save it when creating the next backup.  So in fact the operation is

- if saving to an existing file
    - delete the backup foo.imagebak if it exists
    - rename foo.image to foo.imagebak
    - save the image
    - optionally validate the new image
    - optionally delete the backup

+1 for making a preference for it :-)
However, my personal flavor would be to rule this behavior via the Squeak.ini file (not sure what's the equivalent on other host platforms), so I would prefer to store this preference image-invariant.

VM support: What would be the pros and cons of implementing this in the VM?
First, I don't know whether we already support a way to read the Squeak.ini file from within the image (see above)?
Second, I *could* imagine (though this is spoken hypothetically) that certain host systems might provide convenient ways for implementing overwrite-by-rename. See my initial mail for my worries about a naive implementation. Again, wouldn't this be an argument for implementing this rather at VM side?

Good questions.  I think implementing image side is better.  The snapshot primitive is separate from the quit primitive, so if the snapshot primitive succeeds there is time for the image to eg run validation and/or delete the backup before quitting.

This seems to me relayed to the other snapshot bug, which is that we GC in the snapshot primitive. This is completely wrong because it elides finalization actions.  Instead we should do a full GC in the image *before* doing the snapshot, allow any finalization actions to complete and then do the snapshot.  VW does this correctly.

Ad validation: Sounds interesting! How high would be the effort for that? Could you do this from within the VM (it's also a question of performance, I guess)? Wouldn't this double the store time? Maybe it would be a good idea to have a second (VM) preference for toggling validation.

Validation could (and IMO /should/) be fine via the new image leak checker.  This is a cut down vm that only loads an image and applies the leak checker before quitting.  To make this runnable from the image eg vis OSProcess.  That makes this an optional project because OSProcess is not in the base image.


Best,
Christoph

Von: Squeak-dev <[hidden email]> im Auftrag von tim Rowledge <[hidden email]>
Gesendet: Mittwoch, 29. Januar 2020 22:05:07
An: The general-purpose Squeak developers list
Betreff: Re: [squeak-dev] Image damaged due to IO error while saving
 


> On 2020-01-29, at 12:24 PM, Tony Garnock-Jones <[hidden email]> wrote:
>
> Oh, I see, so it's probably something that can be arranged entirely
> image-side, no VM support needed. Right?

Pretty sure it could be done without VM support, yes. One might even use the OSProcess forking trick to do it, I think.

> ... it'd be a Preference, I suppose?
Yet another ...


tim
--
tim Rowledge; [hidden email]; http://www.rowledge.org/tim
"How many Kdatlyno does it take to change a lightbulb?”
"None. It sounds perfectly OK to them."







Reply | Threaded
Open this post in threaded view
|

Re: Image damaged due to IO error while saving

Eliot Miranda-2
In reply to this post by David T. Lewis
Hi Dave,


> On Jan 30, 2020, at 5:18 AM, David T. Lewis <[hidden email]> wrote:
>
> My suggestion is to just try some ideas in your own image and see
> if it's something you want to live with. The intentions are good but
> I have a feeling that this is the kind of thing where the inintended
> side effects are worse than the problem to be solved.

+1

>
> In my own experience, I have encountered an IO error while saving
> the image several times over the years. In every case, the cause has
> been a file system full condition. A solution that uses more disc
> space would not have been helpful.

Good point.  The snapshot primitive *could* make a conservative estimate of the file size needed (easy; it knows how big the heap is), create a file, write that many zeros (only way to actually commit the disc space), and then overwrite with the real data, but that’s twice the disc traffic.

>
> Tim mentions using OSProcess, so you can try something based on
> "UnixProcess saveImageInBackground" if you want.
>
> Dave
>
>> On Thu, Jan 30, 2020 at 12:13:27PM +0000, Thiede, Christoph wrote:
>> Hi all!
>>
>>
>>> On configurations where overwrite-by-rename is a problem, perhaps an alternate of "copy the existing image to a *.bak file" would work?
>>
>>
>> <http://www.hpi.de/>
>> Compared to overwrite-by-rename, this proposal would double the storage effort. Provided that I understand you correctly, -1 :-)
>>
>>> On configurations where overwrite-by-rename is a problem, perhaps an alternate of "copy the existing image to a *.bak file" would work?
>>
>> What would you like to do with this backup file? Keep them permanently? As we speak about hundreds-of-megabytes file sizes, I think this could be quite storage extensive ... Also, it messes up your image folder. We already have two files for each image: .image and .changes. No need for even more files, imho. But there may always be some special application areas, of course :)
>>
>> +1 for making a preference for it :-)
>> However, my personal flavor would be to rule this behavior via the Squeak.ini file (not sure what's the equivalent on other host platforms), so I would prefer to store this preference image-invariant.
>>
>> VM support: What would be the pros and cons of implementing this in the VM?
>> First, I don't know whether we already support a way to read the Squeak.ini file from within the image (see above)?
>> Second, I *could* imagine (though this is spoken hypothetically) that certain host systems might provide convenient ways for implementing overwrite-by-rename. See my initial mail for my worries about a naive implementation. Again, wouldn't this be an argument for implementing this rather at VM side?
>>
>> Ad validation: Sounds interesting! How high would be the effort for that? Could you do this from within the VM (it's also a question of performance, I guess)? Wouldn't this double the store time? Maybe it would be a good idea to have a second (VM) preference for toggling validation.
>>
>> Best,
>> Christoph
>> ________________________________
>> Von: Squeak-dev <[hidden email]> im Auftrag von tim Rowledge <[hidden email]>
>> Gesendet: Mittwoch, 29. Januar 2020 22:05:07
>> An: The general-purpose Squeak developers list
>> Betreff: Re: [squeak-dev] Image damaged due to IO error while saving
>>
>>
>>
>>>> On 2020-01-29, at 12:24 PM, Tony Garnock-Jones <[hidden email]> wrote:
>>>
>>> Oh, I see, so it's probably something that can be arranged entirely
>>> image-side, no VM support needed. Right?
>>
>> Pretty sure it could be done without VM support, yes. One might even use the OSProcess forking trick to do it, I think.
>>
>>> ... it'd be a Preference, I suppose?
>> Yet another ...
>>
>>
>> tim
>> --
>> tim Rowledge; [hidden email]; http://www.rowledge.org/tim
>> "How many Kdatlyno does it take to change a lightbulb??
>> "None. It sounds perfectly OK to them."
>>
>>
>>
>>
>
>>
>
>

Reply | Threaded
Open this post in threaded view
|

Re: Image damaged due to IO error while saving

David T. Lewis
On Thu, Jan 30, 2020 at 06:20:01AM -0800, Eliot Miranda wrote:

> Hi Dave,
>
>
> > On Jan 30, 2020, at 5:18 AM, David T. Lewis <[hidden email]> wrote:
> >
> > ???My suggestion is to just try some ideas in your own image and see
> > if it's something you want to live with. The intentions are good but
> > I have a feeling that this is the kind of thing where the inintended
> > side effects are worse than the problem to be solved.
>
> +1
>
> >
> > In my own experience, I have encountered an IO error while saving
> > the image several times over the years. In every case, the cause has
> > been a file system full condition. A solution that uses more disc
> > space would not have been helpful.
>
> Good point.  The snapshot primitive *could* make a conservative estimate
> of the file size needed (easy; it knows how big the heap is), create
> a file, write that many zeros (only way to actually commit the disc
> space), and then overwrite with the real data, but that???s twice
> the disc traffic.
>

That's a good idea, and on a unix platform we can use statvfs()
to check space availability without adding any disc traffic.
To prove out the idea, I implemented it as a primitive in the
unix OSProcess plugin so that you can test it like this:

primSpaceFor: byteSize InDirectoryPath: dirPath
        <primitive: 'primitiveSpaceForByteSizeInDirectoryPath' module: 'UnixOSProcessPlugin'>
        ^ self primitiveFailed

If you want to give it a try, the primitive is now in the latest
UnixOSProcessPlugin in www.squeaksource.com/OSProcessPlugin in
VMConstruction-Plugins-OSProcessPlugin-dtl.47, and I merged it
into VMConstruction-Plugins-OSProcessPlugin.oscog-dtl.67 for the
Cog/Spur VMs.

I also added access from OSProcess is added in OSProcess-dtl.114.

I have not really looked into how best to put this into the VM
proper, but we could consider either adding a primitive similar
to the one in OSPP, or maybe add a check directly into the image
write function (which is currently a macro that we could override).

I also have not looked into how to implement this on Windows. I'm
sure there is a way but I have not yet checked. It's likely that
statvsf() is available on Windows but I have not looked.

In any case, treat this is a proof of concept to illustrate a
way to handle the file system full scenario.

Dave


Reply | Threaded
Open this post in threaded view
|

Re: Image damaged due to IO error while saving

timrowledge


> On 2020-02-01, at 11:48 AM, David T. Lewis <[hidden email]> wrote:
>
> That's a good idea, and on a unix platform we can use statvfs()
> to check space availability without adding any disc traffic.

The way we used to do this on RISC OS was (is, for the remaining users!) to allocate a file of the required size and it would be filled with 0. That way if you got a success return code you knew for certain (barring fire, flood, bomb or bear attack) that the file could be over-written with your real content. Do any other filing systems actually really definitely allocate space when you ask for it? No idea.


tim
--
tim Rowledge; [hidden email]; http://www.rowledge.org/tim
Strange OpCodes: FR: Flip Record



Reply | Threaded
Open this post in threaded view
|

Re: Image damaged due to IO error while saving

Christoph Thiede
In reply to this post by Eliot Miranda-2

Hi Eliot,


If you want a backup, even temporarily, then you can’t avoid needing twice the file storage while the new snapshot is being written.  So careful what you wish for.  All implementations have this as a consequence, by definition.


Actually, I did not want to have a backup :) All I requested was overwrite-by-rename to ensure the atomicity of the snapshot operation. I think there are enough tools out there that provide clever backup mechanism, we do not need to reinvent the wheel here. (Personally, I'm fine with OneDrive, which keeps old versions of all my images around every 15 minutes.)

> - if saving to an existing file, then
>    - rename existing file to some backup, eg foo.imagebak
>    - write new image file foo.image
>    - delete foo.imagebak
> - if not saving to an existing file, then
>    - write new image file foo.image

+1, sounds perfect.
You could also consider the following instead:
- if saving to an existing file, then
   - write new image file ~foo.image
   - rename ~foo.image to foo.image
Then foo.image will never be corrupted. Afaik this is the way Chromium or MS Office go, for example.

Best,
Christoph


Von: Squeak-dev <[hidden email]> im Auftrag von Eliot Miranda <[hidden email]>
Gesendet: Donnerstag, 30. Januar 2020 15:14:30
An: The general-purpose Squeak developers list
Betreff: Re: [squeak-dev] Image damaged due to IO error while saving
 
Hi Christoph,

On Jan 30, 2020, at 4:13 AM, Thiede, Christoph <[hidden email]> wrote:



Hi all!


On configurations where overwrite-by-rename is a problem, perhaps an alternate of "copy the existing image to a *.bak file" would work?


Compared to overwrite-by-rename, this proposal would double the storage effort. Provided that I understand you correctly, -1 :-)

On configurations where overwrite-by-rename is a problem, perhaps an alternate of "copy the existing image to a *.bak file" would work?

If you want a backup, even temporarily, then you can’t avoid needing twice the file storage while the new snapshot is being written.  So careful what you wish for.  All implementations have this as a consequence, by definition.

What would you like to do with this backup file? Keep them permanently? As we speak about hundreds-of-megabytes file sizes, I think this could be quite storage extensive ... Also, it messes up your image folder. We already have two files for each image: .image and .changes. No need for even more files, imho. But there may always be some special application areas, of course :)

Well, if it stays around then it gets replaced on every save.  So one only has one copy per image.  One presumably would never rename the backup to save it when creating the next backup.  So in fact the operation is

- if saving to an existing file
    - delete the backup foo.imagebak if it exists
    - rename foo.image to foo.imagebak
    - save the image
    - optionally validate the new image
    - optionally delete the backup

+1 for making a preference for it :-)
However, my personal flavor would be to rule this behavior via the Squeak.ini file (not sure what's the equivalent on other host platforms), so I would prefer to store this preference image-invariant.

VM support: What would be the pros and cons of implementing this in the VM?
First, I don't know whether we already support a way to read the Squeak.ini file from within the image (see above)?
Second, I *could* imagine (though this is spoken hypothetically) that certain host systems might provide convenient ways for implementing overwrite-by-rename. See my initial mail for my worries about a naive implementation. Again, wouldn't this be an argument for implementing this rather at VM side?

Good questions.  I think implementing image side is better.  The snapshot primitive is separate from the quit primitive, so if the snapshot primitive succeeds there is time for the image to eg run validation and/or delete the backup before quitting.

This seems to me relayed to the other snapshot bug, which is that we GC in the snapshot primitive. This is completely wrong because it elides finalization actions.  Instead we should do a full GC in the image *before* doing the snapshot, allow any finalization actions to complete and then do the snapshot.  VW does this correctly.

Ad validation: Sounds interesting! How high would be the effort for that? Could you do this from within the VM (it's also a question of performance, I guess)? Wouldn't this double the store time? Maybe it would be a good idea to have a second (VM) preference for toggling validation.

Validation could (and IMO /should/) be fine via the new image leak checker.  This is a cut down vm that only loads an image and applies the leak checker before quitting.  To make this runnable from the image eg vis OSProcess.  That makes this an optional project because OSProcess is not in the base image.


Best,
Christoph

Von: Squeak-dev <[hidden email]> im Auftrag von tim Rowledge <[hidden email]>
Gesendet: Mittwoch, 29. Januar 2020 22:05:07
An: The general-purpose Squeak developers list
Betreff: Re: [squeak-dev] Image damaged due to IO error while saving
 


> On 2020-01-29, at 12:24 PM, Tony Garnock-Jones <[hidden email]> wrote:
>
> Oh, I see, so it's probably something that can be arranged entirely
> image-side, no VM support needed. Right?

Pretty sure it could be done without VM support, yes. One might even use the OSProcess forking trick to do it, I think.

> ... it'd be a Preference, I suppose?
Yet another ...


tim
--
tim Rowledge; [hidden email]; http://www.rowledge.org/tim
"How many Kdatlyno does it take to change a lightbulb?”
"None. It sounds perfectly OK to them."







Carpe Squeak!
Reply | Threaded
Open this post in threaded view
|

Re: Image damaged due to IO error while saving

K K Subbu
In reply to this post by timrowledge
On 02/02/20 7:57 AM, tim Rowledge wrote:
> The way we used to do this on RISC OS was (is, for the remaining
> users!) to allocate a file of the required size and it would be
> filled with 0. That way if you got a success return code you knew for
> certain (barring fire, flood, bomb or bear attack) that the file
> could be over-written with your real content. Do any other filing
> systems actually really definitely allocate space when you ask for
> it? No idea.

Writing twice into the same file will increase wear and tear in SSDs
unnecessarily. An image is just an array of bytes, so one of the
following techniques could be adopted:

* create dummy files in fixed size units (say 128MB) and write the image
as usual. If the image write returns a disk full error, then delete one
or more of these dummy files to complete the operation.

* create a two file partitions of max size (say A and B). Alternate
writing to these partitions and mark the latest successful write as the
real McCoy.

* create a file with two segments each of max size. Alternate writing
into these two as in the case above. The header will need a flag to
identify which one is latest.

Regards .. Subbu

Reply | Threaded
Open this post in threaded view
|

Re: Image damaged due to IO error while saving

timrowledge


> On 2020-02-02, at 6:03 AM, K K Subbu <[hidden email]> wrote:
>
> On 02/02/20 7:57 AM, tim Rowledge wrote:
>> The way we used to do this on RISC OS was (is, for the remaining
>> users!) to allocate a file of the required size and it would be
>> filled with 0. That way if you got a success return code you knew for
>> certain (barring fire, flood, bomb or bear attack) that the file
>> could be over-written with your real content. Do any other filing
>> systems actually really definitely allocate space when you ask for
>> it? No idea.
>
> Writing twice into the same file will increase wear and tear in SSDs unnecessarily.

True, but in defence of RISC OS it does predate even the fantasy of SSDs by perhaps a couple of decades :-)

Obviously the key point is actually, definitely, allocating the required space, as opposed to optimistically claiming to have some room and then getting all upset when your application tries to use it.


tim
--
tim Rowledge; [hidden email]; http://www.rowledge.org/tim
"How many Carlos Wus does it take to change a lightbulb?”
"With an unlimited breeding licence, who needs lightbulbs?"