FilePlugin currently splits files in to two groups: 1) Stdio streams and 2) everything else. To test for the end of file, FilePlugin>>primitiveFileAtEnd: 1) Uses feof() for stdio streams. 2) Compares the current position to the file size for everything else. This returns the expected results for regular files, but fails for non-regular files, e.g.: (FileSystem / 'dev' / 'urandom') binaryReadStream next: 8. returns an empty array. On Unix, the proper way to check is to read past the end of the file and then call feof() to confirm that it is in fact at the end. Pharo (and I assume, Squeak) has plenty of code that assumes that a file position >= the file size means that the file is at the end, and as stated above this generally works for regular files. This patch modifies FilePlugin>>primitiveFileAtEnd to: a) Keep the current behaviour of using the file position test for regular files. b) Keep the current behaviour of using feof() for stdio streams. c) Use feof() for non-regular files, e.g. /dev/urandom. This allows existing code to continue to function, and allows non-regular files to be tested correctly. After applying the patch, the example code above answers the expected result, e.g.: (FileSystem / 'dev' / 'urandom') binaryReadStream next: 8. " #[179 136 227 226 28 147 197 125]" On Windows, as far as I can tell, all files are regular, and the position test is used regularly, so no change is requried. https://github.com/OpenSmalltalk/opensmalltalk-vm/pull/232 contains the changes. The CI builds failed, but for different reasons to do with sista. If I don't hear any dissenting opinion in a few days I'll integrate this in. Cheers, Alistair |
> > FilePlugin currently splits files in to two groups: 1) Stdio streams and > 2) everything else. > > To test for the end of file, FilePlugin>>primitiveFileAtEnd: > > 1) Uses feof() for stdio streams. > 2) Compares the current position to the file size for everything else. > > This returns the expected results for regular files, but fails for > non-regular files, e.g.: > > (FileSystem / 'dev' / 'urandom') binaryReadStream next: 8. > > returns an empty array. The FileStream hierachy in Squeak has no issue with the current implementation, because the code will call the primitive to read from the file even if #atEnd returns true (which conforms to what you described as expected usage). E.g. StandardFileStream readOnlyFileNamed: '/dev/urandom' do: [ :file | file binary; next: 8 ] will return #[192 229 133 67 128 216 51 114]. > > On Unix, the proper way to check is to read past the end of the file and > then call feof() to confirm that it is in fact at the end. > > Pharo (and I assume, Squeak) has plenty of code that assumes that a file > position >= the file size means that the file is at the end, and as > stated above this generally works for regular files. It seems that the n+1th stream implementation that was incorporated into Pharo is not compatible with the rest of the stream libraries (the original Stream hierarchy - as shown above, Nile, Xtreams, and whatever else is out there I'm not aware of). If it were, then this issue would have arisen long before. So, if I were you, I would first check if the new implementation in Pharo is correct. Levente > > This patch modifies FilePlugin>>primitiveFileAtEnd to: > > a) Keep the current behaviour of using the file position test for > regular files. > b) Keep the current behaviour of using feof() for stdio streams. > c) Use feof() for non-regular files, e.g. /dev/urandom. > > This allows existing code to continue to function, and allows > non-regular files to be tested correctly. > > After applying the patch, the example code above answers the expected > result, e.g.: > > (FileSystem / 'dev' / 'urandom') binaryReadStream next: 8. > " #[179 136 227 226 28 147 197 125]" > > On Windows, as far as I can tell, all files are regular, and the > position test is used regularly, so no change is requried. > > https://github.com/OpenSmalltalk/opensmalltalk-vm/pull/232 > contains the changes. The CI builds failed, but for different reasons > to do with sista. > > If I don't hear any dissenting opinion in a few days I'll integrate this > in. > > > Cheers, > Alistair |
Hi Levente, Right, my nice detailed description is all about the symptoms and not the actual issue. Sorry about that. You're also correct that a new Pharo stream (ZnBufferedReadStream) had a bug in it, which is in the process of being fixed. The actual issue is the value returned by #atEnd, which is returning true as you suggested: | outputs | outputs := OrderedCollection new. StandardFileStream readOnlyFileNamed: '/dev/urandom' do: [ :file | outputs add: (file binary; next: 8). outputs add: file atEnd. outputs add: (file binary; next: 8). outputs add: file atEnd. ]. outputs " an OrderedCollection( #[80 31 245 49 20 97 163 76] true #[207 210 106 185 26 86 15 49] true)" The proposed modification means that #atEnd returns the correct value for non-regular files like the example above (false), while still maintaining the existing behaviour for regular files and stdio streams. Hopefully that is clearer, Alistair On 26 March 2018 at 18:16, Levente Uzonyi <[hidden email]> wrote: > >> >> FilePlugin currently splits files in to two groups: 1) Stdio streams and >> 2) everything else. >> >> To test for the end of file, FilePlugin>>primitiveFileAtEnd: >> >> 1) Uses feof() for stdio streams. >> 2) Compares the current position to the file size for everything else. >> >> This returns the expected results for regular files, but fails for >> non-regular files, e.g.: >> >> (FileSystem / 'dev' / 'urandom') binaryReadStream next: 8. >> >> returns an empty array. > > > The FileStream hierachy in Squeak has no issue with the current > implementation, because the code will call the primitive to read from the > file even if #atEnd returns true (which conforms to what you described as > expected usage). E.g. > > StandardFileStream readOnlyFileNamed: '/dev/urandom' do: [ :file | > file binary; next: 8 ] > > will return #[192 229 133 67 128 216 51 114]. > >> >> On Unix, the proper way to check is to read past the end of the file and >> then call feof() to confirm that it is in fact at the end. >> >> Pharo (and I assume, Squeak) has plenty of code that assumes that a file >> position >= the file size means that the file is at the end, and as >> stated above this generally works for regular files. > > > It seems that the n+1th stream implementation that was incorporated into > Pharo is not compatible with the rest of the stream libraries (the original > Stream hierarchy - as shown above, Nile, Xtreams, and whatever else is out > there I'm not aware of). If it were, then this issue would have arisen long > before. > So, if I were you, I would first check if the new implementation in Pharo is > correct. > > Levente > > >> >> This patch modifies FilePlugin>>primitiveFileAtEnd to: >> >> a) Keep the current behaviour of using the file position test for >> regular files. >> b) Keep the current behaviour of using feof() for stdio streams. >> c) Use feof() for non-regular files, e.g. /dev/urandom. >> >> This allows existing code to continue to function, and allows >> non-regular files to be tested correctly. >> >> After applying the patch, the example code above answers the expected >> result, e.g.: >> >> (FileSystem / 'dev' / 'urandom') binaryReadStream next: 8. >> " #[179 136 227 226 28 147 197 125]" >> >> On Windows, as far as I can tell, all files are regular, and the >> position test is used regularly, so no change is requried. >> >> https://github.com/OpenSmalltalk/opensmalltalk-vm/pull/232 >> contains the changes. The CI builds failed, but for different reasons >> to do with sista. >> >> If I don't hear any dissenting opinion in a few days I'll integrate this >> in. >> >> >> Cheers, >> Alistair |
In reply to this post by alistairgrant
FYI, OSProcess provides this: OSProcessAccessor>>isAtEndOfFile: anIOHandle "Answer whether the file represented by anIOHandle is at end of file, as determined by a call to feof(). This is different from StandardFileStream>>primAtEnd: which answers true if the file pointer is at the end of the file, but which does not call feof() to determine that an end of file condition has occurred. The difference is significant if aSqFileStruct represents a pipe or a device file, which may not be positionable in the sense of a conventional disk file." And the primitive for Unix is: UnixOSProcessPlugin>>primitiveTestEndOfFileFlag "Take a struct SQFile from the stack, and call feof(3) to determine if the file has reached end of file. The flag is set only by a previous read operation, so end of file is not detected until an actual EOF condition has been detected by a read attempt." Typical usage is a file stream on an OS pipe stream, in which case OSProcess makes a distiction between reading to end of file (current available data in a stream) versus reading all available data until the remote end of the pipe is closed (hence all of the data that is or ever will be available on that stream). In my opinion, the distinction between "stdio streams" and "everything else" is a mis-feature, and it presumably entered the FilePlugin to accommodate Windows. The console streams on Windows are quite fundamentally different from other streams, and have pre-assigned handle numbers to identify them. This is not the case for Unix-based systems, which attempt to treat IO streams as similarly as possible, and which are free to dup() the handles at will. The general idea is that stdin, stdout, and stderr should behave as reasonably as possible regardless of whether they are attached to a positional stream on a disk file, or to a pipe connected to some other process, or to some sort of special file. Or to say it another way, The stdin/stdout/stderr streams might be connected to things that are positionable or non-positionable, and you should not try to guess which behavior is appropriate based on whether that stream happens to be labeled as a "stdio stream". Dave On Mon, Mar 26, 2018 at 05:32:15PM +0200, Alistair Grant wrote: > > FilePlugin currently splits files in to two groups: 1) Stdio streams and > 2) everything else. > > To test for the end of file, FilePlugin>>primitiveFileAtEnd: > > 1) Uses feof() for stdio streams. > 2) Compares the current position to the file size for everything else. > > This returns the expected results for regular files, but fails for > non-regular files, e.g.: > > (FileSystem / 'dev' / 'urandom') binaryReadStream next: 8. > > returns an empty array. > > On Unix, the proper way to check is to read past the end of the file and > then call feof() to confirm that it is in fact at the end. > > Pharo (and I assume, Squeak) has plenty of code that assumes that a file > position >= the file size means that the file is at the end, and as > stated above this generally works for regular files. > > This patch modifies FilePlugin>>primitiveFileAtEnd to: > > a) Keep the current behaviour of using the file position test for > regular files. > b) Keep the current behaviour of using feof() for stdio streams. > c) Use feof() for non-regular files, e.g. /dev/urandom. > > This allows existing code to continue to function, and allows > non-regular files to be tested correctly. > > After applying the patch, the example code above answers the expected > result, e.g.: > > (FileSystem / 'dev' / 'urandom') binaryReadStream next: 8. > " #[179 136 227 226 28 147 197 125]" > > On Windows, as far as I can tell, all files are regular, and the > position test is used regularly, so no change is requried. > > https://github.com/OpenSmalltalk/opensmalltalk-vm/pull/232 > contains the changes. The CI builds failed, but for different reasons > to do with sista. > > If I don't hear any dissenting opinion in a few days I'll integrate this > in. > > > Cheers, > Alistair |
On 26 March 2018 at 23:51, David T. Lewis <[hidden email]> wrote: > > FYI, OSProcess provides this: > > OSProcessAccessor>>isAtEndOfFile: anIOHandle > "Answer whether the file represented by anIOHandle is at end of file, as determined > by a call to feof(). This is different from StandardFileStream>>primAtEnd: which answers > true if the file pointer is at the end of the file, but which does not call feof() to > determine that an end of file condition has occurred. The difference is significant > if aSqFileStruct represents a pipe or a device file, which may not be positionable > in the sense of a conventional disk file." > > And the primitive for Unix is: > > UnixOSProcessPlugin>>primitiveTestEndOfFileFlag > "Take a struct SQFile from the stack, and call feof(3) to determine if the file has > reached end of file. The flag is set only by a previous read operation, so end of > file is not detected until an actual EOF condition has been detected by a read attempt." > > Typical usage is a file stream on an OS pipe stream, in which case OSProcess makes > a distiction between reading to end of file (current available data in a stream) > versus reading all available data until the remote end of the pipe is closed (hence > all of the data that is or ever will be available on that stream). > > In my opinion, the distinction between "stdio streams" and "everything else" is a > mis-feature, +1 > and it presumably entered the FilePlugin to accommodate Windows. The > console streams on Windows are quite fundamentally different from other streams, > and have pre-assigned handle numbers to identify them. This is not the case for > Unix-based systems, which attempt to treat IO streams as similarly as possible, > and which are free to dup() the handles at will. > > The general idea is that stdin, stdout, and stderr should behave as reasonably > as possible regardless of whether they are attached to a positional stream on a > disk file, or to a pipe connected to some other process, or to some sort of special > file. Or to say it another way, The stdin/stdout/stderr streams might be connected > to things that are positionable or non-positionable, and you should not try to guess > which behavior is appropriate based on whether that stream happens to be labeled > as a "stdio stream". Right. Given that the primitive for opening/connecting to a file by file descriptor is now in to the VM, combined with this proposed patch it's possible that on Unix we could get rid of the special stdio stream handling altogether (I haven't gone through all the code to check). On Windows it should be possible to move the special handling out of the VM and in to the image, but again, that is obviously more work. Cheers, Alistair > On Mon, Mar 26, 2018 at 05:32:15PM +0200, Alistair Grant wrote: >> >> FilePlugin currently splits files in to two groups: 1) Stdio streams and >> 2) everything else. >> >> To test for the end of file, FilePlugin>>primitiveFileAtEnd: >> >> 1) Uses feof() for stdio streams. >> 2) Compares the current position to the file size for everything else. >> >> This returns the expected results for regular files, but fails for >> non-regular files, e.g.: >> >> (FileSystem / 'dev' / 'urandom') binaryReadStream next: 8. >> >> returns an empty array. >> >> On Unix, the proper way to check is to read past the end of the file and >> then call feof() to confirm that it is in fact at the end. >> >> Pharo (and I assume, Squeak) has plenty of code that assumes that a file >> position >= the file size means that the file is at the end, and as >> stated above this generally works for regular files. >> >> This patch modifies FilePlugin>>primitiveFileAtEnd to: >> >> a) Keep the current behaviour of using the file position test for >> regular files. >> b) Keep the current behaviour of using feof() for stdio streams. >> c) Use feof() for non-regular files, e.g. /dev/urandom. >> >> This allows existing code to continue to function, and allows >> non-regular files to be tested correctly. >> >> After applying the patch, the example code above answers the expected >> result, e.g.: >> >> (FileSystem / 'dev' / 'urandom') binaryReadStream next: 8. >> " #[179 136 227 226 28 147 197 125]" >> >> On Windows, as far as I can tell, all files are regular, and the >> position test is used regularly, so no change is requried. >> >> https://github.com/OpenSmalltalk/opensmalltalk-vm/pull/232 >> contains the changes. The CI builds failed, but for different reasons >> to do with sista. >> >> If I don't hear any dissenting opinion in a few days I'll integrate this >> in. >> >> >> Cheers, >> Alistair |
Free forum by Nabble | Edit this page |