Caching MultiByteFileStream bug?

Previous Topic Next Topic
 
classic Classic list List threaded Threaded
7 messages Options
Reply | Threaded
Open this post in threaded view
|

Caching MultiByteFileStream bug?

Andreas.Raab
Hi Levente -

Could you check the failure of FileStreamTest>>testNextChunkOutOfBounds?
This looks like it might be a problem with caching file stream behavior.
If it's not (i.e., the infinite recursion is the 'expected' behavior)
can we add something that addresses the issue?

Thanks,
   - Andreas

Reply | Threaded
Open this post in threaded view
|

Re: Caching MultiByteFileStream bug?

Levente Uzonyi-2
On Mon, 29 Mar 2010, Andreas Raab wrote:

> Hi Levente -
>
> Could you check the failure of FileStreamTest>>testNextChunkOutOfBounds? This
> looks like it might be a problem with caching file stream behavior. If it's
> not (i.e., the infinite recursion is the 'expected' behavior) can we add
> something that addresses the issue?

Seems like #primAtEnd: doesn't answer true if position is out of
bounds and there are no more bytes to read:

FileStream forceNewFileNamed: 'testFileStreamAtEnd' do: [ :file |
  file position: 1000.
  self assert: file next isNil.
  self assert: file atEnd ].

Read buffering doesn't affect this behavior:

FileStream forceNewFileNamed: 'testFileStreamAtEnd' do: [ :file |
  file disableReadBuffering.
         file position: 1000.
         self assert: file next isNil.
         self assert: file atEnd ].

Really:

FileStream forceNewFileNamed: 'testFileStreamAtEnd' do: [ :file |
  | fileID buffer1 count |
  file disableReadBuffering.
  file position: 1000.
  fileID := file instVarNamed: #fileID.
  buffer1 := String new: 1.
  count := file primRead: fileID into: buffer1 startingAt: 1 count: 1.
  self assert: count = 0.
  self assert: (file primAtEnd: fileID) ].

We can work around this issue with StandardFileStream >> #upTo: and
friends (by replacing the old code which uses recursion and is pretty
inefficient btw), but I think #atEnd should answer true in this case.


Levente

>
> Thanks,
>  - Andreas
>
>

Reply | Threaded
Open this post in threaded view
|

Re: Caching MultiByteFileStream bug?

Levente Uzonyi-2
I uploaded two packages to the Inbox which contain the workaround for the
#atEnd issue.

(Installer repository: 'http://source.squeak.org/inbox')
  install: 'Files-ul.80.mcz';
  install: 'Multilingual-ul.118.mcz'

The tests are green, but there may be untested edge cases (probably not,
but who knows). Please review it.


Levente

On Tue, 30 Mar 2010, Levente Uzonyi wrote:

> On Mon, 29 Mar 2010, Andreas Raab wrote:
>
>> Hi Levente -
>>
>> Could you check the failure of FileStreamTest>>testNextChunkOutOfBounds?
>> This looks like it might be a problem with caching file stream behavior. If
>> it's not (i.e., the infinite recursion is the 'expected' behavior) can we
>> add something that addresses the issue?
>
> Seems like #primAtEnd: doesn't answer true if position is out of bounds and
> there are no more bytes to read:
>
> FileStream forceNewFileNamed: 'testFileStreamAtEnd' do: [ :file |
> file position: 1000.
> self assert: file next isNil.
> self assert: file atEnd ].
>
> Read buffering doesn't affect this behavior:
>
> FileStream forceNewFileNamed: 'testFileStreamAtEnd' do: [ :file |
> file disableReadBuffering.
>        file position: 1000.
>        self assert: file next isNil.
>        self assert: file atEnd ].
>
> Really:
>
> FileStream forceNewFileNamed: 'testFileStreamAtEnd' do: [ :file |
> | fileID buffer1 count |
> file disableReadBuffering.
> file position: 1000.
> fileID := file instVarNamed: #fileID.
> buffer1 := String new: 1.
> count := file primRead: fileID into: buffer1 startingAt: 1 count: 1.
> self assert: count = 0.
> self assert: (file primAtEnd: fileID) ].
>
> We can work around this issue with StandardFileStream >> #upTo: and friends
> (by replacing the old code which uses recursion and is pretty inefficient
> btw), but I think #atEnd should answer true in this case.
>
>
> Levente
>
>>
>> Thanks,
>>  - Andreas
>>
>>
>
>

Reply | Threaded
Open this post in threaded view
|

Re: Caching MultiByteFileStream bug?

Andreas.Raab
On 3/30/2010 7:40 PM, Levente Uzonyi wrote:
> I uploaded two packages to the Inbox which contain the workaround for
> the #atEnd issue.
>
> (Installer repository: 'http://source.squeak.org/inbox')
> install: 'Files-ul.80.mcz';
> install: 'Multilingual-ul.118.mcz'
>
> The tests are green, but there may be untested edge cases (probably not,
> but who knows). Please review it.

You're the expert. I kinda would to see this issue fixed but since it
doesn't appear to be a regression I would be equally okay to live with
the issue in 4.1 and postpone fixing it to 4.2 with lots more time to
test the change. It's a question of risk and how certain you feel about
it. Your call.

Cheers,
   - Andreas


> On Tue, 30 Mar 2010, Levente Uzonyi wrote:
>
>> On Mon, 29 Mar 2010, Andreas Raab wrote:
>>
>>> Hi Levente -
>>>
>>> Could you check the failure of
>>> FileStreamTest>>testNextChunkOutOfBounds? This looks like it might be
>>> a problem with caching file stream behavior. If it's not (i.e., the
>>> infinite recursion is the 'expected' behavior) can we add something
>>> that addresses the issue?
>>
>> Seems like #primAtEnd: doesn't answer true if position is out of
>> bounds and there are no more bytes to read:
>>
>> FileStream forceNewFileNamed: 'testFileStreamAtEnd' do: [ :file |
>> file position: 1000.
>> self assert: file next isNil.
>> self assert: file atEnd ].
>>
>> Read buffering doesn't affect this behavior:
>>
>> FileStream forceNewFileNamed: 'testFileStreamAtEnd' do: [ :file |
>> file disableReadBuffering.
>> file position: 1000.
>> self assert: file next isNil.
>> self assert: file atEnd ].
>>
>> Really:
>>
>> FileStream forceNewFileNamed: 'testFileStreamAtEnd' do: [ :file |
>> | fileID buffer1 count |
>> file disableReadBuffering.
>> file position: 1000.
>> fileID := file instVarNamed: #fileID.
>> buffer1 := String new: 1.
>> count := file primRead: fileID into: buffer1 startingAt: 1 count: 1.
>> self assert: count = 0.
>> self assert: (file primAtEnd: fileID) ].
>>
>> We can work around this issue with StandardFileStream >> #upTo: and
>> friends (by replacing the old code which uses recursion and is pretty
>> inefficient btw), but I think #atEnd should answer true in this case.
>>
>>
>> Levente
>>
>>>
>>> Thanks,
>>> - Andreas
>>>
>>>
>>
>>
>
>


Reply | Threaded
Open this post in threaded view
|

Re: Caching MultiByteFileStream bug?

Levente Uzonyi-2
On Tue, 30 Mar 2010, Andreas Raab wrote:

> On 3/30/2010 7:40 PM, Levente Uzonyi wrote:
>> I uploaded two packages to the Inbox which contain the workaround for
>> the #atEnd issue.
>>
>> (Installer repository: 'http://source.squeak.org/inbox')
>> install: 'Files-ul.80.mcz';
>> install: 'Multilingual-ul.118.mcz'
>>
>> The tests are green, but there may be untested edge cases (probably not,
>> but who knows). Please review it.
>
> You're the expert. I kinda would to see this issue fixed but since it doesn't
> appear to be a regression I would be equally okay to live with the issue in
> 4.1 and postpone fixing it to 4.2 with lots more time to test the change.
> It's a question of risk and how certain you feel about it. Your call.

I'm pretty sure they work as expected, but I'll write a few unit tests
before pushing to the Trunk.


Levente

>
> Cheers,
>  - Andreas
>
>
>> On Tue, 30 Mar 2010, Levente Uzonyi wrote:
>>
>>> On Mon, 29 Mar 2010, Andreas Raab wrote:
>>>
>>>> Hi Levente -
>>>>
>>>> Could you check the failure of
>>>> FileStreamTest>>testNextChunkOutOfBounds? This looks like it might be
>>>> a problem with caching file stream behavior. If it's not (i.e., the
>>>> infinite recursion is the 'expected' behavior) can we add something
>>>> that addresses the issue?
>>>
>>> Seems like #primAtEnd: doesn't answer true if position is out of
>>> bounds and there are no more bytes to read:
>>>
>>> FileStream forceNewFileNamed: 'testFileStreamAtEnd' do: [ :file |
>>> file position: 1000.
>>> self assert: file next isNil.
>>> self assert: file atEnd ].
>>>
>>> Read buffering doesn't affect this behavior:
>>>
>>> FileStream forceNewFileNamed: 'testFileStreamAtEnd' do: [ :file |
>>> file disableReadBuffering.
>>> file position: 1000.
>>> self assert: file next isNil.
>>> self assert: file atEnd ].
>>>
>>> Really:
>>>
>>> FileStream forceNewFileNamed: 'testFileStreamAtEnd' do: [ :file |
>>> | fileID buffer1 count |
>>> file disableReadBuffering.
>>> file position: 1000.
>>> fileID := file instVarNamed: #fileID.
>>> buffer1 := String new: 1.
>>> count := file primRead: fileID into: buffer1 startingAt: 1 count: 1.
>>> self assert: count = 0.
>>> self assert: (file primAtEnd: fileID) ].
>>>
>>> We can work around this issue with StandardFileStream >> #upTo: and
>>> friends (by replacing the old code which uses recursion and is pretty
>>> inefficient btw), but I think #atEnd should answer true in this case.
>>>
>>>
>>> Levente
>>>
>>>>
>>>> Thanks,
>>>> - Andreas
>>>>
>>>>
>>>
>>>
>>
>>
>
>
>

Reply | Threaded
Open this post in threaded view
|

#primAtEnd: bug? (was: Re: [squeak-dev] Caching MultiByteFileStream bug?)

Levente Uzonyi-2
In reply to this post by Levente Uzonyi-2
On Tue, 30 Mar 2010, Levente Uzonyi wrote:

> On Mon, 29 Mar 2010, Andreas Raab wrote:
>
>> Hi Levente -
>>
>> Could you check the failure of FileStreamTest>>testNextChunkOutOfBounds?
>> This looks like it might be a problem with caching file stream behavior. If
>> it's not (i.e., the infinite recursion is the 'expected' behavior) can we
>> add something that addresses the issue?
>
> Seems like #primAtEnd: doesn't answer true if position is out of bounds and
> there are no more bytes to read:

I realized that even if I push the changes for #basicUpTo: and friends
there may be arbitrary code that assumes that #atEnd will return true if
there's nothing more to read. How should this issue be fixed?

One solution is to change StandardFileStream >> #atEnd to something like
this:

atEnd
  "Answer whether the receiver is at its end.  "

  collection ifNotNil: [
  position < readLimit ifTrue: [ ^false ] ].
  ^(self primSizeNoError: fileID)
  ifNil: [ true ]
  ifNotNil: [ :size | size <= self position ]

But this means that #primAtEnd: is not used at all, so it's just a
workaroud.

The other option is to change the VM code. This means changing == to >=
like this:

For unix and mac (cross):

sqInt sqFileAtEnd(SQFile *f) {
  /* Return true if the file's read/write head is at the end of the file. */

  if (!sqFileValid(f)) return interpreterProxy->success(false);
  return ftell(getFile(f)) >= getSize(f);
}

For windows:

sqInt sqFileAtEnd(SQFile *f) {
   win32FileOffset ofs;
   /* Return true if the file's read/write head is at the end of the file. */
   if (!sqFileValid(f)) FAIL();
   ofs.offset = 0;
   ofs.dwLow = SetFilePointer(FILE_HANDLE(f), 0, &ofs.dwHigh, FILE_CURRENT);
   return ofs.offset >= sqFileSize(f);
}

What do you think?


Levente

>
> FileStream forceNewFileNamed: 'testFileStreamAtEnd' do: [ :file |
> file position: 1000.
> self assert: file next isNil.
> self assert: file atEnd ].
>
> Read buffering doesn't affect this behavior:
>
> FileStream forceNewFileNamed: 'testFileStreamAtEnd' do: [ :file |
> file disableReadBuffering.
>        file position: 1000.
>        self assert: file next isNil.
>        self assert: file atEnd ].
>
> Really:
>
> FileStream forceNewFileNamed: 'testFileStreamAtEnd' do: [ :file |
> | fileID buffer1 count |
> file disableReadBuffering.
> file position: 1000.
> fileID := file instVarNamed: #fileID.
> buffer1 := String new: 1.
> count := file primRead: fileID into: buffer1 startingAt: 1 count: 1.
> self assert: count = 0.
> self assert: (file primAtEnd: fileID) ].
>
> We can work around this issue with StandardFileStream >> #upTo: and friends
> (by replacing the old code which uses recursion and is pretty inefficient
> btw), but I think #atEnd should answer true in this case.
>
>
> Levente
>
>>
>> Thanks,
>>  - Andreas
>>
>>
>
>

Reply | Threaded
Open this post in threaded view
|

Re: #primAtEnd: bug?

Andreas.Raab
On 4/1/2010 7:59 AM, Levente Uzonyi wrote:
> I realized that even if I push the changes for #basicUpTo: and friends
> there may be arbitrary code that assumes that #atEnd will return true if
> there's nothing more to read. How should this issue be fixed?

In the primitives (I just committed the changes to SVN). Plus we should
add a test for this specifically. Thanks for the help!

Cheers,
   - Andreas

> One solution is to change StandardFileStream >> #atEnd to something like
> this:
>
> atEnd
> "Answer whether the receiver is at its end. "
>
> collection ifNotNil: [
> position < readLimit ifTrue: [ ^false ] ].
> ^(self primSizeNoError: fileID)
> ifNil: [ true ]
> ifNotNil: [ :size | size <= self position ]
>
> But this means that #primAtEnd: is not used at all, so it's just a
> workaroud.
>
> The other option is to change the VM code. This means changing == to >=
> like this:
>
> For unix and mac (cross):
>
> sqInt sqFileAtEnd(SQFile *f) {
> /* Return true if the file's read/write head is at the end of the file. */
>
> if (!sqFileValid(f)) return interpreterProxy->success(false);
> return ftell(getFile(f)) >= getSize(f);
> }
>
> For windows:
>
> sqInt sqFileAtEnd(SQFile *f) {
> win32FileOffset ofs;
> /* Return true if the file's read/write head is at the end of the file. */
> if (!sqFileValid(f)) FAIL();
> ofs.offset = 0;
> ofs.dwLow = SetFilePointer(FILE_HANDLE(f), 0, &ofs.dwHigh, FILE_CURRENT);
> return ofs.offset >= sqFileSize(f);
> }
>
> What do you think?
>
>
> Levente
>
>>
>> FileStream forceNewFileNamed: 'testFileStreamAtEnd' do: [ :file |
>> file position: 1000.
>> self assert: file next isNil.
>> self assert: file atEnd ].
>>
>> Read buffering doesn't affect this behavior:
>>
>> FileStream forceNewFileNamed: 'testFileStreamAtEnd' do: [ :file |
>> file disableReadBuffering.
>> file position: 1000.
>> self assert: file next isNil.
>> self assert: file atEnd ].
>>
>> Really:
>>
>> FileStream forceNewFileNamed: 'testFileStreamAtEnd' do: [ :file |
>> | fileID buffer1 count |
>> file disableReadBuffering.
>> file position: 1000.
>> fileID := file instVarNamed: #fileID.
>> buffer1 := String new: 1.
>> count := file primRead: fileID into: buffer1 startingAt: 1 count: 1.
>> self assert: count = 0.
>> self assert: (file primAtEnd: fileID) ].
>>
>> We can work around this issue with StandardFileStream >> #upTo: and
>> friends (by replacing the old code which uses recursion and is pretty
>> inefficient btw), but I think #atEnd should answer true in this case.
>>
>>
>> Levente
>>
>>>
>>> Thanks,
>>> - Andreas
>>>
>>>
>>
>>
>
>