Changed #atEnd primitive - #atEnd vs #next returning nil

Previous Topic Next Topic
 
classic Classic list List threaded Threaded
60 messages Options
123
Reply | Threaded
Open this post in threaded view
|

Changed #atEnd primitive - #atEnd vs #next returning nil

Sven Van Caekenberghe-2
Somehow, somewhere there was a change to the implementation of the primitive called by some streams' #atEnd.

IIRC, someone said it is implemented as 'remaining size being zero' and some virtual unix files like /dev/random are zero sized.

Now, all kinds of changes are being done image size to work around this.

I am a strong believer in simple, real (i.e. infinite) streams, but I am not sure we are doing the right thing here.

Point is, I am not sure #next returning nil is official and universal.

Consider the comments:

Stream>>#next
  "Answer the next object accessible by the receiver."

ReadStream>>#next
  "Primitive. Answer the next object in the Stream represented by the
  receiver. Fail if the collection of this stream is not an Array or a String.
  Fail if the stream is positioned at its end, or if the position is out of
  bounds in the collection. Optional. See Object documentation
  whatIsAPrimitive."

Note how there is no talk about returning nil !

I think we should discuss about this first.

Was the low level change really correct and the right thing to do ?

Note also that a Guille introduced something new, #closed which is related to the difference between having no more elements (maybe right now, like an open network stream) and never ever being able to produce more data.

Sven



Reply | Threaded
Open this post in threaded view
|

Re: Changed #atEnd primitive - #atEnd vs #next returning nil

Nicolas Cellier

2018-04-04 11:32 GMT+02:00 Sven Van Caekenberghe <[hidden email]>:
Somehow, somewhere there was a change to the implementation of the primitive called by some streams' #atEnd.

IIRC, someone said it is implemented as 'remaining size being zero' and some virtual unix files like /dev/random are zero sized.

Now, all kinds of changes are being done image size to work around this.

I am a strong believer in simple, real (i.e. infinite) streams, but I am not sure we are doing the right thing here.

Point is, I am not sure #next returning nil is official and universal.

Consider the comments:

Stream>>#next
  "Answer the next object accessible by the receiver."

ReadStream>>#next
  "Primitive. Answer the next object in the Stream represented by the
  receiver. Fail if the collection of this stream is not an Array or a String.
  Fail if the stream is positioned at its end, or if the position is out of
  bounds in the collection. Optional. See Object documentation
  whatIsAPrimitive."

Note how there is no talk about returning nil !

I think we should discuss about this first.

Was the low level change really correct and the right thing to do ?

Note also that a Guille introduced something new, #closed which is related to the difference between having no more elements (maybe right now, like an open network stream) and never ever being able to produce more data.

Sven




Reply | Threaded
Open this post in threaded view
|

Re: Changed #atEnd primitive - #atEnd vs #next returning nil

Sven Van Caekenberghe-2


> On 4 Apr 2018, at 11:38, Nicolas Cellier <[hidden email]> wrote:
>
> Hi Sven,
> See also discussion at https://github.com/OpenSmalltalk/opensmalltalk-vm/pull/232

Thanks Nicolas, I had already seen parts of it.

Now, I still want image level changes to be based on clear semantic definitions of the abstract stream API, we're not just talking about file or unix streams, the general Smalltalk concept.

For reading, they are IMHO,

#next
#readInto:startingAt:count:
#peek
#atEnd
#upToEnd (can be derived but still the semantics are important in relation to #atEnd)

For writing, we have

#nextPut:
#next:putAll:startingAt:
#flush

For both, we have

#atEnd
#close
#closed (new)

So, I know, #next returning nil exists, but is it universally/officially defined as such ? Where is that documented ?

Positioning, sizing are not universal, IMHO, but should be clearly defined as well.

> 2018-04-04 11:32 GMT+02:00 Sven Van Caekenberghe <[hidden email]>:
> Somehow, somewhere there was a change to the implementation of the primitive called by some streams' #atEnd.
>
> IIRC, someone said it is implemented as 'remaining size being zero' and some virtual unix files like /dev/random are zero sized.
>
> Now, all kinds of changes are being done image size to work around this.
>
> I am a strong believer in simple, real (i.e. infinite) streams, but I am not sure we are doing the right thing here.
>
> Point is, I am not sure #next returning nil is official and universal.
>
> Consider the comments:
>
> Stream>>#next
>   "Answer the next object accessible by the receiver."
>
> ReadStream>>#next
>   "Primitive. Answer the next object in the Stream represented by the
>   receiver. Fail if the collection of this stream is not an Array or a String.
>   Fail if the stream is positioned at its end, or if the position is out of
>   bounds in the collection. Optional. See Object documentation
>   whatIsAPrimitive."
>
> Note how there is no talk about returning nil !
>
> I think we should discuss about this first.
>
> Was the low level change really correct and the right thing to do ?
>
> Note also that a Guille introduced something new, #closed which is related to the difference between having no more elements (maybe right now, like an open network stream) and never ever being able to produce more data.
>
> Sven
>
>
>
>


Reply | Threaded
Open this post in threaded view
|

Re: Changed #atEnd primitive - #atEnd vs #next returning nil

Denis Kudriashov
Hi.

#next returning nil is definitely looks bad because we will always have case when nil value will not means the end:

#(1 nil 2) readSteam next; next; atEnd

But I understand that in many cases analysing #next for nil is much more suitable than #atEnd. I am sure that it can be always avoided but it can be not easy to do.
Alternatively we can reify real EndOfStream object which will be used as result of reading operation when stream is at end.
It can provide nice API like #ifEndOfStream:. which will reflect intention in much cleaner way than ifNil: checks. 


2018-04-04 12:00 GMT+02:00 Sven Van Caekenberghe <[hidden email]>:


> On 4 Apr 2018, at 11:38, Nicolas Cellier <[hidden email]> wrote:
>
> Hi Sven,
> See also discussion at https://github.com/OpenSmalltalk/opensmalltalk-vm/pull/232

Thanks Nicolas, I had already seen parts of it.

Now, I still want image level changes to be based on clear semantic definitions of the abstract stream API, we're not just talking about file or unix streams, the general Smalltalk concept.

For reading, they are IMHO,

#next
#readInto:startingAt:count:
#peek
#atEnd
#upToEnd (can be derived but still the semantics are important in relation to #atEnd)

For writing, we have

#nextPut:
#next:putAll:startingAt:
#flush

For both, we have

#atEnd
#close
#closed (new)

So, I know, #next returning nil exists, but is it universally/officially defined as such ? Where is that documented ?

Positioning, sizing are not universal, IMHO, but should be clearly defined as well.

> 2018-04-04 11:32 GMT+02:00 Sven Van Caekenberghe <[hidden email]>:
> Somehow, somewhere there was a change to the implementation of the primitive called by some streams' #atEnd.
>
> IIRC, someone said it is implemented as 'remaining size being zero' and some virtual unix files like /dev/random are zero sized.
>
> Now, all kinds of changes are being done image size to work around this.
>
> I am a strong believer in simple, real (i.e. infinite) streams, but I am not sure we are doing the right thing here.
>
> Point is, I am not sure #next returning nil is official and universal.
>
> Consider the comments:
>
> Stream>>#next
>   "Answer the next object accessible by the receiver."
>
> ReadStream>>#next
>   "Primitive. Answer the next object in the Stream represented by the
>   receiver. Fail if the collection of this stream is not an Array or a String.
>   Fail if the stream is positioned at its end, or if the position is out of
>   bounds in the collection. Optional. See Object documentation
>   whatIsAPrimitive."
>
> Note how there is no talk about returning nil !
>
> I think we should discuss about this first.
>
> Was the low level change really correct and the right thing to do ?
>
> Note also that a Guille introduced something new, #closed which is related to the difference between having no more elements (maybe right now, like an open network stream) and never ever being able to produce more data.
>
> Sven
>
>
>
>



Reply | Threaded
Open this post in threaded view
|

Re: Changed #atEnd primitive - #atEnd vs #next returning nil

alistairgrant
In reply to this post by Sven Van Caekenberghe-2
Hi Sven,

On Wed, Apr 04, 2018 at 11:32:02AM +0200, Sven Van Caekenberghe wrote:
> Somehow, somewhere there was a change to the implementation of the
> primitive called by some streams' #atEnd.

That's a proposed change by me, but it hasn't been integrated yet.  So
the discussion below should apply to the current stable vm (from August
last year).


> IIRC, someone said it is implemented as 'remaining size being zero'
> and some virtual unix files like /dev/random are zero sized.

Currently, for files other than sdio (stdout, stderr, stdin) it is
effectively defined as:

atEnd := stream position >= stream size


And, as you say, plenty of virtual unix files report size 0.



> Now, all kinds of changes are being done image size to work around this.

I would phrase this slightly differently :-)

Some code does the right thing, while other code doesn't.  E.g.:

MultiByteFileStream>>upToEnd is good, while
FileStream>>contents is incorrect


> I am a strong believer in simple, real (i.e. infinite) streams, but I
> am not sure we are doing the right thing here.
>
> Point is, I am not sure #next returning nil is official and universal.
>
> Consider the comments:
>
> Stream>>#next
>   "Answer the next object accessible by the receiver."
>
> ReadStream>>#next
>   "Primitive. Answer the next object in the Stream represented by the
>   receiver. Fail if the collection of this stream is not an Array or a String.
>   Fail if the stream is positioned at its end, or if the position is out of
>   bounds in the collection. Optional. See Object documentation
>   whatIsAPrimitive."
>
> Note how there is no talk about returning nil !
>
> I think we should discuss about this first.
>
> Was the low level change really correct and the right thing to do ?

The primitive change proposed doesn't affect this discussion.  It will
mean that #atEnd returns false (correctly) sometimes, while currently it
returns true (incorrectly).  The end result is still incorrect, e.g.
#contents returns an empty string for /proc/cpuinfo.

You're correct about no mention of nil, but we have:

FileStream>>next

        (position >= readLimit and: [self atEnd])
                ifTrue: [^nil]
                ifFalse: [^collection at: (position := position + 1)]


which has been around for a long time (I suspect, before Pharo existed).

Having said that, I think that raising an exception is a better
solution, but it is a much, much bigger change than the one I proposed
in https://github.com/pharo-project/pharo/pull/1180.


Cheers,
Alistair



> Note also that a Guille introduced something new, #closed which is related to the difference between having no more elements (maybe right now, like an open network stream) and never ever being able to produce more data.
>
> Sven
>
>
>

Reply | Threaded
Open this post in threaded view
|

Re: Changed #atEnd primitive - #atEnd vs #next returning nil

Nicolas Cellier


2018-04-04 12:18 GMT+02:00 Alistair Grant <[hidden email]>:
Hi Sven,

On Wed, Apr 04, 2018 at 11:32:02AM +0200, Sven Van Caekenberghe wrote:
> Somehow, somewhere there was a change to the implementation of the
> primitive called by some streams' #atEnd.

That's a proposed change by me, but it hasn't been integrated yet.  So
the discussion below should apply to the current stable vm (from August
last year).


> IIRC, someone said it is implemented as 'remaining size being zero'
> and some virtual unix files like /dev/random are zero sized.

Currently, for files other than sdio (stdout, stderr, stdin) it is
effectively defined as:

atEnd := stream position >= stream size


And, as you say, plenty of virtual unix files report size 0.



> Now, all kinds of changes are being done image size to work around this.

I would phrase this slightly differently :-)

Some code does the right thing, while other code doesn't.  E.g.:

MultiByteFileStream>>upToEnd is good, while
FileStream>>contents is incorrect


> I am a strong believer in simple, real (i.e. infinite) streams, but I
> am not sure we are doing the right thing here.
>
> Point is, I am not sure #next returning nil is official and universal.
>
> Consider the comments:
>
> Stream>>#next
>   "Answer the next object accessible by the receiver."
>
> ReadStream>>#next
>   "Primitive. Answer the next object in the Stream represented by the
>   receiver. Fail if the collection of this stream is not an Array or a String.
>   Fail if the stream is positioned at its end, or if the position is out of
>   bounds in the collection. Optional. See Object documentation
>   whatIsAPrimitive."
>
> Note how there is no talk about returning nil !
>
> I think we should discuss about this first.
>
> Was the low level change really correct and the right thing to do ?

The primitive change proposed doesn't affect this discussion.  It will
mean that #atEnd returns false (correctly) sometimes, while currently it
returns true (incorrectly).  The end result is still incorrect, e.g.
#contents returns an empty string for /proc/cpuinfo.

You're correct about no mention of nil, but we have:

FileStream>>next

        (position >= readLimit and: [self atEnd])
                ifTrue: [^nil]
                ifFalse: [^collection at: (position := position + 1)]


which has been around for a long time (I suspect, before Pharo existed).

Having said that, I think that raising an exception is a better
solution, but it is a much, much bigger change than the one I proposed
in https://github.com/pharo-project/pharo/pull/1180.


Cheers,
Alistair


Hi,
yes, if you are after universal behavior englobing Unix streams, the Exception might be the best way.
Because on special stream you can't allways say in advance, you have to try.
That's the solution adopted by authors of Xtreams.
But there is a runtime penalty associated to it.

The penalty once was so high that my proposal to generalize EndOfStream usage was rejected a few years ago by AndreaRaab.
http://forum.world.st/EndOfStream-unused-td68806.html

I have regularly benched Xtreams, but stopped a few years ago.
Maybe i can excavate and pass on newer VM.

In the mean time, i had experimented a programmable end of stream behavior (via a block, or any other valuable)
http://www.squeaksource.com/XTream.htm
so as to reconcile performance and universality, but it was a source of complexification at implementation side.

Nicolas



> Note also that a Guille introduced something new, #closed which is related to the difference between having no more elements (maybe right now, like an open network stream) and never ever being able to produce more data.
>
> Sven
>
>
>


Reply | Threaded
Open this post in threaded view
|

Re: Changed #atEnd primitive - #atEnd vs #next returning nil

alistairgrant
Hi Nicolas,

On 4 April 2018 at 12:36, Nicolas Cellier
<[hidden email]> wrote:

>
>
> 2018-04-04 12:18 GMT+02:00 Alistair Grant <[hidden email]>:
>>
>> Hi Sven,
>>
>> On Wed, Apr 04, 2018 at 11:32:02AM +0200, Sven Van Caekenberghe wrote:
>> > Somehow, somewhere there was a change to the implementation of the
>> > primitive called by some streams' #atEnd.
>>
>> That's a proposed change by me, but it hasn't been integrated yet.  So
>> the discussion below should apply to the current stable vm (from August
>> last year).
>>
>>
>> > IIRC, someone said it is implemented as 'remaining size being zero'
>> > and some virtual unix files like /dev/random are zero sized.
>>
>> Currently, for files other than sdio (stdout, stderr, stdin) it is
>> effectively defined as:
>>
>> atEnd := stream position >= stream size
>>
>>
>> And, as you say, plenty of virtual unix files report size 0.
>>
>>
>>
>> > Now, all kinds of changes are being done image size to work around this.
>>
>> I would phrase this slightly differently :-)
>>
>> Some code does the right thing, while other code doesn't.  E.g.:
>>
>> MultiByteFileStream>>upToEnd is good, while
>> FileStream>>contents is incorrect
>>
>>
>> > I am a strong believer in simple, real (i.e. infinite) streams, but I
>> > am not sure we are doing the right thing here.
>> >
>> > Point is, I am not sure #next returning nil is official and universal.
>> >
>> > Consider the comments:
>> >
>> > Stream>>#next
>> >   "Answer the next object accessible by the receiver."
>> >
>> > ReadStream>>#next
>> >   "Primitive. Answer the next object in the Stream represented by the
>> >   receiver. Fail if the collection of this stream is not an Array or a
>> > String.
>> >   Fail if the stream is positioned at its end, or if the position is out
>> > of
>> >   bounds in the collection. Optional. See Object documentation
>> >   whatIsAPrimitive."
>> >
>> > Note how there is no talk about returning nil !
>> >
>> > I think we should discuss about this first.
>> >
>> > Was the low level change really correct and the right thing to do ?
>>
>> The primitive change proposed doesn't affect this discussion.  It will
>> mean that #atEnd returns false (correctly) sometimes, while currently it
>> returns true (incorrectly).  The end result is still incorrect, e.g.
>> #contents returns an empty string for /proc/cpuinfo.
>>
>> You're correct about no mention of nil, but we have:
>>
>> FileStream>>next
>>
>>         (position >= readLimit and: [self atEnd])
>>                 ifTrue: [^nil]
>>                 ifFalse: [^collection at: (position := position + 1)]
>>
>>
>> which has been around for a long time (I suspect, before Pharo existed).
>>
>> Having said that, I think that raising an exception is a better
>> solution, but it is a much, much bigger change than the one I proposed
>> in https://github.com/pharo-project/pharo/pull/1180.
>>
>>
>> Cheers,
>> Alistair
>>
>
> Hi,
> yes, if you are after universal behavior englobing Unix streams, the
> Exception might be the best way.
> Because on special stream you can't allways say in advance, you have to try.
> That's the solution adopted by authors of Xtreams.
> But there is a runtime penalty associated to it.
>
> The penalty once was so high that my proposal to generalize EndOfStream
> usage was rejected a few years ago by AndreaRaab.
> http://forum.world.st/EndOfStream-unused-td68806.html

Thanks for this, I'll definitely take a look.

Do you have a sense of how Denis' suggestion of using an EndOfStream
object would compare?

It would keep the same coding style, but avoid the problems with nil.

Thanks,
Alistair



> I have regularly benched Xtreams, but stopped a few years ago.
> Maybe i can excavate and pass on newer VM.
>
> In the mean time, i had experimented a programmable end of stream behavior
> (via a block, or any other valuable)
> http://www.squeaksource.com/XTream.htm
> so as to reconcile performance and universality, but it was a source of
> complexification at implementation side.
>
> Nicolas
>
>>
>>
>> > Note also that a Guille introduced something new, #closed which is
>> > related to the difference between having no more elements (maybe right now,
>> > like an open network stream) and never ever being able to produce more data.
>> >
>> > Sven
>> >
>> >
>> >
>>
>

Reply | Threaded
Open this post in threaded view
|

Re: Changed #atEnd primitive - #atEnd vs #next returning nil

Sven Van Caekenberghe-2
Playing a bit devil's advocate, the idea is that, in general,

[ stream atEnd] whileFalse: [ stream next. "..." ].

is no longer allowed ? And you want to replace it with

[ stream next ifNil: [ false ] ifNotNil: [ :x | "..." true ] whileTrue.

That is a pretty big change, no ?

I think/feel like a proper EOF exception would be better, more correct.

[ [ stream next. "..." true ] on: EOF do: [ false ] ] whileTrue.

Will we throw away #atEnd then ? Do we need it if we cannot use it ?

> On 4 Apr 2018, at 12:41, Alistair Grant <[hidden email]> wrote:
>
> Hi Nicolas,
>
> On 4 April 2018 at 12:36, Nicolas Cellier
> <[hidden email]> wrote:
>>
>>
>> 2018-04-04 12:18 GMT+02:00 Alistair Grant <[hidden email]>:
>>>
>>> Hi Sven,
>>>
>>> On Wed, Apr 04, 2018 at 11:32:02AM +0200, Sven Van Caekenberghe wrote:
>>>> Somehow, somewhere there was a change to the implementation of the
>>>> primitive called by some streams' #atEnd.
>>>
>>> That's a proposed change by me, but it hasn't been integrated yet.  So
>>> the discussion below should apply to the current stable vm (from August
>>> last year).
>>>
>>>
>>>> IIRC, someone said it is implemented as 'remaining size being zero'
>>>> and some virtual unix files like /dev/random are zero sized.
>>>
>>> Currently, for files other than sdio (stdout, stderr, stdin) it is
>>> effectively defined as:
>>>
>>> atEnd := stream position >= stream size
>>>
>>>
>>> And, as you say, plenty of virtual unix files report size 0.
>>>
>>>
>>>
>>>> Now, all kinds of changes are being done image size to work around this.
>>>
>>> I would phrase this slightly differently :-)
>>>
>>> Some code does the right thing, while other code doesn't.  E.g.:
>>>
>>> MultiByteFileStream>>upToEnd is good, while
>>> FileStream>>contents is incorrect
>>>
>>>
>>>> I am a strong believer in simple, real (i.e. infinite) streams, but I
>>>> am not sure we are doing the right thing here.
>>>>
>>>> Point is, I am not sure #next returning nil is official and universal.
>>>>
>>>> Consider the comments:
>>>>
>>>> Stream>>#next
>>>>  "Answer the next object accessible by the receiver."
>>>>
>>>> ReadStream>>#next
>>>>  "Primitive. Answer the next object in the Stream represented by the
>>>>  receiver. Fail if the collection of this stream is not an Array or a
>>>> String.
>>>>  Fail if the stream is positioned at its end, or if the position is out
>>>> of
>>>>  bounds in the collection. Optional. See Object documentation
>>>>  whatIsAPrimitive."
>>>>
>>>> Note how there is no talk about returning nil !
>>>>
>>>> I think we should discuss about this first.
>>>>
>>>> Was the low level change really correct and the right thing to do ?
>>>
>>> The primitive change proposed doesn't affect this discussion.  It will
>>> mean that #atEnd returns false (correctly) sometimes, while currently it
>>> returns true (incorrectly).  The end result is still incorrect, e.g.
>>> #contents returns an empty string for /proc/cpuinfo.
>>>
>>> You're correct about no mention of nil, but we have:
>>>
>>> FileStream>>next
>>>
>>>        (position >= readLimit and: [self atEnd])
>>>                ifTrue: [^nil]
>>>                ifFalse: [^collection at: (position := position + 1)]
>>>
>>>
>>> which has been around for a long time (I suspect, before Pharo existed).
>>>
>>> Having said that, I think that raising an exception is a better
>>> solution, but it is a much, much bigger change than the one I proposed
>>> in https://github.com/pharo-project/pharo/pull/1180.
>>>
>>>
>>> Cheers,
>>> Alistair
>>>
>>
>> Hi,
>> yes, if you are after universal behavior englobing Unix streams, the
>> Exception might be the best way.
>> Because on special stream you can't allways say in advance, you have to try.
>> That's the solution adopted by authors of Xtreams.
>> But there is a runtime penalty associated to it.
>>
>> The penalty once was so high that my proposal to generalize EndOfStream
>> usage was rejected a few years ago by AndreaRaab.
>> http://forum.world.st/EndOfStream-unused-td68806.html
>
> Thanks for this, I'll definitely take a look.
>
> Do you have a sense of how Denis' suggestion of using an EndOfStream
> object would compare?
>
> It would keep the same coding style, but avoid the problems with nil.
>
> Thanks,
> Alistair
>
>
>
>> I have regularly benched Xtreams, but stopped a few years ago.
>> Maybe i can excavate and pass on newer VM.
>>
>> In the mean time, i had experimented a programmable end of stream behavior
>> (via a block, or any other valuable)
>> http://www.squeaksource.com/XTream.htm
>> so as to reconcile performance and universality, but it was a source of
>> complexification at implementation side.
>>
>> Nicolas
>>
>>>
>>>
>>>> Note also that a Guille introduced something new, #closed which is
>>>> related to the difference between having no more elements (maybe right now,
>>>> like an open network stream) and never ever being able to produce more data.
>>>>
>>>> Sven


Reply | Threaded
Open this post in threaded view
|

Re: Changed #atEnd primitive - #atEnd vs #next returning nil

alistairgrant
On 4 April 2018 at 12:56, Sven Van Caekenberghe <[hidden email]> wrote:
> Playing a bit devil's advocate, the idea is that, in general,
>
> [ stream atEnd] whileFalse: [ stream next. "..." ].
>
> is no longer allowed ?

It hasn't been allowed "forever" [1].  It's just been misused for
almost as long.

[1] Time began when stdio stream support was introduced. :-)


> And you want to replace it with
>
> [ stream next ifNil: [ false ] ifNotNil: [ :x | "..." true ] whileTrue.
>
> That is a pretty big change, no ?

That's the way quite a bit of code already operates.

As Denis pointed out, it's obviously problematic in the general sense,
since nil can be embedded in non-byte oriented streams.  I suspect
that in practice not many people write code that reads streams from
both byte oriented and non-byte oriented streams.


> I think/feel like a proper EOF exception would be better, more correct.
>
> [ [ stream next. "..." true ] on: EOF do: [ false ] ] whileTrue.

I agree, but the email thread Nicolas pointed to raises some
performance questions about this approach.  It should be
straightforward to do a basic performance comparison which I'll get
around to if other objections aren't raised.


> Will we throw away #atEnd then ? Do we need it if we cannot use it ?

Unix file i/o returns EOF if the end of file has been reach OR if an
error occurs.  You should still check #atEnd after reading past the
end of the file to make sure no error occurred.  Another part of the
primitive change I'm proposing is to return additional information
about what went wrong in the event of an error.

We could modify the read primitive so that it fails if an error has
occurred, and then #atEnd wouldn't be required.

Cheers,
Alistair



>> On 4 Apr 2018, at 12:41, Alistair Grant <[hidden email]> wrote:
>>
>> Hi Nicolas,
>>
>> On 4 April 2018 at 12:36, Nicolas Cellier
>> <[hidden email]> wrote:
>>>
>>>
>>> 2018-04-04 12:18 GMT+02:00 Alistair Grant <[hidden email]>:
>>>>
>>>> Hi Sven,
>>>>
>>>> On Wed, Apr 04, 2018 at 11:32:02AM +0200, Sven Van Caekenberghe wrote:
>>>>> Somehow, somewhere there was a change to the implementation of the
>>>>> primitive called by some streams' #atEnd.
>>>>
>>>> That's a proposed change by me, but it hasn't been integrated yet.  So
>>>> the discussion below should apply to the current stable vm (from August
>>>> last year).
>>>>
>>>>
>>>>> IIRC, someone said it is implemented as 'remaining size being zero'
>>>>> and some virtual unix files like /dev/random are zero sized.
>>>>
>>>> Currently, for files other than sdio (stdout, stderr, stdin) it is
>>>> effectively defined as:
>>>>
>>>> atEnd := stream position >= stream size
>>>>
>>>>
>>>> And, as you say, plenty of virtual unix files report size 0.
>>>>
>>>>
>>>>
>>>>> Now, all kinds of changes are being done image size to work around this.
>>>>
>>>> I would phrase this slightly differently :-)
>>>>
>>>> Some code does the right thing, while other code doesn't.  E.g.:
>>>>
>>>> MultiByteFileStream>>upToEnd is good, while
>>>> FileStream>>contents is incorrect
>>>>
>>>>
>>>>> I am a strong believer in simple, real (i.e. infinite) streams, but I
>>>>> am not sure we are doing the right thing here.
>>>>>
>>>>> Point is, I am not sure #next returning nil is official and universal.
>>>>>
>>>>> Consider the comments:
>>>>>
>>>>> Stream>>#next
>>>>>  "Answer the next object accessible by the receiver."
>>>>>
>>>>> ReadStream>>#next
>>>>>  "Primitive. Answer the next object in the Stream represented by the
>>>>>  receiver. Fail if the collection of this stream is not an Array or a
>>>>> String.
>>>>>  Fail if the stream is positioned at its end, or if the position is out
>>>>> of
>>>>>  bounds in the collection. Optional. See Object documentation
>>>>>  whatIsAPrimitive."
>>>>>
>>>>> Note how there is no talk about returning nil !
>>>>>
>>>>> I think we should discuss about this first.
>>>>>
>>>>> Was the low level change really correct and the right thing to do ?
>>>>
>>>> The primitive change proposed doesn't affect this discussion.  It will
>>>> mean that #atEnd returns false (correctly) sometimes, while currently it
>>>> returns true (incorrectly).  The end result is still incorrect, e.g.
>>>> #contents returns an empty string for /proc/cpuinfo.
>>>>
>>>> You're correct about no mention of nil, but we have:
>>>>
>>>> FileStream>>next
>>>>
>>>>        (position >= readLimit and: [self atEnd])
>>>>                ifTrue: [^nil]
>>>>                ifFalse: [^collection at: (position := position + 1)]
>>>>
>>>>
>>>> which has been around for a long time (I suspect, before Pharo existed).
>>>>
>>>> Having said that, I think that raising an exception is a better
>>>> solution, but it is a much, much bigger change than the one I proposed
>>>> in https://github.com/pharo-project/pharo/pull/1180.
>>>>
>>>>
>>>> Cheers,
>>>> Alistair
>>>>
>>>
>>> Hi,
>>> yes, if you are after universal behavior englobing Unix streams, the
>>> Exception might be the best way.
>>> Because on special stream you can't allways say in advance, you have to try.
>>> That's the solution adopted by authors of Xtreams.
>>> But there is a runtime penalty associated to it.
>>>
>>> The penalty once was so high that my proposal to generalize EndOfStream
>>> usage was rejected a few years ago by AndreaRaab.
>>> http://forum.world.st/EndOfStream-unused-td68806.html
>>
>> Thanks for this, I'll definitely take a look.
>>
>> Do you have a sense of how Denis' suggestion of using an EndOfStream
>> object would compare?
>>
>> It would keep the same coding style, but avoid the problems with nil.
>>
>> Thanks,
>> Alistair
>>
>>
>>
>>> I have regularly benched Xtreams, but stopped a few years ago.
>>> Maybe i can excavate and pass on newer VM.
>>>
>>> In the mean time, i had experimented a programmable end of stream behavior
>>> (via a block, or any other valuable)
>>> http://www.squeaksource.com/XTream.htm
>>> so as to reconcile performance and universality, but it was a source of
>>> complexification at implementation side.
>>>
>>> Nicolas
>>>
>>>>
>>>>
>>>>> Note also that a Guille introduced something new, #closed which is
>>>>> related to the difference between having no more elements (maybe right now,
>>>>> like an open network stream) and never ever being able to produce more data.
>>>>>
>>>>> Sven
>
>

Reply | Threaded
Open this post in threaded view
|

Re: Changed #atEnd primitive - #atEnd vs #next returning nil

Sven Van Caekenberghe-2
Alistair,

First off, thanks for the discussions and your contributions, I really appreciate them.

But I want to have a discussion at the high level of the definition and semantics of the stream API in Pharo.

> On 4 Apr 2018, at 13:20, Alistair Grant <[hidden email]> wrote:
>
> On 4 April 2018 at 12:56, Sven Van Caekenberghe <[hidden email]> wrote:
>> Playing a bit devil's advocate, the idea is that, in general,
>>
>> [ stream atEnd] whileFalse: [ stream next. "..." ].
>>
>> is no longer allowed ?
>
> It hasn't been allowed "forever" [1].  It's just been misused for
> almost as long.
>
> [1] Time began when stdio stream support was introduced. :-)

I am still not convinced. Another way to put it would be that the old #atEnd or #upToEnd do not make sense for these streams and some new loop is needed, based on a new test (it exists for socket streams already).

[ stream isDataAvailable ] whileTrue: [ stream next ]

>> And you want to replace it with
>>
>> [ stream next ifNil: [ false ] ifNotNil: [ :x | "..." true ] whileTrue.
>>
>> That is a pretty big change, no ?
>
> That's the way quite a bit of code already operates.
>
> As Denis pointed out, it's obviously problematic in the general sense,
> since nil can be embedded in non-byte oriented streams.  I suspect
> that in practice not many people write code that reads streams from
> both byte oriented and non-byte oriented streams.

Maybe yes, maybe no. As Denis' example shows there is a clear definition problem.

And I do use streams of byte arrays or strings all the time, this is really important. I want my parsers to work on all kinds of streams.

>> I think/feel like a proper EOF exception would be better, more correct.
>>
>> [ [ stream next. "..." true ] on: EOF do: [ false ] ] whileTrue.
>
> I agree, but the email thread Nicolas pointed to raises some
> performance questions about this approach.  It should be
> straightforward to do a basic performance comparison which I'll get
> around to if other objections aren't raised.

Reading in bigger blocks, using #readInto:startingAt:count: (which is basically Unix's (2) Read sys call), would solve performance problems, I think.

>> Will we throw away #atEnd then ? Do we need it if we cannot use it ?
>
> Unix file i/o returns EOF if the end of file has been reach OR if an
> error occurs.  You should still check #atEnd after reading past the
> end of the file to make sure no error occurred.  Another part of the
> primitive change I'm proposing is to return additional information
> about what went wrong in the event of an error.

I am sorry, but this kind of semantics (the OR) is way too complex at the general image level, it is too specific and based on certain underlying implementation details.

Sven

> We could modify the read primitive so that it fails if an error has
> occurred, and then #atEnd wouldn't be required.
>
> Cheers,
> Alistair
>
>
>
>>> On 4 Apr 2018, at 12:41, Alistair Grant <[hidden email]> wrote:
>>>
>>> Hi Nicolas,
>>>
>>> On 4 April 2018 at 12:36, Nicolas Cellier
>>> <[hidden email]> wrote:
>>>>
>>>>
>>>> 2018-04-04 12:18 GMT+02:00 Alistair Grant <[hidden email]>:
>>>>>
>>>>> Hi Sven,
>>>>>
>>>>> On Wed, Apr 04, 2018 at 11:32:02AM +0200, Sven Van Caekenberghe wrote:
>>>>>> Somehow, somewhere there was a change to the implementation of the
>>>>>> primitive called by some streams' #atEnd.
>>>>>
>>>>> That's a proposed change by me, but it hasn't been integrated yet.  So
>>>>> the discussion below should apply to the current stable vm (from August
>>>>> last year).
>>>>>
>>>>>
>>>>>> IIRC, someone said it is implemented as 'remaining size being zero'
>>>>>> and some virtual unix files like /dev/random are zero sized.
>>>>>
>>>>> Currently, for files other than sdio (stdout, stderr, stdin) it is
>>>>> effectively defined as:
>>>>>
>>>>> atEnd := stream position >= stream size
>>>>>
>>>>>
>>>>> And, as you say, plenty of virtual unix files report size 0.
>>>>>
>>>>>
>>>>>
>>>>>> Now, all kinds of changes are being done image size to work around this.
>>>>>
>>>>> I would phrase this slightly differently :-)
>>>>>
>>>>> Some code does the right thing, while other code doesn't.  E.g.:
>>>>>
>>>>> MultiByteFileStream>>upToEnd is good, while
>>>>> FileStream>>contents is incorrect
>>>>>
>>>>>
>>>>>> I am a strong believer in simple, real (i.e. infinite) streams, but I
>>>>>> am not sure we are doing the right thing here.
>>>>>>
>>>>>> Point is, I am not sure #next returning nil is official and universal.
>>>>>>
>>>>>> Consider the comments:
>>>>>>
>>>>>> Stream>>#next
>>>>>> "Answer the next object accessible by the receiver."
>>>>>>
>>>>>> ReadStream>>#next
>>>>>> "Primitive. Answer the next object in the Stream represented by the
>>>>>> receiver. Fail if the collection of this stream is not an Array or a
>>>>>> String.
>>>>>> Fail if the stream is positioned at its end, or if the position is out
>>>>>> of
>>>>>> bounds in the collection. Optional. See Object documentation
>>>>>> whatIsAPrimitive."
>>>>>>
>>>>>> Note how there is no talk about returning nil !
>>>>>>
>>>>>> I think we should discuss about this first.
>>>>>>
>>>>>> Was the low level change really correct and the right thing to do ?
>>>>>
>>>>> The primitive change proposed doesn't affect this discussion.  It will
>>>>> mean that #atEnd returns false (correctly) sometimes, while currently it
>>>>> returns true (incorrectly).  The end result is still incorrect, e.g.
>>>>> #contents returns an empty string for /proc/cpuinfo.
>>>>>
>>>>> You're correct about no mention of nil, but we have:
>>>>>
>>>>> FileStream>>next
>>>>>
>>>>>       (position >= readLimit and: [self atEnd])
>>>>>               ifTrue: [^nil]
>>>>>               ifFalse: [^collection at: (position := position + 1)]
>>>>>
>>>>>
>>>>> which has been around for a long time (I suspect, before Pharo existed).
>>>>>
>>>>> Having said that, I think that raising an exception is a better
>>>>> solution, but it is a much, much bigger change than the one I proposed
>>>>> in https://github.com/pharo-project/pharo/pull/1180.
>>>>>
>>>>>
>>>>> Cheers,
>>>>> Alistair
>>>>>
>>>>
>>>> Hi,
>>>> yes, if you are after universal behavior englobing Unix streams, the
>>>> Exception might be the best way.
>>>> Because on special stream you can't allways say in advance, you have to try.
>>>> That's the solution adopted by authors of Xtreams.
>>>> But there is a runtime penalty associated to it.
>>>>
>>>> The penalty once was so high that my proposal to generalize EndOfStream
>>>> usage was rejected a few years ago by AndreaRaab.
>>>> http://forum.world.st/EndOfStream-unused-td68806.html
>>>
>>> Thanks for this, I'll definitely take a look.
>>>
>>> Do you have a sense of how Denis' suggestion of using an EndOfStream
>>> object would compare?
>>>
>>> It would keep the same coding style, but avoid the problems with nil.
>>>
>>> Thanks,
>>> Alistair
>>>
>>>
>>>
>>>> I have regularly benched Xtreams, but stopped a few years ago.
>>>> Maybe i can excavate and pass on newer VM.
>>>>
>>>> In the mean time, i had experimented a programmable end of stream behavior
>>>> (via a block, or any other valuable)
>>>> http://www.squeaksource.com/XTream.htm
>>>> so as to reconcile performance and universality, but it was a source of
>>>> complexification at implementation side.
>>>>
>>>> Nicolas
>>>>
>>>>>
>>>>>
>>>>>> Note also that a Guille introduced something new, #closed which is
>>>>>> related to the difference between having no more elements (maybe right now,
>>>>>> like an open network stream) and never ever being able to produce more data.
>>>>>>
>>>>>> Sven
>>
>>
>


Reply | Threaded
Open this post in threaded view
|

Re: Changed #atEnd primitive - #atEnd vs #next returning nil

K K Subbu
In reply to this post by Nicolas Cellier
On Wednesday 04 April 2018 04:06 PM, Nicolas Cellier wrote:
>> IIRC, someone said it is implemented as 'remaining size being zero'
>> and some virtual unix files like /dev/random are zero sized.
>
> Currently, for files other than sdio (stdout, stderr, stdin) it is
> effectively defined as:
>
> atEnd := stream position >= stream size
I see a confusion between Stream and its underlying collection. Stream
is an iterator and just does next, nextPut, peek, reset etc. But methods
like size or atEnd depend on its collection and there is no guarantee
that this collection has a known and finite size.

Essentially, a collection's size may be known and finite, unknown but
finite size or infinite. This has nothing do with file descriptor being
std{in,out,err}. If std* is a regular file or special file like
/dev/mem, /dev/null, it's size is known at open time. With console
streams or sysfs files, size is unknown until EOT (^D) or NUL is
received. Lastly, special files like /dev/zero, /dev/random or
/proc/cpuinfo don't have a finite size but report it as zero (!).

[ stream atEnd ] whileFalse: [ stream next. .. ]

will only terminate if its collection size is finite. It won't terminate
for infinite collections.

Regards .. Subbu

Reply | Threaded
Open this post in threaded view
|

Re: Changed #atEnd primitive - #atEnd vs #next returning nil

Sven Van Caekenberghe-2


> On 4 Apr 2018, at 17:32, K K Subbu <[hidden email]> wrote:
>
> On Wednesday 04 April 2018 04:06 PM, Nicolas Cellier wrote:
>>> IIRC, someone said it is implemented as 'remaining size being zero'
>>> and some virtual unix files like /dev/random are zero sized.
>> Currently, for files other than sdio (stdout, stderr, stdin) it is
>> effectively defined as:
>> atEnd := stream position >= stream size
> I see a confusion between Stream and its underlying collection. Stream is an iterator and just does next, nextPut, peek, reset etc. But methods like size or atEnd depend on its collection and there is no guarantee that this collection has a known and finite size.
>
> Essentially, a collection's size may be known and finite, unknown but finite size or infinite. This has nothing do with file descriptor being std{in,out,err}. If std* is a regular file or special file like /dev/mem, /dev/null, it's size is known at open time. With console streams or sysfs files, size is unknown until EOT (^D) or NUL is received. Lastly, special files like /dev/zero, /dev/random or /proc/cpuinfo don't have a finite size but report it as zero (!).
>
> [ stream atEnd ] whileFalse: [ stream next. .. ]
>
> will only terminate if its collection size is finite. It won't terminate for infinite collections.
>
> Regards .. Subbu

Good summary, I agree.

Still, what are the semantics of #next - does the caller always have to check for nil ? Do we think this is ugly (as the return value is outside the domain) ? Do we then still need #atEnd ?

Sven



Reply | Threaded
Open this post in threaded view
|

Re: Changed #atEnd primitive - #atEnd vs #next returning nil

Eliot Miranda-2


On Wed, Apr 4, 2018 at 8:37 AM, Sven Van Caekenberghe <[hidden email]> wrote:


> On 4 Apr 2018, at 17:32, K K Subbu <[hidden email]> wrote:
>
> On Wednesday 04 April 2018 04:06 PM, Nicolas Cellier wrote:
>>> IIRC, someone said it is implemented as 'remaining size being zero'
>>> and some virtual unix files like /dev/random are zero sized.
>> Currently, for files other than sdio (stdout, stderr, stdin) it is
>> effectively defined as:
>> atEnd := stream position >= stream size
> I see a confusion between Stream and its underlying collection. Stream is an iterator and just does next, nextPut, peek, reset etc. But methods like size or atEnd depend on its collection and there is no guarantee that this collection has a known and finite size.
>
> Essentially, a collection's size may be known and finite, unknown but finite size or infinite. This has nothing do with file descriptor being std{in,out,err}. If std* is a regular file or special file like /dev/mem, /dev/null, it's size is known at open time. With console streams or sysfs files, size is unknown until EOT (^D) or NUL is received. Lastly, special files like /dev/zero, /dev/random or /proc/cpuinfo don't have a finite size but report it as zero (!).
>
> [ stream atEnd ] whileFalse: [ stream next. .. ]
>
> will only terminate if its collection size is finite. It won't terminate for infinite collections.
>
> Regards .. Subbu

Good summary, I agree.

Still, what are the semantics of #next - does the caller always have to check for nil ? Do we think this is ugly (as the return value is outside the domain) ?

The problem is not that the value is outside the domain.  The problem is that it may be within the domain.  Answering an element within the domain when there are no more elements is insane.  Not even C does that.
 
Do we then still need #atEnd ?

For backward compatibility, yes, even if deprecated.  There is so much existing code that uses it, it'll be useful to have.
 
Sven

_,,,^..^,,,_
best, Eliot
Reply | Threaded
Open this post in threaded view
|

Re: Changed #atEnd primitive - #atEnd vs #next returning nil

K K Subbu
In reply to this post by Sven Van Caekenberghe-2
On Wednesday 04 April 2018 09:07 PM, Sven Van Caekenberghe wrote:
> Good summary, I agree.
>
> Still, what are the semantics of #next - does the caller always have
> to check for nil ? Do we think this is ugly (as the return value is
> outside the domain) ? Do we then still need #atEnd ?

Senders of #next should be prepared to handle errors and EoS is just one
of them.

On exhausting all elements in the collection, #next could block waiting
for next object to become available or check for errors. Reading past
End is just one of the error conditions. One way to recover from EoS
error is to return an sentinel. The exact sentinel object could depend
on the collection. Eliot suggested a good one earlier in this thread.

Regards .. Subbu

Reply | Threaded
Open this post in threaded view
|

Re: Changed #atEnd primitive - #atEnd vs #next returning nil

Stephane Ducasse-3
In reply to this post by Sven Van Caekenberghe-2
Thanks for this discussion.

On Wed, Apr 4, 2018 at 1:37 PM, Sven Van Caekenberghe <[hidden email]> wrote:

> Alistair,
>
> First off, thanks for the discussions and your contributions, I really appreciate them.
>
> But I want to have a discussion at the high level of the definition and semantics of the stream API in Pharo.
>
>> On 4 Apr 2018, at 13:20, Alistair Grant <[hidden email]> wrote:
>>
>> On 4 April 2018 at 12:56, Sven Van Caekenberghe <[hidden email]> wrote:
>>> Playing a bit devil's advocate, the idea is that, in general,
>>>
>>> [ stream atEnd] whileFalse: [ stream next. "..." ].
>>>
>>> is no longer allowed ?
>>
>> It hasn't been allowed "forever" [1].  It's just been misused for
>> almost as long.
>>
>> [1] Time began when stdio stream support was introduced. :-)
>
> I am still not convinced. Another way to put it would be that the old #atEnd or #upToEnd do not make sense for these streams and some new loop is needed, based on a new test (it exists for socket streams already).
>
> [ stream isDataAvailable ] whileTrue: [ stream next ]
>
>>> And you want to replace it with
>>>
>>> [ stream next ifNil: [ false ] ifNotNil: [ :x | "..." true ] whileTrue.
>>>
>>> That is a pretty big change, no ?
>>
>> That's the way quite a bit of code already operates.
>>
>> As Denis pointed out, it's obviously problematic in the general sense,
>> since nil can be embedded in non-byte oriented streams.  I suspect
>> that in practice not many people write code that reads streams from
>> both byte oriented and non-byte oriented streams.
>
> Maybe yes, maybe no. As Denis' example shows there is a clear definition problem.
>
> And I do use streams of byte arrays or strings all the time, this is really important. I want my parsers to work on all kinds of streams.
>
>>> I think/feel like a proper EOF exception would be better, more correct.
>>>
>>> [ [ stream next. "..." true ] on: EOF do: [ false ] ] whileTrue.
>>
>> I agree, but the email thread Nicolas pointed to raises some
>> performance questions about this approach.  It should be
>> straightforward to do a basic performance comparison which I'll get
>> around to if other objections aren't raised.
>
> Reading in bigger blocks, using #readInto:startingAt:count: (which is basically Unix's (2) Read sys call), would solve performance problems, I think.
>
>>> Will we throw away #atEnd then ? Do we need it if we cannot use it ?
>>
>> Unix file i/o returns EOF if the end of file has been reach OR if an
>> error occurs.  You should still check #atEnd after reading past the
>> end of the file to make sure no error occurred.  Another part of the
>> primitive change I'm proposing is to return additional information
>> about what went wrong in the event of an error.
>
> I am sorry, but this kind of semantics (the OR) is way too complex at the general image level, it is too specific and based on certain underlying implementation details.
>
> Sven
>
>> We could modify the read primitive so that it fails if an error has
>> occurred, and then #atEnd wouldn't be required.
>>
>> Cheers,
>> Alistair
>>
>>
>>
>>>> On 4 Apr 2018, at 12:41, Alistair Grant <[hidden email]> wrote:
>>>>
>>>> Hi Nicolas,
>>>>
>>>> On 4 April 2018 at 12:36, Nicolas Cellier
>>>> <[hidden email]> wrote:
>>>>>
>>>>>
>>>>> 2018-04-04 12:18 GMT+02:00 Alistair Grant <[hidden email]>:
>>>>>>
>>>>>> Hi Sven,
>>>>>>
>>>>>> On Wed, Apr 04, 2018 at 11:32:02AM +0200, Sven Van Caekenberghe wrote:
>>>>>>> Somehow, somewhere there was a change to the implementation of the
>>>>>>> primitive called by some streams' #atEnd.
>>>>>>
>>>>>> That's a proposed change by me, but it hasn't been integrated yet.  So
>>>>>> the discussion below should apply to the current stable vm (from August
>>>>>> last year).
>>>>>>
>>>>>>
>>>>>>> IIRC, someone said it is implemented as 'remaining size being zero'
>>>>>>> and some virtual unix files like /dev/random are zero sized.
>>>>>>
>>>>>> Currently, for files other than sdio (stdout, stderr, stdin) it is
>>>>>> effectively defined as:
>>>>>>
>>>>>> atEnd := stream position >= stream size
>>>>>>
>>>>>>
>>>>>> And, as you say, plenty of virtual unix files report size 0.
>>>>>>
>>>>>>
>>>>>>
>>>>>>> Now, all kinds of changes are being done image size to work around this.
>>>>>>
>>>>>> I would phrase this slightly differently :-)
>>>>>>
>>>>>> Some code does the right thing, while other code doesn't.  E.g.:
>>>>>>
>>>>>> MultiByteFileStream>>upToEnd is good, while
>>>>>> FileStream>>contents is incorrect
>>>>>>
>>>>>>
>>>>>>> I am a strong believer in simple, real (i.e. infinite) streams, but I
>>>>>>> am not sure we are doing the right thing here.
>>>>>>>
>>>>>>> Point is, I am not sure #next returning nil is official and universal.
>>>>>>>
>>>>>>> Consider the comments:
>>>>>>>
>>>>>>> Stream>>#next
>>>>>>> "Answer the next object accessible by the receiver."
>>>>>>>
>>>>>>> ReadStream>>#next
>>>>>>> "Primitive. Answer the next object in the Stream represented by the
>>>>>>> receiver. Fail if the collection of this stream is not an Array or a
>>>>>>> String.
>>>>>>> Fail if the stream is positioned at its end, or if the position is out
>>>>>>> of
>>>>>>> bounds in the collection. Optional. See Object documentation
>>>>>>> whatIsAPrimitive."
>>>>>>>
>>>>>>> Note how there is no talk about returning nil !
>>>>>>>
>>>>>>> I think we should discuss about this first.
>>>>>>>
>>>>>>> Was the low level change really correct and the right thing to do ?
>>>>>>
>>>>>> The primitive change proposed doesn't affect this discussion.  It will
>>>>>> mean that #atEnd returns false (correctly) sometimes, while currently it
>>>>>> returns true (incorrectly).  The end result is still incorrect, e.g.
>>>>>> #contents returns an empty string for /proc/cpuinfo.
>>>>>>
>>>>>> You're correct about no mention of nil, but we have:
>>>>>>
>>>>>> FileStream>>next
>>>>>>
>>>>>>       (position >= readLimit and: [self atEnd])
>>>>>>               ifTrue: [^nil]
>>>>>>               ifFalse: [^collection at: (position := position + 1)]
>>>>>>
>>>>>>
>>>>>> which has been around for a long time (I suspect, before Pharo existed).
>>>>>>
>>>>>> Having said that, I think that raising an exception is a better
>>>>>> solution, but it is a much, much bigger change than the one I proposed
>>>>>> in https://github.com/pharo-project/pharo/pull/1180.
>>>>>>
>>>>>>
>>>>>> Cheers,
>>>>>> Alistair
>>>>>>
>>>>>
>>>>> Hi,
>>>>> yes, if you are after universal behavior englobing Unix streams, the
>>>>> Exception might be the best way.
>>>>> Because on special stream you can't allways say in advance, you have to try.
>>>>> That's the solution adopted by authors of Xtreams.
>>>>> But there is a runtime penalty associated to it.
>>>>>
>>>>> The penalty once was so high that my proposal to generalize EndOfStream
>>>>> usage was rejected a few years ago by AndreaRaab.
>>>>> http://forum.world.st/EndOfStream-unused-td68806.html
>>>>
>>>> Thanks for this, I'll definitely take a look.
>>>>
>>>> Do you have a sense of how Denis' suggestion of using an EndOfStream
>>>> object would compare?
>>>>
>>>> It would keep the same coding style, but avoid the problems with nil.
>>>>
>>>> Thanks,
>>>> Alistair
>>>>
>>>>
>>>>
>>>>> I have regularly benched Xtreams, but stopped a few years ago.
>>>>> Maybe i can excavate and pass on newer VM.
>>>>>
>>>>> In the mean time, i had experimented a programmable end of stream behavior
>>>>> (via a block, or any other valuable)
>>>>> http://www.squeaksource.com/XTream.htm
>>>>> so as to reconcile performance and universality, but it was a source of
>>>>> complexification at implementation side.
>>>>>
>>>>> Nicolas
>>>>>
>>>>>>
>>>>>>
>>>>>>> Note also that a Guille introduced something new, #closed which is
>>>>>>> related to the difference between having no more elements (maybe right now,
>>>>>>> like an open network stream) and never ever being able to produce more data.
>>>>>>>
>>>>>>> Sven
>>>
>>>
>>
>
>

Reply | Threaded
Open this post in threaded view
|

Re: Changed #atEnd primitive - #atEnd vs #next returning nil

alistairgrant
First a quick update:

After doing some work on primitiveFileAtEnd, #atEnd now answers
correctly for files that don't report their size correctly, e.g.
/dev/urandom and /proc/cpuinfo, whether the files are opened directly or
redirected through stdin.

However determining whether stdin from a terminal has reached the end of
file can't be done without making #atEnd blocking since we have to wait
for the user to flag the end of file, e.g. by typing Ctrl-D.  And #atEnd
is assumed to be non-blocking.

So currently using ZnCharacterReadStream with stdin from a terminal will
result in a stack dump similar to:

MessageNotUnderstood: receiver of "<" is nil
UndefinedObject(Object)>>doesNotUnderstand: #<
ZnUTF8Encoder>>nextCodePointFromStream:
ZnUTF8Encoder(ZnCharacterEncoder)>>nextFromStream:
ZnCharacterReadStream>>nextElement
ZnCharacterReadStream(ZnEncodedReadStream)>>next
UndefinedObject>>DoIt


Going back through the various suggestions that have been made regarding
using a sentinel object vs. raising a notification / exception, my
(still to be polished) suggestion is to:

1. Add an endOfStream instance variable
2. When the end of the stream is reached answer the value of the
   instance variable (i.e. the result of sending #value to the variable).
3. The initial default value would be a block that raises a Deprecation
   warning and then returns nil.  This would allow existing code to
   function for a changeover period.
4. At the end of the deprecation period the default value would be
   changed to a unique sentinel object which would answer itself as its
   #value.

At any time users of the stream can set their own sentinel, including a
block that raises an exception.


Cheers,
Alistair


On 4 April 2018 at 19:24, Stephane Ducasse <[hidden email]> wrote:

> Thanks for this discussion.
>
> On Wed, Apr 4, 2018 at 1:37 PM, Sven Van Caekenberghe <[hidden email]> wrote:
>> Alistair,
>>
>> First off, thanks for the discussions and your contributions, I really appreciate them.
>>
>> But I want to have a discussion at the high level of the definition and semantics of the stream API in Pharo.
>>
>>> On 4 Apr 2018, at 13:20, Alistair Grant <[hidden email]> wrote:
>>>
>>> On 4 April 2018 at 12:56, Sven Van Caekenberghe <[hidden email]> wrote:
>>>> Playing a bit devil's advocate, the idea is that, in general,
>>>>
>>>> [ stream atEnd] whileFalse: [ stream next. "..." ].
>>>>
>>>> is no longer allowed ?
>>>
>>> It hasn't been allowed "forever" [1].  It's just been misused for
>>> almost as long.
>>>
>>> [1] Time began when stdio stream support was introduced. :-)
>>
>> I am still not convinced. Another way to put it would be that the old #atEnd or #upToEnd do not make sense for these streams and some new loop is needed, based on a new test (it exists for socket streams already).
>>
>> [ stream isDataAvailable ] whileTrue: [ stream next ]
>>
>>>> And you want to replace it with
>>>>
>>>> [ stream next ifNil: [ false ] ifNotNil: [ :x | "..." true ] whileTrue.
>>>>
>>>> That is a pretty big change, no ?
>>>
>>> That's the way quite a bit of code already operates.
>>>
>>> As Denis pointed out, it's obviously problematic in the general sense,
>>> since nil can be embedded in non-byte oriented streams.  I suspect
>>> that in practice not many people write code that reads streams from
>>> both byte oriented and non-byte oriented streams.
>>
>> Maybe yes, maybe no. As Denis' example shows there is a clear definition problem.
>>
>> And I do use streams of byte arrays or strings all the time, this is really important. I want my parsers to work on all kinds of streams.
>>
>>>> I think/feel like a proper EOF exception would be better, more correct.
>>>>
>>>> [ [ stream next. "..." true ] on: EOF do: [ false ] ] whileTrue.
>>>
>>> I agree, but the email thread Nicolas pointed to raises some
>>> performance questions about this approach.  It should be
>>> straightforward to do a basic performance comparison which I'll get
>>> around to if other objections aren't raised.
>>
>> Reading in bigger blocks, using #readInto:startingAt:count: (which is basically Unix's (2) Read sys call), would solve performance problems, I think.
>>
>>>> Will we throw away #atEnd then ? Do we need it if we cannot use it ?
>>>
>>> Unix file i/o returns EOF if the end of file has been reach OR if an
>>> error occurs.  You should still check #atEnd after reading past the
>>> end of the file to make sure no error occurred.  Another part of the
>>> primitive change I'm proposing is to return additional information
>>> about what went wrong in the event of an error.
>>
>> I am sorry, but this kind of semantics (the OR) is way too complex at the general image level, it is too specific and based on certain underlying implementation details.
>>
>> Sven
>>
>>> We could modify the read primitive so that it fails if an error has
>>> occurred, and then #atEnd wouldn't be required.
>>>
>>> Cheers,
>>> Alistair
>>>
>>>
>>>
>>>>> On 4 Apr 2018, at 12:41, Alistair Grant <[hidden email]> wrote:
>>>>>
>>>>> Hi Nicolas,
>>>>>
>>>>> On 4 April 2018 at 12:36, Nicolas Cellier
>>>>> <[hidden email]> wrote:
>>>>>>
>>>>>>
>>>>>> 2018-04-04 12:18 GMT+02:00 Alistair Grant <[hidden email]>:
>>>>>>>
>>>>>>> Hi Sven,
>>>>>>>
>>>>>>> On Wed, Apr 04, 2018 at 11:32:02AM +0200, Sven Van Caekenberghe wrote:
>>>>>>>> Somehow, somewhere there was a change to the implementation of the
>>>>>>>> primitive called by some streams' #atEnd.
>>>>>>>
>>>>>>> That's a proposed change by me, but it hasn't been integrated yet.  So
>>>>>>> the discussion below should apply to the current stable vm (from August
>>>>>>> last year).
>>>>>>>
>>>>>>>
>>>>>>>> IIRC, someone said it is implemented as 'remaining size being zero'
>>>>>>>> and some virtual unix files like /dev/random are zero sized.
>>>>>>>
>>>>>>> Currently, for files other than sdio (stdout, stderr, stdin) it is
>>>>>>> effectively defined as:
>>>>>>>
>>>>>>> atEnd := stream position >= stream size
>>>>>>>
>>>>>>>
>>>>>>> And, as you say, plenty of virtual unix files report size 0.
>>>>>>>
>>>>>>>
>>>>>>>
>>>>>>>> Now, all kinds of changes are being done image size to work around this.
>>>>>>>
>>>>>>> I would phrase this slightly differently :-)
>>>>>>>
>>>>>>> Some code does the right thing, while other code doesn't.  E.g.:
>>>>>>>
>>>>>>> MultiByteFileStream>>upToEnd is good, while
>>>>>>> FileStream>>contents is incorrect
>>>>>>>
>>>>>>>
>>>>>>>> I am a strong believer in simple, real (i.e. infinite) streams, but I
>>>>>>>> am not sure we are doing the right thing here.
>>>>>>>>
>>>>>>>> Point is, I am not sure #next returning nil is official and universal.
>>>>>>>>
>>>>>>>> Consider the comments:
>>>>>>>>
>>>>>>>> Stream>>#next
>>>>>>>> "Answer the next object accessible by the receiver."
>>>>>>>>
>>>>>>>> ReadStream>>#next
>>>>>>>> "Primitive. Answer the next object in the Stream represented by the
>>>>>>>> receiver. Fail if the collection of this stream is not an Array or a
>>>>>>>> String.
>>>>>>>> Fail if the stream is positioned at its end, or if the position is out
>>>>>>>> of
>>>>>>>> bounds in the collection. Optional. See Object documentation
>>>>>>>> whatIsAPrimitive."
>>>>>>>>
>>>>>>>> Note how there is no talk about returning nil !
>>>>>>>>
>>>>>>>> I think we should discuss about this first.
>>>>>>>>
>>>>>>>> Was the low level change really correct and the right thing to do ?
>>>>>>>
>>>>>>> The primitive change proposed doesn't affect this discussion.  It will
>>>>>>> mean that #atEnd returns false (correctly) sometimes, while currently it
>>>>>>> returns true (incorrectly).  The end result is still incorrect, e.g.
>>>>>>> #contents returns an empty string for /proc/cpuinfo.
>>>>>>>
>>>>>>> You're correct about no mention of nil, but we have:
>>>>>>>
>>>>>>> FileStream>>next
>>>>>>>
>>>>>>>       (position >= readLimit and: [self atEnd])
>>>>>>>               ifTrue: [^nil]
>>>>>>>               ifFalse: [^collection at: (position := position + 1)]
>>>>>>>
>>>>>>>
>>>>>>> which has been around for a long time (I suspect, before Pharo existed).
>>>>>>>
>>>>>>> Having said that, I think that raising an exception is a better
>>>>>>> solution, but it is a much, much bigger change than the one I proposed
>>>>>>> in https://github.com/pharo-project/pharo/pull/1180.
>>>>>>>
>>>>>>>
>>>>>>> Cheers,
>>>>>>> Alistair
>>>>>>>
>>>>>>
>>>>>> Hi,
>>>>>> yes, if you are after universal behavior englobing Unix streams, the
>>>>>> Exception might be the best way.
>>>>>> Because on special stream you can't allways say in advance, you have to try.
>>>>>> That's the solution adopted by authors of Xtreams.
>>>>>> But there is a runtime penalty associated to it.
>>>>>>
>>>>>> The penalty once was so high that my proposal to generalize EndOfStream
>>>>>> usage was rejected a few years ago by AndreaRaab.
>>>>>> http://forum.world.st/EndOfStream-unused-td68806.html
>>>>>
>>>>> Thanks for this, I'll definitely take a look.
>>>>>
>>>>> Do you have a sense of how Denis' suggestion of using an EndOfStream
>>>>> object would compare?
>>>>>
>>>>> It would keep the same coding style, but avoid the problems with nil.
>>>>>
>>>>> Thanks,
>>>>> Alistair
>>>>>
>>>>>
>>>>>
>>>>>> I have regularly benched Xtreams, but stopped a few years ago.
>>>>>> Maybe i can excavate and pass on newer VM.
>>>>>>
>>>>>> In the mean time, i had experimented a programmable end of stream behavior
>>>>>> (via a block, or any other valuable)
>>>>>> http://www.squeaksource.com/XTream.htm
>>>>>> so as to reconcile performance and universality, but it was a source of
>>>>>> complexification at implementation side.
>>>>>>
>>>>>> Nicolas
>>>>>>
>>>>>>>
>>>>>>>
>>>>>>>> Note also that a Guille introduced something new, #closed which is
>>>>>>>> related to the difference between having no more elements (maybe right now,
>>>>>>>> like an open network stream) and never ever being able to produce more data.
>>>>>>>>
>>>>>>>> Sven
>>>>
>>>>
>>>
>>
>>
>

Reply | Threaded
Open this post in threaded view
|

Re: Changed #atEnd primitive - #atEnd vs #next returning nil

Sven Van Caekenberghe-2
I have trouble understanding your problem analysis, and how your proposed solution, would solve it.

Where is it being said that #next and/or #atEnd should be blocking or non-blocking ?
How is this related to how EOF is signalled ?

It seems to me that there are quite a few classes of streams that are 'special' in the sense that #next could be blocking and/or #atEnd could be unclear - socket/network streams, serial streams, maybe stdio (interactive or not). Without a message like #isDataAvailable you cannot handle those without blocking.

Reading from stdin seems like a very rare case for a Smalltalk system (not that it should not be possible).

I have a feeling that too much functionality is being pushed into too small an API.

> On 10 Apr 2018, at 18:30, Alistair Grant <[hidden email]> wrote:
>
> First a quick update:
>
> After doing some work on primitiveFileAtEnd, #atEnd now answers
> correctly for files that don't report their size correctly, e.g.
> /dev/urandom and /proc/cpuinfo, whether the files are opened directly or
> redirected through stdin.
>
> However determining whether stdin from a terminal has reached the end of
> file can't be done without making #atEnd blocking since we have to wait
> for the user to flag the end of file, e.g. by typing Ctrl-D.  And #atEnd
> is assumed to be non-blocking.
>
> So currently using ZnCharacterReadStream with stdin from a terminal will
> result in a stack dump similar to:
>
> MessageNotUnderstood: receiver of "<" is nil
> UndefinedObject(Object)>>doesNotUnderstand: #<
> ZnUTF8Encoder>>nextCodePointFromStream:
> ZnUTF8Encoder(ZnCharacterEncoder)>>nextFromStream:
> ZnCharacterReadStream>>nextElement
> ZnCharacterReadStream(ZnEncodedReadStream)>>next
> UndefinedObject>>DoIt
>
>
> Going back through the various suggestions that have been made regarding
> using a sentinel object vs. raising a notification / exception, my
> (still to be polished) suggestion is to:
>
> 1. Add an endOfStream instance variable
> 2. When the end of the stream is reached answer the value of the
>   instance variable (i.e. the result of sending #value to the variable).
> 3. The initial default value would be a block that raises a Deprecation
>   warning and then returns nil.  This would allow existing code to
>   function for a changeover period.
> 4. At the end of the deprecation period the default value would be
>   changed to a unique sentinel object which would answer itself as its
>   #value.
>
> At any time users of the stream can set their own sentinel, including a
> block that raises an exception.
>
>
> Cheers,
> Alistair
>
>
> On 4 April 2018 at 19:24, Stephane Ducasse <[hidden email]> wrote:
>> Thanks for this discussion.
>>
>> On Wed, Apr 4, 2018 at 1:37 PM, Sven Van Caekenberghe <[hidden email]> wrote:
>>> Alistair,
>>>
>>> First off, thanks for the discussions and your contributions, I really appreciate them.
>>>
>>> But I want to have a discussion at the high level of the definition and semantics of the stream API in Pharo.
>>>
>>>> On 4 Apr 2018, at 13:20, Alistair Grant <[hidden email]> wrote:
>>>>
>>>> On 4 April 2018 at 12:56, Sven Van Caekenberghe <[hidden email]> wrote:
>>>>> Playing a bit devil's advocate, the idea is that, in general,
>>>>>
>>>>> [ stream atEnd] whileFalse: [ stream next. "..." ].
>>>>>
>>>>> is no longer allowed ?
>>>>
>>>> It hasn't been allowed "forever" [1].  It's just been misused for
>>>> almost as long.
>>>>
>>>> [1] Time began when stdio stream support was introduced. :-)
>>>
>>> I am still not convinced. Another way to put it would be that the old #atEnd or #upToEnd do not make sense for these streams and some new loop is needed, based on a new test (it exists for socket streams already).
>>>
>>> [ stream isDataAvailable ] whileTrue: [ stream next ]
>>>
>>>>> And you want to replace it with
>>>>>
>>>>> [ stream next ifNil: [ false ] ifNotNil: [ :x | "..." true ] whileTrue.
>>>>>
>>>>> That is a pretty big change, no ?
>>>>
>>>> That's the way quite a bit of code already operates.
>>>>
>>>> As Denis pointed out, it's obviously problematic in the general sense,
>>>> since nil can be embedded in non-byte oriented streams.  I suspect
>>>> that in practice not many people write code that reads streams from
>>>> both byte oriented and non-byte oriented streams.
>>>
>>> Maybe yes, maybe no. As Denis' example shows there is a clear definition problem.
>>>
>>> And I do use streams of byte arrays or strings all the time, this is really important. I want my parsers to work on all kinds of streams.
>>>
>>>>> I think/feel like a proper EOF exception would be better, more correct.
>>>>>
>>>>> [ [ stream next. "..." true ] on: EOF do: [ false ] ] whileTrue.
>>>>
>>>> I agree, but the email thread Nicolas pointed to raises some
>>>> performance questions about this approach.  It should be
>>>> straightforward to do a basic performance comparison which I'll get
>>>> around to if other objections aren't raised.
>>>
>>> Reading in bigger blocks, using #readInto:startingAt:count: (which is basically Unix's (2) Read sys call), would solve performance problems, I think.
>>>
>>>>> Will we throw away #atEnd then ? Do we need it if we cannot use it ?
>>>>
>>>> Unix file i/o returns EOF if the end of file has been reach OR if an
>>>> error occurs.  You should still check #atEnd after reading past the
>>>> end of the file to make sure no error occurred.  Another part of the
>>>> primitive change I'm proposing is to return additional information
>>>> about what went wrong in the event of an error.
>>>
>>> I am sorry, but this kind of semantics (the OR) is way too complex at the general image level, it is too specific and based on certain underlying implementation details.
>>>
>>> Sven
>>>
>>>> We could modify the read primitive so that it fails if an error has
>>>> occurred, and then #atEnd wouldn't be required.
>>>>
>>>> Cheers,
>>>> Alistair
>>>>
>>>>
>>>>
>>>>>> On 4 Apr 2018, at 12:41, Alistair Grant <[hidden email]> wrote:
>>>>>>
>>>>>> Hi Nicolas,
>>>>>>
>>>>>> On 4 April 2018 at 12:36, Nicolas Cellier
>>>>>> <[hidden email]> wrote:
>>>>>>>
>>>>>>>
>>>>>>> 2018-04-04 12:18 GMT+02:00 Alistair Grant <[hidden email]>:
>>>>>>>>
>>>>>>>> Hi Sven,
>>>>>>>>
>>>>>>>> On Wed, Apr 04, 2018 at 11:32:02AM +0200, Sven Van Caekenberghe wrote:
>>>>>>>>> Somehow, somewhere there was a change to the implementation of the
>>>>>>>>> primitive called by some streams' #atEnd.
>>>>>>>>
>>>>>>>> That's a proposed change by me, but it hasn't been integrated yet.  So
>>>>>>>> the discussion below should apply to the current stable vm (from August
>>>>>>>> last year).
>>>>>>>>
>>>>>>>>
>>>>>>>>> IIRC, someone said it is implemented as 'remaining size being zero'
>>>>>>>>> and some virtual unix files like /dev/random are zero sized.
>>>>>>>>
>>>>>>>> Currently, for files other than sdio (stdout, stderr, stdin) it is
>>>>>>>> effectively defined as:
>>>>>>>>
>>>>>>>> atEnd := stream position >= stream size
>>>>>>>>
>>>>>>>>
>>>>>>>> And, as you say, plenty of virtual unix files report size 0.
>>>>>>>>
>>>>>>>>
>>>>>>>>
>>>>>>>>> Now, all kinds of changes are being done image size to work around this.
>>>>>>>>
>>>>>>>> I would phrase this slightly differently :-)
>>>>>>>>
>>>>>>>> Some code does the right thing, while other code doesn't.  E.g.:
>>>>>>>>
>>>>>>>> MultiByteFileStream>>upToEnd is good, while
>>>>>>>> FileStream>>contents is incorrect
>>>>>>>>
>>>>>>>>
>>>>>>>>> I am a strong believer in simple, real (i.e. infinite) streams, but I
>>>>>>>>> am not sure we are doing the right thing here.
>>>>>>>>>
>>>>>>>>> Point is, I am not sure #next returning nil is official and universal.
>>>>>>>>>
>>>>>>>>> Consider the comments:
>>>>>>>>>
>>>>>>>>> Stream>>#next
>>>>>>>>> "Answer the next object accessible by the receiver."
>>>>>>>>>
>>>>>>>>> ReadStream>>#next
>>>>>>>>> "Primitive. Answer the next object in the Stream represented by the
>>>>>>>>> receiver. Fail if the collection of this stream is not an Array or a
>>>>>>>>> String.
>>>>>>>>> Fail if the stream is positioned at its end, or if the position is out
>>>>>>>>> of
>>>>>>>>> bounds in the collection. Optional. See Object documentation
>>>>>>>>> whatIsAPrimitive."
>>>>>>>>>
>>>>>>>>> Note how there is no talk about returning nil !
>>>>>>>>>
>>>>>>>>> I think we should discuss about this first.
>>>>>>>>>
>>>>>>>>> Was the low level change really correct and the right thing to do ?
>>>>>>>>
>>>>>>>> The primitive change proposed doesn't affect this discussion.  It will
>>>>>>>> mean that #atEnd returns false (correctly) sometimes, while currently it
>>>>>>>> returns true (incorrectly).  The end result is still incorrect, e.g.
>>>>>>>> #contents returns an empty string for /proc/cpuinfo.
>>>>>>>>
>>>>>>>> You're correct about no mention of nil, but we have:
>>>>>>>>
>>>>>>>> FileStream>>next
>>>>>>>>
>>>>>>>>      (position >= readLimit and: [self atEnd])
>>>>>>>>              ifTrue: [^nil]
>>>>>>>>              ifFalse: [^collection at: (position := position + 1)]
>>>>>>>>
>>>>>>>>
>>>>>>>> which has been around for a long time (I suspect, before Pharo existed).
>>>>>>>>
>>>>>>>> Having said that, I think that raising an exception is a better
>>>>>>>> solution, but it is a much, much bigger change than the one I proposed
>>>>>>>> in https://github.com/pharo-project/pharo/pull/1180.
>>>>>>>>
>>>>>>>>
>>>>>>>> Cheers,
>>>>>>>> Alistair
>>>>>>>>
>>>>>>>
>>>>>>> Hi,
>>>>>>> yes, if you are after universal behavior englobing Unix streams, the
>>>>>>> Exception might be the best way.
>>>>>>> Because on special stream you can't allways say in advance, you have to try.
>>>>>>> That's the solution adopted by authors of Xtreams.
>>>>>>> But there is a runtime penalty associated to it.
>>>>>>>
>>>>>>> The penalty once was so high that my proposal to generalize EndOfStream
>>>>>>> usage was rejected a few years ago by AndreaRaab.
>>>>>>> http://forum.world.st/EndOfStream-unused-td68806.html
>>>>>>
>>>>>> Thanks for this, I'll definitely take a look.
>>>>>>
>>>>>> Do you have a sense of how Denis' suggestion of using an EndOfStream
>>>>>> object would compare?
>>>>>>
>>>>>> It would keep the same coding style, but avoid the problems with nil.
>>>>>>
>>>>>> Thanks,
>>>>>> Alistair
>>>>>>
>>>>>>
>>>>>>
>>>>>>> I have regularly benched Xtreams, but stopped a few years ago.
>>>>>>> Maybe i can excavate and pass on newer VM.
>>>>>>>
>>>>>>> In the mean time, i had experimented a programmable end of stream behavior
>>>>>>> (via a block, or any other valuable)
>>>>>>> http://www.squeaksource.com/XTream.htm
>>>>>>> so as to reconcile performance and universality, but it was a source of
>>>>>>> complexification at implementation side.
>>>>>>>
>>>>>>> Nicolas
>>>>>>>
>>>>>>>>
>>>>>>>>
>>>>>>>>> Note also that a Guille introduced something new, #closed which is
>>>>>>>>> related to the difference between having no more elements (maybe right now,
>>>>>>>>> like an open network stream) and never ever being able to produce more data.
>>>>>>>>>
>>>>>>>>> Sven
>>>>>
>>>>>
>>>>
>>>
>>>
>>
>


Reply | Threaded
Open this post in threaded view
|

Re: Changed #atEnd primitive - #atEnd vs #next returning nil

alistairgrant
Hi Sven,

On 10 April 2018 at 19:36, Sven Van Caekenberghe <[hidden email]> wrote:
> I have trouble understanding your problem analysis, and how your proposed solution, would solve it.

The discussion started with my proposal to modify the Zinc streams to
check the return value of #next for nil rather than using #atEnd to
iterate over the underlying stream.  You correctly pointed out that
the intention behind #atEnd is that it can be used to control
iterating over streams.


> Where is it being said that #next and/or #atEnd should be blocking or non-blocking ?

There is existing code that assumes that #atEnd is non-blocking and
that #next is allowed block.  I believe that we should keep those
conditions.


> How is this related to how EOF is signalled ?

Because, combined with terminal EOF not being known until the user
explicitly flags it (with Ctrl-D) it means that #atEnd can't be used
for iterating over input from stdin connected to a terminal.



> It seems to me that there are quite a few classes of streams that are 'special' in the sense that #next could be blocking and/or #atEnd could be unclear - socket/network streams, serial streams, maybe stdio (interactive or not). Without a message like #isDataAvailable you cannot handle those without blocking.

Right.  I think this is a distraction (I was trying to explain some
details, but it's causing more confusion instead of helping).

The important point is that #atEnd doesn't work for iterating over
streams with terminal input



> Reading from stdin seems like a very rare case for a Smalltalk system (not that it should not be possible).

There's been quite a bit of discussion and several projects recently
related to using pharo for scripting, so it may become more common.
E.g.

https://www.quora.com/Can-Smalltalk-be-a-batch-file-scripting-language/answer/Philippe-Back-1?share=c19bfc95
https://github.com/rajula96reddy/pharo-cli



> I have a feeling that too much functionality is being pushed into too small an API.


This is just about how should Zinc streams be iterating over the
underlying streams.  You didn't like checking the result of #next for
nil since it isn't general, correctly pointing out that nil is a valid
value for non-byte oriented streams.  But #atEnd doesn't work for
stdin from a terminal.


At this point I think there are three options:

1. Modify Zinc to check the return value of #next instead of using #atEnd.

This is what all existing character / byte oriented streams in Squeak
and Pharo do.  At that point the Zinc streams can be used on all file
/ stdio input and output.


2. Modify all streams to signal EOF in some other way, i.e. a sentinel
or notification / exception.

This is what we were discussing below.  But it is a decent chunk of
work with significant impact on the existing code base.


3. Require anyone who wants to read from stdin to code around Zinc's
inability to handle terminal input.

I'd prefer to avoid this option if possible.


Does that clarify the situation?

Thanks,
Alistair



>> On 10 Apr 2018, at 18:30, Alistair Grant <[hidden email]> wrote:
>>
>> First a quick update:
>>
>> After doing some work on primitiveFileAtEnd, #atEnd now answers
>> correctly for files that don't report their size correctly, e.g.
>> /dev/urandom and /proc/cpuinfo, whether the files are opened directly or
>> redirected through stdin.
>>
>> However determining whether stdin from a terminal has reached the end of
>> file can't be done without making #atEnd blocking since we have to wait
>> for the user to flag the end of file, e.g. by typing Ctrl-D.  And #atEnd
>> is assumed to be non-blocking.
>>
>> So currently using ZnCharacterReadStream with stdin from a terminal will
>> result in a stack dump similar to:
>>
>> MessageNotUnderstood: receiver of "<" is nil
>> UndefinedObject(Object)>>doesNotUnderstand: #<
>> ZnUTF8Encoder>>nextCodePointFromStream:
>> ZnUTF8Encoder(ZnCharacterEncoder)>>nextFromStream:
>> ZnCharacterReadStream>>nextElement
>> ZnCharacterReadStream(ZnEncodedReadStream)>>next
>> UndefinedObject>>DoIt
>>
>>
>> Going back through the various suggestions that have been made regarding
>> using a sentinel object vs. raising a notification / exception, my
>> (still to be polished) suggestion is to:
>>
>> 1. Add an endOfStream instance variable
>> 2. When the end of the stream is reached answer the value of the
>>   instance variable (i.e. the result of sending #value to the variable).
>> 3. The initial default value would be a block that raises a Deprecation
>>   warning and then returns nil.  This would allow existing code to
>>   function for a changeover period.
>> 4. At the end of the deprecation period the default value would be
>>   changed to a unique sentinel object which would answer itself as its
>>   #value.
>>
>> At any time users of the stream can set their own sentinel, including a
>> block that raises an exception.
>>
>>
>> Cheers,
>> Alistair
>>
>>
>> On 4 April 2018 at 19:24, Stephane Ducasse <[hidden email]> wrote:
>>> Thanks for this discussion.
>>>
>>> On Wed, Apr 4, 2018 at 1:37 PM, Sven Van Caekenberghe <[hidden email]> wrote:
>>>> Alistair,
>>>>
>>>> First off, thanks for the discussions and your contributions, I really appreciate them.
>>>>
>>>> But I want to have a discussion at the high level of the definition and semantics of the stream API in Pharo.
>>>>
>>>>> On 4 Apr 2018, at 13:20, Alistair Grant <[hidden email]> wrote:
>>>>>
>>>>> On 4 April 2018 at 12:56, Sven Van Caekenberghe <[hidden email]> wrote:
>>>>>> Playing a bit devil's advocate, the idea is that, in general,
>>>>>>
>>>>>> [ stream atEnd] whileFalse: [ stream next. "..." ].
>>>>>>
>>>>>> is no longer allowed ?
>>>>>
>>>>> It hasn't been allowed "forever" [1].  It's just been misused for
>>>>> almost as long.
>>>>>
>>>>> [1] Time began when stdio stream support was introduced. :-)
>>>>
>>>> I am still not convinced. Another way to put it would be that the old #atEnd or #upToEnd do not make sense for these streams and some new loop is needed, based on a new test (it exists for socket streams already).
>>>>
>>>> [ stream isDataAvailable ] whileTrue: [ stream next ]
>>>>
>>>>>> And you want to replace it with
>>>>>>
>>>>>> [ stream next ifNil: [ false ] ifNotNil: [ :x | "..." true ] whileTrue.
>>>>>>
>>>>>> That is a pretty big change, no ?
>>>>>
>>>>> That's the way quite a bit of code already operates.
>>>>>
>>>>> As Denis pointed out, it's obviously problematic in the general sense,
>>>>> since nil can be embedded in non-byte oriented streams.  I suspect
>>>>> that in practice not many people write code that reads streams from
>>>>> both byte oriented and non-byte oriented streams.
>>>>
>>>> Maybe yes, maybe no. As Denis' example shows there is a clear definition problem.
>>>>
>>>> And I do use streams of byte arrays or strings all the time, this is really important. I want my parsers to work on all kinds of streams.
>>>>
>>>>>> I think/feel like a proper EOF exception would be better, more correct.
>>>>>>
>>>>>> [ [ stream next. "..." true ] on: EOF do: [ false ] ] whileTrue.
>>>>>
>>>>> I agree, but the email thread Nicolas pointed to raises some
>>>>> performance questions about this approach.  It should be
>>>>> straightforward to do a basic performance comparison which I'll get
>>>>> around to if other objections aren't raised.
>>>>
>>>> Reading in bigger blocks, using #readInto:startingAt:count: (which is basically Unix's (2) Read sys call), would solve performance problems, I think.
>>>>
>>>>>> Will we throw away #atEnd then ? Do we need it if we cannot use it ?
>>>>>
>>>>> Unix file i/o returns EOF if the end of file has been reach OR if an
>>>>> error occurs.  You should still check #atEnd after reading past the
>>>>> end of the file to make sure no error occurred.  Another part of the
>>>>> primitive change I'm proposing is to return additional information
>>>>> about what went wrong in the event of an error.
>>>>
>>>> I am sorry, but this kind of semantics (the OR) is way too complex at the general image level, it is too specific and based on certain underlying implementation details.
>>>>
>>>> Sven
>>>>
>>>>> We could modify the read primitive so that it fails if an error has
>>>>> occurred, and then #atEnd wouldn't be required.
>>>>>
>>>>> Cheers,
>>>>> Alistair
>>>>>
>>>>>
>>>>>
>>>>>>> On 4 Apr 2018, at 12:41, Alistair Grant <[hidden email]> wrote:
>>>>>>>
>>>>>>> Hi Nicolas,
>>>>>>>
>>>>>>> On 4 April 2018 at 12:36, Nicolas Cellier
>>>>>>> <[hidden email]> wrote:
>>>>>>>>
>>>>>>>>
>>>>>>>> 2018-04-04 12:18 GMT+02:00 Alistair Grant <[hidden email]>:
>>>>>>>>>
>>>>>>>>> Hi Sven,
>>>>>>>>>
>>>>>>>>> On Wed, Apr 04, 2018 at 11:32:02AM +0200, Sven Van Caekenberghe wrote:
>>>>>>>>>> Somehow, somewhere there was a change to the implementation of the
>>>>>>>>>> primitive called by some streams' #atEnd.
>>>>>>>>>
>>>>>>>>> That's a proposed change by me, but it hasn't been integrated yet.  So
>>>>>>>>> the discussion below should apply to the current stable vm (from August
>>>>>>>>> last year).
>>>>>>>>>
>>>>>>>>>
>>>>>>>>>> IIRC, someone said it is implemented as 'remaining size being zero'
>>>>>>>>>> and some virtual unix files like /dev/random are zero sized.
>>>>>>>>>
>>>>>>>>> Currently, for files other than sdio (stdout, stderr, stdin) it is
>>>>>>>>> effectively defined as:
>>>>>>>>>
>>>>>>>>> atEnd := stream position >= stream size
>>>>>>>>>
>>>>>>>>>
>>>>>>>>> And, as you say, plenty of virtual unix files report size 0.
>>>>>>>>>
>>>>>>>>>
>>>>>>>>>
>>>>>>>>>> Now, all kinds of changes are being done image size to work around this.
>>>>>>>>>
>>>>>>>>> I would phrase this slightly differently :-)
>>>>>>>>>
>>>>>>>>> Some code does the right thing, while other code doesn't.  E.g.:
>>>>>>>>>
>>>>>>>>> MultiByteFileStream>>upToEnd is good, while
>>>>>>>>> FileStream>>contents is incorrect
>>>>>>>>>
>>>>>>>>>
>>>>>>>>>> I am a strong believer in simple, real (i.e. infinite) streams, but I
>>>>>>>>>> am not sure we are doing the right thing here.
>>>>>>>>>>
>>>>>>>>>> Point is, I am not sure #next returning nil is official and universal.
>>>>>>>>>>
>>>>>>>>>> Consider the comments:
>>>>>>>>>>
>>>>>>>>>> Stream>>#next
>>>>>>>>>> "Answer the next object accessible by the receiver."
>>>>>>>>>>
>>>>>>>>>> ReadStream>>#next
>>>>>>>>>> "Primitive. Answer the next object in the Stream represented by the
>>>>>>>>>> receiver. Fail if the collection of this stream is not an Array or a
>>>>>>>>>> String.
>>>>>>>>>> Fail if the stream is positioned at its end, or if the position is out
>>>>>>>>>> of
>>>>>>>>>> bounds in the collection. Optional. See Object documentation
>>>>>>>>>> whatIsAPrimitive."
>>>>>>>>>>
>>>>>>>>>> Note how there is no talk about returning nil !
>>>>>>>>>>
>>>>>>>>>> I think we should discuss about this first.
>>>>>>>>>>
>>>>>>>>>> Was the low level change really correct and the right thing to do ?
>>>>>>>>>
>>>>>>>>> The primitive change proposed doesn't affect this discussion.  It will
>>>>>>>>> mean that #atEnd returns false (correctly) sometimes, while currently it
>>>>>>>>> returns true (incorrectly).  The end result is still incorrect, e.g.
>>>>>>>>> #contents returns an empty string for /proc/cpuinfo.
>>>>>>>>>
>>>>>>>>> You're correct about no mention of nil, but we have:
>>>>>>>>>
>>>>>>>>> FileStream>>next
>>>>>>>>>
>>>>>>>>>      (position >= readLimit and: [self atEnd])
>>>>>>>>>              ifTrue: [^nil]
>>>>>>>>>              ifFalse: [^collection at: (position := position + 1)]
>>>>>>>>>
>>>>>>>>>
>>>>>>>>> which has been around for a long time (I suspect, before Pharo existed).
>>>>>>>>>
>>>>>>>>> Having said that, I think that raising an exception is a better
>>>>>>>>> solution, but it is a much, much bigger change than the one I proposed
>>>>>>>>> in https://github.com/pharo-project/pharo/pull/1180.
>>>>>>>>>
>>>>>>>>>
>>>>>>>>> Cheers,
>>>>>>>>> Alistair
>>>>>>>>>
>>>>>>>>
>>>>>>>> Hi,
>>>>>>>> yes, if you are after universal behavior englobing Unix streams, the
>>>>>>>> Exception might be the best way.
>>>>>>>> Because on special stream you can't allways say in advance, you have to try.
>>>>>>>> That's the solution adopted by authors of Xtreams.
>>>>>>>> But there is a runtime penalty associated to it.
>>>>>>>>
>>>>>>>> The penalty once was so high that my proposal to generalize EndOfStream
>>>>>>>> usage was rejected a few years ago by AndreaRaab.
>>>>>>>> http://forum.world.st/EndOfStream-unused-td68806.html
>>>>>>>
>>>>>>> Thanks for this, I'll definitely take a look.
>>>>>>>
>>>>>>> Do you have a sense of how Denis' suggestion of using an EndOfStream
>>>>>>> object would compare?
>>>>>>>
>>>>>>> It would keep the same coding style, but avoid the problems with nil.
>>>>>>>
>>>>>>> Thanks,
>>>>>>> Alistair
>>>>>>>
>>>>>>>
>>>>>>>
>>>>>>>> I have regularly benched Xtreams, but stopped a few years ago.
>>>>>>>> Maybe i can excavate and pass on newer VM.
>>>>>>>>
>>>>>>>> In the mean time, i had experimented a programmable end of stream behavior
>>>>>>>> (via a block, or any other valuable)
>>>>>>>> http://www.squeaksource.com/XTream.htm
>>>>>>>> so as to reconcile performance and universality, but it was a source of
>>>>>>>> complexification at implementation side.
>>>>>>>>
>>>>>>>> Nicolas
>>>>>>>>
>>>>>>>>>
>>>>>>>>>
>>>>>>>>>> Note also that a Guille introduced something new, #closed which is
>>>>>>>>>> related to the difference between having no more elements (maybe right now,
>>>>>>>>>> like an open network stream) and never ever being able to produce more data.
>>>>>>>>>>
>>>>>>>>>> Sven
>>>>>>
>>>>>>
>>>>>
>>>>
>>>>
>>>
>>
>
>

stdio.cs (1K) Download Attachment
Reply | Threaded
Open this post in threaded view
|

Re: Changed #atEnd primitive - #atEnd vs #next returning nil

Sven Van Caekenberghe-2
Alistair,

I am replying in-between, please don't take my remarks as being negative, I just like to straighten things out as clear as possible.

> On 10 Apr 2018, at 22:14, Alistair Grant <[hidden email]> wrote:
>
> Hi Sven,
>
> On 10 April 2018 at 19:36, Sven Van Caekenberghe <[hidden email]> wrote:
>> I have trouble understanding your problem analysis, and how your proposed solution, would solve it.
>
> The discussion started with my proposal to modify the Zinc streams to
> check the return value of #next for nil rather than using #atEnd to
> iterate over the underlying stream.  You correctly pointed out that
> the intention behind #atEnd is that it can be used to control
> iterating over streams.

OK

>> Where is it being said that #next and/or #atEnd should be blocking or non-blocking ?
>
> There is existing code that assumes that #atEnd is non-blocking and
> that #next is allowed block.  I believe that we should keep those
> conditions.

I fail to see where that is written down, either way. Can you point me to comments stating that, I would really like to know ?

>> How is this related to how EOF is signalled ?
>
> Because, combined with terminal EOF not being known until the user
> explicitly flags it (with Ctrl-D) it means that #atEnd can't be used
> for iterating over input from stdin connected to a terminal.

This seems to me like an exception that only holds for one particular stream in one particular scenario (interactive stdin). I might be wrong.

>> It seems to me that there are quite a few classes of streams that are 'special' in the sense that #next could be blocking and/or #atEnd could be unclear - socket/network streams, serial streams, maybe stdio (interactive or not). Without a message like #isDataAvailable you cannot handle those without blocking.
>
> Right.  I think this is a distraction (I was trying to explain some
> details, but it's causing more confusion instead of helping).
>
> The important point is that #atEnd doesn't work for iterating over
> streams with terminal input

Maybe you should also point to the actual code that fails. I mean you showed a partial stack trace, but not how you got there, precisely. How does the application reading from an interactive stdin do to get into trouble ?

>> Reading from stdin seems like a very rare case for a Smalltalk system (not that it should not be possible).
>
> There's been quite a bit of discussion and several projects recently
> related to using pharo for scripting, so it may become more common.
> E.g.
>
> https://www.quora.com/Can-Smalltalk-be-a-batch-file-scripting-language/answer/Philippe-Back-1?share=c19bfc95
> https://github.com/rajula96reddy/pharo-cli

Still, it is not common at all.

>> I have a feeling that too much functionality is being pushed into too small an API.
>
> This is just about how should Zinc streams be iterating over the
> underlying streams.  You didn't like checking the result of #next for
> nil since it isn't general, correctly pointing out that nil is a valid
> value for non-byte oriented streams.  But #atEnd doesn't work for
> stdin from a terminal.
>
>
> At this point I think there are three options:
>
> 1. Modify Zinc to check the return value of #next instead of using #atEnd.
>
> This is what all existing character / byte oriented streams in Squeak
> and Pharo do.  At that point the Zinc streams can be used on all file
> / stdio input and output.

I agree that such code exists in many places, but there is lots of stream reading that does not check for nils.

> 2. Modify all streams to signal EOF in some other way, i.e. a sentinel
> or notification / exception.
>
> This is what we were discussing below.  But it is a decent chunk of
> work with significant impact on the existing code base.

Agreed. This would be a future extension.

> 3. Require anyone who wants to read from stdin to code around Zinc's
> inability to handle terminal input.
>
> I'd prefer to avoid this option if possible.

See higher for a more concrete usage example request.

> Does that clarify the situation?

Yes, it helps. Thanks. But questions remain.

> Thanks,
> Alistair
>
>
>
>>> On 10 Apr 2018, at 18:30, Alistair Grant <[hidden email]> wrote:
>>>
>>> First a quick update:
>>>
>>> After doing some work on primitiveFileAtEnd, #atEnd now answers
>>> correctly for files that don't report their size correctly, e.g.
>>> /dev/urandom and /proc/cpuinfo, whether the files are opened directly or
>>> redirected through stdin.
>>>
>>> However determining whether stdin from a terminal has reached the end of
>>> file can't be done without making #atEnd blocking since we have to wait
>>> for the user to flag the end of file, e.g. by typing Ctrl-D.  And #atEnd
>>> is assumed to be non-blocking.
>>>
>>> So currently using ZnCharacterReadStream with stdin from a terminal will
>>> result in a stack dump similar to:
>>>
>>> MessageNotUnderstood: receiver of "<" is nil
>>> UndefinedObject(Object)>>doesNotUnderstand: #<
>>> ZnUTF8Encoder>>nextCodePointFromStream:
>>> ZnUTF8Encoder(ZnCharacterEncoder)>>nextFromStream:
>>> ZnCharacterReadStream>>nextElement
>>> ZnCharacterReadStream(ZnEncodedReadStream)>>next
>>> UndefinedObject>>DoIt
>>>
>>>
>>> Going back through the various suggestions that have been made regarding
>>> using a sentinel object vs. raising a notification / exception, my
>>> (still to be polished) suggestion is to:
>>>
>>> 1. Add an endOfStream instance variable
>>> 2. When the end of the stream is reached answer the value of the
>>>  instance variable (i.e. the result of sending #value to the variable).
>>> 3. The initial default value would be a block that raises a Deprecation
>>>  warning and then returns nil.  This would allow existing code to
>>>  function for a changeover period.
>>> 4. At the end of the deprecation period the default value would be
>>>  changed to a unique sentinel object which would answer itself as its
>>>  #value.
>>>
>>> At any time users of the stream can set their own sentinel, including a
>>> block that raises an exception.
>>>
>>>
>>> Cheers,
>>> Alistair
>>>
>>>
>>> On 4 April 2018 at 19:24, Stephane Ducasse <[hidden email]> wrote:
>>>> Thanks for this discussion.
>>>>
>>>> On Wed, Apr 4, 2018 at 1:37 PM, Sven Van Caekenberghe <[hidden email]> wrote:
>>>>> Alistair,
>>>>>
>>>>> First off, thanks for the discussions and your contributions, I really appreciate them.
>>>>>
>>>>> But I want to have a discussion at the high level of the definition and semantics of the stream API in Pharo.
>>>>>
>>>>>> On 4 Apr 2018, at 13:20, Alistair Grant <[hidden email]> wrote:
>>>>>>
>>>>>> On 4 April 2018 at 12:56, Sven Van Caekenberghe <[hidden email]> wrote:
>>>>>>> Playing a bit devil's advocate, the idea is that, in general,
>>>>>>>
>>>>>>> [ stream atEnd] whileFalse: [ stream next. "..." ].
>>>>>>>
>>>>>>> is no longer allowed ?
>>>>>>
>>>>>> It hasn't been allowed "forever" [1].  It's just been misused for
>>>>>> almost as long.
>>>>>>
>>>>>> [1] Time began when stdio stream support was introduced. :-)
>>>>>
>>>>> I am still not convinced. Another way to put it would be that the old #atEnd or #upToEnd do not make sense for these streams and some new loop is needed, based on a new test (it exists for socket streams already).
>>>>>
>>>>> [ stream isDataAvailable ] whileTrue: [ stream next ]
>>>>>
>>>>>>> And you want to replace it with
>>>>>>>
>>>>>>> [ stream next ifNil: [ false ] ifNotNil: [ :x | "..." true ] whileTrue.
>>>>>>>
>>>>>>> That is a pretty big change, no ?
>>>>>>
>>>>>> That's the way quite a bit of code already operates.
>>>>>>
>>>>>> As Denis pointed out, it's obviously problematic in the general sense,
>>>>>> since nil can be embedded in non-byte oriented streams.  I suspect
>>>>>> that in practice not many people write code that reads streams from
>>>>>> both byte oriented and non-byte oriented streams.
>>>>>
>>>>> Maybe yes, maybe no. As Denis' example shows there is a clear definition problem.
>>>>>
>>>>> And I do use streams of byte arrays or strings all the time, this is really important. I want my parsers to work on all kinds of streams.
>>>>>
>>>>>>> I think/feel like a proper EOF exception would be better, more correct.
>>>>>>>
>>>>>>> [ [ stream next. "..." true ] on: EOF do: [ false ] ] whileTrue.
>>>>>>
>>>>>> I agree, but the email thread Nicolas pointed to raises some
>>>>>> performance questions about this approach.  It should be
>>>>>> straightforward to do a basic performance comparison which I'll get
>>>>>> around to if other objections aren't raised.
>>>>>
>>>>> Reading in bigger blocks, using #readInto:startingAt:count: (which is basically Unix's (2) Read sys call), would solve performance problems, I think.
>>>>>
>>>>>>> Will we throw away #atEnd then ? Do we need it if we cannot use it ?
>>>>>>
>>>>>> Unix file i/o returns EOF if the end of file has been reach OR if an
>>>>>> error occurs.  You should still check #atEnd after reading past the
>>>>>> end of the file to make sure no error occurred.  Another part of the
>>>>>> primitive change I'm proposing is to return additional information
>>>>>> about what went wrong in the event of an error.
>>>>>
>>>>> I am sorry, but this kind of semantics (the OR) is way too complex at the general image level, it is too specific and based on certain underlying implementation details.
>>>>>
>>>>> Sven
>>>>>
>>>>>> We could modify the read primitive so that it fails if an error has
>>>>>> occurred, and then #atEnd wouldn't be required.
>>>>>>
>>>>>> Cheers,
>>>>>> Alistair
>>>>>>
>>>>>>
>>>>>>
>>>>>>>> On 4 Apr 2018, at 12:41, Alistair Grant <[hidden email]> wrote:
>>>>>>>>
>>>>>>>> Hi Nicolas,
>>>>>>>>
>>>>>>>> On 4 April 2018 at 12:36, Nicolas Cellier
>>>>>>>> <[hidden email]> wrote:
>>>>>>>>>
>>>>>>>>>
>>>>>>>>> 2018-04-04 12:18 GMT+02:00 Alistair Grant <[hidden email]>:
>>>>>>>>>>
>>>>>>>>>> Hi Sven,
>>>>>>>>>>
>>>>>>>>>> On Wed, Apr 04, 2018 at 11:32:02AM +0200, Sven Van Caekenberghe wrote:
>>>>>>>>>>> Somehow, somewhere there was a change to the implementation of the
>>>>>>>>>>> primitive called by some streams' #atEnd.
>>>>>>>>>>
>>>>>>>>>> That's a proposed change by me, but it hasn't been integrated yet.  So
>>>>>>>>>> the discussion below should apply to the current stable vm (from August
>>>>>>>>>> last year).
>>>>>>>>>>
>>>>>>>>>>
>>>>>>>>>>> IIRC, someone said it is implemented as 'remaining size being zero'
>>>>>>>>>>> and some virtual unix files like /dev/random are zero sized.
>>>>>>>>>>
>>>>>>>>>> Currently, for files other than sdio (stdout, stderr, stdin) it is
>>>>>>>>>> effectively defined as:
>>>>>>>>>>
>>>>>>>>>> atEnd := stream position >= stream size
>>>>>>>>>>
>>>>>>>>>>
>>>>>>>>>> And, as you say, plenty of virtual unix files report size 0.
>>>>>>>>>>
>>>>>>>>>>
>>>>>>>>>>
>>>>>>>>>>> Now, all kinds of changes are being done image size to work around this.
>>>>>>>>>>
>>>>>>>>>> I would phrase this slightly differently :-)
>>>>>>>>>>
>>>>>>>>>> Some code does the right thing, while other code doesn't.  E.g.:
>>>>>>>>>>
>>>>>>>>>> MultiByteFileStream>>upToEnd is good, while
>>>>>>>>>> FileStream>>contents is incorrect
>>>>>>>>>>
>>>>>>>>>>
>>>>>>>>>>> I am a strong believer in simple, real (i.e. infinite) streams, but I
>>>>>>>>>>> am not sure we are doing the right thing here.
>>>>>>>>>>>
>>>>>>>>>>> Point is, I am not sure #next returning nil is official and universal.
>>>>>>>>>>>
>>>>>>>>>>> Consider the comments:
>>>>>>>>>>>
>>>>>>>>>>> Stream>>#next
>>>>>>>>>>> "Answer the next object accessible by the receiver."
>>>>>>>>>>>
>>>>>>>>>>> ReadStream>>#next
>>>>>>>>>>> "Primitive. Answer the next object in the Stream represented by the
>>>>>>>>>>> receiver. Fail if the collection of this stream is not an Array or a
>>>>>>>>>>> String.
>>>>>>>>>>> Fail if the stream is positioned at its end, or if the position is out
>>>>>>>>>>> of
>>>>>>>>>>> bounds in the collection. Optional. See Object documentation
>>>>>>>>>>> whatIsAPrimitive."
>>>>>>>>>>>
>>>>>>>>>>> Note how there is no talk about returning nil !
>>>>>>>>>>>
>>>>>>>>>>> I think we should discuss about this first.
>>>>>>>>>>>
>>>>>>>>>>> Was the low level change really correct and the right thing to do ?
>>>>>>>>>>
>>>>>>>>>> The primitive change proposed doesn't affect this discussion.  It will
>>>>>>>>>> mean that #atEnd returns false (correctly) sometimes, while currently it
>>>>>>>>>> returns true (incorrectly).  The end result is still incorrect, e.g.
>>>>>>>>>> #contents returns an empty string for /proc/cpuinfo.
>>>>>>>>>>
>>>>>>>>>> You're correct about no mention of nil, but we have:
>>>>>>>>>>
>>>>>>>>>> FileStream>>next
>>>>>>>>>>
>>>>>>>>>>     (position >= readLimit and: [self atEnd])
>>>>>>>>>>             ifTrue: [^nil]
>>>>>>>>>>             ifFalse: [^collection at: (position := position + 1)]
>>>>>>>>>>
>>>>>>>>>>
>>>>>>>>>> which has been around for a long time (I suspect, before Pharo existed).
>>>>>>>>>>
>>>>>>>>>> Having said that, I think that raising an exception is a better
>>>>>>>>>> solution, but it is a much, much bigger change than the one I proposed
>>>>>>>>>> in https://github.com/pharo-project/pharo/pull/1180.
>>>>>>>>>>
>>>>>>>>>>
>>>>>>>>>> Cheers,
>>>>>>>>>> Alistair
>>>>>>>>>>
>>>>>>>>>
>>>>>>>>> Hi,
>>>>>>>>> yes, if you are after universal behavior englobing Unix streams, the
>>>>>>>>> Exception might be the best way.
>>>>>>>>> Because on special stream you can't allways say in advance, you have to try.
>>>>>>>>> That's the solution adopted by authors of Xtreams.
>>>>>>>>> But there is a runtime penalty associated to it.
>>>>>>>>>
>>>>>>>>> The penalty once was so high that my proposal to generalize EndOfStream
>>>>>>>>> usage was rejected a few years ago by AndreaRaab.
>>>>>>>>> http://forum.world.st/EndOfStream-unused-td68806.html
>>>>>>>>
>>>>>>>> Thanks for this, I'll definitely take a look.
>>>>>>>>
>>>>>>>> Do you have a sense of how Denis' suggestion of using an EndOfStream
>>>>>>>> object would compare?
>>>>>>>>
>>>>>>>> It would keep the same coding style, but avoid the problems with nil.
>>>>>>>>
>>>>>>>> Thanks,
>>>>>>>> Alistair
>>>>>>>>
>>>>>>>>
>>>>>>>>
>>>>>>>>> I have regularly benched Xtreams, but stopped a few years ago.
>>>>>>>>> Maybe i can excavate and pass on newer VM.
>>>>>>>>>
>>>>>>>>> In the mean time, i had experimented a programmable end of stream behavior
>>>>>>>>> (via a block, or any other valuable)
>>>>>>>>> http://www.squeaksource.com/XTream.htm
>>>>>>>>> so as to reconcile performance and universality, but it was a source of
>>>>>>>>> complexification at implementation side.
>>>>>>>>>
>>>>>>>>> Nicolas
>>>>>>>>>
>>>>>>>>>>
>>>>>>>>>>
>>>>>>>>>>> Note also that a Guille introduced something new, #closed which is
>>>>>>>>>>> related to the difference between having no more elements (maybe right now,
>>>>>>>>>>> like an open network stream) and never ever being able to produce more data.
>>>>>>>>>>>
>>>>>>>>>>> Sven
>>>>>>>
>>>>>>>
>>>>>>
>>>>>
>>>>>
>>>>
>>>
>>
>>
> <stdio.cs>


Reply | Threaded
Open this post in threaded view
|

Re: Changed #atEnd primitive - #atEnd vs #next returning nil

alistairgrant
Hi Sven,

On 10 April 2018 at 23:54, Sven Van Caekenberghe <[hidden email]> wrote:
> Alistair,
>
> I am replying in-between, please don't take my remarks as being negative, I just like to straighten things out as clear as possible.

No problem.


>> On 10 Apr 2018, at 22:14, Alistair Grant <[hidden email]> wrote:
>>
>> Hi Sven,
>>
>> On 10 April 2018 at 19:36, Sven Van Caekenberghe <[hidden email]> wrote:
>>> I have trouble understanding your problem analysis, and how your proposed solution, would solve it.
>>
>> The discussion started with my proposal to modify the Zinc streams to
>> check the return value of #next for nil rather than using #atEnd to
>> iterate over the underlying stream.  You correctly pointed out that
>> the intention behind #atEnd is that it can be used to control
>> iterating over streams.
>
> OK
>
>>> Where is it being said that #next and/or #atEnd should be blocking or non-blocking ?
>>
>> There is existing code that assumes that #atEnd is non-blocking and
>> that #next is allowed block.  I believe that we should keep those
>> conditions.
>
> I fail to see where that is written down, either way. Can you point me to comments stating that, I would really like to know ?
I'm not aware of it being written down, just that ever existing
implementation I'm aware of behaves this way.

On the other hand, making #atEnd blocking breaks Eliot's REPL sample
(in Squeak).



>>> How is this related to how EOF is signalled ?
>>
>> Because, combined with terminal EOF not being known until the user
>> explicitly flags it (with Ctrl-D) it means that #atEnd can't be used
>> for iterating over input from stdin connected to a terminal.
>
> This seems to me like an exception that only holds for one particular stream in one particular scenario (interactive stdin). I might be wrong.
>
>>> It seems to me that there are quite a few classes of streams that are 'special' in the sense that #next could be blocking and/or #atEnd could be unclear - socket/network streams, serial streams, maybe stdio (interactive or not). Without a message like #isDataAvailable you cannot handle those without blocking.
>>
>> Right.  I think this is a distraction (I was trying to explain some
>> details, but it's causing more confusion instead of helping).
>>
>> The important point is that #atEnd doesn't work for iterating over
>> streams with terminal input
>
> Maybe you should also point to the actual code that fails. I mean you showed a partial stack trace, but not how you got there, precisely. How does the application reading from an interactive stdin do to get into trouble ?
Included below.


>>> Reading from stdin seems like a very rare case for a Smalltalk system (not that it should not be possible).
>>
>> There's been quite a bit of discussion and several projects recently
>> related to using pharo for scripting, so it may become more common.
>> E.g.
>>
>> https://www.quora.com/Can-Smalltalk-be-a-batch-file-scripting-language/answer/Philippe-Back-1?share=c19bfc95
>> https://github.com/rajula96reddy/pharo-cli
>
> Still, it is not common at all.
>
>>> I have a feeling that too much functionality is being pushed into too small an API.
>>
>> This is just about how should Zinc streams be iterating over the
>> underlying streams.  You didn't like checking the result of #next for
>> nil since it isn't general, correctly pointing out that nil is a valid
>> value for non-byte oriented streams.  But #atEnd doesn't work for
>> stdin from a terminal.
>>
>>
>> At this point I think there are three options:
>>
>> 1. Modify Zinc to check the return value of #next instead of using #atEnd.
>>
>> This is what all existing character / byte oriented streams in Squeak
>> and Pharo do.  At that point the Zinc streams can be used on all file
>> / stdio input and output.
>
> I agree that such code exists in many places, but there is lots of stream reading that does not check for nils.
Right.  Streams can be categorised in many ways, but for this
discussion I think streams are broken in to two types:

1) Byte / Character oriented
2) All others

For historical reasons, byte / character oriented streams need to
check for EOF by using "stream next == nil" and all other streams
should use #atEnd.

This avoids the "nil being part of the domain" issue that was
discussed earlier in the thread.


>> 2. Modify all streams to signal EOF in some other way, i.e. a sentinel
>> or notification / exception.
>>
>> This is what we were discussing below.  But it is a decent chunk of
>> work with significant impact on the existing code base.
>
> Agreed. This would be a future extension.
>
>> 3. Require anyone who wants to read from stdin to code around Zinc's
>> inability to handle terminal input.
>>
>> I'd prefer to avoid this option if possible.
>
> See higher for a more concrete usage example request.

testAtEnd.st
--
| ch stream string stdin |

'stdio.cs' asFileReference fileIn.
"stdin := FileStream stdin."
stdin := ZnCharacterReadStream on:
    (ZnBufferedReadStream on:
        Stdio stdin).
stream := (String new: 100) writeStream.
ch := stdin next.
[ ch == nil ] whileFalse: [
    stream nextPut: ch.
    ch := stdin next. ].
string := stream contents.
FileStream stdout
    nextPutAll: string; lf;
    nextPutAll: 'Characters read: ';
    nextPutAll: string size asString;
    lf.
Smalltalk snapshot: false andQuit: true.
--

Execute with:

./pharo --headless Pharo7.0-64bit-e76f1a2.image testAtEnd.st

and type Ctrl-D gives:


'Errors in script loaded from testAtEnd.st'
MessageNotUnderstood: receiver of "<" is nil
UndefinedObject(Object)>>doesNotUnderstand: #<
ZnUTF8Encoder>>nextCodePointFromStream:
ZnUTF8Encoder(ZnCharacterEncoder)>>nextFromStream:
ZnCharacterReadStream>>nextElement
ZnCharacterReadStream(ZnEncodedReadStream)>>next
UndefinedObject>>DoIt
OpalCompiler>>evaluate


Using #atEnd to control the loop instead of "stdin next == nil"
produces the same result.

Replacing stdin with FileStream stdin makes the script work.

stdio.cs fixes a bug in StdioStream which really isn't part of this
discussion (PR to be submitted).

Cheers,
Alistair




>> Does that clarify the situation?
>
> Yes, it helps. Thanks. But questions remain.
>
>> Thanks,
>> Alistair
>>
>>
>>
>>>> On 10 Apr 2018, at 18:30, Alistair Grant <[hidden email]> wrote:
>>>>
>>>> First a quick update:
>>>>
>>>> After doing some work on primitiveFileAtEnd, #atEnd now answers
>>>> correctly for files that don't report their size correctly, e.g.
>>>> /dev/urandom and /proc/cpuinfo, whether the files are opened directly or
>>>> redirected through stdin.
>>>>
>>>> However determining whether stdin from a terminal has reached the end of
>>>> file can't be done without making #atEnd blocking since we have to wait
>>>> for the user to flag the end of file, e.g. by typing Ctrl-D.  And #atEnd
>>>> is assumed to be non-blocking.
>>>>
>>>> So currently using ZnCharacterReadStream with stdin from a terminal will
>>>> result in a stack dump similar to:
>>>>
>>>> MessageNotUnderstood: receiver of "<" is nil
>>>> UndefinedObject(Object)>>doesNotUnderstand: #<
>>>> ZnUTF8Encoder>>nextCodePointFromStream:
>>>> ZnUTF8Encoder(ZnCharacterEncoder)>>nextFromStream:
>>>> ZnCharacterReadStream>>nextElement
>>>> ZnCharacterReadStream(ZnEncodedReadStream)>>next
>>>> UndefinedObject>>DoIt
>>>>
>>>>
>>>> Going back through the various suggestions that have been made regarding
>>>> using a sentinel object vs. raising a notification / exception, my
>>>> (still to be polished) suggestion is to:
>>>>
>>>> 1. Add an endOfStream instance variable
>>>> 2. When the end of the stream is reached answer the value of the
>>>>  instance variable (i.e. the result of sending #value to the variable).
>>>> 3. The initial default value would be a block that raises a Deprecation
>>>>  warning and then returns nil.  This would allow existing code to
>>>>  function for a changeover period.
>>>> 4. At the end of the deprecation period the default value would be
>>>>  changed to a unique sentinel object which would answer itself as its
>>>>  #value.
>>>>
>>>> At any time users of the stream can set their own sentinel, including a
>>>> block that raises an exception.
>>>>
>>>>
>>>> Cheers,
>>>> Alistair
>>>>
>>>>
>>>> On 4 April 2018 at 19:24, Stephane Ducasse <[hidden email]> wrote:
>>>>> Thanks for this discussion.
>>>>>
>>>>> On Wed, Apr 4, 2018 at 1:37 PM, Sven Van Caekenberghe <[hidden email]> wrote:
>>>>>> Alistair,
>>>>>>
>>>>>> First off, thanks for the discussions and your contributions, I really appreciate them.
>>>>>>
>>>>>> But I want to have a discussion at the high level of the definition and semantics of the stream API in Pharo.
>>>>>>
>>>>>>> On 4 Apr 2018, at 13:20, Alistair Grant <[hidden email]> wrote:
>>>>>>>
>>>>>>> On 4 April 2018 at 12:56, Sven Van Caekenberghe <[hidden email]> wrote:
>>>>>>>> Playing a bit devil's advocate, the idea is that, in general,
>>>>>>>>
>>>>>>>> [ stream atEnd] whileFalse: [ stream next. "..." ].
>>>>>>>>
>>>>>>>> is no longer allowed ?
>>>>>>>
>>>>>>> It hasn't been allowed "forever" [1].  It's just been misused for
>>>>>>> almost as long.
>>>>>>>
>>>>>>> [1] Time began when stdio stream support was introduced. :-)
>>>>>>
>>>>>> I am still not convinced. Another way to put it would be that the old #atEnd or #upToEnd do not make sense for these streams and some new loop is needed, based on a new test (it exists for socket streams already).
>>>>>>
>>>>>> [ stream isDataAvailable ] whileTrue: [ stream next ]
>>>>>>
>>>>>>>> And you want to replace it with
>>>>>>>>
>>>>>>>> [ stream next ifNil: [ false ] ifNotNil: [ :x | "..." true ] whileTrue.
>>>>>>>>
>>>>>>>> That is a pretty big change, no ?
>>>>>>>
>>>>>>> That's the way quite a bit of code already operates.
>>>>>>>
>>>>>>> As Denis pointed out, it's obviously problematic in the general sense,
>>>>>>> since nil can be embedded in non-byte oriented streams.  I suspect
>>>>>>> that in practice not many people write code that reads streams from
>>>>>>> both byte oriented and non-byte oriented streams.
>>>>>>
>>>>>> Maybe yes, maybe no. As Denis' example shows there is a clear definition problem.
>>>>>>
>>>>>> And I do use streams of byte arrays or strings all the time, this is really important. I want my parsers to work on all kinds of streams.
>>>>>>
>>>>>>>> I think/feel like a proper EOF exception would be better, more correct.
>>>>>>>>
>>>>>>>> [ [ stream next. "..." true ] on: EOF do: [ false ] ] whileTrue.
>>>>>>>
>>>>>>> I agree, but the email thread Nicolas pointed to raises some
>>>>>>> performance questions about this approach.  It should be
>>>>>>> straightforward to do a basic performance comparison which I'll get
>>>>>>> around to if other objections aren't raised.
>>>>>>
>>>>>> Reading in bigger blocks, using #readInto:startingAt:count: (which is basically Unix's (2) Read sys call), would solve performance problems, I think.
>>>>>>
>>>>>>>> Will we throw away #atEnd then ? Do we need it if we cannot use it ?
>>>>>>>
>>>>>>> Unix file i/o returns EOF if the end of file has been reach OR if an
>>>>>>> error occurs.  You should still check #atEnd after reading past the
>>>>>>> end of the file to make sure no error occurred.  Another part of the
>>>>>>> primitive change I'm proposing is to return additional information
>>>>>>> about what went wrong in the event of an error.
>>>>>>
>>>>>> I am sorry, but this kind of semantics (the OR) is way too complex at the general image level, it is too specific and based on certain underlying implementation details.
>>>>>>
>>>>>> Sven
>>>>>>
>>>>>>> We could modify the read primitive so that it fails if an error has
>>>>>>> occurred, and then #atEnd wouldn't be required.
>>>>>>>
>>>>>>> Cheers,
>>>>>>> Alistair
>>>>>>>
>>>>>>>
>>>>>>>
>>>>>>>>> On 4 Apr 2018, at 12:41, Alistair Grant <[hidden email]> wrote:
>>>>>>>>>
>>>>>>>>> Hi Nicolas,
>>>>>>>>>
>>>>>>>>> On 4 April 2018 at 12:36, Nicolas Cellier
>>>>>>>>> <[hidden email]> wrote:
>>>>>>>>>>
>>>>>>>>>>
>>>>>>>>>> 2018-04-04 12:18 GMT+02:00 Alistair Grant <[hidden email]>:
>>>>>>>>>>>
>>>>>>>>>>> Hi Sven,
>>>>>>>>>>>
>>>>>>>>>>> On Wed, Apr 04, 2018 at 11:32:02AM +0200, Sven Van Caekenberghe wrote:
>>>>>>>>>>>> Somehow, somewhere there was a change to the implementation of the
>>>>>>>>>>>> primitive called by some streams' #atEnd.
>>>>>>>>>>>
>>>>>>>>>>> That's a proposed change by me, but it hasn't been integrated yet.  So
>>>>>>>>>>> the discussion below should apply to the current stable vm (from August
>>>>>>>>>>> last year).
>>>>>>>>>>>
>>>>>>>>>>>
>>>>>>>>>>>> IIRC, someone said it is implemented as 'remaining size being zero'
>>>>>>>>>>>> and some virtual unix files like /dev/random are zero sized.
>>>>>>>>>>>
>>>>>>>>>>> Currently, for files other than sdio (stdout, stderr, stdin) it is
>>>>>>>>>>> effectively defined as:
>>>>>>>>>>>
>>>>>>>>>>> atEnd := stream position >= stream size
>>>>>>>>>>>
>>>>>>>>>>>
>>>>>>>>>>> And, as you say, plenty of virtual unix files report size 0.
>>>>>>>>>>>
>>>>>>>>>>>
>>>>>>>>>>>
>>>>>>>>>>>> Now, all kinds of changes are being done image size to work around this.
>>>>>>>>>>>
>>>>>>>>>>> I would phrase this slightly differently :-)
>>>>>>>>>>>
>>>>>>>>>>> Some code does the right thing, while other code doesn't.  E.g.:
>>>>>>>>>>>
>>>>>>>>>>> MultiByteFileStream>>upToEnd is good, while
>>>>>>>>>>> FileStream>>contents is incorrect
>>>>>>>>>>>
>>>>>>>>>>>
>>>>>>>>>>>> I am a strong believer in simple, real (i.e. infinite) streams, but I
>>>>>>>>>>>> am not sure we are doing the right thing here.
>>>>>>>>>>>>
>>>>>>>>>>>> Point is, I am not sure #next returning nil is official and universal.
>>>>>>>>>>>>
>>>>>>>>>>>> Consider the comments:
>>>>>>>>>>>>
>>>>>>>>>>>> Stream>>#next
>>>>>>>>>>>> "Answer the next object accessible by the receiver."
>>>>>>>>>>>>
>>>>>>>>>>>> ReadStream>>#next
>>>>>>>>>>>> "Primitive. Answer the next object in the Stream represented by the
>>>>>>>>>>>> receiver. Fail if the collection of this stream is not an Array or a
>>>>>>>>>>>> String.
>>>>>>>>>>>> Fail if the stream is positioned at its end, or if the position is out
>>>>>>>>>>>> of
>>>>>>>>>>>> bounds in the collection. Optional. See Object documentation
>>>>>>>>>>>> whatIsAPrimitive."
>>>>>>>>>>>>
>>>>>>>>>>>> Note how there is no talk about returning nil !
>>>>>>>>>>>>
>>>>>>>>>>>> I think we should discuss about this first.
>>>>>>>>>>>>
>>>>>>>>>>>> Was the low level change really correct and the right thing to do ?
>>>>>>>>>>>
>>>>>>>>>>> The primitive change proposed doesn't affect this discussion.  It will
>>>>>>>>>>> mean that #atEnd returns false (correctly) sometimes, while currently it
>>>>>>>>>>> returns true (incorrectly).  The end result is still incorrect, e.g.
>>>>>>>>>>> #contents returns an empty string for /proc/cpuinfo.
>>>>>>>>>>>
>>>>>>>>>>> You're correct about no mention of nil, but we have:
>>>>>>>>>>>
>>>>>>>>>>> FileStream>>next
>>>>>>>>>>>
>>>>>>>>>>>     (position >= readLimit and: [self atEnd])
>>>>>>>>>>>             ifTrue: [^nil]
>>>>>>>>>>>             ifFalse: [^collection at: (position := position + 1)]
>>>>>>>>>>>
>>>>>>>>>>>
>>>>>>>>>>> which has been around for a long time (I suspect, before Pharo existed).
>>>>>>>>>>>
>>>>>>>>>>> Having said that, I think that raising an exception is a better
>>>>>>>>>>> solution, but it is a much, much bigger change than the one I proposed
>>>>>>>>>>> in https://github.com/pharo-project/pharo/pull/1180.
>>>>>>>>>>>
>>>>>>>>>>>
>>>>>>>>>>> Cheers,
>>>>>>>>>>> Alistair
>>>>>>>>>>>
>>>>>>>>>>
>>>>>>>>>> Hi,
>>>>>>>>>> yes, if you are after universal behavior englobing Unix streams, the
>>>>>>>>>> Exception might be the best way.
>>>>>>>>>> Because on special stream you can't allways say in advance, you have to try.
>>>>>>>>>> That's the solution adopted by authors of Xtreams.
>>>>>>>>>> But there is a runtime penalty associated to it.
>>>>>>>>>>
>>>>>>>>>> The penalty once was so high that my proposal to generalize EndOfStream
>>>>>>>>>> usage was rejected a few years ago by AndreaRaab.
>>>>>>>>>> http://forum.world.st/EndOfStream-unused-td68806.html
>>>>>>>>>
>>>>>>>>> Thanks for this, I'll definitely take a look.
>>>>>>>>>
>>>>>>>>> Do you have a sense of how Denis' suggestion of using an EndOfStream
>>>>>>>>> object would compare?
>>>>>>>>>
>>>>>>>>> It would keep the same coding style, but avoid the problems with nil.
>>>>>>>>>
>>>>>>>>> Thanks,
>>>>>>>>> Alistair
>>>>>>>>>
>>>>>>>>>
>>>>>>>>>
>>>>>>>>>> I have regularly benched Xtreams, but stopped a few years ago.
>>>>>>>>>> Maybe i can excavate and pass on newer VM.
>>>>>>>>>>
>>>>>>>>>> In the mean time, i had experimented a programmable end of stream behavior
>>>>>>>>>> (via a block, or any other valuable)
>>>>>>>>>> http://www.squeaksource.com/XTream.htm
>>>>>>>>>> so as to reconcile performance and universality, but it was a source of
>>>>>>>>>> complexification at implementation side.
>>>>>>>>>>
>>>>>>>>>> Nicolas
>>>>>>>>>>
>>>>>>>>>>>
>>>>>>>>>>>
>>>>>>>>>>>> Note also that a Guille introduced something new, #closed which is
>>>>>>>>>>>> related to the difference between having no more elements (maybe right now,
>>>>>>>>>>>> like an open network stream) and never ever being able to produce more data.
>>>>>>>>>>>>
>>>>>>>>>>>> Sven
>>>>>>>>
>>>>>>>>
>>>>>>>
>>>>>>
>>>>>>
>>>>>
>>>>
>>>
>>>
>> <stdio.cs>
>
>

stdio.cs (1K) Download Attachment
123