Changed #atEnd primitive - #atEnd vs #next returning nil

Previous Topic Next Topic
 
classic Classic list List threaded Threaded
60 messages Options
123
Reply | Threaded
Open this post in threaded view
|

Re: Changed #atEnd primitive - #atEnd vs #next returning nil

Denis Kudriashov

2018-04-11 8:32 GMT+02:00 Alistair Grant <[hidden email]>:
>>> Where is it being said that #next and/or #atEnd should be blocking or non-blocking ?
>>
>> There is existing code that assumes that #atEnd is non-blocking and
>> that #next is allowed block.  I believe that we should keep those
>> conditions.
>
> I fail to see where that is written down, either way. Can you point me to comments stating that, I would really like to know ?

I'm not aware of it being written down, just that ever existing
implementation I'm aware of behaves this way.

On the other hand, making #atEnd blocking breaks Eliot's REPL sample
(in Squeak).

Could you write here this example, please?
 



>>> How is this related to how EOF is signalled ?
>>
>> Because, combined with terminal EOF not being known until the user
>> explicitly flags it (with Ctrl-D) it means that #atEnd can't be used
>> for iterating over input from stdin connected to a terminal.
>
> This seems to me like an exception that only holds for one particular stream in one particular scenario (interactive stdin). I might be wrong.
>
>>> It seems to me that there are quite a few classes of streams that are 'special' in the sense that #next could be blocking and/or #atEnd could be unclear - socket/network streams, serial streams, maybe stdio (interactive or not). Without a message like #isDataAvailable you cannot handle those without blocking.
>>
>> Right.  I think this is a distraction (I was trying to explain some
>> details, but it's causing more confusion instead of helping).
>>
>> The important point is that #atEnd doesn't work for iterating over
>> streams with terminal input
>
> Maybe you should also point to the actual code that fails. I mean you showed a partial stack trace, but not how you got there, precisely. How does the application reading from an interactive stdin do to get into trouble ?

Included below.


>>> Reading from stdin seems like a very rare case for a Smalltalk system (not that it should not be possible).
>>
>> There's been quite a bit of discussion and several projects recently
>> related to using pharo for scripting, so it may become more common.
>> E.g.
>>
>> https://www.quora.com/Can-Smalltalk-be-a-batch-file-scripting-language/answer/Philippe-Back-1?share=c19bfc95
>> https://github.com/rajula96reddy/pharo-cli
>
> Still, it is not common at all.
>
>>> I have a feeling that too much functionality is being pushed into too small an API.
>>
>> This is just about how should Zinc streams be iterating over the
>> underlying streams.  You didn't like checking the result of #next for
>> nil since it isn't general, correctly pointing out that nil is a valid
>> value for non-byte oriented streams.  But #atEnd doesn't work for
>> stdin from a terminal.
>>
>>
>> At this point I think there are three options:
>>
>> 1. Modify Zinc to check the return value of #next instead of using #atEnd.
>>
>> This is what all existing character / byte oriented streams in Squeak
>> and Pharo do.  At that point the Zinc streams can be used on all file
>> / stdio input and output.
>
> I agree that such code exists in many places, but there is lots of stream reading that does not check for nils.

Right.  Streams can be categorised in many ways, but for this
discussion I think streams are broken in to two types:

1) Byte / Character oriented
2) All others

For historical reasons, byte / character oriented streams need to
check for EOF by using "stream next == nil" and all other streams
should use #atEnd.

This avoids the "nil being part of the domain" issue that was
discussed earlier in the thread.


>> 2. Modify all streams to signal EOF in some other way, i.e. a sentinel
>> or notification / exception.
>>
>> This is what we were discussing below.  But it is a decent chunk of
>> work with significant impact on the existing code base.
>
> Agreed. This would be a future extension.
>
>> 3. Require anyone who wants to read from stdin to code around Zinc's
>> inability to handle terminal input.
>>
>> I'd prefer to avoid this option if possible.
>
> See higher for a more concrete usage example request.


testAtEnd.st
--
| ch stream string stdin |

'stdio.cs' asFileReference fileIn.
"stdin := FileStream stdin."
stdin := ZnCharacterReadStream on:
    (ZnBufferedReadStream on:
        Stdio stdin).
stream := (String new: 100) writeStream.
ch := stdin next.
[ ch == nil ] whileFalse: [
    stream nextPut: ch.
    ch := stdin next. ].
string := stream contents.
FileStream stdout
    nextPutAll: string; lf;
    nextPutAll: 'Characters read: ';
    nextPutAll: string size asString;
    lf.
Smalltalk snapshot: false andQuit: true.
--

Execute with:

./pharo --headless Pharo7.0-64bit-e76f1a2.image testAtEnd.st

and type Ctrl-D gives:


'Errors in script loaded from testAtEnd.st'
MessageNotUnderstood: receiver of "<" is nil
UndefinedObject(Object)>>doesNotUnderstand: #<
ZnUTF8Encoder>>nextCodePointFromStream:
ZnUTF8Encoder(ZnCharacterEncoder)>>nextFromStream:
ZnCharacterReadStream>>nextElement
ZnCharacterReadStream(ZnEncodedReadStream)>>next
UndefinedObject>>DoIt
OpalCompiler>>evaluate


Using #atEnd to control the loop instead of "stdin next == nil"
produces the same result.

Replacing stdin with FileStream stdin makes the script work.

stdio.cs fixes a bug in StdioStream which really isn't part of this
discussion (PR to be submitted).

Cheers,
Alistair




>> Does that clarify the situation?
>
> Yes, it helps. Thanks. But questions remain.
>
>> Thanks,
>> Alistair
>>
>>
>>
>>>> On 10 Apr 2018, at 18:30, Alistair Grant <[hidden email]> wrote:
>>>>
>>>> First a quick update:
>>>>
>>>> After doing some work on primitiveFileAtEnd, #atEnd now answers
>>>> correctly for files that don't report their size correctly, e.g.
>>>> /dev/urandom and /proc/cpuinfo, whether the files are opened directly or
>>>> redirected through stdin.
>>>>
>>>> However determining whether stdin from a terminal has reached the end of
>>>> file can't be done without making #atEnd blocking since we have to wait
>>>> for the user to flag the end of file, e.g. by typing Ctrl-D.  And #atEnd
>>>> is assumed to be non-blocking.
>>>>
>>>> So currently using ZnCharacterReadStream with stdin from a terminal will
>>>> result in a stack dump similar to:
>>>>
>>>> MessageNotUnderstood: receiver of "<" is nil
>>>> UndefinedObject(Object)>>doesNotUnderstand: #<
>>>> ZnUTF8Encoder>>nextCodePointFromStream:
>>>> ZnUTF8Encoder(ZnCharacterEncoder)>>nextFromStream:
>>>> ZnCharacterReadStream>>nextElement
>>>> ZnCharacterReadStream(ZnEncodedReadStream)>>next
>>>> UndefinedObject>>DoIt
>>>>
>>>>
>>>> Going back through the various suggestions that have been made regarding
>>>> using a sentinel object vs. raising a notification / exception, my
>>>> (still to be polished) suggestion is to:
>>>>
>>>> 1. Add an endOfStream instance variable
>>>> 2. When the end of the stream is reached answer the value of the
>>>>  instance variable (i.e. the result of sending #value to the variable).
>>>> 3. The initial default value would be a block that raises a Deprecation
>>>>  warning and then returns nil.  This would allow existing code to
>>>>  function for a changeover period.
>>>> 4. At the end of the deprecation period the default value would be
>>>>  changed to a unique sentinel object which would answer itself as its
>>>>  #value.
>>>>
>>>> At any time users of the stream can set their own sentinel, including a
>>>> block that raises an exception.
>>>>
>>>>
>>>> Cheers,
>>>> Alistair
>>>>
>>>>
>>>> On 4 April 2018 at 19:24, Stephane Ducasse <[hidden email]> wrote:
>>>>> Thanks for this discussion.
>>>>>
>>>>> On Wed, Apr 4, 2018 at 1:37 PM, Sven Van Caekenberghe <[hidden email]> wrote:
>>>>>> Alistair,
>>>>>>
>>>>>> First off, thanks for the discussions and your contributions, I really appreciate them.
>>>>>>
>>>>>> But I want to have a discussion at the high level of the definition and semantics of the stream API in Pharo.
>>>>>>
>>>>>>> On 4 Apr 2018, at 13:20, Alistair Grant <[hidden email]> wrote:
>>>>>>>
>>>>>>> On 4 April 2018 at 12:56, Sven Van Caekenberghe <[hidden email]> wrote:
>>>>>>>> Playing a bit devil's advocate, the idea is that, in general,
>>>>>>>>
>>>>>>>> [ stream atEnd] whileFalse: [ stream next. "..." ].
>>>>>>>>
>>>>>>>> is no longer allowed ?
>>>>>>>
>>>>>>> It hasn't been allowed "forever" [1].  It's just been misused for
>>>>>>> almost as long.
>>>>>>>
>>>>>>> [1] Time began when stdio stream support was introduced. :-)
>>>>>>
>>>>>> I am still not convinced. Another way to put it would be that the old #atEnd or #upToEnd do not make sense for these streams and some new loop is needed, based on a new test (it exists for socket streams already).
>>>>>>
>>>>>> [ stream isDataAvailable ] whileTrue: [ stream next ]
>>>>>>
>>>>>>>> And you want to replace it with
>>>>>>>>
>>>>>>>> [ stream next ifNil: [ false ] ifNotNil: [ :x | "..." true ] whileTrue.
>>>>>>>>
>>>>>>>> That is a pretty big change, no ?
>>>>>>>
>>>>>>> That's the way quite a bit of code already operates.
>>>>>>>
>>>>>>> As Denis pointed out, it's obviously problematic in the general sense,
>>>>>>> since nil can be embedded in non-byte oriented streams.  I suspect
>>>>>>> that in practice not many people write code that reads streams from
>>>>>>> both byte oriented and non-byte oriented streams.
>>>>>>
>>>>>> Maybe yes, maybe no. As Denis' example shows there is a clear definition problem.
>>>>>>
>>>>>> And I do use streams of byte arrays or strings all the time, this is really important. I want my parsers to work on all kinds of streams.
>>>>>>
>>>>>>>> I think/feel like a proper EOF exception would be better, more correct.
>>>>>>>>
>>>>>>>> [ [ stream next. "..." true ] on: EOF do: [ false ] ] whileTrue.
>>>>>>>
>>>>>>> I agree, but the email thread Nicolas pointed to raises some
>>>>>>> performance questions about this approach.  It should be
>>>>>>> straightforward to do a basic performance comparison which I'll get
>>>>>>> around to if other objections aren't raised.
>>>>>>
>>>>>> Reading in bigger blocks, using #readInto:startingAt:count: (which is basically Unix's (2) Read sys call), would solve performance problems, I think.
>>>>>>
>>>>>>>> Will we throw away #atEnd then ? Do we need it if we cannot use it ?
>>>>>>>
>>>>>>> Unix file i/o returns EOF if the end of file has been reach OR if an
>>>>>>> error occurs.  You should still check #atEnd after reading past the
>>>>>>> end of the file to make sure no error occurred.  Another part of the
>>>>>>> primitive change I'm proposing is to return additional information
>>>>>>> about what went wrong in the event of an error.
>>>>>>
>>>>>> I am sorry, but this kind of semantics (the OR) is way too complex at the general image level, it is too specific and based on certain underlying implementation details.
>>>>>>
>>>>>> Sven
>>>>>>
>>>>>>> We could modify the read primitive so that it fails if an error has
>>>>>>> occurred, and then #atEnd wouldn't be required.
>>>>>>>
>>>>>>> Cheers,
>>>>>>> Alistair
>>>>>>>
>>>>>>>
>>>>>>>
>>>>>>>>> On 4 Apr 2018, at 12:41, Alistair Grant <[hidden email]> wrote:
>>>>>>>>>
>>>>>>>>> Hi Nicolas,
>>>>>>>>>
>>>>>>>>> On 4 April 2018 at 12:36, Nicolas Cellier
>>>>>>>>> <[hidden email]> wrote:
>>>>>>>>>>
>>>>>>>>>>
>>>>>>>>>> 2018-04-04 12:18 GMT+02:00 Alistair Grant <[hidden email]>:
>>>>>>>>>>>
>>>>>>>>>>> Hi Sven,
>>>>>>>>>>>
>>>>>>>>>>> On Wed, Apr 04, 2018 at 11:32:02AM +0200, Sven Van Caekenberghe wrote:
>>>>>>>>>>>> Somehow, somewhere there was a change to the implementation of the
>>>>>>>>>>>> primitive called by some streams' #atEnd.
>>>>>>>>>>>
>>>>>>>>>>> That's a proposed change by me, but it hasn't been integrated yet.  So
>>>>>>>>>>> the discussion below should apply to the current stable vm (from August
>>>>>>>>>>> last year).
>>>>>>>>>>>
>>>>>>>>>>>
>>>>>>>>>>>> IIRC, someone said it is implemented as 'remaining size being zero'
>>>>>>>>>>>> and some virtual unix files like /dev/random are zero sized.
>>>>>>>>>>>
>>>>>>>>>>> Currently, for files other than sdio (stdout, stderr, stdin) it is
>>>>>>>>>>> effectively defined as:
>>>>>>>>>>>
>>>>>>>>>>> atEnd := stream position >= stream size
>>>>>>>>>>>
>>>>>>>>>>>
>>>>>>>>>>> And, as you say, plenty of virtual unix files report size 0.
>>>>>>>>>>>
>>>>>>>>>>>
>>>>>>>>>>>
>>>>>>>>>>>> Now, all kinds of changes are being done image size to work around this.
>>>>>>>>>>>
>>>>>>>>>>> I would phrase this slightly differently :-)
>>>>>>>>>>>
>>>>>>>>>>> Some code does the right thing, while other code doesn't.  E.g.:
>>>>>>>>>>>
>>>>>>>>>>> MultiByteFileStream>>upToEnd is good, while
>>>>>>>>>>> FileStream>>contents is incorrect
>>>>>>>>>>>
>>>>>>>>>>>
>>>>>>>>>>>> I am a strong believer in simple, real (i.e. infinite) streams, but I
>>>>>>>>>>>> am not sure we are doing the right thing here.
>>>>>>>>>>>>
>>>>>>>>>>>> Point is, I am not sure #next returning nil is official and universal.
>>>>>>>>>>>>
>>>>>>>>>>>> Consider the comments:
>>>>>>>>>>>>
>>>>>>>>>>>> Stream>>#next
>>>>>>>>>>>> "Answer the next object accessible by the receiver."
>>>>>>>>>>>>
>>>>>>>>>>>> ReadStream>>#next
>>>>>>>>>>>> "Primitive. Answer the next object in the Stream represented by the
>>>>>>>>>>>> receiver. Fail if the collection of this stream is not an Array or a
>>>>>>>>>>>> String.
>>>>>>>>>>>> Fail if the stream is positioned at its end, or if the position is out
>>>>>>>>>>>> of
>>>>>>>>>>>> bounds in the collection. Optional. See Object documentation
>>>>>>>>>>>> whatIsAPrimitive."
>>>>>>>>>>>>
>>>>>>>>>>>> Note how there is no talk about returning nil !
>>>>>>>>>>>>
>>>>>>>>>>>> I think we should discuss about this first.
>>>>>>>>>>>>
>>>>>>>>>>>> Was the low level change really correct and the right thing to do ?
>>>>>>>>>>>
>>>>>>>>>>> The primitive change proposed doesn't affect this discussion.  It will
>>>>>>>>>>> mean that #atEnd returns false (correctly) sometimes, while currently it
>>>>>>>>>>> returns true (incorrectly).  The end result is still incorrect, e.g.
>>>>>>>>>>> #contents returns an empty string for /proc/cpuinfo.
>>>>>>>>>>>
>>>>>>>>>>> You're correct about no mention of nil, but we have:
>>>>>>>>>>>
>>>>>>>>>>> FileStream>>next
>>>>>>>>>>>
>>>>>>>>>>>     (position >= readLimit and: [self atEnd])
>>>>>>>>>>>             ifTrue: [^nil]
>>>>>>>>>>>             ifFalse: [^collection at: (position := position + 1)]
>>>>>>>>>>>
>>>>>>>>>>>
>>>>>>>>>>> which has been around for a long time (I suspect, before Pharo existed).
>>>>>>>>>>>
>>>>>>>>>>> Having said that, I think that raising an exception is a better
>>>>>>>>>>> solution, but it is a much, much bigger change than the one I proposed
>>>>>>>>>>> in https://github.com/pharo-project/pharo/pull/1180.
>>>>>>>>>>>
>>>>>>>>>>>
>>>>>>>>>>> Cheers,
>>>>>>>>>>> Alistair
>>>>>>>>>>>
>>>>>>>>>>
>>>>>>>>>> Hi,
>>>>>>>>>> yes, if you are after universal behavior englobing Unix streams, the
>>>>>>>>>> Exception might be the best way.
>>>>>>>>>> Because on special stream you can't allways say in advance, you have to try.
>>>>>>>>>> That's the solution adopted by authors of Xtreams.
>>>>>>>>>> But there is a runtime penalty associated to it.
>>>>>>>>>>
>>>>>>>>>> The penalty once was so high that my proposal to generalize EndOfStream
>>>>>>>>>> usage was rejected a few years ago by AndreaRaab.
>>>>>>>>>> http://forum.world.st/EndOfStream-unused-td68806.html
>>>>>>>>>
>>>>>>>>> Thanks for this, I'll definitely take a look.
>>>>>>>>>
>>>>>>>>> Do you have a sense of how Denis' suggestion of using an EndOfStream
>>>>>>>>> object would compare?
>>>>>>>>>
>>>>>>>>> It would keep the same coding style, but avoid the problems with nil.
>>>>>>>>>
>>>>>>>>> Thanks,
>>>>>>>>> Alistair
>>>>>>>>>
>>>>>>>>>
>>>>>>>>>
>>>>>>>>>> I have regularly benched Xtreams, but stopped a few years ago.
>>>>>>>>>> Maybe i can excavate and pass on newer VM.
>>>>>>>>>>
>>>>>>>>>> In the mean time, i had experimented a programmable end of stream behavior
>>>>>>>>>> (via a block, or any other valuable)
>>>>>>>>>> http://www.squeaksource.com/XTream.htm
>>>>>>>>>> so as to reconcile performance and universality, but it was a source of
>>>>>>>>>> complexification at implementation side.
>>>>>>>>>>
>>>>>>>>>> Nicolas
>>>>>>>>>>
>>>>>>>>>>>
>>>>>>>>>>>
>>>>>>>>>>>> Note also that a Guille introduced something new, #closed which is
>>>>>>>>>>>> related to the difference between having no more elements (maybe right now,
>>>>>>>>>>>> like an open network stream) and never ever being able to produce more data.
>>>>>>>>>>>>
>>>>>>>>>>>> Sven
>>>>>>>>
>>>>>>>>
>>>>>>>
>>>>>>
>>>>>>
>>>>>
>>>>
>>>
>>>
>> <stdio.cs>
>
>

Reply | Threaded
Open this post in threaded view
|

Re: Changed #atEnd primitive - #atEnd vs #next returning nil

alistairgrant
Hi Denis,

On 11 April 2018 at 10:02, Denis Kudriashov <[hidden email]> wrote:

>
> 2018-04-11 8:32 GMT+02:00 Alistair Grant <[hidden email]>:
>>
>> >>> Where is it being said that #next and/or #atEnd should be blocking or
>> >>> non-blocking ?
>> >>
>> >> There is existing code that assumes that #atEnd is non-blocking and
>> >> that #next is allowed block.  I believe that we should keep those
>> >> conditions.
>> >
>> > I fail to see where that is written down, either way. Can you point me
>> > to comments stating that, I would really like to know ?
>>
>> I'm not aware of it being written down, just that ever existing
>> implementation I'm aware of behaves this way.
>>
>> On the other hand, making #atEnd blocking breaks Eliot's REPL sample
>> (in Squeak).
>
>
> Could you write here this example, please?

The code is loaded in squeak using:

https://github.com/OpenSmalltalk/opensmalltalk-vm/blob/Cog/image/buildspurtrunkreaderimage.sh

for 32 bit images.  It loads:

https://github.com/OpenSmalltalk/opensmalltalk-vm/blob/Cog/image/LoadReader.st

which loads package CogTools-Listener in http://source.squeak.org/VMMaker

An image that automatically runs the code and nothing else is created in:

https://github.com/OpenSmalltalk/opensmalltalk-vm/blob/Cog/image/StartReader.st


If you want to run it interactively you can load CogTools-Listener and
do something like:

StdioListener new
    quitOnEof: false;
    run

If you modify #atEnd to block it will result in the "squeak>" input
prompt being printed in the terminal after the input has been entered.

The code can be loaded in to Pharo and basically works, but the output
tends to be hidden behind the next input prompt because it uses #cr
instead of #lf.  You can easily modify StdioListener>>initialize to
set the line end convention in stdout.

NOTE: It is not intended to be a release quality implementation of a
evaluation loop.  The whole purpose as I understand it is for it to be
as simple as possible to assist in tracking down issues using the VM
simulator.  It runs minimal code to get to the point of waiting for
user input and then allows an expression that causes problems to be
entered and traced using the simulator.

Cheers,
Alistair



>> >>> How is this related to how EOF is signalled ?
>> >>
>> >> Because, combined with terminal EOF not being known until the user
>> >> explicitly flags it (with Ctrl-D) it means that #atEnd can't be used
>> >> for iterating over input from stdin connected to a terminal.
>> >
>> > This seems to me like an exception that only holds for one particular
>> > stream in one particular scenario (interactive stdin). I might be wrong.
>> >
>> >>> It seems to me that there are quite a few classes of streams that are
>> >>> 'special' in the sense that #next could be blocking and/or #atEnd could be
>> >>> unclear - socket/network streams, serial streams, maybe stdio (interactive
>> >>> or not). Without a message like #isDataAvailable you cannot handle those
>> >>> without blocking.
>> >>
>> >> Right.  I think this is a distraction (I was trying to explain some
>> >> details, but it's causing more confusion instead of helping).
>> >>
>> >> The important point is that #atEnd doesn't work for iterating over
>> >> streams with terminal input
>> >
>> > Maybe you should also point to the actual code that fails. I mean you
>> > showed a partial stack trace, but not how you got there, precisely. How does
>> > the application reading from an interactive stdin do to get into trouble ?
>>
>> Included below.
>>
>>
>> >>> Reading from stdin seems like a very rare case for a Smalltalk system
>> >>> (not that it should not be possible).
>> >>
>> >> There's been quite a bit of discussion and several projects recently
>> >> related to using pharo for scripting, so it may become more common.
>> >> E.g.
>> >>
>> >>
>> >> https://www.quora.com/Can-Smalltalk-be-a-batch-file-scripting-language/answer/Philippe-Back-1?share=c19bfc95
>> >> https://github.com/rajula96reddy/pharo-cli
>> >
>> > Still, it is not common at all.
>> >
>> >>> I have a feeling that too much functionality is being pushed into too
>> >>> small an API.
>> >>
>> >> This is just about how should Zinc streams be iterating over the
>> >> underlying streams.  You didn't like checking the result of #next for
>> >> nil since it isn't general, correctly pointing out that nil is a valid
>> >> value for non-byte oriented streams.  But #atEnd doesn't work for
>> >> stdin from a terminal.
>> >>
>> >>
>> >> At this point I think there are three options:
>> >>
>> >> 1. Modify Zinc to check the return value of #next instead of using
>> >> #atEnd.
>> >>
>> >> This is what all existing character / byte oriented streams in Squeak
>> >> and Pharo do.  At that point the Zinc streams can be used on all file
>> >> / stdio input and output.
>> >
>> > I agree that such code exists in many places, but there is lots of
>> > stream reading that does not check for nils.
>>
>> Right.  Streams can be categorised in many ways, but for this
>> discussion I think streams are broken in to two types:
>>
>> 1) Byte / Character oriented
>> 2) All others
>>
>> For historical reasons, byte / character oriented streams need to
>> check for EOF by using "stream next == nil" and all other streams
>> should use #atEnd.
>>
>> This avoids the "nil being part of the domain" issue that was
>> discussed earlier in the thread.
>>
>>
>> >> 2. Modify all streams to signal EOF in some other way, i.e. a sentinel
>> >> or notification / exception.
>> >>
>> >> This is what we were discussing below.  But it is a decent chunk of
>> >> work with significant impact on the existing code base.
>> >
>> > Agreed. This would be a future extension.
>> >
>> >> 3. Require anyone who wants to read from stdin to code around Zinc's
>> >> inability to handle terminal input.
>> >>
>> >> I'd prefer to avoid this option if possible.
>> >
>> > See higher for a more concrete usage example request.
>>
>>
>> testAtEnd.st
>> --
>> | ch stream string stdin |
>>
>> 'stdio.cs' asFileReference fileIn.
>> "stdin := FileStream stdin."
>> stdin := ZnCharacterReadStream on:
>>     (ZnBufferedReadStream on:
>>         Stdio stdin).
>> stream := (String new: 100) writeStream.
>> ch := stdin next.
>> [ ch == nil ] whileFalse: [
>>     stream nextPut: ch.
>>     ch := stdin next. ].
>> string := stream contents.
>> FileStream stdout
>>     nextPutAll: string; lf;
>>     nextPutAll: 'Characters read: ';
>>     nextPutAll: string size asString;
>>     lf.
>> Smalltalk snapshot: false andQuit: true.
>> --
>>
>> Execute with:
>>
>> ./pharo --headless Pharo7.0-64bit-e76f1a2.image testAtEnd.st
>>
>> and type Ctrl-D gives:
>>
>>
>> 'Errors in script loaded from testAtEnd.st'
>> MessageNotUnderstood: receiver of "<" is nil
>> UndefinedObject(Object)>>doesNotUnderstand: #<
>> ZnUTF8Encoder>>nextCodePointFromStream:
>> ZnUTF8Encoder(ZnCharacterEncoder)>>nextFromStream:
>> ZnCharacterReadStream>>nextElement
>> ZnCharacterReadStream(ZnEncodedReadStream)>>next
>> UndefinedObject>>DoIt
>> OpalCompiler>>evaluate
>>
>>
>> Using #atEnd to control the loop instead of "stdin next == nil"
>> produces the same result.
>>
>> Replacing stdin with FileStream stdin makes the script work.
>>
>> stdio.cs fixes a bug in StdioStream which really isn't part of this
>> discussion (PR to be submitted).
>>
>> Cheers,
>> Alistair
>>
>>
>>
>>
>> >> Does that clarify the situation?
>> >
>> > Yes, it helps. Thanks. But questions remain.
>> >
>> >> Thanks,
>> >> Alistair
>> >>
>> >>
>> >>
>> >>>> On 10 Apr 2018, at 18:30, Alistair Grant <[hidden email]>
>> >>>> wrote:
>> >>>>
>> >>>> First a quick update:
>> >>>>
>> >>>> After doing some work on primitiveFileAtEnd, #atEnd now answers
>> >>>> correctly for files that don't report their size correctly, e.g.
>> >>>> /dev/urandom and /proc/cpuinfo, whether the files are opened directly
>> >>>> or
>> >>>> redirected through stdin.
>> >>>>
>> >>>> However determining whether stdin from a terminal has reached the end
>> >>>> of
>> >>>> file can't be done without making #atEnd blocking since we have to
>> >>>> wait
>> >>>> for the user to flag the end of file, e.g. by typing Ctrl-D.  And
>> >>>> #atEnd
>> >>>> is assumed to be non-blocking.
>> >>>>
>> >>>> So currently using ZnCharacterReadStream with stdin from a terminal
>> >>>> will
>> >>>> result in a stack dump similar to:
>> >>>>
>> >>>> MessageNotUnderstood: receiver of "<" is nil
>> >>>> UndefinedObject(Object)>>doesNotUnderstand: #<
>> >>>> ZnUTF8Encoder>>nextCodePointFromStream:
>> >>>> ZnUTF8Encoder(ZnCharacterEncoder)>>nextFromStream:
>> >>>> ZnCharacterReadStream>>nextElement
>> >>>> ZnCharacterReadStream(ZnEncodedReadStream)>>next
>> >>>> UndefinedObject>>DoIt
>> >>>>
>> >>>>
>> >>>> Going back through the various suggestions that have been made
>> >>>> regarding
>> >>>> using a sentinel object vs. raising a notification / exception, my
>> >>>> (still to be polished) suggestion is to:
>> >>>>
>> >>>> 1. Add an endOfStream instance variable
>> >>>> 2. When the end of the stream is reached answer the value of the
>> >>>>  instance variable (i.e. the result of sending #value to the
>> >>>> variable).
>> >>>> 3. The initial default value would be a block that raises a
>> >>>> Deprecation
>> >>>>  warning and then returns nil.  This would allow existing code to
>> >>>>  function for a changeover period.
>> >>>> 4. At the end of the deprecation period the default value would be
>> >>>>  changed to a unique sentinel object which would answer itself as its
>> >>>>  #value.
>> >>>>
>> >>>> At any time users of the stream can set their own sentinel, including
>> >>>> a
>> >>>> block that raises an exception.
>> >>>>
>> >>>>
>> >>>> Cheers,
>> >>>> Alistair
>> >>>>
>> >>>>
>> >>>> On 4 April 2018 at 19:24, Stephane Ducasse <[hidden email]>
>> >>>> wrote:
>> >>>>> Thanks for this discussion.
>> >>>>>
>> >>>>> On Wed, Apr 4, 2018 at 1:37 PM, Sven Van Caekenberghe <[hidden email]>
>> >>>>> wrote:
>> >>>>>> Alistair,
>> >>>>>>
>> >>>>>> First off, thanks for the discussions and your contributions, I
>> >>>>>> really appreciate them.
>> >>>>>>
>> >>>>>> But I want to have a discussion at the high level of the definition
>> >>>>>> and semantics of the stream API in Pharo.
>> >>>>>>
>> >>>>>>> On 4 Apr 2018, at 13:20, Alistair Grant <[hidden email]>
>> >>>>>>> wrote:
>> >>>>>>>
>> >>>>>>> On 4 April 2018 at 12:56, Sven Van Caekenberghe <[hidden email]>
>> >>>>>>> wrote:
>> >>>>>>>> Playing a bit devil's advocate, the idea is that, in general,
>> >>>>>>>>
>> >>>>>>>> [ stream atEnd] whileFalse: [ stream next. "..." ].
>> >>>>>>>>
>> >>>>>>>> is no longer allowed ?
>> >>>>>>>
>> >>>>>>> It hasn't been allowed "forever" [1].  It's just been misused for
>> >>>>>>> almost as long.
>> >>>>>>>
>> >>>>>>> [1] Time began when stdio stream support was introduced. :-)
>> >>>>>>
>> >>>>>> I am still not convinced. Another way to put it would be that the
>> >>>>>> old #atEnd or #upToEnd do not make sense for these streams and some new loop
>> >>>>>> is needed, based on a new test (it exists for socket streams already).
>> >>>>>>
>> >>>>>> [ stream isDataAvailable ] whileTrue: [ stream next ]
>> >>>>>>
>> >>>>>>>> And you want to replace it with
>> >>>>>>>>
>> >>>>>>>> [ stream next ifNil: [ false ] ifNotNil: [ :x | "..." true ]
>> >>>>>>>> whileTrue.
>> >>>>>>>>
>> >>>>>>>> That is a pretty big change, no ?
>> >>>>>>>
>> >>>>>>> That's the way quite a bit of code already operates.
>> >>>>>>>
>> >>>>>>> As Denis pointed out, it's obviously problematic in the general
>> >>>>>>> sense,
>> >>>>>>> since nil can be embedded in non-byte oriented streams.  I suspect
>> >>>>>>> that in practice not many people write code that reads streams
>> >>>>>>> from
>> >>>>>>> both byte oriented and non-byte oriented streams.
>> >>>>>>
>> >>>>>> Maybe yes, maybe no. As Denis' example shows there is a clear
>> >>>>>> definition problem.
>> >>>>>>
>> >>>>>> And I do use streams of byte arrays or strings all the time, this
>> >>>>>> is really important. I want my parsers to work on all kinds of streams.
>> >>>>>>
>> >>>>>>>> I think/feel like a proper EOF exception would be better, more
>> >>>>>>>> correct.
>> >>>>>>>>
>> >>>>>>>> [ [ stream next. "..." true ] on: EOF do: [ false ] ] whileTrue.
>> >>>>>>>
>> >>>>>>> I agree, but the email thread Nicolas pointed to raises some
>> >>>>>>> performance questions about this approach.  It should be
>> >>>>>>> straightforward to do a basic performance comparison which I'll
>> >>>>>>> get
>> >>>>>>> around to if other objections aren't raised.
>> >>>>>>
>> >>>>>> Reading in bigger blocks, using #readInto:startingAt:count: (which
>> >>>>>> is basically Unix's (2) Read sys call), would solve performance problems, I
>> >>>>>> think.
>> >>>>>>
>> >>>>>>>> Will we throw away #atEnd then ? Do we need it if we cannot use
>> >>>>>>>> it ?
>> >>>>>>>
>> >>>>>>> Unix file i/o returns EOF if the end of file has been reach OR if
>> >>>>>>> an
>> >>>>>>> error occurs.  You should still check #atEnd after reading past
>> >>>>>>> the
>> >>>>>>> end of the file to make sure no error occurred.  Another part of
>> >>>>>>> the
>> >>>>>>> primitive change I'm proposing is to return additional information
>> >>>>>>> about what went wrong in the event of an error.
>> >>>>>>
>> >>>>>> I am sorry, but this kind of semantics (the OR) is way too complex
>> >>>>>> at the general image level, it is too specific and based on certain
>> >>>>>> underlying implementation details.
>> >>>>>>
>> >>>>>> Sven
>> >>>>>>
>> >>>>>>> We could modify the read primitive so that it fails if an error
>> >>>>>>> has
>> >>>>>>> occurred, and then #atEnd wouldn't be required.
>> >>>>>>>
>> >>>>>>> Cheers,
>> >>>>>>> Alistair
>> >>>>>>>
>> >>>>>>>
>> >>>>>>>
>> >>>>>>>>> On 4 Apr 2018, at 12:41, Alistair Grant <[hidden email]>
>> >>>>>>>>> wrote:
>> >>>>>>>>>
>> >>>>>>>>> Hi Nicolas,
>> >>>>>>>>>
>> >>>>>>>>> On 4 April 2018 at 12:36, Nicolas Cellier
>> >>>>>>>>> <[hidden email]> wrote:
>> >>>>>>>>>>
>> >>>>>>>>>>
>> >>>>>>>>>> 2018-04-04 12:18 GMT+02:00 Alistair Grant
>> >>>>>>>>>> <[hidden email]>:
>> >>>>>>>>>>>
>> >>>>>>>>>>> Hi Sven,
>> >>>>>>>>>>>
>> >>>>>>>>>>> On Wed, Apr 04, 2018 at 11:32:02AM +0200, Sven Van
>> >>>>>>>>>>> Caekenberghe wrote:
>> >>>>>>>>>>>> Somehow, somewhere there was a change to the implementation
>> >>>>>>>>>>>> of the
>> >>>>>>>>>>>> primitive called by some streams' #atEnd.
>> >>>>>>>>>>>
>> >>>>>>>>>>> That's a proposed change by me, but it hasn't been integrated
>> >>>>>>>>>>> yet.  So
>> >>>>>>>>>>> the discussion below should apply to the current stable vm
>> >>>>>>>>>>> (from August
>> >>>>>>>>>>> last year).
>> >>>>>>>>>>>
>> >>>>>>>>>>>
>> >>>>>>>>>>>> IIRC, someone said it is implemented as 'remaining size being
>> >>>>>>>>>>>> zero'
>> >>>>>>>>>>>> and some virtual unix files like /dev/random are zero sized.
>> >>>>>>>>>>>
>> >>>>>>>>>>> Currently, for files other than sdio (stdout, stderr, stdin)
>> >>>>>>>>>>> it is
>> >>>>>>>>>>> effectively defined as:
>> >>>>>>>>>>>
>> >>>>>>>>>>> atEnd := stream position >= stream size
>> >>>>>>>>>>>
>> >>>>>>>>>>>
>> >>>>>>>>>>> And, as you say, plenty of virtual unix files report size 0.
>> >>>>>>>>>>>
>> >>>>>>>>>>>
>> >>>>>>>>>>>
>> >>>>>>>>>>>> Now, all kinds of changes are being done image size to work
>> >>>>>>>>>>>> around this.
>> >>>>>>>>>>>
>> >>>>>>>>>>> I would phrase this slightly differently :-)
>> >>>>>>>>>>>
>> >>>>>>>>>>> Some code does the right thing, while other code doesn't.
>> >>>>>>>>>>> E.g.:
>> >>>>>>>>>>>
>> >>>>>>>>>>> MultiByteFileStream>>upToEnd is good, while
>> >>>>>>>>>>> FileStream>>contents is incorrect
>> >>>>>>>>>>>
>> >>>>>>>>>>>
>> >>>>>>>>>>>> I am a strong believer in simple, real (i.e. infinite)
>> >>>>>>>>>>>> streams, but I
>> >>>>>>>>>>>> am not sure we are doing the right thing here.
>> >>>>>>>>>>>>
>> >>>>>>>>>>>> Point is, I am not sure #next returning nil is official and
>> >>>>>>>>>>>> universal.
>> >>>>>>>>>>>>
>> >>>>>>>>>>>> Consider the comments:
>> >>>>>>>>>>>>
>> >>>>>>>>>>>> Stream>>#next
>> >>>>>>>>>>>> "Answer the next object accessible by the receiver."
>> >>>>>>>>>>>>
>> >>>>>>>>>>>> ReadStream>>#next
>> >>>>>>>>>>>> "Primitive. Answer the next object in the Stream represented
>> >>>>>>>>>>>> by the
>> >>>>>>>>>>>> receiver. Fail if the collection of this stream is not an
>> >>>>>>>>>>>> Array or a
>> >>>>>>>>>>>> String.
>> >>>>>>>>>>>> Fail if the stream is positioned at its end, or if the
>> >>>>>>>>>>>> position is out
>> >>>>>>>>>>>> of
>> >>>>>>>>>>>> bounds in the collection. Optional. See Object documentation
>> >>>>>>>>>>>> whatIsAPrimitive."
>> >>>>>>>>>>>>
>> >>>>>>>>>>>> Note how there is no talk about returning nil !
>> >>>>>>>>>>>>
>> >>>>>>>>>>>> I think we should discuss about this first.
>> >>>>>>>>>>>>
>> >>>>>>>>>>>> Was the low level change really correct and the right thing
>> >>>>>>>>>>>> to do ?
>> >>>>>>>>>>>
>> >>>>>>>>>>> The primitive change proposed doesn't affect this discussion.
>> >>>>>>>>>>> It will
>> >>>>>>>>>>> mean that #atEnd returns false (correctly) sometimes, while
>> >>>>>>>>>>> currently it
>> >>>>>>>>>>> returns true (incorrectly).  The end result is still
>> >>>>>>>>>>> incorrect, e.g.
>> >>>>>>>>>>> #contents returns an empty string for /proc/cpuinfo.
>> >>>>>>>>>>>
>> >>>>>>>>>>> You're correct about no mention of nil, but we have:
>> >>>>>>>>>>>
>> >>>>>>>>>>> FileStream>>next
>> >>>>>>>>>>>
>> >>>>>>>>>>>     (position >= readLimit and: [self atEnd])
>> >>>>>>>>>>>             ifTrue: [^nil]
>> >>>>>>>>>>>             ifFalse: [^collection at: (position := position +
>> >>>>>>>>>>> 1)]
>> >>>>>>>>>>>
>> >>>>>>>>>>>
>> >>>>>>>>>>> which has been around for a long time (I suspect, before Pharo
>> >>>>>>>>>>> existed).
>> >>>>>>>>>>>
>> >>>>>>>>>>> Having said that, I think that raising an exception is a
>> >>>>>>>>>>> better
>> >>>>>>>>>>> solution, but it is a much, much bigger change than the one I
>> >>>>>>>>>>> proposed
>> >>>>>>>>>>> in https://github.com/pharo-project/pharo/pull/1180.
>> >>>>>>>>>>>
>> >>>>>>>>>>>
>> >>>>>>>>>>> Cheers,
>> >>>>>>>>>>> Alistair
>> >>>>>>>>>>>
>> >>>>>>>>>>
>> >>>>>>>>>> Hi,
>> >>>>>>>>>> yes, if you are after universal behavior englobing Unix
>> >>>>>>>>>> streams, the
>> >>>>>>>>>> Exception might be the best way.
>> >>>>>>>>>> Because on special stream you can't allways say in advance, you
>> >>>>>>>>>> have to try.
>> >>>>>>>>>> That's the solution adopted by authors of Xtreams.
>> >>>>>>>>>> But there is a runtime penalty associated to it.
>> >>>>>>>>>>
>> >>>>>>>>>> The penalty once was so high that my proposal to generalize
>> >>>>>>>>>> EndOfStream
>> >>>>>>>>>> usage was rejected a few years ago by AndreaRaab.
>> >>>>>>>>>> http://forum.world.st/EndOfStream-unused-td68806.html
>> >>>>>>>>>
>> >>>>>>>>> Thanks for this, I'll definitely take a look.
>> >>>>>>>>>
>> >>>>>>>>> Do you have a sense of how Denis' suggestion of using an
>> >>>>>>>>> EndOfStream
>> >>>>>>>>> object would compare?
>> >>>>>>>>>
>> >>>>>>>>> It would keep the same coding style, but avoid the problems with
>> >>>>>>>>> nil.
>> >>>>>>>>>
>> >>>>>>>>> Thanks,
>> >>>>>>>>> Alistair
>> >>>>>>>>>
>> >>>>>>>>>
>> >>>>>>>>>
>> >>>>>>>>>> I have regularly benched Xtreams, but stopped a few years ago.
>> >>>>>>>>>> Maybe i can excavate and pass on newer VM.
>> >>>>>>>>>>
>> >>>>>>>>>> In the mean time, i had experimented a programmable end of
>> >>>>>>>>>> stream behavior
>> >>>>>>>>>> (via a block, or any other valuable)
>> >>>>>>>>>> http://www.squeaksource.com/XTream.htm
>> >>>>>>>>>> so as to reconcile performance and universality, but it was a
>> >>>>>>>>>> source of
>> >>>>>>>>>> complexification at implementation side.
>> >>>>>>>>>>
>> >>>>>>>>>> Nicolas
>> >>>>>>>>>>
>> >>>>>>>>>>>
>> >>>>>>>>>>>
>> >>>>>>>>>>>> Note also that a Guille introduced something new, #closed
>> >>>>>>>>>>>> which is
>> >>>>>>>>>>>> related to the difference between having no more elements
>> >>>>>>>>>>>> (maybe right now,
>> >>>>>>>>>>>> like an open network stream) and never ever being able to
>> >>>>>>>>>>>> produce more data.
>> >>>>>>>>>>>>
>> >>>>>>>>>>>> Sven
>> >>>>>>>>
>> >>>>>>>>
>> >>>>>>>
>> >>>>>>
>> >>>>>>
>> >>>>>
>> >>>>
>> >>>
>> >>>
>> >> <stdio.cs>
>> >
>> >
>
>

Reply | Threaded
Open this post in threaded view
|

Re: Changed #atEnd primitive - #atEnd vs #next returning nil

Sven Van Caekenberghe-2


> On 11 Apr 2018, at 10:29, Alistair Grant <[hidden email]> wrote:
>
> Hi Denis,
>
> On 11 April 2018 at 10:02, Denis Kudriashov <[hidden email]> wrote:
>>
>> 2018-04-11 8:32 GMT+02:00 Alistair Grant <[hidden email]>:
>>>
>>>>>> Where is it being said that #next and/or #atEnd should be blocking or
>>>>>> non-blocking ?
>>>>>
>>>>> There is existing code that assumes that #atEnd is non-blocking and
>>>>> that #next is allowed block.  I believe that we should keep those
>>>>> conditions.
>>>>
>>>> I fail to see where that is written down, either way. Can you point me
>>>> to comments stating that, I would really like to know ?
>>>
>>> I'm not aware of it being written down, just that ever existing
>>> implementation I'm aware of behaves this way.
>>>
>>> On the other hand, making #atEnd blocking breaks Eliot's REPL sample
>>> (in Squeak).
>>
>>
>> Could you write here this example, please?
>
> The code is loaded in squeak using:
>
> https://github.com/OpenSmalltalk/opensmalltalk-vm/blob/Cog/image/buildspurtrunkreaderimage.sh
>
> for 32 bit images.  It loads:
>
> https://github.com/OpenSmalltalk/opensmalltalk-vm/blob/Cog/image/LoadReader.st
>
> which loads package CogTools-Listener in http://source.squeak.org/VMMaker
>
> An image that automatically runs the code and nothing else is created in:
>
> https://github.com/OpenSmalltalk/opensmalltalk-vm/blob/Cog/image/StartReader.st
>
>
> If you want to run it interactively you can load CogTools-Listener and
> do something like:
>
> StdioListener new
>    quitOnEof: false;
>    run

What does #quitOnEof: do ? Can the StdioListener code be browsed/viewed online somewhere ?

> If you modify #atEnd to block it will result in the "squeak>" input
> prompt being printed in the terminal after the input has been entered.

How does one modify #atEnd to block ? I suppose you are talking about StdioStream>>#atEnd ?

 ^ self peek isNil

?

PS: I liked your runnable example better, I will try it later on. Thx!

> The code can be loaded in to Pharo and basically works, but the output
> tends to be hidden behind the next input prompt because it uses #cr
> instead of #lf.  You can easily modify StdioListener>>initialize to
> set the line end convention in stdout.
>
> NOTE: It is not intended to be a release quality implementation of a
> evaluation loop.  The whole purpose as I understand it is for it to be
> as simple as possible to assist in tracking down issues using the VM
> simulator.  It runs minimal code to get to the point of waiting for
> user input and then allows an expression that causes problems to be
> entered and traced using the simulator.
>
> Cheers,
> Alistair
>
>
>
>>>>>> How is this related to how EOF is signalled ?
>>>>>
>>>>> Because, combined with terminal EOF not being known until the user
>>>>> explicitly flags it (with Ctrl-D) it means that #atEnd can't be used
>>>>> for iterating over input from stdin connected to a terminal.
>>>>
>>>> This seems to me like an exception that only holds for one particular
>>>> stream in one particular scenario (interactive stdin). I might be wrong.
>>>>
>>>>>> It seems to me that there are quite a few classes of streams that are
>>>>>> 'special' in the sense that #next could be blocking and/or #atEnd could be
>>>>>> unclear - socket/network streams, serial streams, maybe stdio (interactive
>>>>>> or not). Without a message like #isDataAvailable you cannot handle those
>>>>>> without blocking.
>>>>>
>>>>> Right.  I think this is a distraction (I was trying to explain some
>>>>> details, but it's causing more confusion instead of helping).
>>>>>
>>>>> The important point is that #atEnd doesn't work for iterating over
>>>>> streams with terminal input
>>>>
>>>> Maybe you should also point to the actual code that fails. I mean you
>>>> showed a partial stack trace, but not how you got there, precisely. How does
>>>> the application reading from an interactive stdin do to get into trouble ?
>>>
>>> Included below.
>>>
>>>
>>>>>> Reading from stdin seems like a very rare case for a Smalltalk system
>>>>>> (not that it should not be possible).
>>>>>
>>>>> There's been quite a bit of discussion and several projects recently
>>>>> related to using pharo for scripting, so it may become more common.
>>>>> E.g.
>>>>>
>>>>>
>>>>> https://www.quora.com/Can-Smalltalk-be-a-batch-file-scripting-language/answer/Philippe-Back-1?share=c19bfc95
>>>>> https://github.com/rajula96reddy/pharo-cli
>>>>
>>>> Still, it is not common at all.
>>>>
>>>>>> I have a feeling that too much functionality is being pushed into too
>>>>>> small an API.
>>>>>
>>>>> This is just about how should Zinc streams be iterating over the
>>>>> underlying streams.  You didn't like checking the result of #next for
>>>>> nil since it isn't general, correctly pointing out that nil is a valid
>>>>> value for non-byte oriented streams.  But #atEnd doesn't work for
>>>>> stdin from a terminal.
>>>>>
>>>>>
>>>>> At this point I think there are three options:
>>>>>
>>>>> 1. Modify Zinc to check the return value of #next instead of using
>>>>> #atEnd.
>>>>>
>>>>> This is what all existing character / byte oriented streams in Squeak
>>>>> and Pharo do.  At that point the Zinc streams can be used on all file
>>>>> / stdio input and output.
>>>>
>>>> I agree that such code exists in many places, but there is lots of
>>>> stream reading that does not check for nils.
>>>
>>> Right.  Streams can be categorised in many ways, but for this
>>> discussion I think streams are broken in to two types:
>>>
>>> 1) Byte / Character oriented
>>> 2) All others
>>>
>>> For historical reasons, byte / character oriented streams need to
>>> check for EOF by using "stream next == nil" and all other streams
>>> should use #atEnd.
>>>
>>> This avoids the "nil being part of the domain" issue that was
>>> discussed earlier in the thread.
>>>
>>>
>>>>> 2. Modify all streams to signal EOF in some other way, i.e. a sentinel
>>>>> or notification / exception.
>>>>>
>>>>> This is what we were discussing below.  But it is a decent chunk of
>>>>> work with significant impact on the existing code base.
>>>>
>>>> Agreed. This would be a future extension.
>>>>
>>>>> 3. Require anyone who wants to read from stdin to code around Zinc's
>>>>> inability to handle terminal input.
>>>>>
>>>>> I'd prefer to avoid this option if possible.
>>>>
>>>> See higher for a more concrete usage example request.
>>>
>>>
>>> testAtEnd.st
>>> --
>>> | ch stream string stdin |
>>>
>>> 'stdio.cs' asFileReference fileIn.
>>> "stdin := FileStream stdin."
>>> stdin := ZnCharacterReadStream on:
>>>    (ZnBufferedReadStream on:
>>>        Stdio stdin).
>>> stream := (String new: 100) writeStream.
>>> ch := stdin next.
>>> [ ch == nil ] whileFalse: [
>>>    stream nextPut: ch.
>>>    ch := stdin next. ].
>>> string := stream contents.
>>> FileStream stdout
>>>    nextPutAll: string; lf;
>>>    nextPutAll: 'Characters read: ';
>>>    nextPutAll: string size asString;
>>>    lf.
>>> Smalltalk snapshot: false andQuit: true.
>>> --
>>>
>>> Execute with:
>>>
>>> ./pharo --headless Pharo7.0-64bit-e76f1a2.image testAtEnd.st
>>>
>>> and type Ctrl-D gives:
>>>
>>>
>>> 'Errors in script loaded from testAtEnd.st'
>>> MessageNotUnderstood: receiver of "<" is nil
>>> UndefinedObject(Object)>>doesNotUnderstand: #<
>>> ZnUTF8Encoder>>nextCodePointFromStream:
>>> ZnUTF8Encoder(ZnCharacterEncoder)>>nextFromStream:
>>> ZnCharacterReadStream>>nextElement
>>> ZnCharacterReadStream(ZnEncodedReadStream)>>next
>>> UndefinedObject>>DoIt
>>> OpalCompiler>>evaluate
>>>
>>>
>>> Using #atEnd to control the loop instead of "stdin next == nil"
>>> produces the same result.
>>>
>>> Replacing stdin with FileStream stdin makes the script work.
>>>
>>> stdio.cs fixes a bug in StdioStream which really isn't part of this
>>> discussion (PR to be submitted).
>>>
>>> Cheers,
>>> Alistair
>>>
>>>
>>>
>>>
>>>>> Does that clarify the situation?
>>>>
>>>> Yes, it helps. Thanks. But questions remain.
>>>>
>>>>> Thanks,
>>>>> Alistair
>>>>>
>>>>>
>>>>>
>>>>>>> On 10 Apr 2018, at 18:30, Alistair Grant <[hidden email]>
>>>>>>> wrote:
>>>>>>>
>>>>>>> First a quick update:
>>>>>>>
>>>>>>> After doing some work on primitiveFileAtEnd, #atEnd now answers
>>>>>>> correctly for files that don't report their size correctly, e.g.
>>>>>>> /dev/urandom and /proc/cpuinfo, whether the files are opened directly
>>>>>>> or
>>>>>>> redirected through stdin.
>>>>>>>
>>>>>>> However determining whether stdin from a terminal has reached the end
>>>>>>> of
>>>>>>> file can't be done without making #atEnd blocking since we have to
>>>>>>> wait
>>>>>>> for the user to flag the end of file, e.g. by typing Ctrl-D.  And
>>>>>>> #atEnd
>>>>>>> is assumed to be non-blocking.
>>>>>>>
>>>>>>> So currently using ZnCharacterReadStream with stdin from a terminal
>>>>>>> will
>>>>>>> result in a stack dump similar to:
>>>>>>>
>>>>>>> MessageNotUnderstood: receiver of "<" is nil
>>>>>>> UndefinedObject(Object)>>doesNotUnderstand: #<
>>>>>>> ZnUTF8Encoder>>nextCodePointFromStream:
>>>>>>> ZnUTF8Encoder(ZnCharacterEncoder)>>nextFromStream:
>>>>>>> ZnCharacterReadStream>>nextElement
>>>>>>> ZnCharacterReadStream(ZnEncodedReadStream)>>next
>>>>>>> UndefinedObject>>DoIt
>>>>>>>
>>>>>>>
>>>>>>> Going back through the various suggestions that have been made
>>>>>>> regarding
>>>>>>> using a sentinel object vs. raising a notification / exception, my
>>>>>>> (still to be polished) suggestion is to:
>>>>>>>
>>>>>>> 1. Add an endOfStream instance variable
>>>>>>> 2. When the end of the stream is reached answer the value of the
>>>>>>> instance variable (i.e. the result of sending #value to the
>>>>>>> variable).
>>>>>>> 3. The initial default value would be a block that raises a
>>>>>>> Deprecation
>>>>>>> warning and then returns nil.  This would allow existing code to
>>>>>>> function for a changeover period.
>>>>>>> 4. At the end of the deprecation period the default value would be
>>>>>>> changed to a unique sentinel object which would answer itself as its
>>>>>>> #value.
>>>>>>>
>>>>>>> At any time users of the stream can set their own sentinel, including
>>>>>>> a
>>>>>>> block that raises an exception.
>>>>>>>
>>>>>>>
>>>>>>> Cheers,
>>>>>>> Alistair
>>>>>>>
>>>>>>>
>>>>>>> On 4 April 2018 at 19:24, Stephane Ducasse <[hidden email]>
>>>>>>> wrote:
>>>>>>>> Thanks for this discussion.
>>>>>>>>
>>>>>>>> On Wed, Apr 4, 2018 at 1:37 PM, Sven Van Caekenberghe <[hidden email]>
>>>>>>>> wrote:
>>>>>>>>> Alistair,
>>>>>>>>>
>>>>>>>>> First off, thanks for the discussions and your contributions, I
>>>>>>>>> really appreciate them.
>>>>>>>>>
>>>>>>>>> But I want to have a discussion at the high level of the definition
>>>>>>>>> and semantics of the stream API in Pharo.
>>>>>>>>>
>>>>>>>>>> On 4 Apr 2018, at 13:20, Alistair Grant <[hidden email]>
>>>>>>>>>> wrote:
>>>>>>>>>>
>>>>>>>>>> On 4 April 2018 at 12:56, Sven Van Caekenberghe <[hidden email]>
>>>>>>>>>> wrote:
>>>>>>>>>>> Playing a bit devil's advocate, the idea is that, in general,
>>>>>>>>>>>
>>>>>>>>>>> [ stream atEnd] whileFalse: [ stream next. "..." ].
>>>>>>>>>>>
>>>>>>>>>>> is no longer allowed ?
>>>>>>>>>>
>>>>>>>>>> It hasn't been allowed "forever" [1].  It's just been misused for
>>>>>>>>>> almost as long.
>>>>>>>>>>
>>>>>>>>>> [1] Time began when stdio stream support was introduced. :-)
>>>>>>>>>
>>>>>>>>> I am still not convinced. Another way to put it would be that the
>>>>>>>>> old #atEnd or #upToEnd do not make sense for these streams and some new loop
>>>>>>>>> is needed, based on a new test (it exists for socket streams already).
>>>>>>>>>
>>>>>>>>> [ stream isDataAvailable ] whileTrue: [ stream next ]
>>>>>>>>>
>>>>>>>>>>> And you want to replace it with
>>>>>>>>>>>
>>>>>>>>>>> [ stream next ifNil: [ false ] ifNotNil: [ :x | "..." true ]
>>>>>>>>>>> whileTrue.
>>>>>>>>>>>
>>>>>>>>>>> That is a pretty big change, no ?
>>>>>>>>>>
>>>>>>>>>> That's the way quite a bit of code already operates.
>>>>>>>>>>
>>>>>>>>>> As Denis pointed out, it's obviously problematic in the general
>>>>>>>>>> sense,
>>>>>>>>>> since nil can be embedded in non-byte oriented streams.  I suspect
>>>>>>>>>> that in practice not many people write code that reads streams
>>>>>>>>>> from
>>>>>>>>>> both byte oriented and non-byte oriented streams.
>>>>>>>>>
>>>>>>>>> Maybe yes, maybe no. As Denis' example shows there is a clear
>>>>>>>>> definition problem.
>>>>>>>>>
>>>>>>>>> And I do use streams of byte arrays or strings all the time, this
>>>>>>>>> is really important. I want my parsers to work on all kinds of streams.
>>>>>>>>>
>>>>>>>>>>> I think/feel like a proper EOF exception would be better, more
>>>>>>>>>>> correct.
>>>>>>>>>>>
>>>>>>>>>>> [ [ stream next. "..." true ] on: EOF do: [ false ] ] whileTrue.
>>>>>>>>>>
>>>>>>>>>> I agree, but the email thread Nicolas pointed to raises some
>>>>>>>>>> performance questions about this approach.  It should be
>>>>>>>>>> straightforward to do a basic performance comparison which I'll
>>>>>>>>>> get
>>>>>>>>>> around to if other objections aren't raised.
>>>>>>>>>
>>>>>>>>> Reading in bigger blocks, using #readInto:startingAt:count: (which
>>>>>>>>> is basically Unix's (2) Read sys call), would solve performance problems, I
>>>>>>>>> think.
>>>>>>>>>
>>>>>>>>>>> Will we throw away #atEnd then ? Do we need it if we cannot use
>>>>>>>>>>> it ?
>>>>>>>>>>
>>>>>>>>>> Unix file i/o returns EOF if the end of file has been reach OR if
>>>>>>>>>> an
>>>>>>>>>> error occurs.  You should still check #atEnd after reading past
>>>>>>>>>> the
>>>>>>>>>> end of the file to make sure no error occurred.  Another part of
>>>>>>>>>> the
>>>>>>>>>> primitive change I'm proposing is to return additional information
>>>>>>>>>> about what went wrong in the event of an error.
>>>>>>>>>
>>>>>>>>> I am sorry, but this kind of semantics (the OR) is way too complex
>>>>>>>>> at the general image level, it is too specific and based on certain
>>>>>>>>> underlying implementation details.
>>>>>>>>>
>>>>>>>>> Sven
>>>>>>>>>
>>>>>>>>>> We could modify the read primitive so that it fails if an error
>>>>>>>>>> has
>>>>>>>>>> occurred, and then #atEnd wouldn't be required.
>>>>>>>>>>
>>>>>>>>>> Cheers,
>>>>>>>>>> Alistair
>>>>>>>>>>
>>>>>>>>>>
>>>>>>>>>>
>>>>>>>>>>>> On 4 Apr 2018, at 12:41, Alistair Grant <[hidden email]>
>>>>>>>>>>>> wrote:
>>>>>>>>>>>>
>>>>>>>>>>>> Hi Nicolas,
>>>>>>>>>>>>
>>>>>>>>>>>> On 4 April 2018 at 12:36, Nicolas Cellier
>>>>>>>>>>>> <[hidden email]> wrote:
>>>>>>>>>>>>>
>>>>>>>>>>>>>
>>>>>>>>>>>>> 2018-04-04 12:18 GMT+02:00 Alistair Grant
>>>>>>>>>>>>> <[hidden email]>:
>>>>>>>>>>>>>>
>>>>>>>>>>>>>> Hi Sven,
>>>>>>>>>>>>>>
>>>>>>>>>>>>>> On Wed, Apr 04, 2018 at 11:32:02AM +0200, Sven Van
>>>>>>>>>>>>>> Caekenberghe wrote:
>>>>>>>>>>>>>>> Somehow, somewhere there was a change to the implementation
>>>>>>>>>>>>>>> of the
>>>>>>>>>>>>>>> primitive called by some streams' #atEnd.
>>>>>>>>>>>>>>
>>>>>>>>>>>>>> That's a proposed change by me, but it hasn't been integrated
>>>>>>>>>>>>>> yet.  So
>>>>>>>>>>>>>> the discussion below should apply to the current stable vm
>>>>>>>>>>>>>> (from August
>>>>>>>>>>>>>> last year).
>>>>>>>>>>>>>>
>>>>>>>>>>>>>>
>>>>>>>>>>>>>>> IIRC, someone said it is implemented as 'remaining size being
>>>>>>>>>>>>>>> zero'
>>>>>>>>>>>>>>> and some virtual unix files like /dev/random are zero sized.
>>>>>>>>>>>>>>
>>>>>>>>>>>>>> Currently, for files other than sdio (stdout, stderr, stdin)
>>>>>>>>>>>>>> it is
>>>>>>>>>>>>>> effectively defined as:
>>>>>>>>>>>>>>
>>>>>>>>>>>>>> atEnd := stream position >= stream size
>>>>>>>>>>>>>>
>>>>>>>>>>>>>>
>>>>>>>>>>>>>> And, as you say, plenty of virtual unix files report size 0.
>>>>>>>>>>>>>>
>>>>>>>>>>>>>>
>>>>>>>>>>>>>>
>>>>>>>>>>>>>>> Now, all kinds of changes are being done image size to work
>>>>>>>>>>>>>>> around this.
>>>>>>>>>>>>>>
>>>>>>>>>>>>>> I would phrase this slightly differently :-)
>>>>>>>>>>>>>>
>>>>>>>>>>>>>> Some code does the right thing, while other code doesn't.
>>>>>>>>>>>>>> E.g.:
>>>>>>>>>>>>>>
>>>>>>>>>>>>>> MultiByteFileStream>>upToEnd is good, while
>>>>>>>>>>>>>> FileStream>>contents is incorrect
>>>>>>>>>>>>>>
>>>>>>>>>>>>>>
>>>>>>>>>>>>>>> I am a strong believer in simple, real (i.e. infinite)
>>>>>>>>>>>>>>> streams, but I
>>>>>>>>>>>>>>> am not sure we are doing the right thing here.
>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>> Point is, I am not sure #next returning nil is official and
>>>>>>>>>>>>>>> universal.
>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>> Consider the comments:
>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>> Stream>>#next
>>>>>>>>>>>>>>> "Answer the next object accessible by the receiver."
>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>> ReadStream>>#next
>>>>>>>>>>>>>>> "Primitive. Answer the next object in the Stream represented
>>>>>>>>>>>>>>> by the
>>>>>>>>>>>>>>> receiver. Fail if the collection of this stream is not an
>>>>>>>>>>>>>>> Array or a
>>>>>>>>>>>>>>> String.
>>>>>>>>>>>>>>> Fail if the stream is positioned at its end, or if the
>>>>>>>>>>>>>>> position is out
>>>>>>>>>>>>>>> of
>>>>>>>>>>>>>>> bounds in the collection. Optional. See Object documentation
>>>>>>>>>>>>>>> whatIsAPrimitive."
>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>> Note how there is no talk about returning nil !
>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>> I think we should discuss about this first.
>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>> Was the low level change really correct and the right thing
>>>>>>>>>>>>>>> to do ?
>>>>>>>>>>>>>>
>>>>>>>>>>>>>> The primitive change proposed doesn't affect this discussion.
>>>>>>>>>>>>>> It will
>>>>>>>>>>>>>> mean that #atEnd returns false (correctly) sometimes, while
>>>>>>>>>>>>>> currently it
>>>>>>>>>>>>>> returns true (incorrectly).  The end result is still
>>>>>>>>>>>>>> incorrect, e.g.
>>>>>>>>>>>>>> #contents returns an empty string for /proc/cpuinfo.
>>>>>>>>>>>>>>
>>>>>>>>>>>>>> You're correct about no mention of nil, but we have:
>>>>>>>>>>>>>>
>>>>>>>>>>>>>> FileStream>>next
>>>>>>>>>>>>>>
>>>>>>>>>>>>>>    (position >= readLimit and: [self atEnd])
>>>>>>>>>>>>>>            ifTrue: [^nil]
>>>>>>>>>>>>>>            ifFalse: [^collection at: (position := position +
>>>>>>>>>>>>>> 1)]
>>>>>>>>>>>>>>
>>>>>>>>>>>>>>
>>>>>>>>>>>>>> which has been around for a long time (I suspect, before Pharo
>>>>>>>>>>>>>> existed).
>>>>>>>>>>>>>>
>>>>>>>>>>>>>> Having said that, I think that raising an exception is a
>>>>>>>>>>>>>> better
>>>>>>>>>>>>>> solution, but it is a much, much bigger change than the one I
>>>>>>>>>>>>>> proposed
>>>>>>>>>>>>>> in https://github.com/pharo-project/pharo/pull/1180.
>>>>>>>>>>>>>>
>>>>>>>>>>>>>>
>>>>>>>>>>>>>> Cheers,
>>>>>>>>>>>>>> Alistair
>>>>>>>>>>>>>>
>>>>>>>>>>>>>
>>>>>>>>>>>>> Hi,
>>>>>>>>>>>>> yes, if you are after universal behavior englobing Unix
>>>>>>>>>>>>> streams, the
>>>>>>>>>>>>> Exception might be the best way.
>>>>>>>>>>>>> Because on special stream you can't allways say in advance, you
>>>>>>>>>>>>> have to try.
>>>>>>>>>>>>> That's the solution adopted by authors of Xtreams.
>>>>>>>>>>>>> But there is a runtime penalty associated to it.
>>>>>>>>>>>>>
>>>>>>>>>>>>> The penalty once was so high that my proposal to generalize
>>>>>>>>>>>>> EndOfStream
>>>>>>>>>>>>> usage was rejected a few years ago by AndreaRaab.
>>>>>>>>>>>>> http://forum.world.st/EndOfStream-unused-td68806.html
>>>>>>>>>>>>
>>>>>>>>>>>> Thanks for this, I'll definitely take a look.
>>>>>>>>>>>>
>>>>>>>>>>>> Do you have a sense of how Denis' suggestion of using an
>>>>>>>>>>>> EndOfStream
>>>>>>>>>>>> object would compare?
>>>>>>>>>>>>
>>>>>>>>>>>> It would keep the same coding style, but avoid the problems with
>>>>>>>>>>>> nil.
>>>>>>>>>>>>
>>>>>>>>>>>> Thanks,
>>>>>>>>>>>> Alistair
>>>>>>>>>>>>
>>>>>>>>>>>>
>>>>>>>>>>>>
>>>>>>>>>>>>> I have regularly benched Xtreams, but stopped a few years ago.
>>>>>>>>>>>>> Maybe i can excavate and pass on newer VM.
>>>>>>>>>>>>>
>>>>>>>>>>>>> In the mean time, i had experimented a programmable end of
>>>>>>>>>>>>> stream behavior
>>>>>>>>>>>>> (via a block, or any other valuable)
>>>>>>>>>>>>> http://www.squeaksource.com/XTream.htm
>>>>>>>>>>>>> so as to reconcile performance and universality, but it was a
>>>>>>>>>>>>> source of
>>>>>>>>>>>>> complexification at implementation side.
>>>>>>>>>>>>>
>>>>>>>>>>>>> Nicolas
>>>>>>>>>>>>>
>>>>>>>>>>>>>>
>>>>>>>>>>>>>>
>>>>>>>>>>>>>>> Note also that a Guille introduced something new, #closed
>>>>>>>>>>>>>>> which is
>>>>>>>>>>>>>>> related to the difference between having no more elements
>>>>>>>>>>>>>>> (maybe right now,
>>>>>>>>>>>>>>> like an open network stream) and never ever being able to
>>>>>>>>>>>>>>> produce more data.
>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>> Sven
>>>>>>>>>>>
>>>>>>>>>>>
>>>>>>>>>>
>>>>>>>>>
>>>>>>>>>
>>>>>>>>
>>>>>>>
>>>>>>
>>>>>>
>>>>> <stdio.cs>


Reply | Threaded
Open this post in threaded view
|

Re: Changed #atEnd primitive - #atEnd vs #next returning nil

alistairgrant
Hi Sven,

Oh dear.  I feel as though I'm not getting my concerns across at all
well, and I'm pushing hard enough that all I'm going to do is make
people annoyed.  So let me try to restate the issue one last time
before answering your questions directly.

Pharo & Squeak have unwritten rules about stream usage that I suspect
have just emerged over time without being designed.

If you want to be able to iterate over any stream, and in particular
stdin from a terminal (which, as far as I know, is the outlier that
causes all the problems) you have to follow these rules:

1.  If the stream is character / byte oriented you have to check for
EOF using "stream next == nil".  #atEnd can be used, but you'll still
have to do the nil check.

2.  All other streams have to check for EOF (end of stream) using
#atEnd.  "stream next == nil" can be used, but you'll still need to
test #atEnd to determine whether nil is a value returned by the
stream.

If you write code that you want to be able to consume characters,
bytes or any other object, you'll have to test both "stream next ==
nil" and #atEnd.

The rules are the result of the original blue book design being that
#atEnd should be used, and then character input from a terminal being
added later, but always returning an EOF character (nil) before #atEnd
answers correctly.

At the moment, ZnCharacterEncoder uses #atEnd on character / byte
streams, so fails for stdin on a terminal.

Back to your questions:

On 11 April 2018 at 11:12, Sven Van Caekenberghe <[hidden email]> wrote:

>
>
>> On 11 Apr 2018, at 10:29, Alistair Grant <[hidden email]> wrote:
>>
>> Hi Denis,
>>
>> On 11 April 2018 at 10:02, Denis Kudriashov <[hidden email]> wrote:
>>>
>>> 2018-04-11 8:32 GMT+02:00 Alistair Grant <[hidden email]>:
>>>>
>>>>>>> Where is it being said that #next and/or #atEnd should be blocking or
>>>>>>> non-blocking ?
>>>>>>
>>>>>> There is existing code that assumes that #atEnd is non-blocking and
>>>>>> that #next is allowed block.  I believe that we should keep those
>>>>>> conditions.
>>>>>
>>>>> I fail to see where that is written down, either way. Can you point me
>>>>> to comments stating that, I would really like to know ?
>>>>
>>>> I'm not aware of it being written down, just that ever existing
>>>> implementation I'm aware of behaves this way.
>>>>
>>>> On the other hand, making #atEnd blocking breaks Eliot's REPL sample
>>>> (in Squeak).
>>>
>>>
>>> Could you write here this example, please?
>>
>> The code is loaded in squeak using:
>>
>> https://github.com/OpenSmalltalk/opensmalltalk-vm/blob/Cog/image/buildspurtrunkreaderimage.sh
>>
>> for 32 bit images.  It loads:
>>
>> https://github.com/OpenSmalltalk/opensmalltalk-vm/blob/Cog/image/LoadReader.st
>>
>> which loads package CogTools-Listener in http://source.squeak.org/VMMaker
>>
>> An image that automatically runs the code and nothing else is created in:
>>
>> https://github.com/OpenSmalltalk/opensmalltalk-vm/blob/Cog/image/StartReader.st
>>
>>
>> If you want to run it interactively you can load CogTools-Listener and
>> do something like:
>>
>> StdioListener new
>>    quitOnEof: false;
>>    run
>
> What does #quitOnEof: do ? Can the StdioListener code be browsed/viewed online somewhere ?

I just referenced this as an example of making #atEnd (really
FilePlugin>>primitiveFileAtEnd) blocking causing problems.  I wasn't
expecting people to go and look at the code or use it as a test.

If you really want to look at it (from Pharo):

1. Add http://source.squeak.org/VMMaker as a repository.
2. Browse the CogTools-Listener package


>> If you modify #atEnd to block it will result in the "squeak>" input
>> prompt being printed in the terminal after the input has been entered.
>
> How does one modify #atEnd to block ? I suppose you are talking about StdioStream>>#atEnd ?

I meant the primitive, i.e. FilePlugin>>primitiveFileAtEnd /
FilePluginPrims>>atEnd:.


>  ^ self peek isNil
>
> ?
>
> PS: I liked your runnable example better, I will try it later on. Thx!

Right.  My code is meant to be minimal and trigger the problem I'm
actually focused on - that ZnCharacterEncoder doesn't work with stdin
from a terminal.

Sven has expressed a hesitation to change the internal operation of
the Zinc streams from using #atEnd to "stream peek == nil" and this
whole discussion is really about us trying to resolve our different
perspective of the best path forward.  I respect Sven and his work so
I'm trying to justify the change (but I'm not expressing it at all
well, obviously).

Cheers,
Alistair



>> The code can be loaded in to Pharo and basically works, but the output
>> tends to be hidden behind the next input prompt because it uses #cr
>> instead of #lf.  You can easily modify StdioListener>>initialize to
>> set the line end convention in stdout.
>>
>> NOTE: It is not intended to be a release quality implementation of a
>> evaluation loop.  The whole purpose as I understand it is for it to be
>> as simple as possible to assist in tracking down issues using the VM
>> simulator.  It runs minimal code to get to the point of waiting for
>> user input and then allows an expression that causes problems to be
>> entered and traced using the simulator.
>>
>> Cheers,
>> Alistair
>>
>>
>>
>>>>>>> How is this related to how EOF is signalled ?
>>>>>>
>>>>>> Because, combined with terminal EOF not being known until the user
>>>>>> explicitly flags it (with Ctrl-D) it means that #atEnd can't be used
>>>>>> for iterating over input from stdin connected to a terminal.
>>>>>
>>>>> This seems to me like an exception that only holds for one particular
>>>>> stream in one particular scenario (interactive stdin). I might be wrong.
>>>>>
>>>>>>> It seems to me that there are quite a few classes of streams that are
>>>>>>> 'special' in the sense that #next could be blocking and/or #atEnd could be
>>>>>>> unclear - socket/network streams, serial streams, maybe stdio (interactive
>>>>>>> or not). Without a message like #isDataAvailable you cannot handle those
>>>>>>> without blocking.
>>>>>>
>>>>>> Right.  I think this is a distraction (I was trying to explain some
>>>>>> details, but it's causing more confusion instead of helping).
>>>>>>
>>>>>> The important point is that #atEnd doesn't work for iterating over
>>>>>> streams with terminal input
>>>>>
>>>>> Maybe you should also point to the actual code that fails. I mean you
>>>>> showed a partial stack trace, but not how you got there, precisely. How does
>>>>> the application reading from an interactive stdin do to get into trouble ?
>>>>
>>>> Included below.
>>>>
>>>>
>>>>>>> Reading from stdin seems like a very rare case for a Smalltalk system
>>>>>>> (not that it should not be possible).
>>>>>>
>>>>>> There's been quite a bit of discussion and several projects recently
>>>>>> related to using pharo for scripting, so it may become more common.
>>>>>> E.g.
>>>>>>
>>>>>>
>>>>>> https://www.quora.com/Can-Smalltalk-be-a-batch-file-scripting-language/answer/Philippe-Back-1?share=c19bfc95
>>>>>> https://github.com/rajula96reddy/pharo-cli
>>>>>
>>>>> Still, it is not common at all.
>>>>>
>>>>>>> I have a feeling that too much functionality is being pushed into too
>>>>>>> small an API.
>>>>>>
>>>>>> This is just about how should Zinc streams be iterating over the
>>>>>> underlying streams.  You didn't like checking the result of #next for
>>>>>> nil since it isn't general, correctly pointing out that nil is a valid
>>>>>> value for non-byte oriented streams.  But #atEnd doesn't work for
>>>>>> stdin from a terminal.
>>>>>>
>>>>>>
>>>>>> At this point I think there are three options:
>>>>>>
>>>>>> 1. Modify Zinc to check the return value of #next instead of using
>>>>>> #atEnd.
>>>>>>
>>>>>> This is what all existing character / byte oriented streams in Squeak
>>>>>> and Pharo do.  At that point the Zinc streams can be used on all file
>>>>>> / stdio input and output.
>>>>>
>>>>> I agree that such code exists in many places, but there is lots of
>>>>> stream reading that does not check for nils.
>>>>
>>>> Right.  Streams can be categorised in many ways, but for this
>>>> discussion I think streams are broken in to two types:
>>>>
>>>> 1) Byte / Character oriented
>>>> 2) All others
>>>>
>>>> For historical reasons, byte / character oriented streams need to
>>>> check for EOF by using "stream next == nil" and all other streams
>>>> should use #atEnd.
>>>>
>>>> This avoids the "nil being part of the domain" issue that was
>>>> discussed earlier in the thread.
>>>>
>>>>
>>>>>> 2. Modify all streams to signal EOF in some other way, i.e. a sentinel
>>>>>> or notification / exception.
>>>>>>
>>>>>> This is what we were discussing below.  But it is a decent chunk of
>>>>>> work with significant impact on the existing code base.
>>>>>
>>>>> Agreed. This would be a future extension.
>>>>>
>>>>>> 3. Require anyone who wants to read from stdin to code around Zinc's
>>>>>> inability to handle terminal input.
>>>>>>
>>>>>> I'd prefer to avoid this option if possible.
>>>>>
>>>>> See higher for a more concrete usage example request.
>>>>
>>>>
>>>> testAtEnd.st
>>>> --
>>>> | ch stream string stdin |
>>>>
>>>> 'stdio.cs' asFileReference fileIn.
>>>> "stdin := FileStream stdin."
>>>> stdin := ZnCharacterReadStream on:
>>>>    (ZnBufferedReadStream on:
>>>>        Stdio stdin).
>>>> stream := (String new: 100) writeStream.
>>>> ch := stdin next.
>>>> [ ch == nil ] whileFalse: [
>>>>    stream nextPut: ch.
>>>>    ch := stdin next. ].
>>>> string := stream contents.
>>>> FileStream stdout
>>>>    nextPutAll: string; lf;
>>>>    nextPutAll: 'Characters read: ';
>>>>    nextPutAll: string size asString;
>>>>    lf.
>>>> Smalltalk snapshot: false andQuit: true.
>>>> --
>>>>
>>>> Execute with:
>>>>
>>>> ./pharo --headless Pharo7.0-64bit-e76f1a2.image testAtEnd.st
>>>>
>>>> and type Ctrl-D gives:
>>>>
>>>>
>>>> 'Errors in script loaded from testAtEnd.st'
>>>> MessageNotUnderstood: receiver of "<" is nil
>>>> UndefinedObject(Object)>>doesNotUnderstand: #<
>>>> ZnUTF8Encoder>>nextCodePointFromStream:
>>>> ZnUTF8Encoder(ZnCharacterEncoder)>>nextFromStream:
>>>> ZnCharacterReadStream>>nextElement
>>>> ZnCharacterReadStream(ZnEncodedReadStream)>>next
>>>> UndefinedObject>>DoIt
>>>> OpalCompiler>>evaluate
>>>>
>>>>
>>>> Using #atEnd to control the loop instead of "stdin next == nil"
>>>> produces the same result.
>>>>
>>>> Replacing stdin with FileStream stdin makes the script work.
>>>>
>>>> stdio.cs fixes a bug in StdioStream which really isn't part of this
>>>> discussion (PR to be submitted).
>>>>
>>>> Cheers,
>>>> Alistair
>>>>
>>>>
>>>>
>>>>
>>>>>> Does that clarify the situation?
>>>>>
>>>>> Yes, it helps. Thanks. But questions remain.
>>>>>
>>>>>> Thanks,
>>>>>> Alistair
>>>>>>
>>>>>>
>>>>>>
>>>>>>>> On 10 Apr 2018, at 18:30, Alistair Grant <[hidden email]>
>>>>>>>> wrote:
>>>>>>>>
>>>>>>>> First a quick update:
>>>>>>>>
>>>>>>>> After doing some work on primitiveFileAtEnd, #atEnd now answers
>>>>>>>> correctly for files that don't report their size correctly, e.g.
>>>>>>>> /dev/urandom and /proc/cpuinfo, whether the files are opened directly
>>>>>>>> or
>>>>>>>> redirected through stdin.
>>>>>>>>
>>>>>>>> However determining whether stdin from a terminal has reached the end
>>>>>>>> of
>>>>>>>> file can't be done without making #atEnd blocking since we have to
>>>>>>>> wait
>>>>>>>> for the user to flag the end of file, e.g. by typing Ctrl-D.  And
>>>>>>>> #atEnd
>>>>>>>> is assumed to be non-blocking.
>>>>>>>>
>>>>>>>> So currently using ZnCharacterReadStream with stdin from a terminal
>>>>>>>> will
>>>>>>>> result in a stack dump similar to:
>>>>>>>>
>>>>>>>> MessageNotUnderstood: receiver of "<" is nil
>>>>>>>> UndefinedObject(Object)>>doesNotUnderstand: #<
>>>>>>>> ZnUTF8Encoder>>nextCodePointFromStream:
>>>>>>>> ZnUTF8Encoder(ZnCharacterEncoder)>>nextFromStream:
>>>>>>>> ZnCharacterReadStream>>nextElement
>>>>>>>> ZnCharacterReadStream(ZnEncodedReadStream)>>next
>>>>>>>> UndefinedObject>>DoIt
>>>>>>>>
>>>>>>>>
>>>>>>>> Going back through the various suggestions that have been made
>>>>>>>> regarding
>>>>>>>> using a sentinel object vs. raising a notification / exception, my
>>>>>>>> (still to be polished) suggestion is to:
>>>>>>>>
>>>>>>>> 1. Add an endOfStream instance variable
>>>>>>>> 2. When the end of the stream is reached answer the value of the
>>>>>>>> instance variable (i.e. the result of sending #value to the
>>>>>>>> variable).
>>>>>>>> 3. The initial default value would be a block that raises a
>>>>>>>> Deprecation
>>>>>>>> warning and then returns nil.  This would allow existing code to
>>>>>>>> function for a changeover period.
>>>>>>>> 4. At the end of the deprecation period the default value would be
>>>>>>>> changed to a unique sentinel object which would answer itself as its
>>>>>>>> #value.
>>>>>>>>
>>>>>>>> At any time users of the stream can set their own sentinel, including
>>>>>>>> a
>>>>>>>> block that raises an exception.
>>>>>>>>
>>>>>>>>
>>>>>>>> Cheers,
>>>>>>>> Alistair
>>>>>>>>
>>>>>>>>
>>>>>>>> On 4 April 2018 at 19:24, Stephane Ducasse <[hidden email]>
>>>>>>>> wrote:
>>>>>>>>> Thanks for this discussion.
>>>>>>>>>
>>>>>>>>> On Wed, Apr 4, 2018 at 1:37 PM, Sven Van Caekenberghe <[hidden email]>
>>>>>>>>> wrote:
>>>>>>>>>> Alistair,
>>>>>>>>>>
>>>>>>>>>> First off, thanks for the discussions and your contributions, I
>>>>>>>>>> really appreciate them.
>>>>>>>>>>
>>>>>>>>>> But I want to have a discussion at the high level of the definition
>>>>>>>>>> and semantics of the stream API in Pharo.
>>>>>>>>>>
>>>>>>>>>>> On 4 Apr 2018, at 13:20, Alistair Grant <[hidden email]>
>>>>>>>>>>> wrote:
>>>>>>>>>>>
>>>>>>>>>>> On 4 April 2018 at 12:56, Sven Van Caekenberghe <[hidden email]>
>>>>>>>>>>> wrote:
>>>>>>>>>>>> Playing a bit devil's advocate, the idea is that, in general,
>>>>>>>>>>>>
>>>>>>>>>>>> [ stream atEnd] whileFalse: [ stream next. "..." ].
>>>>>>>>>>>>
>>>>>>>>>>>> is no longer allowed ?
>>>>>>>>>>>
>>>>>>>>>>> It hasn't been allowed "forever" [1].  It's just been misused for
>>>>>>>>>>> almost as long.
>>>>>>>>>>>
>>>>>>>>>>> [1] Time began when stdio stream support was introduced. :-)
>>>>>>>>>>
>>>>>>>>>> I am still not convinced. Another way to put it would be that the
>>>>>>>>>> old #atEnd or #upToEnd do not make sense for these streams and some new loop
>>>>>>>>>> is needed, based on a new test (it exists for socket streams already).
>>>>>>>>>>
>>>>>>>>>> [ stream isDataAvailable ] whileTrue: [ stream next ]
>>>>>>>>>>
>>>>>>>>>>>> And you want to replace it with
>>>>>>>>>>>>
>>>>>>>>>>>> [ stream next ifNil: [ false ] ifNotNil: [ :x | "..." true ]
>>>>>>>>>>>> whileTrue.
>>>>>>>>>>>>
>>>>>>>>>>>> That is a pretty big change, no ?
>>>>>>>>>>>
>>>>>>>>>>> That's the way quite a bit of code already operates.
>>>>>>>>>>>
>>>>>>>>>>> As Denis pointed out, it's obviously problematic in the general
>>>>>>>>>>> sense,
>>>>>>>>>>> since nil can be embedded in non-byte oriented streams.  I suspect
>>>>>>>>>>> that in practice not many people write code that reads streams
>>>>>>>>>>> from
>>>>>>>>>>> both byte oriented and non-byte oriented streams.
>>>>>>>>>>
>>>>>>>>>> Maybe yes, maybe no. As Denis' example shows there is a clear
>>>>>>>>>> definition problem.
>>>>>>>>>>
>>>>>>>>>> And I do use streams of byte arrays or strings all the time, this
>>>>>>>>>> is really important. I want my parsers to work on all kinds of streams.
>>>>>>>>>>
>>>>>>>>>>>> I think/feel like a proper EOF exception would be better, more
>>>>>>>>>>>> correct.
>>>>>>>>>>>>
>>>>>>>>>>>> [ [ stream next. "..." true ] on: EOF do: [ false ] ] whileTrue.
>>>>>>>>>>>
>>>>>>>>>>> I agree, but the email thread Nicolas pointed to raises some
>>>>>>>>>>> performance questions about this approach.  It should be
>>>>>>>>>>> straightforward to do a basic performance comparison which I'll
>>>>>>>>>>> get
>>>>>>>>>>> around to if other objections aren't raised.
>>>>>>>>>>
>>>>>>>>>> Reading in bigger blocks, using #readInto:startingAt:count: (which
>>>>>>>>>> is basically Unix's (2) Read sys call), would solve performance problems, I
>>>>>>>>>> think.
>>>>>>>>>>
>>>>>>>>>>>> Will we throw away #atEnd then ? Do we need it if we cannot use
>>>>>>>>>>>> it ?
>>>>>>>>>>>
>>>>>>>>>>> Unix file i/o returns EOF if the end of file has been reach OR if
>>>>>>>>>>> an
>>>>>>>>>>> error occurs.  You should still check #atEnd after reading past
>>>>>>>>>>> the
>>>>>>>>>>> end of the file to make sure no error occurred.  Another part of
>>>>>>>>>>> the
>>>>>>>>>>> primitive change I'm proposing is to return additional information
>>>>>>>>>>> about what went wrong in the event of an error.
>>>>>>>>>>
>>>>>>>>>> I am sorry, but this kind of semantics (the OR) is way too complex
>>>>>>>>>> at the general image level, it is too specific and based on certain
>>>>>>>>>> underlying implementation details.
>>>>>>>>>>
>>>>>>>>>> Sven
>>>>>>>>>>
>>>>>>>>>>> We could modify the read primitive so that it fails if an error
>>>>>>>>>>> has
>>>>>>>>>>> occurred, and then #atEnd wouldn't be required.
>>>>>>>>>>>
>>>>>>>>>>> Cheers,
>>>>>>>>>>> Alistair
>>>>>>>>>>>
>>>>>>>>>>>
>>>>>>>>>>>
>>>>>>>>>>>>> On 4 Apr 2018, at 12:41, Alistair Grant <[hidden email]>
>>>>>>>>>>>>> wrote:
>>>>>>>>>>>>>
>>>>>>>>>>>>> Hi Nicolas,
>>>>>>>>>>>>>
>>>>>>>>>>>>> On 4 April 2018 at 12:36, Nicolas Cellier
>>>>>>>>>>>>> <[hidden email]> wrote:
>>>>>>>>>>>>>>
>>>>>>>>>>>>>>
>>>>>>>>>>>>>> 2018-04-04 12:18 GMT+02:00 Alistair Grant
>>>>>>>>>>>>>> <[hidden email]>:
>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>> Hi Sven,
>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>> On Wed, Apr 04, 2018 at 11:32:02AM +0200, Sven Van
>>>>>>>>>>>>>>> Caekenberghe wrote:
>>>>>>>>>>>>>>>> Somehow, somewhere there was a change to the implementation
>>>>>>>>>>>>>>>> of the
>>>>>>>>>>>>>>>> primitive called by some streams' #atEnd.
>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>> That's a proposed change by me, but it hasn't been integrated
>>>>>>>>>>>>>>> yet.  So
>>>>>>>>>>>>>>> the discussion below should apply to the current stable vm
>>>>>>>>>>>>>>> (from August
>>>>>>>>>>>>>>> last year).
>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>> IIRC, someone said it is implemented as 'remaining size being
>>>>>>>>>>>>>>>> zero'
>>>>>>>>>>>>>>>> and some virtual unix files like /dev/random are zero sized.
>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>> Currently, for files other than sdio (stdout, stderr, stdin)
>>>>>>>>>>>>>>> it is
>>>>>>>>>>>>>>> effectively defined as:
>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>> atEnd := stream position >= stream size
>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>> And, as you say, plenty of virtual unix files report size 0.
>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>> Now, all kinds of changes are being done image size to work
>>>>>>>>>>>>>>>> around this.
>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>> I would phrase this slightly differently :-)
>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>> Some code does the right thing, while other code doesn't.
>>>>>>>>>>>>>>> E.g.:
>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>> MultiByteFileStream>>upToEnd is good, while
>>>>>>>>>>>>>>> FileStream>>contents is incorrect
>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>> I am a strong believer in simple, real (i.e. infinite)
>>>>>>>>>>>>>>>> streams, but I
>>>>>>>>>>>>>>>> am not sure we are doing the right thing here.
>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>> Point is, I am not sure #next returning nil is official and
>>>>>>>>>>>>>>>> universal.
>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>> Consider the comments:
>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>> Stream>>#next
>>>>>>>>>>>>>>>> "Answer the next object accessible by the receiver."
>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>> ReadStream>>#next
>>>>>>>>>>>>>>>> "Primitive. Answer the next object in the Stream represented
>>>>>>>>>>>>>>>> by the
>>>>>>>>>>>>>>>> receiver. Fail if the collection of this stream is not an
>>>>>>>>>>>>>>>> Array or a
>>>>>>>>>>>>>>>> String.
>>>>>>>>>>>>>>>> Fail if the stream is positioned at its end, or if the
>>>>>>>>>>>>>>>> position is out
>>>>>>>>>>>>>>>> of
>>>>>>>>>>>>>>>> bounds in the collection. Optional. See Object documentation
>>>>>>>>>>>>>>>> whatIsAPrimitive."
>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>> Note how there is no talk about returning nil !
>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>> I think we should discuss about this first.
>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>> Was the low level change really correct and the right thing
>>>>>>>>>>>>>>>> to do ?
>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>> The primitive change proposed doesn't affect this discussion.
>>>>>>>>>>>>>>> It will
>>>>>>>>>>>>>>> mean that #atEnd returns false (correctly) sometimes, while
>>>>>>>>>>>>>>> currently it
>>>>>>>>>>>>>>> returns true (incorrectly).  The end result is still
>>>>>>>>>>>>>>> incorrect, e.g.
>>>>>>>>>>>>>>> #contents returns an empty string for /proc/cpuinfo.
>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>> You're correct about no mention of nil, but we have:
>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>> FileStream>>next
>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>    (position >= readLimit and: [self atEnd])
>>>>>>>>>>>>>>>            ifTrue: [^nil]
>>>>>>>>>>>>>>>            ifFalse: [^collection at: (position := position +
>>>>>>>>>>>>>>> 1)]
>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>> which has been around for a long time (I suspect, before Pharo
>>>>>>>>>>>>>>> existed).
>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>> Having said that, I think that raising an exception is a
>>>>>>>>>>>>>>> better
>>>>>>>>>>>>>>> solution, but it is a much, much bigger change than the one I
>>>>>>>>>>>>>>> proposed
>>>>>>>>>>>>>>> in https://github.com/pharo-project/pharo/pull/1180.
>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>> Cheers,
>>>>>>>>>>>>>>> Alistair
>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>
>>>>>>>>>>>>>> Hi,
>>>>>>>>>>>>>> yes, if you are after universal behavior englobing Unix
>>>>>>>>>>>>>> streams, the
>>>>>>>>>>>>>> Exception might be the best way.
>>>>>>>>>>>>>> Because on special stream you can't allways say in advance, you
>>>>>>>>>>>>>> have to try.
>>>>>>>>>>>>>> That's the solution adopted by authors of Xtreams.
>>>>>>>>>>>>>> But there is a runtime penalty associated to it.
>>>>>>>>>>>>>>
>>>>>>>>>>>>>> The penalty once was so high that my proposal to generalize
>>>>>>>>>>>>>> EndOfStream
>>>>>>>>>>>>>> usage was rejected a few years ago by AndreaRaab.
>>>>>>>>>>>>>> http://forum.world.st/EndOfStream-unused-td68806.html
>>>>>>>>>>>>>
>>>>>>>>>>>>> Thanks for this, I'll definitely take a look.
>>>>>>>>>>>>>
>>>>>>>>>>>>> Do you have a sense of how Denis' suggestion of using an
>>>>>>>>>>>>> EndOfStream
>>>>>>>>>>>>> object would compare?
>>>>>>>>>>>>>
>>>>>>>>>>>>> It would keep the same coding style, but avoid the problems with
>>>>>>>>>>>>> nil.
>>>>>>>>>>>>>
>>>>>>>>>>>>> Thanks,
>>>>>>>>>>>>> Alistair
>>>>>>>>>>>>>
>>>>>>>>>>>>>
>>>>>>>>>>>>>
>>>>>>>>>>>>>> I have regularly benched Xtreams, but stopped a few years ago.
>>>>>>>>>>>>>> Maybe i can excavate and pass on newer VM.
>>>>>>>>>>>>>>
>>>>>>>>>>>>>> In the mean time, i had experimented a programmable end of
>>>>>>>>>>>>>> stream behavior
>>>>>>>>>>>>>> (via a block, or any other valuable)
>>>>>>>>>>>>>> http://www.squeaksource.com/XTream.htm
>>>>>>>>>>>>>> so as to reconcile performance and universality, but it was a
>>>>>>>>>>>>>> source of
>>>>>>>>>>>>>> complexification at implementation side.
>>>>>>>>>>>>>>
>>>>>>>>>>>>>> Nicolas
>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>> Note also that a Guille introduced something new, #closed
>>>>>>>>>>>>>>>> which is
>>>>>>>>>>>>>>>> related to the difference between having no more elements
>>>>>>>>>>>>>>>> (maybe right now,
>>>>>>>>>>>>>>>> like an open network stream) and never ever being able to
>>>>>>>>>>>>>>>> produce more data.
>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>> Sven
>>>>>>>>>>>>
>>>>>>>>>>>>
>>>>>>>>>>>
>>>>>>>>>>
>>>>>>>>>>
>>>>>>>>>
>>>>>>>>
>>>>>>>
>>>>>>>
>>>>>> <stdio.cs>
>
>

Reply | Threaded
Open this post in threaded view
|

Re: Changed #atEnd primitive - #atEnd vs #next returning nil

Denis Kudriashov
Hi Alistair.

I don't think anybody is annoyed by you. You are doing really good job. And nice thing that you are super patient to continue :)

What I try to understand is why blocking atEnd is bad?
Here is code from VMMaker:

[stdin atEnd] whileFalse:
[| nextChunk |
stdout nextPutAll: 'squeak> '; flush.
nextChunk := stdin nextChunkNoTag.
[nextChunk notEmpty and: [nextChunk first isSeparator]] whileTrue:
[nextChunk := nextChunk allButFirst].
Transcript cr; nextPutAll: nextChunk; cr; flush.
[stdout print: (Compiler evaluate: nextChunk); cr; flush]
on: Error
do: [:ex| self logError: ex description inContext: ex signalerContext to: stderr]].
quitOnEof ifTrue:
[SourceFiles at: 2 put: nil.
Smalltalk snapshot: false andQuit: true]
 
I am not see why it breaks with blocking #atEnd. Can you explain?

2018-04-11 11:41 GMT+02:00 Alistair Grant <[hidden email]>:
Hi Sven,

Oh dear.  I feel as though I'm not getting my concerns across at all
well, and I'm pushing hard enough that all I'm going to do is make
people annoyed.  So let me try to restate the issue one last time
before answering your questions directly.

Pharo & Squeak have unwritten rules about stream usage that I suspect
have just emerged over time without being designed.

If you want to be able to iterate over any stream, and in particular
stdin from a terminal (which, as far as I know, is the outlier that
causes all the problems) you have to follow these rules:

1.  If the stream is character / byte oriented you have to check for
EOF using "stream next == nil".  #atEnd can be used, but you'll still
have to do the nil check.

2.  All other streams have to check for EOF (end of stream) using
#atEnd.  "stream next == nil" can be used, but you'll still need to
test #atEnd to determine whether nil is a value returned by the
stream.

If you write code that you want to be able to consume characters,
bytes or any other object, you'll have to test both "stream next ==
nil" and #atEnd.

The rules are the result of the original blue book design being that
#atEnd should be used, and then character input from a terminal being
added later, but always returning an EOF character (nil) before #atEnd
answers correctly.

At the moment, ZnCharacterEncoder uses #atEnd on character / byte
streams, so fails for stdin on a terminal.

Back to your questions:

On 11 April 2018 at 11:12, Sven Van Caekenberghe <[hidden email]> wrote:
>
>
>> On 11 Apr 2018, at 10:29, Alistair Grant <[hidden email]> wrote:
>>
>> Hi Denis,
>>
>> On 11 April 2018 at 10:02, Denis Kudriashov <[hidden email]> wrote:
>>>
>>> 2018-04-11 8:32 GMT+02:00 Alistair Grant <[hidden email]>:
>>>>
>>>>>>> Where is it being said that #next and/or #atEnd should be blocking or
>>>>>>> non-blocking ?
>>>>>>
>>>>>> There is existing code that assumes that #atEnd is non-blocking and
>>>>>> that #next is allowed block.  I believe that we should keep those
>>>>>> conditions.
>>>>>
>>>>> I fail to see where that is written down, either way. Can you point me
>>>>> to comments stating that, I would really like to know ?
>>>>
>>>> I'm not aware of it being written down, just that ever existing
>>>> implementation I'm aware of behaves this way.
>>>>
>>>> On the other hand, making #atEnd blocking breaks Eliot's REPL sample
>>>> (in Squeak).
>>>
>>>
>>> Could you write here this example, please?
>>
>> The code is loaded in squeak using:
>>
>> https://github.com/OpenSmalltalk/opensmalltalk-vm/blob/Cog/image/buildspurtrunkreaderimage.sh
>>
>> for 32 bit images.  It loads:
>>
>> https://github.com/OpenSmalltalk/opensmalltalk-vm/blob/Cog/image/LoadReader.st
>>
>> which loads package CogTools-Listener in http://source.squeak.org/VMMaker
>>
>> An image that automatically runs the code and nothing else is created in:
>>
>> https://github.com/OpenSmalltalk/opensmalltalk-vm/blob/Cog/image/StartReader.st
>>
>>
>> If you want to run it interactively you can load CogTools-Listener and
>> do something like:
>>
>> StdioListener new
>>    quitOnEof: false;
>>    run
>
> What does #quitOnEof: do ? Can the StdioListener code be browsed/viewed online somewhere ?

I just referenced this as an example of making #atEnd (really
FilePlugin>>primitiveFileAtEnd) blocking causing problems.  I wasn't
expecting people to go and look at the code or use it as a test.

If you really want to look at it (from Pharo):

1. Add http://source.squeak.org/VMMaker as a repository.
2. Browse the CogTools-Listener package


>> If you modify #atEnd to block it will result in the "squeak>" input
>> prompt being printed in the terminal after the input has been entered.
>
> How does one modify #atEnd to block ? I suppose you are talking about StdioStream>>#atEnd ?

I meant the primitive, i.e. FilePlugin>>primitiveFileAtEnd /
FilePluginPrims>>atEnd:.


>  ^ self peek isNil
>
> ?
>
> PS: I liked your runnable example better, I will try it later on. Thx!

Right.  My code is meant to be minimal and trigger the problem I'm
actually focused on - that ZnCharacterEncoder doesn't work with stdin
from a terminal.

Sven has expressed a hesitation to change the internal operation of
the Zinc streams from using #atEnd to "stream peek == nil" and this
whole discussion is really about us trying to resolve our different
perspective of the best path forward.  I respect Sven and his work so
I'm trying to justify the change (but I'm not expressing it at all
well, obviously).

Cheers,
Alistair



>> The code can be loaded in to Pharo and basically works, but the output
>> tends to be hidden behind the next input prompt because it uses #cr
>> instead of #lf.  You can easily modify StdioListener>>initialize to
>> set the line end convention in stdout.
>>
>> NOTE: It is not intended to be a release quality implementation of a
>> evaluation loop.  The whole purpose as I understand it is for it to be
>> as simple as possible to assist in tracking down issues using the VM
>> simulator.  It runs minimal code to get to the point of waiting for
>> user input and then allows an expression that causes problems to be
>> entered and traced using the simulator.
>>
>> Cheers,
>> Alistair
>>
>>
>>
>>>>>>> How is this related to how EOF is signalled ?
>>>>>>
>>>>>> Because, combined with terminal EOF not being known until the user
>>>>>> explicitly flags it (with Ctrl-D) it means that #atEnd can't be used
>>>>>> for iterating over input from stdin connected to a terminal.
>>>>>
>>>>> This seems to me like an exception that only holds for one particular
>>>>> stream in one particular scenario (interactive stdin). I might be wrong.
>>>>>
>>>>>>> It seems to me that there are quite a few classes of streams that are
>>>>>>> 'special' in the sense that #next could be blocking and/or #atEnd could be
>>>>>>> unclear - socket/network streams, serial streams, maybe stdio (interactive
>>>>>>> or not). Without a message like #isDataAvailable you cannot handle those
>>>>>>> without blocking.
>>>>>>
>>>>>> Right.  I think this is a distraction (I was trying to explain some
>>>>>> details, but it's causing more confusion instead of helping).
>>>>>>
>>>>>> The important point is that #atEnd doesn't work for iterating over
>>>>>> streams with terminal input
>>>>>
>>>>> Maybe you should also point to the actual code that fails. I mean you
>>>>> showed a partial stack trace, but not how you got there, precisely. How does
>>>>> the application reading from an interactive stdin do to get into trouble ?
>>>>
>>>> Included below.
>>>>
>>>>
>>>>>>> Reading from stdin seems like a very rare case for a Smalltalk system
>>>>>>> (not that it should not be possible).
>>>>>>
>>>>>> There's been quite a bit of discussion and several projects recently
>>>>>> related to using pharo for scripting, so it may become more common.
>>>>>> E.g.
>>>>>>
>>>>>>
>>>>>> https://www.quora.com/Can-Smalltalk-be-a-batch-file-scripting-language/answer/Philippe-Back-1?share=c19bfc95
>>>>>> https://github.com/rajula96reddy/pharo-cli
>>>>>
>>>>> Still, it is not common at all.
>>>>>
>>>>>>> I have a feeling that too much functionality is being pushed into too
>>>>>>> small an API.
>>>>>>
>>>>>> This is just about how should Zinc streams be iterating over the
>>>>>> underlying streams.  You didn't like checking the result of #next for
>>>>>> nil since it isn't general, correctly pointing out that nil is a valid
>>>>>> value for non-byte oriented streams.  But #atEnd doesn't work for
>>>>>> stdin from a terminal.
>>>>>>
>>>>>>
>>>>>> At this point I think there are three options:
>>>>>>
>>>>>> 1. Modify Zinc to check the return value of #next instead of using
>>>>>> #atEnd.
>>>>>>
>>>>>> This is what all existing character / byte oriented streams in Squeak
>>>>>> and Pharo do.  At that point the Zinc streams can be used on all file
>>>>>> / stdio input and output.
>>>>>
>>>>> I agree that such code exists in many places, but there is lots of
>>>>> stream reading that does not check for nils.
>>>>
>>>> Right.  Streams can be categorised in many ways, but for this
>>>> discussion I think streams are broken in to two types:
>>>>
>>>> 1) Byte / Character oriented
>>>> 2) All others
>>>>
>>>> For historical reasons, byte / character oriented streams need to
>>>> check for EOF by using "stream next == nil" and all other streams
>>>> should use #atEnd.
>>>>
>>>> This avoids the "nil being part of the domain" issue that was
>>>> discussed earlier in the thread.
>>>>
>>>>
>>>>>> 2. Modify all streams to signal EOF in some other way, i.e. a sentinel
>>>>>> or notification / exception.
>>>>>>
>>>>>> This is what we were discussing below.  But it is a decent chunk of
>>>>>> work with significant impact on the existing code base.
>>>>>
>>>>> Agreed. This would be a future extension.
>>>>>
>>>>>> 3. Require anyone who wants to read from stdin to code around Zinc's
>>>>>> inability to handle terminal input.
>>>>>>
>>>>>> I'd prefer to avoid this option if possible.
>>>>>
>>>>> See higher for a more concrete usage example request.
>>>>
>>>>
>>>> testAtEnd.st
>>>> --
>>>> | ch stream string stdin |
>>>>
>>>> 'stdio.cs' asFileReference fileIn.
>>>> "stdin := FileStream stdin."
>>>> stdin := ZnCharacterReadStream on:
>>>>    (ZnBufferedReadStream on:
>>>>        Stdio stdin).
>>>> stream := (String new: 100) writeStream.
>>>> ch := stdin next.
>>>> [ ch == nil ] whileFalse: [
>>>>    stream nextPut: ch.
>>>>    ch := stdin next. ].
>>>> string := stream contents.
>>>> FileStream stdout
>>>>    nextPutAll: string; lf;
>>>>    nextPutAll: 'Characters read: ';
>>>>    nextPutAll: string size asString;
>>>>    lf.
>>>> Smalltalk snapshot: false andQuit: true.
>>>> --
>>>>
>>>> Execute with:
>>>>
>>>> ./pharo --headless Pharo7.0-64bit-e76f1a2.image testAtEnd.st
>>>>
>>>> and type Ctrl-D gives:
>>>>
>>>>
>>>> 'Errors in script loaded from testAtEnd.st'
>>>> MessageNotUnderstood: receiver of "<" is nil
>>>> UndefinedObject(Object)>>doesNotUnderstand: #<
>>>> ZnUTF8Encoder>>nextCodePointFromStream:
>>>> ZnUTF8Encoder(ZnCharacterEncoder)>>nextFromStream:
>>>> ZnCharacterReadStream>>nextElement
>>>> ZnCharacterReadStream(ZnEncodedReadStream)>>next
>>>> UndefinedObject>>DoIt
>>>> OpalCompiler>>evaluate
>>>>
>>>>
>>>> Using #atEnd to control the loop instead of "stdin next == nil"
>>>> produces the same result.
>>>>
>>>> Replacing stdin with FileStream stdin makes the script work.
>>>>
>>>> stdio.cs fixes a bug in StdioStream which really isn't part of this
>>>> discussion (PR to be submitted).
>>>>
>>>> Cheers,
>>>> Alistair
>>>>
>>>>
>>>>
>>>>
>>>>>> Does that clarify the situation?
>>>>>
>>>>> Yes, it helps. Thanks. But questions remain.
>>>>>
>>>>>> Thanks,
>>>>>> Alistair
>>>>>>
>>>>>>
>>>>>>
>>>>>>>> On 10 Apr 2018, at 18:30, Alistair Grant <[hidden email]>
>>>>>>>> wrote:
>>>>>>>>
>>>>>>>> First a quick update:
>>>>>>>>
>>>>>>>> After doing some work on primitiveFileAtEnd, #atEnd now answers
>>>>>>>> correctly for files that don't report their size correctly, e.g.
>>>>>>>> /dev/urandom and /proc/cpuinfo, whether the files are opened directly
>>>>>>>> or
>>>>>>>> redirected through stdin.
>>>>>>>>
>>>>>>>> However determining whether stdin from a terminal has reached the end
>>>>>>>> of
>>>>>>>> file can't be done without making #atEnd blocking since we have to
>>>>>>>> wait
>>>>>>>> for the user to flag the end of file, e.g. by typing Ctrl-D.  And
>>>>>>>> #atEnd
>>>>>>>> is assumed to be non-blocking.
>>>>>>>>
>>>>>>>> So currently using ZnCharacterReadStream with stdin from a terminal
>>>>>>>> will
>>>>>>>> result in a stack dump similar to:
>>>>>>>>
>>>>>>>> MessageNotUnderstood: receiver of "<" is nil
>>>>>>>> UndefinedObject(Object)>>doesNotUnderstand: #<
>>>>>>>> ZnUTF8Encoder>>nextCodePointFromStream:
>>>>>>>> ZnUTF8Encoder(ZnCharacterEncoder)>>nextFromStream:
>>>>>>>> ZnCharacterReadStream>>nextElement
>>>>>>>> ZnCharacterReadStream(ZnEncodedReadStream)>>next
>>>>>>>> UndefinedObject>>DoIt
>>>>>>>>
>>>>>>>>
>>>>>>>> Going back through the various suggestions that have been made
>>>>>>>> regarding
>>>>>>>> using a sentinel object vs. raising a notification / exception, my
>>>>>>>> (still to be polished) suggestion is to:
>>>>>>>>
>>>>>>>> 1. Add an endOfStream instance variable
>>>>>>>> 2. When the end of the stream is reached answer the value of the
>>>>>>>> instance variable (i.e. the result of sending #value to the
>>>>>>>> variable).
>>>>>>>> 3. The initial default value would be a block that raises a
>>>>>>>> Deprecation
>>>>>>>> warning and then returns nil.  This would allow existing code to
>>>>>>>> function for a changeover period.
>>>>>>>> 4. At the end of the deprecation period the default value would be
>>>>>>>> changed to a unique sentinel object which would answer itself as its
>>>>>>>> #value.
>>>>>>>>
>>>>>>>> At any time users of the stream can set their own sentinel, including
>>>>>>>> a
>>>>>>>> block that raises an exception.
>>>>>>>>
>>>>>>>>
>>>>>>>> Cheers,
>>>>>>>> Alistair
>>>>>>>>
>>>>>>>>
>>>>>>>> On 4 April 2018 at 19:24, Stephane Ducasse <[hidden email]>
>>>>>>>> wrote:
>>>>>>>>> Thanks for this discussion.
>>>>>>>>>
>>>>>>>>> On Wed, Apr 4, 2018 at 1:37 PM, Sven Van Caekenberghe <[hidden email]>
>>>>>>>>> wrote:
>>>>>>>>>> Alistair,
>>>>>>>>>>
>>>>>>>>>> First off, thanks for the discussions and your contributions, I
>>>>>>>>>> really appreciate them.
>>>>>>>>>>
>>>>>>>>>> But I want to have a discussion at the high level of the definition
>>>>>>>>>> and semantics of the stream API in Pharo.
>>>>>>>>>>
>>>>>>>>>>> On 4 Apr 2018, at 13:20, Alistair Grant <[hidden email]>
>>>>>>>>>>> wrote:
>>>>>>>>>>>
>>>>>>>>>>> On 4 April 2018 at 12:56, Sven Van Caekenberghe <[hidden email]>
>>>>>>>>>>> wrote:
>>>>>>>>>>>> Playing a bit devil's advocate, the idea is that, in general,
>>>>>>>>>>>>
>>>>>>>>>>>> [ stream atEnd] whileFalse: [ stream next. "..." ].
>>>>>>>>>>>>
>>>>>>>>>>>> is no longer allowed ?
>>>>>>>>>>>
>>>>>>>>>>> It hasn't been allowed "forever" [1].  It's just been misused for
>>>>>>>>>>> almost as long.
>>>>>>>>>>>
>>>>>>>>>>> [1] Time began when stdio stream support was introduced. :-)
>>>>>>>>>>
>>>>>>>>>> I am still not convinced. Another way to put it would be that the
>>>>>>>>>> old #atEnd or #upToEnd do not make sense for these streams and some new loop
>>>>>>>>>> is needed, based on a new test (it exists for socket streams already).
>>>>>>>>>>
>>>>>>>>>> [ stream isDataAvailable ] whileTrue: [ stream next ]
>>>>>>>>>>
>>>>>>>>>>>> And you want to replace it with
>>>>>>>>>>>>
>>>>>>>>>>>> [ stream next ifNil: [ false ] ifNotNil: [ :x | "..." true ]
>>>>>>>>>>>> whileTrue.
>>>>>>>>>>>>
>>>>>>>>>>>> That is a pretty big change, no ?
>>>>>>>>>>>
>>>>>>>>>>> That's the way quite a bit of code already operates.
>>>>>>>>>>>
>>>>>>>>>>> As Denis pointed out, it's obviously problematic in the general
>>>>>>>>>>> sense,
>>>>>>>>>>> since nil can be embedded in non-byte oriented streams.  I suspect
>>>>>>>>>>> that in practice not many people write code that reads streams
>>>>>>>>>>> from
>>>>>>>>>>> both byte oriented and non-byte oriented streams.
>>>>>>>>>>
>>>>>>>>>> Maybe yes, maybe no. As Denis' example shows there is a clear
>>>>>>>>>> definition problem.
>>>>>>>>>>
>>>>>>>>>> And I do use streams of byte arrays or strings all the time, this
>>>>>>>>>> is really important. I want my parsers to work on all kinds of streams.
>>>>>>>>>>
>>>>>>>>>>>> I think/feel like a proper EOF exception would be better, more
>>>>>>>>>>>> correct.
>>>>>>>>>>>>
>>>>>>>>>>>> [ [ stream next. "..." true ] on: EOF do: [ false ] ] whileTrue.
>>>>>>>>>>>
>>>>>>>>>>> I agree, but the email thread Nicolas pointed to raises some
>>>>>>>>>>> performance questions about this approach.  It should be
>>>>>>>>>>> straightforward to do a basic performance comparison which I'll
>>>>>>>>>>> get
>>>>>>>>>>> around to if other objections aren't raised.
>>>>>>>>>>
>>>>>>>>>> Reading in bigger blocks, using #readInto:startingAt:count: (which
>>>>>>>>>> is basically Unix's (2) Read sys call), would solve performance problems, I
>>>>>>>>>> think.
>>>>>>>>>>
>>>>>>>>>>>> Will we throw away #atEnd then ? Do we need it if we cannot use
>>>>>>>>>>>> it ?
>>>>>>>>>>>
>>>>>>>>>>> Unix file i/o returns EOF if the end of file has been reach OR if
>>>>>>>>>>> an
>>>>>>>>>>> error occurs.  You should still check #atEnd after reading past
>>>>>>>>>>> the
>>>>>>>>>>> end of the file to make sure no error occurred.  Another part of
>>>>>>>>>>> the
>>>>>>>>>>> primitive change I'm proposing is to return additional information
>>>>>>>>>>> about what went wrong in the event of an error.
>>>>>>>>>>
>>>>>>>>>> I am sorry, but this kind of semantics (the OR) is way too complex
>>>>>>>>>> at the general image level, it is too specific and based on certain
>>>>>>>>>> underlying implementation details.
>>>>>>>>>>
>>>>>>>>>> Sven
>>>>>>>>>>
>>>>>>>>>>> We could modify the read primitive so that it fails if an error
>>>>>>>>>>> has
>>>>>>>>>>> occurred, and then #atEnd wouldn't be required.
>>>>>>>>>>>
>>>>>>>>>>> Cheers,
>>>>>>>>>>> Alistair
>>>>>>>>>>>
>>>>>>>>>>>
>>>>>>>>>>>
>>>>>>>>>>>>> On 4 Apr 2018, at 12:41, Alistair Grant <[hidden email]>
>>>>>>>>>>>>> wrote:
>>>>>>>>>>>>>
>>>>>>>>>>>>> Hi Nicolas,
>>>>>>>>>>>>>
>>>>>>>>>>>>> On 4 April 2018 at 12:36, Nicolas Cellier
>>>>>>>>>>>>> <[hidden email]> wrote:
>>>>>>>>>>>>>>
>>>>>>>>>>>>>>
>>>>>>>>>>>>>> 2018-04-04 12:18 GMT+02:00 Alistair Grant
>>>>>>>>>>>>>> <[hidden email]>:
>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>> Hi Sven,
>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>> On Wed, Apr 04, 2018 at 11:32:02AM +0200, Sven Van
>>>>>>>>>>>>>>> Caekenberghe wrote:
>>>>>>>>>>>>>>>> Somehow, somewhere there was a change to the implementation
>>>>>>>>>>>>>>>> of the
>>>>>>>>>>>>>>>> primitive called by some streams' #atEnd.
>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>> That's a proposed change by me, but it hasn't been integrated
>>>>>>>>>>>>>>> yet.  So
>>>>>>>>>>>>>>> the discussion below should apply to the current stable vm
>>>>>>>>>>>>>>> (from August
>>>>>>>>>>>>>>> last year).
>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>> IIRC, someone said it is implemented as 'remaining size being
>>>>>>>>>>>>>>>> zero'
>>>>>>>>>>>>>>>> and some virtual unix files like /dev/random are zero sized.
>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>> Currently, for files other than sdio (stdout, stderr, stdin)
>>>>>>>>>>>>>>> it is
>>>>>>>>>>>>>>> effectively defined as:
>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>> atEnd := stream position >= stream size
>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>> And, as you say, plenty of virtual unix files report size 0.
>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>> Now, all kinds of changes are being done image size to work
>>>>>>>>>>>>>>>> around this.
>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>> I would phrase this slightly differently :-)
>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>> Some code does the right thing, while other code doesn't.
>>>>>>>>>>>>>>> E.g.:
>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>> MultiByteFileStream>>upToEnd is good, while
>>>>>>>>>>>>>>> FileStream>>contents is incorrect
>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>> I am a strong believer in simple, real (i.e. infinite)
>>>>>>>>>>>>>>>> streams, but I
>>>>>>>>>>>>>>>> am not sure we are doing the right thing here.
>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>> Point is, I am not sure #next returning nil is official and
>>>>>>>>>>>>>>>> universal.
>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>> Consider the comments:
>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>> Stream>>#next
>>>>>>>>>>>>>>>> "Answer the next object accessible by the receiver."
>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>> ReadStream>>#next
>>>>>>>>>>>>>>>> "Primitive. Answer the next object in the Stream represented
>>>>>>>>>>>>>>>> by the
>>>>>>>>>>>>>>>> receiver. Fail if the collection of this stream is not an
>>>>>>>>>>>>>>>> Array or a
>>>>>>>>>>>>>>>> String.
>>>>>>>>>>>>>>>> Fail if the stream is positioned at its end, or if the
>>>>>>>>>>>>>>>> position is out
>>>>>>>>>>>>>>>> of
>>>>>>>>>>>>>>>> bounds in the collection. Optional. See Object documentation
>>>>>>>>>>>>>>>> whatIsAPrimitive."
>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>> Note how there is no talk about returning nil !
>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>> I think we should discuss about this first.
>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>> Was the low level change really correct and the right thing
>>>>>>>>>>>>>>>> to do ?
>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>> The primitive change proposed doesn't affect this discussion.
>>>>>>>>>>>>>>> It will
>>>>>>>>>>>>>>> mean that #atEnd returns false (correctly) sometimes, while
>>>>>>>>>>>>>>> currently it
>>>>>>>>>>>>>>> returns true (incorrectly).  The end result is still
>>>>>>>>>>>>>>> incorrect, e.g.
>>>>>>>>>>>>>>> #contents returns an empty string for /proc/cpuinfo.
>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>> You're correct about no mention of nil, but we have:
>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>> FileStream>>next
>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>    (position >= readLimit and: [self atEnd])
>>>>>>>>>>>>>>>            ifTrue: [^nil]
>>>>>>>>>>>>>>>            ifFalse: [^collection at: (position := position +
>>>>>>>>>>>>>>> 1)]
>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>> which has been around for a long time (I suspect, before Pharo
>>>>>>>>>>>>>>> existed).
>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>> Having said that, I think that raising an exception is a
>>>>>>>>>>>>>>> better
>>>>>>>>>>>>>>> solution, but it is a much, much bigger change than the one I
>>>>>>>>>>>>>>> proposed
>>>>>>>>>>>>>>> in https://github.com/pharo-project/pharo/pull/1180.
>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>> Cheers,
>>>>>>>>>>>>>>> Alistair
>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>
>>>>>>>>>>>>>> Hi,
>>>>>>>>>>>>>> yes, if you are after universal behavior englobing Unix
>>>>>>>>>>>>>> streams, the
>>>>>>>>>>>>>> Exception might be the best way.
>>>>>>>>>>>>>> Because on special stream you can't allways say in advance, you
>>>>>>>>>>>>>> have to try.
>>>>>>>>>>>>>> That's the solution adopted by authors of Xtreams.
>>>>>>>>>>>>>> But there is a runtime penalty associated to it.
>>>>>>>>>>>>>>
>>>>>>>>>>>>>> The penalty once was so high that my proposal to generalize
>>>>>>>>>>>>>> EndOfStream
>>>>>>>>>>>>>> usage was rejected a few years ago by AndreaRaab.
>>>>>>>>>>>>>> http://forum.world.st/EndOfStream-unused-td68806.html
>>>>>>>>>>>>>
>>>>>>>>>>>>> Thanks for this, I'll definitely take a look.
>>>>>>>>>>>>>
>>>>>>>>>>>>> Do you have a sense of how Denis' suggestion of using an
>>>>>>>>>>>>> EndOfStream
>>>>>>>>>>>>> object would compare?
>>>>>>>>>>>>>
>>>>>>>>>>>>> It would keep the same coding style, but avoid the problems with
>>>>>>>>>>>>> nil.
>>>>>>>>>>>>>
>>>>>>>>>>>>> Thanks,
>>>>>>>>>>>>> Alistair
>>>>>>>>>>>>>
>>>>>>>>>>>>>
>>>>>>>>>>>>>
>>>>>>>>>>>>>> I have regularly benched Xtreams, but stopped a few years ago.
>>>>>>>>>>>>>> Maybe i can excavate and pass on newer VM.
>>>>>>>>>>>>>>
>>>>>>>>>>>>>> In the mean time, i had experimented a programmable end of
>>>>>>>>>>>>>> stream behavior
>>>>>>>>>>>>>> (via a block, or any other valuable)
>>>>>>>>>>>>>> http://www.squeaksource.com/XTream.htm
>>>>>>>>>>>>>> so as to reconcile performance and universality, but it was a
>>>>>>>>>>>>>> source of
>>>>>>>>>>>>>> complexification at implementation side.
>>>>>>>>>>>>>>
>>>>>>>>>>>>>> Nicolas
>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>> Note also that a Guille introduced something new, #closed
>>>>>>>>>>>>>>>> which is
>>>>>>>>>>>>>>>> related to the difference between having no more elements
>>>>>>>>>>>>>>>> (maybe right now,
>>>>>>>>>>>>>>>> like an open network stream) and never ever being able to
>>>>>>>>>>>>>>>> produce more data.
>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>> Sven
>>>>>>>>>>>>
>>>>>>>>>>>>
>>>>>>>>>>>
>>>>>>>>>>
>>>>>>>>>>
>>>>>>>>>
>>>>>>>>
>>>>>>>
>>>>>>>
>>>>>> <stdio.cs>
>
>


Reply | Threaded
Open this post in threaded view
|

Re: Changed #atEnd primitive - #atEnd vs #next returning nil

Sven Van Caekenberghe-2


> On 11 Apr 2018, at 12:04, Denis Kudriashov <[hidden email]> wrote:
>
> I don't think anybody is annoyed by you. You are doing really good job. And nice thing that you are super patient to continue :)

Yes, Alistair, you are a top notch open source contributor !

For me, this discussion is about the difference between looking from low level details/issues/changes up, vs, from a higher level down.


Reply | Threaded
Open this post in threaded view
|

Re: Changed #atEnd primitive - #atEnd vs #next returning nil

alistairgrant
In reply to this post by Denis Kudriashov
Hi Sven & Dennis,

On 11 April 2018 at 12:04, Denis Kudriashov <[hidden email]> wrote:
> Hi Alistair.
>
> I don't think anybody is annoyed by you. You are doing really good job. And
> nice thing that you are super patient to continue :)
>

On 11 April 2018 at 12:13, Sven Van Caekenberghe <[hidden email]> wrote:
>
> Yes, Alistair, you are a top notch open source contributor !
>
> For me, this discussion is about the difference between looking from low level details/issues/changes up, vs, from a higher level down.

Thanks for your kind words.



> What I try to understand is why blocking atEnd is bad?
> Here is code from VMMaker:
>
> [stdin atEnd] whileFalse:
> [| nextChunk |
> stdout nextPutAll: 'squeak> '; flush.
> nextChunk := stdin nextChunkNoTag.
> [nextChunk notEmpty and: [nextChunk first isSeparator]] whileTrue:
> [nextChunk := nextChunk allButFirst].
> Transcript cr; nextPutAll: nextChunk; cr; flush.
> [stdout print: (Compiler evaluate: nextChunk); cr; flush]
> on: Error
> do: [:ex| self logError: ex description inContext: ex signalerContext to:
> stderr]].
> quitOnEof ifTrue:
> [SourceFiles at: 2 put: nil.
> Smalltalk snapshot: false andQuit: true]
>
>
> I am not see why it breaks with blocking #atEnd. Can you explain?


First consider the case where #atEnd doesn't block and we just want to
evaluate 4+3:

1. #atEnd will return false
2. the loop will print the prompt
3. wait for input (stdin nextChunkNoTag)
4. print the result
5. goto 1.

So the screen will look like:

squeak> 4+3!
7
squeak> [cursor here]

Which is what we expect (prompt, input, result, prompt).

If #atEnd is blocking the VM will hang at step 1 until the user enters
something in the terminal.  In Ubuntu at least terminal input appears to
be line buffered, so for the example above the terminal will look like:

4+3!
squeak> 7
[cursor here]

We don't get the prompt when the program is started, the result is
printed after the prompt, and then there's just a cursor sitting at the
start of the next line.

Obviously the program could be re-written to have the correct output
with #atEnd blocking.  But I'm arguing that this program is
representative of many others, and we don't want to break backward
compatibility in this case.

Cheers,
Alistair








> 2018-04-11 11:41 GMT+02:00 Alistair Grant <[hidden email]>:
>>
>> Hi Sven,
>>
>> Oh dear.  I feel as though I'm not getting my concerns across at all
>> well, and I'm pushing hard enough that all I'm going to do is make
>> people annoyed.  So let me try to restate the issue one last time
>> before answering your questions directly.
>>
>> Pharo & Squeak have unwritten rules about stream usage that I suspect
>> have just emerged over time without being designed.
>>
>> If you want to be able to iterate over any stream, and in particular
>> stdin from a terminal (which, as far as I know, is the outlier that
>> causes all the problems) you have to follow these rules:
>>
>> 1.  If the stream is character / byte oriented you have to check for
>> EOF using "stream next == nil".  #atEnd can be used, but you'll still
>> have to do the nil check.
>>
>> 2.  All other streams have to check for EOF (end of stream) using
>> #atEnd.  "stream next == nil" can be used, but you'll still need to
>> test #atEnd to determine whether nil is a value returned by the
>> stream.
>>
>> If you write code that you want to be able to consume characters,
>> bytes or any other object, you'll have to test both "stream next ==
>> nil" and #atEnd.
>>
>> The rules are the result of the original blue book design being that
>> #atEnd should be used, and then character input from a terminal being
>> added later, but always returning an EOF character (nil) before #atEnd
>> answers correctly.
>>
>> At the moment, ZnCharacterEncoder uses #atEnd on character / byte
>> streams, so fails for stdin on a terminal.
>>
>> Back to your questions:
>>
>> On 11 April 2018 at 11:12, Sven Van Caekenberghe <[hidden email]> wrote:
>> >
>> >
>> >> On 11 Apr 2018, at 10:29, Alistair Grant <[hidden email]> wrote:
>> >>
>> >> Hi Denis,
>> >>
>> >> On 11 April 2018 at 10:02, Denis Kudriashov <[hidden email]>
>> >> wrote:
>> >>>
>> >>> 2018-04-11 8:32 GMT+02:00 Alistair Grant <[hidden email]>:
>> >>>>
>> >>>>>>> Where is it being said that #next and/or #atEnd should be blocking
>> >>>>>>> or
>> >>>>>>> non-blocking ?
>> >>>>>>
>> >>>>>> There is existing code that assumes that #atEnd is non-blocking and
>> >>>>>> that #next is allowed block.  I believe that we should keep those
>> >>>>>> conditions.
>> >>>>>
>> >>>>> I fail to see where that is written down, either way. Can you point
>> >>>>> me
>> >>>>> to comments stating that, I would really like to know ?
>> >>>>
>> >>>> I'm not aware of it being written down, just that ever existing
>> >>>> implementation I'm aware of behaves this way.
>> >>>>
>> >>>> On the other hand, making #atEnd blocking breaks Eliot's REPL sample
>> >>>> (in Squeak).
>> >>>
>> >>>
>> >>> Could you write here this example, please?
>> >>
>> >> The code is loaded in squeak using:
>> >>
>> >>
>> >> https://github.com/OpenSmalltalk/opensmalltalk-vm/blob/Cog/image/buildspurtrunkreaderimage.sh
>> >>
>> >> for 32 bit images.  It loads:
>> >>
>> >>
>> >> https://github.com/OpenSmalltalk/opensmalltalk-vm/blob/Cog/image/LoadReader.st
>> >>
>> >> which loads package CogTools-Listener in
>> >> http://source.squeak.org/VMMaker
>> >>
>> >> An image that automatically runs the code and nothing else is created
>> >> in:
>> >>
>> >>
>> >> https://github.com/OpenSmalltalk/opensmalltalk-vm/blob/Cog/image/StartReader.st
>> >>
>> >>
>> >> If you want to run it interactively you can load CogTools-Listener and
>> >> do something like:
>> >>
>> >> StdioListener new
>> >>    quitOnEof: false;
>> >>    run
>> >
>> > What does #quitOnEof: do ? Can the StdioListener code be browsed/viewed
>> > online somewhere ?
>>
>> I just referenced this as an example of making #atEnd (really
>> FilePlugin>>primitiveFileAtEnd) blocking causing problems.  I wasn't
>> expecting people to go and look at the code or use it as a test.
>>
>> If you really want to look at it (from Pharo):
>>
>> 1. Add http://source.squeak.org/VMMaker as a repository.
>> 2. Browse the CogTools-Listener package
>>
>>
>> >> If you modify #atEnd to block it will result in the "squeak>" input
>> >> prompt being printed in the terminal after the input has been entered.
>> >
>> > How does one modify #atEnd to block ? I suppose you are talking about
>> > StdioStream>>#atEnd ?
>>
>> I meant the primitive, i.e. FilePlugin>>primitiveFileAtEnd /
>> FilePluginPrims>>atEnd:.
>>
>>
>> >  ^ self peek isNil
>> >
>> > ?
>> >
>> > PS: I liked your runnable example better, I will try it later on. Thx!
>>
>> Right.  My code is meant to be minimal and trigger the problem I'm
>> actually focused on - that ZnCharacterEncoder doesn't work with stdin
>> from a terminal.
>>
>> Sven has expressed a hesitation to change the internal operation of
>> the Zinc streams from using #atEnd to "stream peek == nil" and this
>> whole discussion is really about us trying to resolve our different
>> perspective of the best path forward.  I respect Sven and his work so
>> I'm trying to justify the change (but I'm not expressing it at all
>> well, obviously).
>>
>> Cheers,
>> Alistair
>>
>>
>>
>> >> The code can be loaded in to Pharo and basically works, but the output
>> >> tends to be hidden behind the next input prompt because it uses #cr
>> >> instead of #lf.  You can easily modify StdioListener>>initialize to
>> >> set the line end convention in stdout.
>> >>
>> >> NOTE: It is not intended to be a release quality implementation of a
>> >> evaluation loop.  The whole purpose as I understand it is for it to be
>> >> as simple as possible to assist in tracking down issues using the VM
>> >> simulator.  It runs minimal code to get to the point of waiting for
>> >> user input and then allows an expression that causes problems to be
>> >> entered and traced using the simulator.
>> >>
>> >> Cheers,
>> >> Alistair
>> >>
>> >>
>> >>
>> >>>>>>> How is this related to how EOF is signalled ?
>> >>>>>>
>> >>>>>> Because, combined with terminal EOF not being known until the user
>> >>>>>> explicitly flags it (with Ctrl-D) it means that #atEnd can't be
>> >>>>>> used
>> >>>>>> for iterating over input from stdin connected to a terminal.
>> >>>>>
>> >>>>> This seems to me like an exception that only holds for one
>> >>>>> particular
>> >>>>> stream in one particular scenario (interactive stdin). I might be
>> >>>>> wrong.
>> >>>>>
>> >>>>>>> It seems to me that there are quite a few classes of streams that
>> >>>>>>> are
>> >>>>>>> 'special' in the sense that #next could be blocking and/or #atEnd
>> >>>>>>> could be
>> >>>>>>> unclear - socket/network streams, serial streams, maybe stdio
>> >>>>>>> (interactive
>> >>>>>>> or not). Without a message like #isDataAvailable you cannot handle
>> >>>>>>> those
>> >>>>>>> without blocking.
>> >>>>>>
>> >>>>>> Right.  I think this is a distraction (I was trying to explain some
>> >>>>>> details, but it's causing more confusion instead of helping).
>> >>>>>>
>> >>>>>> The important point is that #atEnd doesn't work for iterating over
>> >>>>>> streams with terminal input
>> >>>>>
>> >>>>> Maybe you should also point to the actual code that fails. I mean
>> >>>>> you
>> >>>>> showed a partial stack trace, but not how you got there, precisely.
>> >>>>> How does
>> >>>>> the application reading from an interactive stdin do to get into
>> >>>>> trouble ?
>> >>>>
>> >>>> Included below.
>> >>>>
>> >>>>
>> >>>>>>> Reading from stdin seems like a very rare case for a Smalltalk
>> >>>>>>> system
>> >>>>>>> (not that it should not be possible).
>> >>>>>>
>> >>>>>> There's been quite a bit of discussion and several projects
>> >>>>>> recently
>> >>>>>> related to using pharo for scripting, so it may become more common.
>> >>>>>> E.g.
>> >>>>>>
>> >>>>>>
>> >>>>>>
>> >>>>>> https://www.quora.com/Can-Smalltalk-be-a-batch-file-scripting-language/answer/Philippe-Back-1?share=c19bfc95
>> >>>>>> https://github.com/rajula96reddy/pharo-cli
>> >>>>>
>> >>>>> Still, it is not common at all.
>> >>>>>
>> >>>>>>> I have a feeling that too much functionality is being pushed into
>> >>>>>>> too
>> >>>>>>> small an API.
>> >>>>>>
>> >>>>>> This is just about how should Zinc streams be iterating over the
>> >>>>>> underlying streams.  You didn't like checking the result of #next
>> >>>>>> for
>> >>>>>> nil since it isn't general, correctly pointing out that nil is a
>> >>>>>> valid
>> >>>>>> value for non-byte oriented streams.  But #atEnd doesn't work for
>> >>>>>> stdin from a terminal.
>> >>>>>>
>> >>>>>>
>> >>>>>> At this point I think there are three options:
>> >>>>>>
>> >>>>>> 1. Modify Zinc to check the return value of #next instead of using
>> >>>>>> #atEnd.
>> >>>>>>
>> >>>>>> This is what all existing character / byte oriented streams in
>> >>>>>> Squeak
>> >>>>>> and Pharo do.  At that point the Zinc streams can be used on all
>> >>>>>> file
>> >>>>>> / stdio input and output.
>> >>>>>
>> >>>>> I agree that such code exists in many places, but there is lots of
>> >>>>> stream reading that does not check for nils.
>> >>>>
>> >>>> Right.  Streams can be categorised in many ways, but for this
>> >>>> discussion I think streams are broken in to two types:
>> >>>>
>> >>>> 1) Byte / Character oriented
>> >>>> 2) All others
>> >>>>
>> >>>> For historical reasons, byte / character oriented streams need to
>> >>>> check for EOF by using "stream next == nil" and all other streams
>> >>>> should use #atEnd.
>> >>>>
>> >>>> This avoids the "nil being part of the domain" issue that was
>> >>>> discussed earlier in the thread.
>> >>>>
>> >>>>
>> >>>>>> 2. Modify all streams to signal EOF in some other way, i.e. a
>> >>>>>> sentinel
>> >>>>>> or notification / exception.
>> >>>>>>
>> >>>>>> This is what we were discussing below.  But it is a decent chunk of
>> >>>>>> work with significant impact on the existing code base.
>> >>>>>
>> >>>>> Agreed. This would be a future extension.
>> >>>>>
>> >>>>>> 3. Require anyone who wants to read from stdin to code around
>> >>>>>> Zinc's
>> >>>>>> inability to handle terminal input.
>> >>>>>>
>> >>>>>> I'd prefer to avoid this option if possible.
>> >>>>>
>> >>>>> See higher for a more concrete usage example request.
>> >>>>
>> >>>>
>> >>>> testAtEnd.st
>> >>>> --
>> >>>> | ch stream string stdin |
>> >>>>
>> >>>> 'stdio.cs' asFileReference fileIn.
>> >>>> "stdin := FileStream stdin."
>> >>>> stdin := ZnCharacterReadStream on:
>> >>>>    (ZnBufferedReadStream on:
>> >>>>        Stdio stdin).
>> >>>> stream := (String new: 100) writeStream.
>> >>>> ch := stdin next.
>> >>>> [ ch == nil ] whileFalse: [
>> >>>>    stream nextPut: ch.
>> >>>>    ch := stdin next. ].
>> >>>> string := stream contents.
>> >>>> FileStream stdout
>> >>>>    nextPutAll: string; lf;
>> >>>>    nextPutAll: 'Characters read: ';
>> >>>>    nextPutAll: string size asString;
>> >>>>    lf.
>> >>>> Smalltalk snapshot: false andQuit: true.
>> >>>> --
>> >>>>
>> >>>> Execute with:
>> >>>>
>> >>>> ./pharo --headless Pharo7.0-64bit-e76f1a2.image testAtEnd.st
>> >>>>
>> >>>> and type Ctrl-D gives:
>> >>>>
>> >>>>
>> >>>> 'Errors in script loaded from testAtEnd.st'
>> >>>> MessageNotUnderstood: receiver of "<" is nil
>> >>>> UndefinedObject(Object)>>doesNotUnderstand: #<
>> >>>> ZnUTF8Encoder>>nextCodePointFromStream:
>> >>>> ZnUTF8Encoder(ZnCharacterEncoder)>>nextFromStream:
>> >>>> ZnCharacterReadStream>>nextElement
>> >>>> ZnCharacterReadStream(ZnEncodedReadStream)>>next
>> >>>> UndefinedObject>>DoIt
>> >>>> OpalCompiler>>evaluate
>> >>>>
>> >>>>
>> >>>> Using #atEnd to control the loop instead of "stdin next == nil"
>> >>>> produces the same result.
>> >>>>
>> >>>> Replacing stdin with FileStream stdin makes the script work.
>> >>>>
>> >>>> stdio.cs fixes a bug in StdioStream which really isn't part of this
>> >>>> discussion (PR to be submitted).
>> >>>>
>> >>>> Cheers,
>> >>>> Alistair
>> >>>>
>> >>>>
>> >>>>
>> >>>>
>> >>>>>> Does that clarify the situation?
>> >>>>>
>> >>>>> Yes, it helps. Thanks. But questions remain.
>> >>>>>
>> >>>>>> Thanks,
>> >>>>>> Alistair

Reply | Threaded
Open this post in threaded view
|

Re: Changed #atEnd primitive - #atEnd vs #next returning nil

Sven Van Caekenberghe-2
In reply to this post by Sven Van Caekenberghe-2


> On 11 Apr 2018, at 11:12, Sven Van Caekenberghe <[hidden email]> wrote:
>
> How does one modify #atEnd to block ? I suppose you are talking about StdioStream>>#atEnd ?
>
> ^ self peek isNil
>
> ?

Still the same question, how do you implement a blocking #atEnd for stdin ?

I have seen your stdio.cs which is indeed needed as the current StdioStream>>#atEnd is bogus for sure.

But that is still a non-blocking one, right ?

Since there is a peekBuffer in StdioStream, why can't that be used ?

I have run your example testAtEnd.st now, and it works/fails as advertised.
Reply | Threaded
Open this post in threaded view
|

Re: Changed #atEnd primitive - #atEnd vs #next returning nil

Denis Kudriashov
In reply to this post by alistairgrant
Thanks for explanation.

I think it would be same scenario for socket stream where #atEnd is not blocking. So I agree that it is expected behaviour.

The example is general enough to expect it to be working for any given pair of in and out streams. So our streams should support this.


2018-04-11 14:56 GMT+02:00 Alistair Grant <[hidden email]>:
Hi Sven & Dennis,

On 11 April 2018 at 12:04, Denis Kudriashov <[hidden email]> wrote:
> Hi Alistair.
>
> I don't think anybody is annoyed by you. You are doing really good job. And
> nice thing that you are super patient to continue :)
>

On 11 April 2018 at 12:13, Sven Van Caekenberghe <[hidden email]> wrote:
>
> Yes, Alistair, you are a top notch open source contributor !
>
> For me, this discussion is about the difference between looking from low level details/issues/changes up, vs, from a higher level down.

Thanks for your kind words.



> What I try to understand is why blocking atEnd is bad?
> Here is code from VMMaker:
>
> [stdin atEnd] whileFalse:
> [| nextChunk |
> stdout nextPutAll: 'squeak> '; flush.
> nextChunk := stdin nextChunkNoTag.
> [nextChunk notEmpty and: [nextChunk first isSeparator]] whileTrue:
> [nextChunk := nextChunk allButFirst].
> Transcript cr; nextPutAll: nextChunk; cr; flush.
> [stdout print: (Compiler evaluate: nextChunk); cr; flush]
> on: Error
> do: [:ex| self logError: ex description inContext: ex signalerContext to:
> stderr]].
> quitOnEof ifTrue:
> [SourceFiles at: 2 put: nil.
> Smalltalk snapshot: false andQuit: true]
>
>
> I am not see why it breaks with blocking #atEnd. Can you explain?


First consider the case where #atEnd doesn't block and we just want to
evaluate 4+3:

1. #atEnd will return false
2. the loop will print the prompt
3. wait for input (stdin nextChunkNoTag)
4. print the result
5. goto 1.

So the screen will look like:

squeak> 4+3!
7
squeak> [cursor here]

Which is what we expect (prompt, input, result, prompt).

If #atEnd is blocking the VM will hang at step 1 until the user enters
something in the terminal.  In Ubuntu at least terminal input appears to
be line buffered, so for the example above the terminal will look like:

4+3!
squeak> 7
[cursor here]

We don't get the prompt when the program is started, the result is
printed after the prompt, and then there's just a cursor sitting at the
start of the next line.

Obviously the program could be re-written to have the correct output
with #atEnd blocking.  But I'm arguing that this program is
representative of many others, and we don't want to break backward
compatibility in this case.

Cheers,
Alistair








> 2018-04-11 11:41 GMT+02:00 Alistair Grant <[hidden email]>:
>>
>> Hi Sven,
>>
>> Oh dear.  I feel as though I'm not getting my concerns across at all
>> well, and I'm pushing hard enough that all I'm going to do is make
>> people annoyed.  So let me try to restate the issue one last time
>> before answering your questions directly.
>>
>> Pharo & Squeak have unwritten rules about stream usage that I suspect
>> have just emerged over time without being designed.
>>
>> If you want to be able to iterate over any stream, and in particular
>> stdin from a terminal (which, as far as I know, is the outlier that
>> causes all the problems) you have to follow these rules:
>>
>> 1.  If the stream is character / byte oriented you have to check for
>> EOF using "stream next == nil".  #atEnd can be used, but you'll still
>> have to do the nil check.
>>
>> 2.  All other streams have to check for EOF (end of stream) using
>> #atEnd.  "stream next == nil" can be used, but you'll still need to
>> test #atEnd to determine whether nil is a value returned by the
>> stream.
>>
>> If you write code that you want to be able to consume characters,
>> bytes or any other object, you'll have to test both "stream next ==
>> nil" and #atEnd.
>>
>> The rules are the result of the original blue book design being that
>> #atEnd should be used, and then character input from a terminal being
>> added later, but always returning an EOF character (nil) before #atEnd
>> answers correctly.
>>
>> At the moment, ZnCharacterEncoder uses #atEnd on character / byte
>> streams, so fails for stdin on a terminal.
>>
>> Back to your questions:
>>
>> On 11 April 2018 at 11:12, Sven Van Caekenberghe <[hidden email]> wrote:
>> >
>> >
>> >> On 11 Apr 2018, at 10:29, Alistair Grant <[hidden email]> wrote:
>> >>
>> >> Hi Denis,
>> >>
>> >> On 11 April 2018 at 10:02, Denis Kudriashov <[hidden email]>
>> >> wrote:
>> >>>
>> >>> 2018-04-11 8:32 GMT+02:00 Alistair Grant <[hidden email]>:
>> >>>>
>> >>>>>>> Where is it being said that #next and/or #atEnd should be blocking
>> >>>>>>> or
>> >>>>>>> non-blocking ?
>> >>>>>>
>> >>>>>> There is existing code that assumes that #atEnd is non-blocking and
>> >>>>>> that #next is allowed block.  I believe that we should keep those
>> >>>>>> conditions.
>> >>>>>
>> >>>>> I fail to see where that is written down, either way. Can you point
>> >>>>> me
>> >>>>> to comments stating that, I would really like to know ?
>> >>>>
>> >>>> I'm not aware of it being written down, just that ever existing
>> >>>> implementation I'm aware of behaves this way.
>> >>>>
>> >>>> On the other hand, making #atEnd blocking breaks Eliot's REPL sample
>> >>>> (in Squeak).
>> >>>
>> >>>
>> >>> Could you write here this example, please?
>> >>
>> >> The code is loaded in squeak using:
>> >>
>> >>
>> >> https://github.com/OpenSmalltalk/opensmalltalk-vm/blob/Cog/image/buildspurtrunkreaderimage.sh
>> >>
>> >> for 32 bit images.  It loads:
>> >>
>> >>
>> >> https://github.com/OpenSmalltalk/opensmalltalk-vm/blob/Cog/image/LoadReader.st
>> >>
>> >> which loads package CogTools-Listener in
>> >> http://source.squeak.org/VMMaker
>> >>
>> >> An image that automatically runs the code and nothing else is created
>> >> in:
>> >>
>> >>
>> >> https://github.com/OpenSmalltalk/opensmalltalk-vm/blob/Cog/image/StartReader.st
>> >>
>> >>
>> >> If you want to run it interactively you can load CogTools-Listener and
>> >> do something like:
>> >>
>> >> StdioListener new
>> >>    quitOnEof: false;
>> >>    run
>> >
>> > What does #quitOnEof: do ? Can the StdioListener code be browsed/viewed
>> > online somewhere ?
>>
>> I just referenced this as an example of making #atEnd (really
>> FilePlugin>>primitiveFileAtEnd) blocking causing problems.  I wasn't
>> expecting people to go and look at the code or use it as a test.
>>
>> If you really want to look at it (from Pharo):
>>
>> 1. Add http://source.squeak.org/VMMaker as a repository.
>> 2. Browse the CogTools-Listener package
>>
>>
>> >> If you modify #atEnd to block it will result in the "squeak>" input
>> >> prompt being printed in the terminal after the input has been entered.
>> >
>> > How does one modify #atEnd to block ? I suppose you are talking about
>> > StdioStream>>#atEnd ?
>>
>> I meant the primitive, i.e. FilePlugin>>primitiveFileAtEnd /
>> FilePluginPrims>>atEnd:.
>>
>>
>> >  ^ self peek isNil
>> >
>> > ?
>> >
>> > PS: I liked your runnable example better, I will try it later on. Thx!
>>
>> Right.  My code is meant to be minimal and trigger the problem I'm
>> actually focused on - that ZnCharacterEncoder doesn't work with stdin
>> from a terminal.
>>
>> Sven has expressed a hesitation to change the internal operation of
>> the Zinc streams from using #atEnd to "stream peek == nil" and this
>> whole discussion is really about us trying to resolve our different
>> perspective of the best path forward.  I respect Sven and his work so
>> I'm trying to justify the change (but I'm not expressing it at all
>> well, obviously).
>>
>> Cheers,
>> Alistair
>>
>>
>>
>> >> The code can be loaded in to Pharo and basically works, but the output
>> >> tends to be hidden behind the next input prompt because it uses #cr
>> >> instead of #lf.  You can easily modify StdioListener>>initialize to
>> >> set the line end convention in stdout.
>> >>
>> >> NOTE: It is not intended to be a release quality implementation of a
>> >> evaluation loop.  The whole purpose as I understand it is for it to be
>> >> as simple as possible to assist in tracking down issues using the VM
>> >> simulator.  It runs minimal code to get to the point of waiting for
>> >> user input and then allows an expression that causes problems to be
>> >> entered and traced using the simulator.
>> >>
>> >> Cheers,
>> >> Alistair
>> >>
>> >>
>> >>
>> >>>>>>> How is this related to how EOF is signalled ?
>> >>>>>>
>> >>>>>> Because, combined with terminal EOF not being known until the user
>> >>>>>> explicitly flags it (with Ctrl-D) it means that #atEnd can't be
>> >>>>>> used
>> >>>>>> for iterating over input from stdin connected to a terminal.
>> >>>>>
>> >>>>> This seems to me like an exception that only holds for one
>> >>>>> particular
>> >>>>> stream in one particular scenario (interactive stdin). I might be
>> >>>>> wrong.
>> >>>>>
>> >>>>>>> It seems to me that there are quite a few classes of streams that
>> >>>>>>> are
>> >>>>>>> 'special' in the sense that #next could be blocking and/or #atEnd
>> >>>>>>> could be
>> >>>>>>> unclear - socket/network streams, serial streams, maybe stdio
>> >>>>>>> (interactive
>> >>>>>>> or not). Without a message like #isDataAvailable you cannot handle
>> >>>>>>> those
>> >>>>>>> without blocking.
>> >>>>>>
>> >>>>>> Right.  I think this is a distraction (I was trying to explain some
>> >>>>>> details, but it's causing more confusion instead of helping).
>> >>>>>>
>> >>>>>> The important point is that #atEnd doesn't work for iterating over
>> >>>>>> streams with terminal input
>> >>>>>
>> >>>>> Maybe you should also point to the actual code that fails. I mean
>> >>>>> you
>> >>>>> showed a partial stack trace, but not how you got there, precisely.
>> >>>>> How does
>> >>>>> the application reading from an interactive stdin do to get into
>> >>>>> trouble ?
>> >>>>
>> >>>> Included below.
>> >>>>
>> >>>>
>> >>>>>>> Reading from stdin seems like a very rare case for a Smalltalk
>> >>>>>>> system
>> >>>>>>> (not that it should not be possible).
>> >>>>>>
>> >>>>>> There's been quite a bit of discussion and several projects
>> >>>>>> recently
>> >>>>>> related to using pharo for scripting, so it may become more common.
>> >>>>>> E.g.
>> >>>>>>
>> >>>>>>
>> >>>>>>
>> >>>>>> https://www.quora.com/Can-Smalltalk-be-a-batch-file-scripting-language/answer/Philippe-Back-1?share=c19bfc95
>> >>>>>> https://github.com/rajula96reddy/pharo-cli
>> >>>>>
>> >>>>> Still, it is not common at all.
>> >>>>>
>> >>>>>>> I have a feeling that too much functionality is being pushed into
>> >>>>>>> too
>> >>>>>>> small an API.
>> >>>>>>
>> >>>>>> This is just about how should Zinc streams be iterating over the
>> >>>>>> underlying streams.  You didn't like checking the result of #next
>> >>>>>> for
>> >>>>>> nil since it isn't general, correctly pointing out that nil is a
>> >>>>>> valid
>> >>>>>> value for non-byte oriented streams.  But #atEnd doesn't work for
>> >>>>>> stdin from a terminal.
>> >>>>>>
>> >>>>>>
>> >>>>>> At this point I think there are three options:
>> >>>>>>
>> >>>>>> 1. Modify Zinc to check the return value of #next instead of using
>> >>>>>> #atEnd.
>> >>>>>>
>> >>>>>> This is what all existing character / byte oriented streams in
>> >>>>>> Squeak
>> >>>>>> and Pharo do.  At that point the Zinc streams can be used on all
>> >>>>>> file
>> >>>>>> / stdio input and output.
>> >>>>>
>> >>>>> I agree that such code exists in many places, but there is lots of
>> >>>>> stream reading that does not check for nils.
>> >>>>
>> >>>> Right.  Streams can be categorised in many ways, but for this
>> >>>> discussion I think streams are broken in to two types:
>> >>>>
>> >>>> 1) Byte / Character oriented
>> >>>> 2) All others
>> >>>>
>> >>>> For historical reasons, byte / character oriented streams need to
>> >>>> check for EOF by using "stream next == nil" and all other streams
>> >>>> should use #atEnd.
>> >>>>
>> >>>> This avoids the "nil being part of the domain" issue that was
>> >>>> discussed earlier in the thread.
>> >>>>
>> >>>>
>> >>>>>> 2. Modify all streams to signal EOF in some other way, i.e. a
>> >>>>>> sentinel
>> >>>>>> or notification / exception.
>> >>>>>>
>> >>>>>> This is what we were discussing below.  But it is a decent chunk of
>> >>>>>> work with significant impact on the existing code base.
>> >>>>>
>> >>>>> Agreed. This would be a future extension.
>> >>>>>
>> >>>>>> 3. Require anyone who wants to read from stdin to code around
>> >>>>>> Zinc's
>> >>>>>> inability to handle terminal input.
>> >>>>>>
>> >>>>>> I'd prefer to avoid this option if possible.
>> >>>>>
>> >>>>> See higher for a more concrete usage example request.
>> >>>>
>> >>>>
>> >>>> testAtEnd.st
>> >>>> --
>> >>>> | ch stream string stdin |
>> >>>>
>> >>>> 'stdio.cs' asFileReference fileIn.
>> >>>> "stdin := FileStream stdin."
>> >>>> stdin := ZnCharacterReadStream on:
>> >>>>    (ZnBufferedReadStream on:
>> >>>>        Stdio stdin).
>> >>>> stream := (String new: 100) writeStream.
>> >>>> ch := stdin next.
>> >>>> [ ch == nil ] whileFalse: [
>> >>>>    stream nextPut: ch.
>> >>>>    ch := stdin next. ].
>> >>>> string := stream contents.
>> >>>> FileStream stdout
>> >>>>    nextPutAll: string; lf;
>> >>>>    nextPutAll: 'Characters read: ';
>> >>>>    nextPutAll: string size asString;
>> >>>>    lf.
>> >>>> Smalltalk snapshot: false andQuit: true.
>> >>>> --
>> >>>>
>> >>>> Execute with:
>> >>>>
>> >>>> ./pharo --headless Pharo7.0-64bit-e76f1a2.image testAtEnd.st
>> >>>>
>> >>>> and type Ctrl-D gives:
>> >>>>
>> >>>>
>> >>>> 'Errors in script loaded from testAtEnd.st'
>> >>>> MessageNotUnderstood: receiver of "<" is nil
>> >>>> UndefinedObject(Object)>>doesNotUnderstand: #<
>> >>>> ZnUTF8Encoder>>nextCodePointFromStream:
>> >>>> ZnUTF8Encoder(ZnCharacterEncoder)>>nextFromStream:
>> >>>> ZnCharacterReadStream>>nextElement
>> >>>> ZnCharacterReadStream(ZnEncodedReadStream)>>next
>> >>>> UndefinedObject>>DoIt
>> >>>> OpalCompiler>>evaluate
>> >>>>
>> >>>>
>> >>>> Using #atEnd to control the loop instead of "stdin next == nil"
>> >>>> produces the same result.
>> >>>>
>> >>>> Replacing stdin with FileStream stdin makes the script work.
>> >>>>
>> >>>> stdio.cs fixes a bug in StdioStream which really isn't part of this
>> >>>> discussion (PR to be submitted).
>> >>>>
>> >>>> Cheers,
>> >>>> Alistair
>> >>>>
>> >>>>
>> >>>>
>> >>>>
>> >>>>>> Does that clarify the situation?
>> >>>>>
>> >>>>> Yes, it helps. Thanks. But questions remain.
>> >>>>>
>> >>>>>> Thanks,
>> >>>>>> Alistair


Reply | Threaded
Open this post in threaded view
|

Re: Changed #atEnd primitive - #atEnd vs #next returning nil

Nicolas Cellier
In reply to this post by alistairgrant
Hi Alistair,
I must take my part too: I suggested that we could use a pair of getc/ungetc to know if we are atEnd(OfData), but this obviously works well with AsyncFileIO only, else it blocks.
For files, this generally isn't a problem (but maybe for network mounted partitions), the latency is bearable, but for sockets and pipes, that is not the right thing.

2018-04-11 14:56 GMT+02:00 Alistair Grant <[hidden email]>:
Hi Sven & Dennis,

On 11 April 2018 at 12:04, Denis Kudriashov <[hidden email]> wrote:
> Hi Alistair.
>
> I don't think anybody is annoyed by you. You are doing really good job. And
> nice thing that you are super patient to continue :)
>

On 11 April 2018 at 12:13, Sven Van Caekenberghe <[hidden email]> wrote:
>
> Yes, Alistair, you are a top notch open source contributor !
>
> For me, this discussion is about the difference between looking from low level details/issues/changes up, vs, from a higher level down.

Thanks for your kind words.



> What I try to understand is why blocking atEnd is bad?
> Here is code from VMMaker:
>
> [stdin atEnd] whileFalse:
> [| nextChunk |
> stdout nextPutAll: 'squeak> '; flush.
> nextChunk := stdin nextChunkNoTag.
> [nextChunk notEmpty and: [nextChunk first isSeparator]] whileTrue:
> [nextChunk := nextChunk allButFirst].
> Transcript cr; nextPutAll: nextChunk; cr; flush.
> [stdout print: (Compiler evaluate: nextChunk); cr; flush]
> on: Error
> do: [:ex| self logError: ex description inContext: ex signalerContext to:
> stderr]].
> quitOnEof ifTrue:
> [SourceFiles at: 2 put: nil.
> Smalltalk snapshot: false andQuit: true]
>
>
> I am not see why it breaks with blocking #atEnd. Can you explain?


First consider the case where #atEnd doesn't block and we just want to
evaluate 4+3:

1. #atEnd will return false
2. the loop will print the prompt
3. wait for input (stdin nextChunkNoTag)
4. print the result
5. goto 1.

So the screen will look like:

squeak> 4+3!
7
squeak> [cursor here]

Which is what we expect (prompt, input, result, prompt).

If #atEnd is blocking the VM will hang at step 1 until the user enters
something in the terminal.  In Ubuntu at least terminal input appears to
be line buffered, so for the example above the terminal will look like:

4+3!
squeak> 7
[cursor here]

We don't get the prompt when the program is started, the result is
printed after the prompt, and then there's just a cursor sitting at the
start of the next line.

Obviously the program could be re-written to have the correct output
with #atEnd blocking.  But I'm arguing that this program is
representative of many others, and we don't want to break backward
compatibility in this case.

Cheers,
Alistair








> 2018-04-11 11:41 GMT+02:00 Alistair Grant <[hidden email]>:
>>
>> Hi Sven,
>>
>> Oh dear.  I feel as though I'm not getting my concerns across at all
>> well, and I'm pushing hard enough that all I'm going to do is make
>> people annoyed.  So let me try to restate the issue one last time
>> before answering your questions directly.
>>
>> Pharo & Squeak have unwritten rules about stream usage that I suspect
>> have just emerged over time without being designed.
>>
>> If you want to be able to iterate over any stream, and in particular
>> stdin from a terminal (which, as far as I know, is the outlier that
>> causes all the problems) you have to follow these rules:
>>
>> 1.  If the stream is character / byte oriented you have to check for
>> EOF using "stream next == nil".  #atEnd can be used, but you'll still
>> have to do the nil check.
>>
>> 2.  All other streams have to check for EOF (end of stream) using
>> #atEnd.  "stream next == nil" can be used, but you'll still need to
>> test #atEnd to determine whether nil is a value returned by the
>> stream.
>>
>> If you write code that you want to be able to consume characters,
>> bytes or any other object, you'll have to test both "stream next ==
>> nil" and #atEnd.
>>
>> The rules are the result of the original blue book design being that
>> #atEnd should be used, and then character input from a terminal being
>> added later, but always returning an EOF character (nil) before #atEnd
>> answers correctly.
>>
>> At the moment, ZnCharacterEncoder uses #atEnd on character / byte
>> streams, so fails for stdin on a terminal.
>>
>> Back to your questions:
>>
>> On 11 April 2018 at 11:12, Sven Van Caekenberghe <[hidden email]> wrote:
>> >
>> >
>> >> On 11 Apr 2018, at 10:29, Alistair Grant <[hidden email]> wrote:
>> >>
>> >> Hi Denis,
>> >>
>> >> On 11 April 2018 at 10:02, Denis Kudriashov <[hidden email]>
>> >> wrote:
>> >>>
>> >>> 2018-04-11 8:32 GMT+02:00 Alistair Grant <[hidden email]>:
>> >>>>
>> >>>>>>> Where is it being said that #next and/or #atEnd should be blocking
>> >>>>>>> or
>> >>>>>>> non-blocking ?
>> >>>>>>
>> >>>>>> There is existing code that assumes that #atEnd is non-blocking and
>> >>>>>> that #next is allowed block.  I believe that we should keep those
>> >>>>>> conditions.
>> >>>>>
>> >>>>> I fail to see where that is written down, either way. Can you point
>> >>>>> me
>> >>>>> to comments stating that, I would really like to know ?
>> >>>>
>> >>>> I'm not aware of it being written down, just that ever existing
>> >>>> implementation I'm aware of behaves this way.
>> >>>>
>> >>>> On the other hand, making #atEnd blocking breaks Eliot's REPL sample
>> >>>> (in Squeak).
>> >>>
>> >>>
>> >>> Could you write here this example, please?
>> >>
>> >> The code is loaded in squeak using:
>> >>
>> >>
>> >> https://github.com/OpenSmalltalk/opensmalltalk-vm/blob/Cog/image/buildspurtrunkreaderimage.sh
>> >>
>> >> for 32 bit images.  It loads:
>> >>
>> >>
>> >> https://github.com/OpenSmalltalk/opensmalltalk-vm/blob/Cog/image/LoadReader.st
>> >>
>> >> which loads package CogTools-Listener in
>> >> http://source.squeak.org/VMMaker
>> >>
>> >> An image that automatically runs the code and nothing else is created
>> >> in:
>> >>
>> >>
>> >> https://github.com/OpenSmalltalk/opensmalltalk-vm/blob/Cog/image/StartReader.st
>> >>
>> >>
>> >> If you want to run it interactively you can load CogTools-Listener and
>> >> do something like:
>> >>
>> >> StdioListener new
>> >>    quitOnEof: false;
>> >>    run
>> >
>> > What does #quitOnEof: do ? Can the StdioListener code be browsed/viewed
>> > online somewhere ?
>>
>> I just referenced this as an example of making #atEnd (really
>> FilePlugin>>primitiveFileAtEnd) blocking causing problems.  I wasn't
>> expecting people to go and look at the code or use it as a test.
>>
>> If you really want to look at it (from Pharo):
>>
>> 1. Add http://source.squeak.org/VMMaker as a repository.
>> 2. Browse the CogTools-Listener package
>>
>>
>> >> If you modify #atEnd to block it will result in the "squeak>" input
>> >> prompt being printed in the terminal after the input has been entered.
>> >
>> > How does one modify #atEnd to block ? I suppose you are talking about
>> > StdioStream>>#atEnd ?
>>
>> I meant the primitive, i.e. FilePlugin>>primitiveFileAtEnd /
>> FilePluginPrims>>atEnd:.
>>
>>
>> >  ^ self peek isNil
>> >
>> > ?
>> >
>> > PS: I liked your runnable example better, I will try it later on. Thx!
>>
>> Right.  My code is meant to be minimal and trigger the problem I'm
>> actually focused on - that ZnCharacterEncoder doesn't work with stdin
>> from a terminal.
>>
>> Sven has expressed a hesitation to change the internal operation of
>> the Zinc streams from using #atEnd to "stream peek == nil" and this
>> whole discussion is really about us trying to resolve our different
>> perspective of the best path forward.  I respect Sven and his work so
>> I'm trying to justify the change (but I'm not expressing it at all
>> well, obviously).
>>
>> Cheers,
>> Alistair
>>
>>
>>
>> >> The code can be loaded in to Pharo and basically works, but the output
>> >> tends to be hidden behind the next input prompt because it uses #cr
>> >> instead of #lf.  You can easily modify StdioListener>>initialize to
>> >> set the line end convention in stdout.
>> >>
>> >> NOTE: It is not intended to be a release quality implementation of a
>> >> evaluation loop.  The whole purpose as I understand it is for it to be
>> >> as simple as possible to assist in tracking down issues using the VM
>> >> simulator.  It runs minimal code to get to the point of waiting for
>> >> user input and then allows an expression that causes problems to be
>> >> entered and traced using the simulator.
>> >>
>> >> Cheers,
>> >> Alistair
>> >>
>> >>
>> >>
>> >>>>>>> How is this related to how EOF is signalled ?
>> >>>>>>
>> >>>>>> Because, combined with terminal EOF not being known until the user
>> >>>>>> explicitly flags it (with Ctrl-D) it means that #atEnd can't be
>> >>>>>> used
>> >>>>>> for iterating over input from stdin connected to a terminal.
>> >>>>>
>> >>>>> This seems to me like an exception that only holds for one
>> >>>>> particular
>> >>>>> stream in one particular scenario (interactive stdin). I might be
>> >>>>> wrong.
>> >>>>>
>> >>>>>>> It seems to me that there are quite a few classes of streams that
>> >>>>>>> are
>> >>>>>>> 'special' in the sense that #next could be blocking and/or #atEnd
>> >>>>>>> could be
>> >>>>>>> unclear - socket/network streams, serial streams, maybe stdio
>> >>>>>>> (interactive
>> >>>>>>> or not). Without a message like #isDataAvailable you cannot handle
>> >>>>>>> those
>> >>>>>>> without blocking.
>> >>>>>>
>> >>>>>> Right.  I think this is a distraction (I was trying to explain some
>> >>>>>> details, but it's causing more confusion instead of helping).
>> >>>>>>
>> >>>>>> The important point is that #atEnd doesn't work for iterating over
>> >>>>>> streams with terminal input
>> >>>>>
>> >>>>> Maybe you should also point to the actual code that fails. I mean
>> >>>>> you
>> >>>>> showed a partial stack trace, but not how you got there, precisely.
>> >>>>> How does
>> >>>>> the application reading from an interactive stdin do to get into
>> >>>>> trouble ?
>> >>>>
>> >>>> Included below.
>> >>>>
>> >>>>
>> >>>>>>> Reading from stdin seems like a very rare case for a Smalltalk
>> >>>>>>> system
>> >>>>>>> (not that it should not be possible).
>> >>>>>>
>> >>>>>> There's been quite a bit of discussion and several projects
>> >>>>>> recently
>> >>>>>> related to using pharo for scripting, so it may become more common.
>> >>>>>> E.g.
>> >>>>>>
>> >>>>>>
>> >>>>>>
>> >>>>>> https://www.quora.com/Can-Smalltalk-be-a-batch-file-scripting-language/answer/Philippe-Back-1?share=c19bfc95
>> >>>>>> https://github.com/rajula96reddy/pharo-cli
>> >>>>>
>> >>>>> Still, it is not common at all.
>> >>>>>
>> >>>>>>> I have a feeling that too much functionality is being pushed into
>> >>>>>>> too
>> >>>>>>> small an API.
>> >>>>>>
>> >>>>>> This is just about how should Zinc streams be iterating over the
>> >>>>>> underlying streams.  You didn't like checking the result of #next
>> >>>>>> for
>> >>>>>> nil since it isn't general, correctly pointing out that nil is a
>> >>>>>> valid
>> >>>>>> value for non-byte oriented streams.  But #atEnd doesn't work for
>> >>>>>> stdin from a terminal.
>> >>>>>>
>> >>>>>>
>> >>>>>> At this point I think there are three options:
>> >>>>>>
>> >>>>>> 1. Modify Zinc to check the return value of #next instead of using
>> >>>>>> #atEnd.
>> >>>>>>
>> >>>>>> This is what all existing character / byte oriented streams in
>> >>>>>> Squeak
>> >>>>>> and Pharo do.  At that point the Zinc streams can be used on all
>> >>>>>> file
>> >>>>>> / stdio input and output.
>> >>>>>
>> >>>>> I agree that such code exists in many places, but there is lots of
>> >>>>> stream reading that does not check for nils.
>> >>>>
>> >>>> Right.  Streams can be categorised in many ways, but for this
>> >>>> discussion I think streams are broken in to two types:
>> >>>>
>> >>>> 1) Byte / Character oriented
>> >>>> 2) All others
>> >>>>
>> >>>> For historical reasons, byte / character oriented streams need to
>> >>>> check for EOF by using "stream next == nil" and all other streams
>> >>>> should use #atEnd.
>> >>>>
>> >>>> This avoids the "nil being part of the domain" issue that was
>> >>>> discussed earlier in the thread.
>> >>>>
>> >>>>
>> >>>>>> 2. Modify all streams to signal EOF in some other way, i.e. a
>> >>>>>> sentinel
>> >>>>>> or notification / exception.
>> >>>>>>
>> >>>>>> This is what we were discussing below.  But it is a decent chunk of
>> >>>>>> work with significant impact on the existing code base.
>> >>>>>
>> >>>>> Agreed. This would be a future extension.
>> >>>>>
>> >>>>>> 3. Require anyone who wants to read from stdin to code around
>> >>>>>> Zinc's
>> >>>>>> inability to handle terminal input.
>> >>>>>>
>> >>>>>> I'd prefer to avoid this option if possible.
>> >>>>>
>> >>>>> See higher for a more concrete usage example request.
>> >>>>
>> >>>>
>> >>>> testAtEnd.st
>> >>>> --
>> >>>> | ch stream string stdin |
>> >>>>
>> >>>> 'stdio.cs' asFileReference fileIn.
>> >>>> "stdin := FileStream stdin."
>> >>>> stdin := ZnCharacterReadStream on:
>> >>>>    (ZnBufferedReadStream on:
>> >>>>        Stdio stdin).
>> >>>> stream := (String new: 100) writeStream.
>> >>>> ch := stdin next.
>> >>>> [ ch == nil ] whileFalse: [
>> >>>>    stream nextPut: ch.
>> >>>>    ch := stdin next. ].
>> >>>> string := stream contents.
>> >>>> FileStream stdout
>> >>>>    nextPutAll: string; lf;
>> >>>>    nextPutAll: 'Characters read: ';
>> >>>>    nextPutAll: string size asString;
>> >>>>    lf.
>> >>>> Smalltalk snapshot: false andQuit: true.
>> >>>> --
>> >>>>
>> >>>> Execute with:
>> >>>>
>> >>>> ./pharo --headless Pharo7.0-64bit-e76f1a2.image testAtEnd.st
>> >>>>
>> >>>> and type Ctrl-D gives:
>> >>>>
>> >>>>
>> >>>> 'Errors in script loaded from testAtEnd.st'
>> >>>> MessageNotUnderstood: receiver of "<" is nil
>> >>>> UndefinedObject(Object)>>doesNotUnderstand: #<
>> >>>> ZnUTF8Encoder>>nextCodePointFromStream:
>> >>>> ZnUTF8Encoder(ZnCharacterEncoder)>>nextFromStream:
>> >>>> ZnCharacterReadStream>>nextElement
>> >>>> ZnCharacterReadStream(ZnEncodedReadStream)>>next
>> >>>> UndefinedObject>>DoIt
>> >>>> OpalCompiler>>evaluate
>> >>>>
>> >>>>
>> >>>> Using #atEnd to control the loop instead of "stdin next == nil"
>> >>>> produces the same result.
>> >>>>
>> >>>> Replacing stdin with FileStream stdin makes the script work.
>> >>>>
>> >>>> stdio.cs fixes a bug in StdioStream which really isn't part of this
>> >>>> discussion (PR to be submitted).
>> >>>>
>> >>>> Cheers,
>> >>>> Alistair
>> >>>>
>> >>>>
>> >>>>
>> >>>>
>> >>>>>> Does that clarify the situation?
>> >>>>>
>> >>>>> Yes, it helps. Thanks. But questions remain.
>> >>>>>
>> >>>>>> Thanks,
>> >>>>>> Alistair


Reply | Threaded
Open this post in threaded view
|

Re: Changed #atEnd primitive - #atEnd vs #next returning nil

David T. Lewis
In reply to this post by Denis Kudriashov
OS pipes are a similar case. On Pharo, you can run CommandShellTestCase to
provide some test coverage for this.

Dave

On Wed, Apr 11, 2018 at 03:13:35PM +0200, Denis Kudriashov wrote:

> Thanks for explanation.
>
> I think it would be same scenario for socket stream where #atEnd is not
> blocking. So I agree that it is expected behaviour.
>
> The example is general enough to expect it to be working for any given pair
> of in and out streams. So our streams should support this.
>
>
> 2018-04-11 14:56 GMT+02:00 Alistair Grant <[hidden email]>:
>
> > Hi Sven & Dennis,
> >
> > On 11 April 2018 at 12:04, Denis Kudriashov <[hidden email]> wrote:
> > > Hi Alistair.
> > >
> > > I don't think anybody is annoyed by you. You are doing really good job.
> > And
> > > nice thing that you are super patient to continue :)
> > >
> >
> > On 11 April 2018 at 12:13, Sven Van Caekenberghe <[hidden email]> wrote:
> > >
> > > Yes, Alistair, you are a top notch open source contributor !
> > >
> > > For me, this discussion is about the difference between looking from low
> > level details/issues/changes up, vs, from a higher level down.
> >
> > Thanks for your kind words.
> >
> >
> >
> > > What I try to understand is why blocking atEnd is bad?
> > > Here is code from VMMaker:
> > >
> > > [stdin atEnd] whileFalse:
> > > [| nextChunk |
> > > stdout nextPutAll: 'squeak> '; flush.
> > > nextChunk := stdin nextChunkNoTag.
> > > [nextChunk notEmpty and: [nextChunk first isSeparator]] whileTrue:
> > > [nextChunk := nextChunk allButFirst].
> > > Transcript cr; nextPutAll: nextChunk; cr; flush.
> > > [stdout print: (Compiler evaluate: nextChunk); cr; flush]
> > > on: Error
> > > do: [:ex| self logError: ex description inContext: ex signalerContext to:
> > > stderr]].
> > > quitOnEof ifTrue:
> > > [SourceFiles at: 2 put: nil.
> > > Smalltalk snapshot: false andQuit: true]
> > >
> > >
> > > I am not see why it breaks with blocking #atEnd. Can you explain?
> >
> >
> > First consider the case where #atEnd doesn't block and we just want to
> > evaluate 4+3:
> >
> > 1. #atEnd will return false
> > 2. the loop will print the prompt
> > 3. wait for input (stdin nextChunkNoTag)
> > 4. print the result
> > 5. goto 1.
> >
> > So the screen will look like:
> >
> > squeak> 4+3!
> > 7
> > squeak> [cursor here]
> >
> > Which is what we expect (prompt, input, result, prompt).
> >
> > If #atEnd is blocking the VM will hang at step 1 until the user enters
> > something in the terminal.  In Ubuntu at least terminal input appears to
> > be line buffered, so for the example above the terminal will look like:
> >
> > 4+3!
> > squeak> 7
> > [cursor here]
> >
> > We don't get the prompt when the program is started, the result is
> > printed after the prompt, and then there's just a cursor sitting at the
> > start of the next line.
> >
> > Obviously the program could be re-written to have the correct output
> > with #atEnd blocking.  But I'm arguing that this program is
> > representative of many others, and we don't want to break backward
> > compatibility in this case.
> >
> > Cheers,
> > Alistair
> >
> >
> >
> >
> >
> >
> >
> >
> > > 2018-04-11 11:41 GMT+02:00 Alistair Grant <[hidden email]>:
> > >>
> > >> Hi Sven,
> > >>
> > >> Oh dear.  I feel as though I'm not getting my concerns across at all
> > >> well, and I'm pushing hard enough that all I'm going to do is make
> > >> people annoyed.  So let me try to restate the issue one last time
> > >> before answering your questions directly.
> > >>
> > >> Pharo & Squeak have unwritten rules about stream usage that I suspect
> > >> have just emerged over time without being designed.
> > >>
> > >> If you want to be able to iterate over any stream, and in particular
> > >> stdin from a terminal (which, as far as I know, is the outlier that
> > >> causes all the problems) you have to follow these rules:
> > >>
> > >> 1.  If the stream is character / byte oriented you have to check for
> > >> EOF using "stream next == nil".  #atEnd can be used, but you'll still
> > >> have to do the nil check.
> > >>
> > >> 2.  All other streams have to check for EOF (end of stream) using
> > >> #atEnd.  "stream next == nil" can be used, but you'll still need to
> > >> test #atEnd to determine whether nil is a value returned by the
> > >> stream.
> > >>
> > >> If you write code that you want to be able to consume characters,
> > >> bytes or any other object, you'll have to test both "stream next ==
> > >> nil" and #atEnd.
> > >>
> > >> The rules are the result of the original blue book design being that
> > >> #atEnd should be used, and then character input from a terminal being
> > >> added later, but always returning an EOF character (nil) before #atEnd
> > >> answers correctly.
> > >>
> > >> At the moment, ZnCharacterEncoder uses #atEnd on character / byte
> > >> streams, so fails for stdin on a terminal.
> > >>
> > >> Back to your questions:
> > >>
> > >> On 11 April 2018 at 11:12, Sven Van Caekenberghe <[hidden email]> wrote:
> > >> >
> > >> >
> > >> >> On 11 Apr 2018, at 10:29, Alistair Grant <[hidden email]>
> > wrote:
> > >> >>
> > >> >> Hi Denis,
> > >> >>
> > >> >> On 11 April 2018 at 10:02, Denis Kudriashov <[hidden email]>
> > >> >> wrote:
> > >> >>>
> > >> >>> 2018-04-11 8:32 GMT+02:00 Alistair Grant <[hidden email]>:
> > >> >>>>
> > >> >>>>>>> Where is it being said that #next and/or #atEnd should be
> > blocking
> > >> >>>>>>> or
> > >> >>>>>>> non-blocking ?
> > >> >>>>>>
> > >> >>>>>> There is existing code that assumes that #atEnd is non-blocking
> > and
> > >> >>>>>> that #next is allowed block.  I believe that we should keep those
> > >> >>>>>> conditions.
> > >> >>>>>
> > >> >>>>> I fail to see where that is written down, either way. Can you
> > point
> > >> >>>>> me
> > >> >>>>> to comments stating that, I would really like to know ?
> > >> >>>>
> > >> >>>> I'm not aware of it being written down, just that ever existing
> > >> >>>> implementation I'm aware of behaves this way.
> > >> >>>>
> > >> >>>> On the other hand, making #atEnd blocking breaks Eliot's REPL
> > sample
> > >> >>>> (in Squeak).
> > >> >>>
> > >> >>>
> > >> >>> Could you write here this example, please?
> > >> >>
> > >> >> The code is loaded in squeak using:
> > >> >>
> > >> >>
> > >> >> https://github.com/OpenSmalltalk/opensmalltalk-vm/blob/Cog/image/
> > buildspurtrunkreaderimage.sh
> > >> >>
> > >> >> for 32 bit images.  It loads:
> > >> >>
> > >> >>
> > >> >> https://github.com/OpenSmalltalk/opensmalltalk-
> > vm/blob/Cog/image/LoadReader.st
> > >> >>
> > >> >> which loads package CogTools-Listener in
> > >> >> http://source.squeak.org/VMMaker
> > >> >>
> > >> >> An image that automatically runs the code and nothing else is created
> > >> >> in:
> > >> >>
> > >> >>
> > >> >> https://github.com/OpenSmalltalk/opensmalltalk-
> > vm/blob/Cog/image/StartReader.st
> > >> >>
> > >> >>
> > >> >> If you want to run it interactively you can load CogTools-Listener
> > and
> > >> >> do something like:
> > >> >>
> > >> >> StdioListener new
> > >> >>    quitOnEof: false;
> > >> >>    run
> > >> >
> > >> > What does #quitOnEof: do ? Can the StdioListener code be
> > browsed/viewed
> > >> > online somewhere ?
> > >>
> > >> I just referenced this as an example of making #atEnd (really
> > >> FilePlugin>>primitiveFileAtEnd) blocking causing problems.  I wasn't
> > >> expecting people to go and look at the code or use it as a test.
> > >>
> > >> If you really want to look at it (from Pharo):
> > >>
> > >> 1. Add http://source.squeak.org/VMMaker as a repository.
> > >> 2. Browse the CogTools-Listener package
> > >>
> > >>
> > >> >> If you modify #atEnd to block it will result in the "squeak>" input
> > >> >> prompt being printed in the terminal after the input has been
> > entered.
> > >> >
> > >> > How does one modify #atEnd to block ? I suppose you are talking about
> > >> > StdioStream>>#atEnd ?
> > >>
> > >> I meant the primitive, i.e. FilePlugin>>primitiveFileAtEnd /
> > >> FilePluginPrims>>atEnd:.
> > >>
> > >>
> > >> >  ^ self peek isNil
> > >> >
> > >> > ?
> > >> >
> > >> > PS: I liked your runnable example better, I will try it later on. Thx!
> > >>
> > >> Right.  My code is meant to be minimal and trigger the problem I'm
> > >> actually focused on - that ZnCharacterEncoder doesn't work with stdin
> > >> from a terminal.
> > >>
> > >> Sven has expressed a hesitation to change the internal operation of
> > >> the Zinc streams from using #atEnd to "stream peek == nil" and this
> > >> whole discussion is really about us trying to resolve our different
> > >> perspective of the best path forward.  I respect Sven and his work so
> > >> I'm trying to justify the change (but I'm not expressing it at all
> > >> well, obviously).
> > >>
> > >> Cheers,
> > >> Alistair
> > >>
> > >>
> > >>
> > >> >> The code can be loaded in to Pharo and basically works, but the
> > output
> > >> >> tends to be hidden behind the next input prompt because it uses #cr
> > >> >> instead of #lf.  You can easily modify StdioListener>>initialize to
> > >> >> set the line end convention in stdout.
> > >> >>
> > >> >> NOTE: It is not intended to be a release quality implementation of a
> > >> >> evaluation loop.  The whole purpose as I understand it is for it to
> > be
> > >> >> as simple as possible to assist in tracking down issues using the VM
> > >> >> simulator.  It runs minimal code to get to the point of waiting for
> > >> >> user input and then allows an expression that causes problems to be
> > >> >> entered and traced using the simulator.
> > >> >>
> > >> >> Cheers,
> > >> >> Alistair
> > >> >>
> > >> >>
> > >> >>
> > >> >>>>>>> How is this related to how EOF is signalled ?
> > >> >>>>>>
> > >> >>>>>> Because, combined with terminal EOF not being known until the
> > user
> > >> >>>>>> explicitly flags it (with Ctrl-D) it means that #atEnd can't be
> > >> >>>>>> used
> > >> >>>>>> for iterating over input from stdin connected to a terminal.
> > >> >>>>>
> > >> >>>>> This seems to me like an exception that only holds for one
> > >> >>>>> particular
> > >> >>>>> stream in one particular scenario (interactive stdin). I might be
> > >> >>>>> wrong.
> > >> >>>>>
> > >> >>>>>>> It seems to me that there are quite a few classes of streams
> > that
> > >> >>>>>>> are
> > >> >>>>>>> 'special' in the sense that #next could be blocking and/or
> > #atEnd
> > >> >>>>>>> could be
> > >> >>>>>>> unclear - socket/network streams, serial streams, maybe stdio
> > >> >>>>>>> (interactive
> > >> >>>>>>> or not). Without a message like #isDataAvailable you cannot
> > handle
> > >> >>>>>>> those
> > >> >>>>>>> without blocking.
> > >> >>>>>>
> > >> >>>>>> Right.  I think this is a distraction (I was trying to explain
> > some
> > >> >>>>>> details, but it's causing more confusion instead of helping).
> > >> >>>>>>
> > >> >>>>>> The important point is that #atEnd doesn't work for iterating
> > over
> > >> >>>>>> streams with terminal input
> > >> >>>>>
> > >> >>>>> Maybe you should also point to the actual code that fails. I mean
> > >> >>>>> you
> > >> >>>>> showed a partial stack trace, but not how you got there,
> > precisely.
> > >> >>>>> How does
> > >> >>>>> the application reading from an interactive stdin do to get into
> > >> >>>>> trouble ?
> > >> >>>>
> > >> >>>> Included below.
> > >> >>>>
> > >> >>>>
> > >> >>>>>>> Reading from stdin seems like a very rare case for a Smalltalk
> > >> >>>>>>> system
> > >> >>>>>>> (not that it should not be possible).
> > >> >>>>>>
> > >> >>>>>> There's been quite a bit of discussion and several projects
> > >> >>>>>> recently
> > >> >>>>>> related to using pharo for scripting, so it may become more
> > common.
> > >> >>>>>> E.g.
> > >> >>>>>>
> > >> >>>>>>
> > >> >>>>>>
> > >> >>>>>> https://www.quora.com/Can-Smalltalk-be-a-batch-file-
> > scripting-language/answer/Philippe-Back-1?share=c19bfc95
> > >> >>>>>> https://github.com/rajula96reddy/pharo-cli
> > >> >>>>>
> > >> >>>>> Still, it is not common at all.
> > >> >>>>>
> > >> >>>>>>> I have a feeling that too much functionality is being pushed
> > into
> > >> >>>>>>> too
> > >> >>>>>>> small an API.
> > >> >>>>>>
> > >> >>>>>> This is just about how should Zinc streams be iterating over the
> > >> >>>>>> underlying streams.  You didn't like checking the result of #next
> > >> >>>>>> for
> > >> >>>>>> nil since it isn't general, correctly pointing out that nil is a
> > >> >>>>>> valid
> > >> >>>>>> value for non-byte oriented streams.  But #atEnd doesn't work for
> > >> >>>>>> stdin from a terminal.
> > >> >>>>>>
> > >> >>>>>>
> > >> >>>>>> At this point I think there are three options:
> > >> >>>>>>
> > >> >>>>>> 1. Modify Zinc to check the return value of #next instead of
> > using
> > >> >>>>>> #atEnd.
> > >> >>>>>>
> > >> >>>>>> This is what all existing character / byte oriented streams in
> > >> >>>>>> Squeak
> > >> >>>>>> and Pharo do.  At that point the Zinc streams can be used on all
> > >> >>>>>> file
> > >> >>>>>> / stdio input and output.
> > >> >>>>>
> > >> >>>>> I agree that such code exists in many places, but there is lots of
> > >> >>>>> stream reading that does not check for nils.
> > >> >>>>
> > >> >>>> Right.  Streams can be categorised in many ways, but for this
> > >> >>>> discussion I think streams are broken in to two types:
> > >> >>>>
> > >> >>>> 1) Byte / Character oriented
> > >> >>>> 2) All others
> > >> >>>>
> > >> >>>> For historical reasons, byte / character oriented streams need to
> > >> >>>> check for EOF by using "stream next == nil" and all other streams
> > >> >>>> should use #atEnd.
> > >> >>>>
> > >> >>>> This avoids the "nil being part of the domain" issue that was
> > >> >>>> discussed earlier in the thread.
> > >> >>>>
> > >> >>>>
> > >> >>>>>> 2. Modify all streams to signal EOF in some other way, i.e. a
> > >> >>>>>> sentinel
> > >> >>>>>> or notification / exception.
> > >> >>>>>>
> > >> >>>>>> This is what we were discussing below.  But it is a decent chunk
> > of
> > >> >>>>>> work with significant impact on the existing code base.
> > >> >>>>>
> > >> >>>>> Agreed. This would be a future extension.
> > >> >>>>>
> > >> >>>>>> 3. Require anyone who wants to read from stdin to code around
> > >> >>>>>> Zinc's
> > >> >>>>>> inability to handle terminal input.
> > >> >>>>>>
> > >> >>>>>> I'd prefer to avoid this option if possible.
> > >> >>>>>
> > >> >>>>> See higher for a more concrete usage example request.
> > >> >>>>
> > >> >>>>
> > >> >>>> testAtEnd.st
> > >> >>>> --
> > >> >>>> | ch stream string stdin |
> > >> >>>>
> > >> >>>> 'stdio.cs' asFileReference fileIn.
> > >> >>>> "stdin := FileStream stdin."
> > >> >>>> stdin := ZnCharacterReadStream on:
> > >> >>>>    (ZnBufferedReadStream on:
> > >> >>>>        Stdio stdin).
> > >> >>>> stream := (String new: 100) writeStream.
> > >> >>>> ch := stdin next.
> > >> >>>> [ ch == nil ] whileFalse: [
> > >> >>>>    stream nextPut: ch.
> > >> >>>>    ch := stdin next. ].
> > >> >>>> string := stream contents.
> > >> >>>> FileStream stdout
> > >> >>>>    nextPutAll: string; lf;
> > >> >>>>    nextPutAll: 'Characters read: ';
> > >> >>>>    nextPutAll: string size asString;
> > >> >>>>    lf.
> > >> >>>> Smalltalk snapshot: false andQuit: true.
> > >> >>>> --
> > >> >>>>
> > >> >>>> Execute with:
> > >> >>>>
> > >> >>>> ./pharo --headless Pharo7.0-64bit-e76f1a2.image testAtEnd.st
> > >> >>>>
> > >> >>>> and type Ctrl-D gives:
> > >> >>>>
> > >> >>>>
> > >> >>>> 'Errors in script loaded from testAtEnd.st'
> > >> >>>> MessageNotUnderstood: receiver of "<" is nil
> > >> >>>> UndefinedObject(Object)>>doesNotUnderstand: #<
> > >> >>>> ZnUTF8Encoder>>nextCodePointFromStream:
> > >> >>>> ZnUTF8Encoder(ZnCharacterEncoder)>>nextFromStream:
> > >> >>>> ZnCharacterReadStream>>nextElement
> > >> >>>> ZnCharacterReadStream(ZnEncodedReadStream)>>next
> > >> >>>> UndefinedObject>>DoIt
> > >> >>>> OpalCompiler>>evaluate
> > >> >>>>
> > >> >>>>
> > >> >>>> Using #atEnd to control the loop instead of "stdin next == nil"
> > >> >>>> produces the same result.
> > >> >>>>
> > >> >>>> Replacing stdin with FileStream stdin makes the script work.
> > >> >>>>
> > >> >>>> stdio.cs fixes a bug in StdioStream which really isn't part of this
> > >> >>>> discussion (PR to be submitted).
> > >> >>>>
> > >> >>>> Cheers,
> > >> >>>> Alistair
> > >> >>>>
> > >> >>>>
> > >> >>>>
> > >> >>>>
> > >> >>>>>> Does that clarify the situation?
> > >> >>>>>
> > >> >>>>> Yes, it helps. Thanks. But questions remain.
> > >> >>>>>
> > >> >>>>>> Thanks,
> > >> >>>>>> Alistair
> >
> >

Reply | Threaded
Open this post in threaded view
|

Re: Changed #atEnd primitive - #atEnd vs #next returning nil

alistairgrant
In reply to this post by Sven Van Caekenberghe-2
On 11 April 2018 at 15:11, Sven Van Caekenberghe <[hidden email]> wrote:

>
>
>> On 11 Apr 2018, at 11:12, Sven Van Caekenberghe <[hidden email]> wrote:
>>
>> How does one modify #atEnd to block ? I suppose you are talking about StdioStream>>#atEnd ?
>>
>> ^ self peek isNil
>>
>> ?
>
> Still the same question, how do you implement a blocking #atEnd for stdin ?
>
> I have seen your stdio.cs which is indeed needed as the current StdioStream>>#atEnd is bogus for sure.
>
> But that is still a non-blocking one, right ?
>
> Since there is a peekBuffer in StdioStream, why can't that be used ?

I think you've created a chicken-and-egg problem with this question,
but ignoring that for now:


StdioStream>>peek
"Answer what would be returned if the message next were sent to the
receiver. If the receiver is at the end, answer nil.  "

    self atEnd ifTrue: [^ nil ].

    peekBuffer ifNotNil: [ ^ peekBuffer ].

    ^ peekBuffer := self next.



So when we first start the program, i.e. the user hasn't entered any
input yet, and #peek is called:

1. #atEnd returns false because Ctrl-D (or similar) hasn't been
entered (assuming it is non-blocking).
2. peekBuffer is nil because we haven't previously called #peek.
3. The system now blocks on "self next".


Just a reminder: for terminal input the end-of-file isn't reached
until the user explicitly enters the end of file key (Ctrl-D).

So, if there is no buffered input (either none has been entered yet,
or all input has been consumed)

#atEnd (after the patch) calls #primAtEnd:.

At the moment, #primAtEnd: ends up calling the libc function feof(),
which is non-blocking and answers the end-of-file flag for the FILE*.
Since the user hasn't entered Ctrl-D, that's false.

If we want to control iteration over the stream and ensure that we
don't need to do a "stream next == nil" check, then #primAtEnd: is
going to have to peek for the next character, and that means waiting
for the user to enter that character.

In c that is typically done using:

atEnd = ungetc(fgetc(fp), fp);

and fgetc() will block until the user enters something.

> I have run your example testAtEnd.st now, and it works/fails as advertised.

:-)


Cheers,
Alistair

Reply | Threaded
Open this post in threaded view
|

Re: Changed #atEnd primitive - #atEnd vs #next returning nil

Sven Van Caekenberghe-2
I can make your example, using the Zn variants, work with the following change:

StdioStream>>#atEnd
  ^ peekBuffer isNil or: [ (peekBuffer := self next) isNil ]

Which is a literal implementation of your statement that you can only know that you are atEnd by reading (and thus waiting/blocking) and checking for nil, which seems logical to me, given the fact that you *are* waiting for user input.

BTW, at least on macOS you have to enter ctrl-D (^D) on a separate line, I am not sure how relevant that is, but that is probably another argument that stdin is special (being line-buffered by the OS, EOF needing to be on a separate line).

And FWIW, I have been writing networking code in Pharo for years, and I have never had issues with unclear semantics of these primitives (#atEnd, #next, #peek) on network streams, either the classic SocketStream or the Zdc* streams (TLS or not). That is why I think we have to be careful.

That being said, it is important to continue this discussion, I find it very interesting. I am trying to write some test code using stdin myself, to better understand the topic.

> On 11 Apr 2018, at 16:06, Alistair Grant <[hidden email]> wrote:
>
> On 11 April 2018 at 15:11, Sven Van Caekenberghe <[hidden email]> wrote:
>>
>>
>>> On 11 Apr 2018, at 11:12, Sven Van Caekenberghe <[hidden email]> wrote:
>>>
>>> How does one modify #atEnd to block ? I suppose you are talking about StdioStream>>#atEnd ?
>>>
>>> ^ self peek isNil
>>>
>>> ?
>>
>> Still the same question, how do you implement a blocking #atEnd for stdin ?
>>
>> I have seen your stdio.cs which is indeed needed as the current StdioStream>>#atEnd is bogus for sure.
>>
>> But that is still a non-blocking one, right ?
>>
>> Since there is a peekBuffer in StdioStream, why can't that be used ?
>
> I think you've created a chicken-and-egg problem with this question,
> but ignoring that for now:
>
>
> StdioStream>>peek
> "Answer what would be returned if the message next were sent to the
> receiver. If the receiver is at the end, answer nil.  "
>
>    self atEnd ifTrue: [^ nil ].
>
>    peekBuffer ifNotNil: [ ^ peekBuffer ].
>
>    ^ peekBuffer := self next.
>
>
>
> So when we first start the program, i.e. the user hasn't entered any
> input yet, and #peek is called:
>
> 1. #atEnd returns false because Ctrl-D (or similar) hasn't been
> entered (assuming it is non-blocking).
> 2. peekBuffer is nil because we haven't previously called #peek.
> 3. The system now blocks on "self next".
>
>
> Just a reminder: for terminal input the end-of-file isn't reached
> until the user explicitly enters the end of file key (Ctrl-D).
>
> So, if there is no buffered input (either none has been entered yet,
> or all input has been consumed)
>
> #atEnd (after the patch) calls #primAtEnd:.
>
> At the moment, #primAtEnd: ends up calling the libc function feof(),
> which is non-blocking and answers the end-of-file flag for the FILE*.
> Since the user hasn't entered Ctrl-D, that's false.
>
> If we want to control iteration over the stream and ensure that we
> don't need to do a "stream next == nil" check, then #primAtEnd: is
> going to have to peek for the next character, and that means waiting
> for the user to enter that character.
>
> In c that is typically done using:
>
> atEnd = ungetc(fgetc(fp), fp);
>
> and fgetc() will block until the user enters something.
>
>> I have run your example testAtEnd.st now, and it works/fails as advertised.
>
> :-)
>
>
> Cheers,
> Alistair


Reply | Threaded
Open this post in threaded view
|

Re: Changed #atEnd primitive - #atEnd vs #next returning nil

Sven Van Caekenberghe-2


> On 11 Apr 2018, at 16:36, Sven Van Caekenberghe <[hidden email]> wrote:
>
> I can make your example, using the Zn variants, work with the following change:
>
> StdioStream>>#atEnd
>  ^ peekBuffer isNil or: [ (peekBuffer := self next) isNil ]

Argh, make that

atEnd
  ^ peekBuffer isNil and: [ (peekBuffer := self next) isNil ]

but I am still testing, this is probably not the final answer/solution.

> Which is a literal implementation of your statement that you can only know that you are atEnd by reading (and thus waiting/blocking) and checking for nil, which seems logical to me, given the fact that you *are* waiting for user input.
>
> BTW, at least on macOS you have to enter ctrl-D (^D) on a separate line, I am not sure how relevant that is, but that is probably another argument that stdin is special (being line-buffered by the OS, EOF needing to be on a separate line).
>
> And FWIW, I have been writing networking code in Pharo for years, and I have never had issues with unclear semantics of these primitives (#atEnd, #next, #peek) on network streams, either the classic SocketStream or the Zdc* streams (TLS or not). That is why I think we have to be careful.
>
> That being said, it is important to continue this discussion, I find it very interesting. I am trying to write some test code using stdin myself, to better understand the topic.
>
>> On 11 Apr 2018, at 16:06, Alistair Grant <[hidden email]> wrote:
>>
>> On 11 April 2018 at 15:11, Sven Van Caekenberghe <[hidden email]> wrote:
>>>
>>>
>>>> On 11 Apr 2018, at 11:12, Sven Van Caekenberghe <[hidden email]> wrote:
>>>>
>>>> How does one modify #atEnd to block ? I suppose you are talking about StdioStream>>#atEnd ?
>>>>
>>>> ^ self peek isNil
>>>>
>>>> ?
>>>
>>> Still the same question, how do you implement a blocking #atEnd for stdin ?
>>>
>>> I have seen your stdio.cs which is indeed needed as the current StdioStream>>#atEnd is bogus for sure.
>>>
>>> But that is still a non-blocking one, right ?
>>>
>>> Since there is a peekBuffer in StdioStream, why can't that be used ?
>>
>> I think you've created a chicken-and-egg problem with this question,
>> but ignoring that for now:
>>
>>
>> StdioStream>>peek
>> "Answer what would be returned if the message next were sent to the
>> receiver. If the receiver is at the end, answer nil.  "
>>
>>   self atEnd ifTrue: [^ nil ].
>>
>>   peekBuffer ifNotNil: [ ^ peekBuffer ].
>>
>>   ^ peekBuffer := self next.
>>
>>
>>
>> So when we first start the program, i.e. the user hasn't entered any
>> input yet, and #peek is called:
>>
>> 1. #atEnd returns false because Ctrl-D (or similar) hasn't been
>> entered (assuming it is non-blocking).
>> 2. peekBuffer is nil because we haven't previously called #peek.
>> 3. The system now blocks on "self next".
>>
>>
>> Just a reminder: for terminal input the end-of-file isn't reached
>> until the user explicitly enters the end of file key (Ctrl-D).
>>
>> So, if there is no buffered input (either none has been entered yet,
>> or all input has been consumed)
>>
>> #atEnd (after the patch) calls #primAtEnd:.
>>
>> At the moment, #primAtEnd: ends up calling the libc function feof(),
>> which is non-blocking and answers the end-of-file flag for the FILE*.
>> Since the user hasn't entered Ctrl-D, that's false.
>>
>> If we want to control iteration over the stream and ensure that we
>> don't need to do a "stream next == nil" check, then #primAtEnd: is
>> going to have to peek for the next character, and that means waiting
>> for the user to enter that character.
>>
>> In c that is typically done using:
>>
>> atEnd = ungetc(fgetc(fp), fp);
>>
>> and fgetc() will block until the user enters something.
>>
>>> I have run your example testAtEnd.st now, and it works/fails as advertised.
>>
>> :-)
>>
>>
>> Cheers,
>> Alistair
>


Reply | Threaded
Open this post in threaded view
|

Re: Changed #atEnd primitive - #atEnd vs #next returning nil

Denis Kudriashov

2018-04-11 17:02 GMT+02:00 Sven Van Caekenberghe <[hidden email]>:


> On 11 Apr 2018, at 16:36, Sven Van Caekenberghe <[hidden email]> wrote:
>
> I can make your example, using the Zn variants, work with the following change:
>
> StdioStream>>#atEnd
>  ^ peekBuffer isNil or: [ (peekBuffer := self next) isNil ]

Argh, make that

atEnd
  ^ peekBuffer isNil and: [ (peekBuffer := self next) isNil ]

But discussion exactly about "self next isNil": how to avoid it.
 

but I am still testing, this is probably not the final answer/solution.

> Which is a literal implementation of your statement that you can only know that you are atEnd by reading (and thus waiting/blocking) and checking for nil, which seems logical to me, given the fact that you *are* waiting for user input.
>
> BTW, at least on macOS you have to enter ctrl-D (^D) on a separate line, I am not sure how relevant that is, but that is probably another argument that stdin is special (being line-buffered by the OS, EOF needing to be on a separate line).
>
> And FWIW, I have been writing networking code in Pharo for years, and I have never had issues with unclear semantics of these primitives (#atEnd, #next, #peek) on network streams, either the classic SocketStream or the Zdc* streams (TLS or not). That is why I think we have to be careful.
>
> That being said, it is important to continue this discussion, I find it very interesting. I am trying to write some test code using stdin myself, to better understand the topic.
>
>> On 11 Apr 2018, at 16:06, Alistair Grant <[hidden email]> wrote:
>>
>> On 11 April 2018 at 15:11, Sven Van Caekenberghe <[hidden email]> wrote:
>>>
>>>
>>>> On 11 Apr 2018, at 11:12, Sven Van Caekenberghe <[hidden email]> wrote:
>>>>
>>>> How does one modify #atEnd to block ? I suppose you are talking about StdioStream>>#atEnd ?
>>>>
>>>> ^ self peek isNil
>>>>
>>>> ?
>>>
>>> Still the same question, how do you implement a blocking #atEnd for stdin ?
>>>
>>> I have seen your stdio.cs which is indeed needed as the current StdioStream>>#atEnd is bogus for sure.
>>>
>>> But that is still a non-blocking one, right ?
>>>
>>> Since there is a peekBuffer in StdioStream, why can't that be used ?
>>
>> I think you've created a chicken-and-egg problem with this question,
>> but ignoring that for now:
>>
>>
>> StdioStream>>peek
>> "Answer what would be returned if the message next were sent to the
>> receiver. If the receiver is at the end, answer nil.  "
>>
>>   self atEnd ifTrue: [^ nil ].
>>
>>   peekBuffer ifNotNil: [ ^ peekBuffer ].
>>
>>   ^ peekBuffer := self next.
>>
>>
>>
>> So when we first start the program, i.e. the user hasn't entered any
>> input yet, and #peek is called:
>>
>> 1. #atEnd returns false because Ctrl-D (or similar) hasn't been
>> entered (assuming it is non-blocking).
>> 2. peekBuffer is nil because we haven't previously called #peek.
>> 3. The system now blocks on "self next".
>>
>>
>> Just a reminder: for terminal input the end-of-file isn't reached
>> until the user explicitly enters the end of file key (Ctrl-D).
>>
>> So, if there is no buffered input (either none has been entered yet,
>> or all input has been consumed)
>>
>> #atEnd (after the patch) calls #primAtEnd:.
>>
>> At the moment, #primAtEnd: ends up calling the libc function feof(),
>> which is non-blocking and answers the end-of-file flag for the FILE*.
>> Since the user hasn't entered Ctrl-D, that's false.
>>
>> If we want to control iteration over the stream and ensure that we
>> don't need to do a "stream next == nil" check, then #primAtEnd: is
>> going to have to peek for the next character, and that means waiting
>> for the user to enter that character.
>>
>> In c that is typically done using:
>>
>> atEnd = ungetc(fgetc(fp), fp);
>>
>> and fgetc() will block until the user enters something.
>>
>>> I have run your example testAtEnd.st now, and it works/fails as advertised.
>>
>> :-)
>>
>>
>> Cheers,
>> Alistair
>



Reply | Threaded
Open this post in threaded view
|

Re: Changed #atEnd primitive - #atEnd vs #next returning nil

Sven Van Caekenberghe-2


> On 11 Apr 2018, at 17:16, Denis Kudriashov <[hidden email]> wrote:
>
>
> 2018-04-11 17:02 GMT+02:00 Sven Van Caekenberghe <[hidden email]>:
>
>
> > On 11 Apr 2018, at 16:36, Sven Van Caekenberghe <[hidden email]> wrote:
> >
> > I can make your example, using the Zn variants, work with the following change:
> >
> > StdioStream>>#atEnd
> >  ^ peekBuffer isNil or: [ (peekBuffer := self next) isNil ]
>
> Argh, make that
>
> atEnd
>   ^ peekBuffer isNil and: [ (peekBuffer := self next) isNil ]
>
> But discussion exactly about "self next isNil": how to avoid it.

I know, but like this it could/might become an implementation detail.

The more things that I try, the more that I feel that stdin is so special that it does not fit in the rest of the stream zoo.

> but I am still testing, this is probably not the final answer/solution.
>
> > Which is a literal implementation of your statement that you can only know that you are atEnd by reading (and thus waiting/blocking) and checking for nil, which seems logical to me, given the fact that you *are* waiting for user input.
> >
> > BTW, at least on macOS you have to enter ctrl-D (^D) on a separate line, I am not sure how relevant that is, but that is probably another argument that stdin is special (being line-buffered by the OS, EOF needing to be on a separate line).
> >
> > And FWIW, I have been writing networking code in Pharo for years, and I have never had issues with unclear semantics of these primitives (#atEnd, #next, #peek) on network streams, either the classic SocketStream or the Zdc* streams (TLS or not). That is why I think we have to be careful.
> >
> > That being said, it is important to continue this discussion, I find it very interesting. I am trying to write some test code using stdin myself, to better understand the topic.
> >
> >> On 11 Apr 2018, at 16:06, Alistair Grant <[hidden email]> wrote:
> >>
> >> On 11 April 2018 at 15:11, Sven Van Caekenberghe <[hidden email]> wrote:
> >>>
> >>>
> >>>> On 11 Apr 2018, at 11:12, Sven Van Caekenberghe <[hidden email]> wrote:
> >>>>
> >>>> How does one modify #atEnd to block ? I suppose you are talking about StdioStream>>#atEnd ?
> >>>>
> >>>> ^ self peek isNil
> >>>>
> >>>> ?
> >>>
> >>> Still the same question, how do you implement a blocking #atEnd for stdin ?
> >>>
> >>> I have seen your stdio.cs which is indeed needed as the current StdioStream>>#atEnd is bogus for sure.
> >>>
> >>> But that is still a non-blocking one, right ?
> >>>
> >>> Since there is a peekBuffer in StdioStream, why can't that be used ?
> >>
> >> I think you've created a chicken-and-egg problem with this question,
> >> but ignoring that for now:
> >>
> >>
> >> StdioStream>>peek
> >> "Answer what would be returned if the message next were sent to the
> >> receiver. If the receiver is at the end, answer nil.  "
> >>
> >>   self atEnd ifTrue: [^ nil ].
> >>
> >>   peekBuffer ifNotNil: [ ^ peekBuffer ].
> >>
> >>   ^ peekBuffer := self next.
> >>
> >>
> >>
> >> So when we first start the program, i.e. the user hasn't entered any
> >> input yet, and #peek is called:
> >>
> >> 1. #atEnd returns false because Ctrl-D (or similar) hasn't been
> >> entered (assuming it is non-blocking).
> >> 2. peekBuffer is nil because we haven't previously called #peek.
> >> 3. The system now blocks on "self next".
> >>
> >>
> >> Just a reminder: for terminal input the end-of-file isn't reached
> >> until the user explicitly enters the end of file key (Ctrl-D).
> >>
> >> So, if there is no buffered input (either none has been entered yet,
> >> or all input has been consumed)
> >>
> >> #atEnd (after the patch) calls #primAtEnd:.
> >>
> >> At the moment, #primAtEnd: ends up calling the libc function feof(),
> >> which is non-blocking and answers the end-of-file flag for the FILE*.
> >> Since the user hasn't entered Ctrl-D, that's false.
> >>
> >> If we want to control iteration over the stream and ensure that we
> >> don't need to do a "stream next == nil" check, then #primAtEnd: is
> >> going to have to peek for the next character, and that means waiting
> >> for the user to enter that character.
> >>
> >> In c that is typically done using:
> >>
> >> atEnd = ungetc(fgetc(fp), fp);
> >>
> >> and fgetc() will block until the user enters something.
> >>
> >>> I have run your example testAtEnd.st now, and it works/fails as advertised.
> >>
> >> :-)
> >>
> >>
> >> Cheers,
> >> Alistair
> >


Reply | Threaded
Open this post in threaded view
|

Re: Changed #atEnd primitive - #atEnd vs #next returning nil

Sven Van Caekenberghe-2
I think we have to reset this whole discussion.

  FileStream stdin

and

  Stdio stdin

are completely different !

We'll have to check that first, before talking about the issues raised in this thread.

And BTW these terminal streams are a real pain to test ;-)

> On 11 Apr 2018, at 17:20, Sven Van Caekenberghe <[hidden email]> wrote:
>
>
>
>> On 11 Apr 2018, at 17:16, Denis Kudriashov <[hidden email]> wrote:
>>
>>
>> 2018-04-11 17:02 GMT+02:00 Sven Van Caekenberghe <[hidden email]>:
>>
>>
>>> On 11 Apr 2018, at 16:36, Sven Van Caekenberghe <[hidden email]> wrote:
>>>
>>> I can make your example, using the Zn variants, work with the following change:
>>>
>>> StdioStream>>#atEnd
>>> ^ peekBuffer isNil or: [ (peekBuffer := self next) isNil ]
>>
>> Argh, make that
>>
>> atEnd
>>  ^ peekBuffer isNil and: [ (peekBuffer := self next) isNil ]
>>
>> But discussion exactly about "self next isNil": how to avoid it.
>
> I know, but like this it could/might become an implementation detail.
>
> The more things that I try, the more that I feel that stdin is so special that it does not fit in the rest of the stream zoo.
>
>> but I am still testing, this is probably not the final answer/solution.
>>
>>> Which is a literal implementation of your statement that you can only know that you are atEnd by reading (and thus waiting/blocking) and checking for nil, which seems logical to me, given the fact that you *are* waiting for user input.
>>>
>>> BTW, at least on macOS you have to enter ctrl-D (^D) on a separate line, I am not sure how relevant that is, but that is probably another argument that stdin is special (being line-buffered by the OS, EOF needing to be on a separate line).
>>>
>>> And FWIW, I have been writing networking code in Pharo for years, and I have never had issues with unclear semantics of these primitives (#atEnd, #next, #peek) on network streams, either the classic SocketStream or the Zdc* streams (TLS or not). That is why I think we have to be careful.
>>>
>>> That being said, it is important to continue this discussion, I find it very interesting. I am trying to write some test code using stdin myself, to better understand the topic.
>>>
>>>> On 11 Apr 2018, at 16:06, Alistair Grant <[hidden email]> wrote:
>>>>
>>>> On 11 April 2018 at 15:11, Sven Van Caekenberghe <[hidden email]> wrote:
>>>>>
>>>>>
>>>>>> On 11 Apr 2018, at 11:12, Sven Van Caekenberghe <[hidden email]> wrote:
>>>>>>
>>>>>> How does one modify #atEnd to block ? I suppose you are talking about StdioStream>>#atEnd ?
>>>>>>
>>>>>> ^ self peek isNil
>>>>>>
>>>>>> ?
>>>>>
>>>>> Still the same question, how do you implement a blocking #atEnd for stdin ?
>>>>>
>>>>> I have seen your stdio.cs which is indeed needed as the current StdioStream>>#atEnd is bogus for sure.
>>>>>
>>>>> But that is still a non-blocking one, right ?
>>>>>
>>>>> Since there is a peekBuffer in StdioStream, why can't that be used ?
>>>>
>>>> I think you've created a chicken-and-egg problem with this question,
>>>> but ignoring that for now:
>>>>
>>>>
>>>> StdioStream>>peek
>>>> "Answer what would be returned if the message next were sent to the
>>>> receiver. If the receiver is at the end, answer nil.  "
>>>>
>>>>  self atEnd ifTrue: [^ nil ].
>>>>
>>>>  peekBuffer ifNotNil: [ ^ peekBuffer ].
>>>>
>>>>  ^ peekBuffer := self next.
>>>>
>>>>
>>>>
>>>> So when we first start the program, i.e. the user hasn't entered any
>>>> input yet, and #peek is called:
>>>>
>>>> 1. #atEnd returns false because Ctrl-D (or similar) hasn't been
>>>> entered (assuming it is non-blocking).
>>>> 2. peekBuffer is nil because we haven't previously called #peek.
>>>> 3. The system now blocks on "self next".
>>>>
>>>>
>>>> Just a reminder: for terminal input the end-of-file isn't reached
>>>> until the user explicitly enters the end of file key (Ctrl-D).
>>>>
>>>> So, if there is no buffered input (either none has been entered yet,
>>>> or all input has been consumed)
>>>>
>>>> #atEnd (after the patch) calls #primAtEnd:.
>>>>
>>>> At the moment, #primAtEnd: ends up calling the libc function feof(),
>>>> which is non-blocking and answers the end-of-file flag for the FILE*.
>>>> Since the user hasn't entered Ctrl-D, that's false.
>>>>
>>>> If we want to control iteration over the stream and ensure that we
>>>> don't need to do a "stream next == nil" check, then #primAtEnd: is
>>>> going to have to peek for the next character, and that means waiting
>>>> for the user to enter that character.
>>>>
>>>> In c that is typically done using:
>>>>
>>>> atEnd = ungetc(fgetc(fp), fp);
>>>>
>>>> and fgetc() will block until the user enters something.
>>>>
>>>>> I have run your example testAtEnd.st now, and it works/fails as advertised.
>>>>
>>>> :-)
>>>>
>>>>
>>>> Cheers,
>>>> Alistair


Reply | Threaded
Open this post in threaded view
|

Re: Changed #atEnd primitive - #atEnd vs #next returning nil

alistairgrant
In reply to this post by Denis Kudriashov
Hi Denis,

On 11 April 2018 at 17:16, Denis Kudriashov <[hidden email]> wrote:

>
> 2018-04-11 17:02 GMT+02:00 Sven Van Caekenberghe <[hidden email]>:
>>
>>
>>
>> > On 11 Apr 2018, at 16:36, Sven Van Caekenberghe <[hidden email]> wrote:
>> >
>> > I can make your example, using the Zn variants, work with the following
>> > change:
>> >
>> > StdioStream>>#atEnd
>> >  ^ peekBuffer isNil or: [ (peekBuffer := self next) isNil ]
>>
>> Argh, make that
>>
>> atEnd
>>   ^ peekBuffer isNil and: [ (peekBuffer := self next) isNil ]
>
>
> But discussion exactly about "self next isNil": how to avoid it.

Apologies in advance for being pedantic, but...

Do you really mean this (that the discussion is about how to avoid
testing "self next isNil")?

My argument has been that without making #atEnd blocking it is not
possible to avoid the test (and we don't want to make #atEnd
blocking).  All the existing stream code that deals with character /
byte streams does this test (see my "unwritten rules" from a previous
reply).  We don't want to make #atEnd blocking, so we need to keep the
test (and my personal opinion is that changing the Zinc streams to
adopt this approach does not add any significant architectural
complexity).

Assuming we reach agreement on the above, we do need to make the
"unwritten rules" written. If we reach a different agreement we should
document that.


Cheers,
Alistair

Reply | Threaded
Open this post in threaded view
|

Re: Changed #atEnd primitive - #atEnd vs #next returning nil

alistairgrant
In reply to this post by Sven Van Caekenberghe-2
Hi Sven,

On 11 April 2018 at 17:33, Sven Van Caekenberghe <[hidden email]> wrote:

> I think we have to reset this whole discussion.
>
>   FileStream stdin
>
> and
>
>   Stdio stdin
>
> are completely different !
>
> We'll have to check that first, before talking about the issues raised in this thread.

Are you sure you're comparing (roughly) equal things?

I would compare:

FileStream stdin

(very roughly) to:

ZnCharacterReadStream on:
    (ZnBufferedReadStream on:
        Stdio stdin).

Actually it is still not a fair comparison because MultiByteFileStream
attempts to be writable as well.  If you could notionally do:

FileStream stdin
  readOnly;
  binary;
  unbuffered;
  yourself.

you could compare FileStream and Stdio :-)

One important similarity:  At the bottom they both use the same set of
primitives to communicate with the OS stdio streams (FilePlugin).

HTH,
Alistair



> And BTW these terminal streams are a real pain to test ;-)

Reply | Threaded
Open this post in threaded view
|

Re: Changed #atEnd primitive - #atEnd vs #next returning nil

Sven Van Caekenberghe-2
Something is off (and/or I am getting crazy, probably both).

$ ./pharo --headless Pharo.image eval '(FileStream stdin binary; next: 3)'
123
#[49]

??

This should return #[49 50 51] AFAIK.

> On 11 Apr 2018, at 18:26, Alistair Grant <[hidden email]> wrote:
>
> Hi Sven,
>
> On 11 April 2018 at 17:33, Sven Van Caekenberghe <[hidden email]> wrote:
>> I think we have to reset this whole discussion.
>>
>>  FileStream stdin
>>
>> and
>>
>>  Stdio stdin
>>
>> are completely different !
>>
>> We'll have to check that first, before talking about the issues raised in this thread.
>
> Are you sure you're comparing (roughly) equal things?
>
> I would compare:
>
> FileStream stdin
>
> (very roughly) to:
>
> ZnCharacterReadStream on:
>    (ZnBufferedReadStream on:
>        Stdio stdin).
>
> Actually it is still not a fair comparison because MultiByteFileStream
> attempts to be writable as well.  If you could notionally do:
>
> FileStream stdin
>  readOnly;
>  binary;
>  unbuffered;
>  yourself.
>
> you could compare FileStream and Stdio :-)
>
> One important similarity:  At the bottom they both use the same set of
> primitives to communicate with the OS stdio streams (FilePlugin).
>
> HTH,
> Alistair
>
>
>
>> And BTW these terminal streams are a real pain to test ;-)


123