[ANN] CoroutineReadStream (again)

Previous Topic Next Topic
 
classic Classic list List threaded Threaded
7 messages Options
Reply | Threaded
Open this post in threaded view
|

[ANN] CoroutineReadStream (again)

Stephen Pair
This version fixes a bug in #close that causes the coroutine to not get unwound when the stream is closed.  Also, it now uses the stream itself as a marker instead of a special string (a suggestion from Eliot).  I've also created unit tests (which is how I caught the problem with close).  Below is the class comment:

-----

I enable the use of a ReadStream interface in cases where enumeration protocol is available, but no read stream interface is available.  For example, I can adapt any object that supports #do: for enumeration:

CoroutineReadStream on: #(1 2 3) iterator: #do:

The implementation makes use of coroutines to avoid the need to create and schedule a separate Process.  Sending #close will abandon the coroutine and unwind all contexts in the coroutine.  The coroutine is not invoked until the first request for an object.  The iterator is finished when there is an attempt to read past the end (via #next or #peek), or when there is an #atEnd test after the last object was retrieved.

A CoroutineReadSteam is not particularly efficient due to its use of thisContext and how stacks are optimized in most Smalltalk implementations.

Instance Variables:
nextValue <Object | noObjectMarker> Holds the next value to be answered when #next is called (needed to support #peek and #atEnd)...if it holds the noObjectMarker, it inticates that the coroutine needs to be invoked to get the next object.
suspendedContext <ContextPart> This is either the context of the iterating stack (when not actively retrieving the next object for a client) or the stack of the user of the stream (when of retrieving the next object)
homeContext <MethodContext> Holds the context of #initializeForBlock: called during instantiation and is used to ensure that exception propagation and handling works as one would expect.
Copyright (c) 2009 Stephen Pair

 Permission is hereby granted, free of charge, to any person
 obtaining a copy of this software and associated documentation
 files (the "Software"), to deal in the Software without
 restriction, including without limitation the rights to use,
 copy, modify, merge, publish, distribute, sublicense, and/or sell
 copies of the Software, and to permit persons to whom the
 Software is furnished to do so, subject to the following
 conditions:

 The above copyright notice and this permission notice shall be
 included in all copies or substantial portions of the Software.

 THE SOFTWARE IS PROVIDED "AS IS", WITHOUT WARRANTY OF ANY KIND,
 EXPRESS OR IMPLIED, INCLUDING BUT NOT LIMITED TO THE WARRANTIES
 OF MERCHANTABILITY, FITNESS FOR A PARTICULAR PURPOSE AND
 NONINFRINGEMENT. IN NO EVENT SHALL THE AUTHORS OR COPYRIGHT
 HOLDERS BE LIABLE FOR ANY CLAIM, DAMAGES OR OTHER LIABILITY,
 WHETHER IN AN ACTION OF CONTRACT, TORT OR OTHERWISE, ARISING
 FROM, OUT OF OR IN CONNECTION WITH THE SOFTWARE OR THE USE OR
 OTHER DEALINGS IN THE SOFTWARE.



- Stephen



CoroutineReadStream.st (11K) Download Attachment
CoroutineReadStreamTest.st (7K) Download Attachment
Reply | Threaded
Open this post in threaded view
|

Re: [ANN] CoroutineReadStream (again)

Colin Putney

On 2009-12-13, at 5:22 PM, Stephen Pair wrote:

> This version fixes a bug in #close that causes the coroutine to not get unwound when the stream is closed.  Also, it now uses the stream itself as a marker instead of a special string (a suggestion from Eliot).  I've also created unit tests (which is how I caught the problem with close).  Below is the class comment:

Awesome. I've been meaning to implement this for Filesystem. Thanks!

Colin
Reply | Threaded
Open this post in threaded view
|

Re: [ANN] CoroutineReadStream (again)

Stephen Pair
On Mon, Dec 14, 2009 at 12:56 PM, Colin Putney <[hidden email]> wrote:

On 2009-12-13, at 5:22 PM, Stephen Pair wrote:

> This version fixes a bug in #close that causes the coroutine to not get unwound when the stream is closed.  Also, it now uses the stream itself as a marker instead of a special string (a suggestion from Eliot).  I've also created unit tests (which is how I caught the problem with close).  Below is the class comment:

Awesome. I've been meaning to implement this for Filesystem. Thanks!

Colin

Can you describe how it would be used with Filesystem?

- Stephen


Reply | Threaded
Open this post in threaded view
|

Re: [ANN] CoroutineReadStream (again)

Nicolas Cellier
2009/12/14 Stephen Pair <[hidden email]>:

> On Mon, Dec 14, 2009 at 12:56 PM, Colin Putney <[hidden email]> wrote:
>>
>> On 2009-12-13, at 5:22 PM, Stephen Pair wrote:
>>
>> > This version fixes a bug in #close that causes the coroutine to not get
>> > unwound when the stream is closed.  Also, it now uses the stream itself as a
>> > marker instead of a special string (a suggestion from Eliot).  I've also
>> > created unit tests (which is how I caught the problem with close).  Below is
>> > the class comment:
>>
>> Awesome. I've been meaning to implement this for Filesystem. Thanks!
>>
>> Colin
>
> Can you describe how it would be used with Filesystem?
> - Stephen
>
>
>

Maybe an encoder/decoder pattern ?
Example: convert cr-lf pairs to cr only
CoroutineReadStream
        onBlock:
                [ :outputStream |
                [inputStream do: [ :ea |
                        outputStream acceptNextObject: ea.
                        ea == Character cr ifTrue: [inputStream peekFor: Character lf]]]

Nicolas

Reply | Threaded
Open this post in threaded view
|

Re: [ANN] CoroutineReadStream (again)

Igor Stasenko
2009/12/14 Nicolas Cellier <[hidden email]>:

> 2009/12/14 Stephen Pair <[hidden email]>:
>> On Mon, Dec 14, 2009 at 12:56 PM, Colin Putney <[hidden email]> wrote:
>>>
>>> On 2009-12-13, at 5:22 PM, Stephen Pair wrote:
>>>
>>> > This version fixes a bug in #close that causes the coroutine to not get
>>> > unwound when the stream is closed.  Also, it now uses the stream itself as a
>>> > marker instead of a special string (a suggestion from Eliot).  I've also
>>> > created unit tests (which is how I caught the problem with close).  Below is
>>> > the class comment:
>>>
>>> Awesome. I've been meaning to implement this for Filesystem. Thanks!
>>>
>>> Colin
>>
>> Can you describe how it would be used with Filesystem?
>> - Stephen
>>
>>
>>
>
> Maybe an encoder/decoder pattern ?
> Example: convert cr-lf pairs to cr only
> CoroutineReadStream
>        onBlock:
>                [ :outputStream |
>                [inputStream do: [ :ea |
>                        outputStream acceptNextObject: ea.
>                        ea == Character cr ifTrue: [inputStream peekFor: Character lf]]]
>

Nope, stream -> stream chaining is a natural thing. Of course you can
use CoroutineReadStream wherever you want, but its purpose is a little
different IMO.

As to me, that CoroutineReadStream is nice wrapping , providing a
transformation from block iterators
to stream. It switching from the 'push' logic (where provider pushing
values to consumer) to 'pull' (where consumer asks provider to give
next value(s), when ready to process next value).

An iteration blocks usually don't decide if they willing to get next
value or when (hence 'push'), and as a consequence its hard to switch
the iterator's logic in the middle of iteration.

While with streams its easy. A patterns like:

(stream next = $x) ifTrue: [
   self decodeFooFrom: stream. ].
stream upto: $y.
self decodeBarFrom: stream.

easy to do with streams, but will be awkward for implementing inside
iterator block.


> Nicolas
>
>



--
Best regards,
Igor Stasenko AKA sig.

Reply | Threaded
Open this post in threaded view
|

Re: [ANN] CoroutineReadStream (again)

Stephen Pair
On Mon, Dec 14, 2009 at 4:28 PM, Igor Stasenko <[hidden email]> wrote:
As to me, that CoroutineReadStream is nice wrapping , providing a
transformation from block iterators
to stream. It switching from the 'push' logic (where provider pushing
values to consumer) to 'pull' (where consumer asks provider to give
next value(s), when ready to process next value).

An iteration blocks usually don't decide if they willing to get next
value or when (hence 'push'), and as a consequence its hard to switch
the iterator's logic in the middle of iteration.

Yes, precisely.  I find it particularly handy when dealing with parsers.  Parsers (i.e. combinatorial parsers) usually use the activation stack to keep track of their progress and do their work.  This means they need to drive execution.  For those that are familiar with PEGs and grammars that don't make use of a lexer or tokenizer, you may know that dealing with whitespace adds a lot of clutter to a grammar.  Since we can have PEGs that are designed to work with arbitrary objects and not just characters, you could solve this problem by first creating a token parser that transforms characters into tokens, omitting whitespace.  Next, you have the parser for your language that transforms the tokens into whatever you like.  Due to the usage of the activation stack by the token parser I mentioned, it's not easy to chain the parsers together.  CoroutineReadStream makes it simpler to do. 

Here's an overly simplified and contrived illustration:

streamOfTokens := 
CoroutineReadStream onBlock: 
[ :outputStream |
tokenParser 
parse: someCharacterStream 
into: 
[ :token |
outputStream acceptNextObject: token]].

someLanguageParser
parse: streamOfTokens
into:
[ :parseNodeOrSomething |
self doSomethingWith: parseNodeOrSomething].


- Stephen


Reply | Threaded
Open this post in threaded view
|

Re: [ANN] CoroutineReadStream (again)

Colin Putney
In reply to this post by Stephen Pair

On 2009-12-14, at 12:10 PM, Stephen Pair wrote:

> Can you describe how it would be used with Filesystem?

Filesystem makes use of a Guides and Visitors pattern for traversing directory trees. The guides drive execution, walking around the filesystem in a specific order, and sending #visitDirectory: and #visitFile: to their visitor. You implement directory-tree operations by implementing a subclass of FSVisitor.

That all works fine, but I'd like to provide a bit of flexibility in the interface. The idea was to provide a CoroutineVisitor that would transform the visiting callbacks into a ReadStream interface. Then you could do something like this:

stream := FSBreathFirstGuide readStreamOn: aReference.

Now, the guides that Filesystem provides are written in such a way that it wouldn't be too hard to invert control and provide a stream interface without coroutines. But I want it to be easy for client applications to implement guides, and not being able to keep state on the activation stack would make things difficult.

Colin