Squeak XTream + COG

Previous Topic Next Topic
 
classic Classic list List threaded Threaded
3 messages Options
Reply | Threaded
Open this post in threaded view
|

Squeak XTream + COG

Nicolas Cellier
My own Squeak XTream development has got a bit frozen...
The arrival of COG is an excellent pretext to wake it up.

In short, the essence of this mail is that XTream performs well in COG
(as well as Stream or better).
Moreover it shows how some possible speed up are still possible for
files despite Levente Stream optimizations.
If interested, read the rest, if not apologies for the too long mail.

Nicolas



What Squeak XTream is not
======================

First, I'd like to renew my apologies to VW fellows (Michael Lucas
Smith & al) for hi-jacking the name.
Squeak XTream is not a port of VW XTream, though it share some ideas
and inspiration.
If someone wants to port VW XTream to Squeak, then I shall give the name back.
In the interim, I keep it, it's too nice (any idea welcome for a rename).

Squeak XTream is much less extreme than VW: for example, it preserve
most message names.
As so, it's just a possible replacement/clean-up of Squeak Stream.
Squeak XTream is also far less extended (no parser etc...).

What Squeak XTream is
===================

Did you ever browse Squeak Stream hierarchy? Then you know what Squeak
XTream is about!
It's about reducing complexity and increasing quality by using simple
uniform concepts.
Not sure the goal is reached yet, but we must keep it in mind.

The 1st idea behind Squeak XTream is to use Wrappers rather than subclasses.
Wrappers act somehow as filters with an input and an output (think of
a Unix pipe).

Note there are other alternatives like composition by Traits - see Nile.
Nile implements some Wrappers too though.

The 2nd idea is to separate ReadXtream and WriteXtream as much as possible.
Well Squeak XTream is not as extreme as VW with this respect too...
It still have a read/write stream subclass (not nice).
I think this was the main start point of Nile.

The 3rd idea is to uniformely provide #readInto:startingAt:count: and
#next:putAll:startingAt: API.
This is essential to increase the throughput when possible (this is
the well known buffering).

The 4th idea is to offer a parametric endOfStream handling for read stream.
It can be a simple ^nil, raising an Exception or evaluating a Block...
(anything responding to value).
In Squeak, streaming on a collection containing nil could be problematic.

The performances
==============

We have justified Squeak XTream by quality, but performances count too.
Essentially because Stream are used everywhere deep in the Kernel
operations (File read/write, character encoding/decoding,
Compiler/Parser, text processing etc...).

With traditional VM, Squeak XTream performs well, but for #next sends
because it does not implement a primitive.
Good news, as already indicated by Eliot, the #next #nextPut:
primitives are absolutely not necessary with COG, better throw them
out!

    {
    [| tmp |
        tmp := (String new: 10000) writeStream.
        1 to: 10000 do: [:i | tmp nextPut: $0]] bench.
    [| tmp |
        tmp := (String new: 10000) writeXtream.
        1 to: 10000 do: [:i | tmp nextPut: $0]] bench.
    }
    #('1,200 per second.' '1,310 per second.')
    #('1,180 per second.' '1,320 per second.')

    {
    [| tmp |
        tmp := (String new: 10000 withAll: $0) readStream.
        1 to: 10000 do: [:i | tmp next]] bench.
    [| tmp |
        tmp := (String new: 10000 withAll: $0) readXtream.
        1 to: 10000 do: [:i | tmp next]] bench.
    }
   #('2,470 per second.' '2,470 per second.')
   #('2,490 per second.' '2,480 per second.')


This now makes XTream performance similar to Stream for every current message.
This includes performances on all kind of simple loops.
    [ tmp := aStream next. tmp==nil] whileFalse: [ tmp doSomething ].
    [ aStream atEnd] whileFalse: [ aStream next doSomething ].
    aStream do: [:next | next doSomething ].
Though XTream adds one more possibility.
    aStream endOfStreamAction: [ ^nil].
    [ aStream next doSomething. true ] whileTrue. "Don't use repeat in
Squeak, it's not inlined"

    | str |
    str := String new: 1000 withAll: $a.
    {
        [| tmp | tmp := str readStream. [tmp next==nil] whileFalse] bench.
        [| tmp | tmp := str readXtream. [tmp next==nil] whileFalse] bench.
    }
    #('34,800 per second.' '37,000 per second.')
    #('36,000 per second.' '37,000 per second.')

    | str |
    str := String new: 1000 withAll: $a.
    {
        [tmp := str readStream. [tmp atEnd] whileFalse: [tmp next]] bench.
        [tmp := str readXtream. [tmp atEnd] whileFalse: [tmp next]] bench.
    }
    #('27,000 per second.' '27,500 per second.')
    #('26,600 per second.' '27,100 per second.')

This also is the case of major messages upTo: upToAnyOf: nextPutAll: etc...

    | str |
    str := String new: 1000 withAll: $a.
    {
        [str readStream upTo: $b] bench.
        [str readXtream upTo: $b] bench.
    }
    #('294,000 per second.' '297,000 per second.')
    #('296,000 per second.' '293,000 per second.')

Now, what about Files ?
Squeak FileStream has recently (4.1) known major speed up thanks to
the hard work of Levente which backported several experiments to
Squeak Stream hierarchy with 100% backward compatibility and smooth
transition. Bravo!
This does naturally reduce one of the advantage of Squeak XTream,
which was buffering optimizations.

    {
    [| tmp |
        tmp := (StandardFileStream readOnlyFileNamed: (SourceFiles at: 2) name).
        [tmp next==nil] whileFalse. tmp close] timeToRun.
    [| tmp |
        tmp := (StandardFileStream readOnlyFileNamed: (SourceFiles at:
2) name) readXtream buffered.
        [tmp next==nil] whileFalse. tmp close] timeToRun.
    }
    #(1497 1164)
    #(1426 1132)

The speed up is not null but not major neither.
Though, some old Stream messages still deserve optimization as
demonstrated here:

    {
    [| tmp |
        tmp := (StandardFileStream readOnlyFileNamed: (SourceFiles at:
2) name) ascii.
        [tmp upTo: Character cr. tmp atEnd] whileFalse. tmp close] timeToRun.
    [| tmp |
        tmp := (StandardFileStream readOnlyFileNamed:
        (SourceFiles at: 2) name) readXtream ascii buffered.
        [tmp upTo: Character cr. tmp atEnd] whileFalse. tmp close] timeToRun.
    }
    #(8854 716)
    #(8859 716)

    {
    [| tmp |
        tmp := (StandardFileStream readOnlyFileNamed: (SourceFiles at:
2) name) ascii.
        [tmp upToAnyOf: (CharacterSet crlf). tmp atEnd] whileFalse.
        tmp close] timeToRun.
    [| tmp |
        tmp := (StandardFileStream readOnlyFileNamed: (SourceFiles at:
2) name) readXtream ascii buffered.
        [tmp upToAnyOf: (CharacterSet crlf). tmp atEnd] whileFalse.
        tmp close] timeToRun.
    }
    #(9138 857)
    #(9157 837)

One more important subject is the MultiByteFileStream bottleneck.
Internationalisation is an essential feature, many thanks to Yoshiki
for bringing it alive.
But the performance price is high.
Also, this is a place where Squeak dramatically require clean-ups (you
know all the basicNext and the like are just hackish).
Now, once again, since Levente buffering, the difference is not that high:

    {
    [| tmp |
        tmp := (MultiByteFileStream readOnlyFileNamed: (SourceFiles
at: 2) name) ascii;
                wantsLineEndConversion: false; converter: UTF8TextConverter new.
        1 to: 20000 do: [:i | tmp upTo: Character cr].
        tmp close] timeToRun.
    [| tmp |
        tmp := (StandardFileStream readOnlyFileNamed: (SourceFiles at: 2) name)
                readXtream binary buffered ~> UTF8Decoder.
        1 to: 20000 do: [:i | tmp upTo: Character cr].
        tmp close] timeToRun.
    }
    #(152 120 )
    #(150 119 )

But wait, the file was buffered (bytes are fetched from file by packets),
but the decoder was not! All decoding is performed char by char.
That's bad, because when only a few bytes require decoding and
majority can be translated unchanged to String, there is potentially a
major speed up by simple using a sub-array copy primitive. We know
this since #squeakToUTF8, many thanks to Andreas.
To profit by buffering for decoder too, just use a message to wrap it up:

    {
    [| tmp |
        tmp := (MultiByteFileStream readOnlyFileNamed: (SourceFiles
at: 2) name) ascii;
                wantsLineEndConversion: false; converter: UTF8TextConverter new.
        1 to: 20000 do: [:i | tmp upTo: Character cr].
        tmp close] timeToRun.
    [| tmp |
        tmp := ((StandardFileStream readOnlyFileNamed: (SourceFiles at: 2) name)
                readXtream binary buffered ~> UTF8Decoder) buffered.
        1 to: 20000 do: [:i | tmp upTo: Character cr].
        tmp close] timeToRun.
    }
   #(152 18)
   #(152 19)

Bingo, now the speed up is there too! 7.5x is not a bad score afterall.
That's not amazing, the change log is essentially made of ASCII and
does rarely require any UTF8 translation at all.
Of course, if you handle files full of chinese code points, don't
expect a speed up at all!
But for a decent proportion of latin character users, the potential
speed-up is there, right under our Streams.

Nicolas (again)

Reply | Threaded
Open this post in threaded view
|

Re: Squeak XTream + COG

Levente Uzonyi-2
On Sat, 11 Sep 2010, Nicolas Cellier wrote:

> My own Squeak XTream development has got a bit frozen...
> The arrival of COG is an excellent pretext to wake it up.
>
> In short, the essence of this mail is that XTream performs well in COG
> (as well as Stream or better).
> Moreover it shows how some possible speed up are still possible for
> files despite Levente Stream optimizations.
> If interested, read the rest, if not apologies for the too long mail.
>
> Nicolas
>
>
>
> What Squeak XTream is not
> ======================
>
> First, I'd like to renew my apologies to VW fellows (Michael Lucas
> Smith & al) for hi-jacking the name.
> Squeak XTream is not a port of VW XTream, though it share some ideas
> and inspiration.
> If someone wants to port VW XTream to Squeak, then I shall give the name back.
> In the interim, I keep it, it's too nice (any idea welcome for a rename).
>
> Squeak XTream is much less extreme than VW: for example, it preserve
> most message names.
> As so, it's just a possible replacement/clean-up of Squeak Stream.
> Squeak XTream is also far less extended (no parser etc...).
>
> What Squeak XTream is
> ===================
>
> Did you ever browse Squeak Stream hierarchy? Then you know what Squeak
> XTream is about!
> It's about reducing complexity and increasing quality by using simple
> uniform concepts.
> Not sure the goal is reached yet, but we must keep it in mind.
>
> The 1st idea behind Squeak XTream is to use Wrappers rather than subclasses.
> Wrappers act somehow as filters with an input and an output (think of
> a Unix pipe).
>
> Note there are other alternatives like composition by Traits - see Nile.
> Nile implements some Wrappers too though.
>
> The 2nd idea is to separate ReadXtream and WriteXtream as much as possible.
> Well Squeak XTream is not as extreme as VW with this respect too...
> It still have a read/write stream subclass (not nice).
> I think this was the main start point of Nile.
>
> The 3rd idea is to uniformely provide #readInto:startingAt:count: and
> #next:putAll:startingAt: API.
> This is essential to increase the throughput when possible (this is
> the well known buffering).
>
> The 4th idea is to offer a parametric endOfStream handling for read stream.
> It can be a simple ^nil, raising an Exception or evaluating a Block...
> (anything responding to value).
> In Squeak, streaming on a collection containing nil could be problematic.
>
> The performances
> ==============
>
> We have justified Squeak XTream by quality, but performances count too.
> Essentially because Stream are used everywhere deep in the Kernel
> operations (File read/write, character encoding/decoding,
> Compiler/Parser, text processing etc...).
>
> With traditional VM, Squeak XTream performs well, but for #next sends
> because it does not implement a primitive.
> Good news, as already indicated by Eliot, the #next #nextPut:
> primitives are absolutely not necessary with COG, better throw them
> out!

That's right, but we should keep them, because the SqueakVM is still
the most widely used VM and that's significantly slower without the
primitives.

>
>    {
>    [| tmp |
>        tmp := (String new: 10000) writeStream.
>        1 to: 10000 do: [:i | tmp nextPut: $0]] bench.
>    [| tmp |
>        tmp := (String new: 10000) writeXtream.
>        1 to: 10000 do: [:i | tmp nextPut: $0]] bench.
>    }
>    #('1,200 per second.' '1,310 per second.')
>    #('1,180 per second.' '1,320 per second.')

XTreams are a bit slower if you add a WideCharacter to the stream, because
ByteString is swapped to WideString with #becomeForward:, while
WriteStream >> #nextPut: has a string specific hack for this.

>
>    {
>    [| tmp |
>        tmp := (String new: 10000 withAll: $0) readStream.
>        1 to: 10000 do: [:i | tmp next]] bench.
>    [| tmp |
>        tmp := (String new: 10000 withAll: $0) readXtream.
>        1 to: 10000 do: [:i | tmp next]] bench.
>    }
>   #('2,470 per second.' '2,470 per second.')
>   #('2,490 per second.' '2,480 per second.')
>
>
> This now makes XTream performance similar to Stream for every current message.
> This includes performances on all kind of simple loops.
>    [ tmp := aStream next. tmp==nil] whileFalse: [ tmp doSomething ].
>    [ aStream atEnd] whileFalse: [ aStream next doSomething ].
>    aStream do: [:next | next doSomething ].
> Though XTream adds one more possibility.
>    aStream endOfStreamAction: [ ^nil].
>    [ aStream next doSomething. true ] whileTrue. "Don't use repeat in
> Squeak, it's not inlined"
>
>    | str |
>    str := String new: 1000 withAll: $a.
>    {
>        [| tmp | tmp := str readStream. [tmp next==nil] whileFalse] bench.
>        [| tmp | tmp := str readXtream. [tmp next==nil] whileFalse] bench.
>    }
>    #('34,800 per second.' '37,000 per second.')
>    #('36,000 per second.' '37,000 per second.')
>
>    | str |
>    str := String new: 1000 withAll: $a.
>    {
>        [tmp := str readStream. [tmp atEnd] whileFalse: [tmp next]] bench.
>        [tmp := str readXtream. [tmp atEnd] whileFalse: [tmp next]] bench.
>    }
>    #('27,000 per second.' '27,500 per second.')
>    #('26,600 per second.' '27,100 per second.')
>
> This also is the case of major messages upTo: upToAnyOf: nextPutAll: etc...
>
>    | str |
>    str := String new: 1000 withAll: $a.
>    {
>        [str readStream upTo: $b] bench.
>        [str readXtream upTo: $b] bench.
>    }
>    #('294,000 per second.' '297,000 per second.')
>    #('296,000 per second.' '293,000 per second.')
>
> Now, what about Files ?
> Squeak FileStream has recently (4.1) known major speed up thanks to
> the hard work of Levente which backported several experiments to
> Squeak Stream hierarchy with 100% backward compatibility and smooth
> transition. Bravo!
> This does naturally reduce one of the advantage of Squeak XTream,
> which was buffering optimizations.
>
>    {
>    [| tmp |
>        tmp := (StandardFileStream readOnlyFileNamed: (SourceFiles at: 2) name).
>        [tmp next==nil] whileFalse. tmp close] timeToRun.
>    [| tmp |
>        tmp := (StandardFileStream readOnlyFileNamed: (SourceFiles at:
> 2) name) readXtream buffered.
>        [tmp next==nil] whileFalse. tmp close] timeToRun.
>    }
>    #(1497 1164)
>    #(1426 1132)
>
> The speed up is not null but not major neither.

If you use #basicNext instead of #next for StandardFileStream and remove
the primitive from it, the performance will be the same. So getting rid of
the #basic* methods can give us a bit more speed besides cleaner code.
I have an idea to implement this, but I'm not sure the transition to the
new code would be smooth enough. I'm also not sure it's worth putting more
effort in this, using another stream implementation may be a better idea.

> Though, some old Stream messages still deserve optimization as
> demonstrated here:
>
>    {
>    [| tmp |
>        tmp := (StandardFileStream readOnlyFileNamed: (SourceFiles at:
> 2) name) ascii.
>        [tmp upTo: Character cr. tmp atEnd] whileFalse. tmp close] timeToRun.
>    [| tmp |
>        tmp := (StandardFileStream readOnlyFileNamed:
>        (SourceFiles at: 2) name) readXtream ascii buffered.
>        [tmp upTo: Character cr. tmp atEnd] whileFalse. tmp close] timeToRun.
>    }
>    #(8854 716)
>    #(8859 716)
>
>    {
>    [| tmp |
>        tmp := (StandardFileStream readOnlyFileNamed: (SourceFiles at:
> 2) name) ascii.
>        [tmp upToAnyOf: (CharacterSet crlf). tmp atEnd] whileFalse.
>        tmp close] timeToRun.
>    [| tmp |
>        tmp := (StandardFileStream readOnlyFileNamed: (SourceFiles at:
> 2) name) readXtream ascii buffered.
>        [tmp upToAnyOf: (CharacterSet crlf). tmp atEnd] whileFalse.
>        tmp close] timeToRun.
>    }
>    #(9138 857)
>    #(9157 837)

So far I couldn't find the real cause of this difference, maybe the vm
profiler will tell more. MessageTally doesn't seem to be useful.

>
> One more important subject is the MultiByteFileStream bottleneck.
> Internationalisation is an essential feature, many thanks to Yoshiki
> for bringing it alive.
> But the performance price is high.
> Also, this is a place where Squeak dramatically require clean-ups (you
> know all the basicNext and the like are just hackish).
> Now, once again, since Levente buffering, the difference is not that high:
>
>    {
>    [| tmp |
>        tmp := (MultiByteFileStream readOnlyFileNamed: (SourceFiles
> at: 2) name) ascii;
>                wantsLineEndConversion: false; converter: UTF8TextConverter new.
>        1 to: 20000 do: [:i | tmp upTo: Character cr].
>        tmp close] timeToRun.
>    [| tmp |
>        tmp := (StandardFileStream readOnlyFileNamed: (SourceFiles at: 2) name)
>                readXtream binary buffered ~> UTF8Decoder.
>        1 to: 20000 do: [:i | tmp upTo: Character cr].
>        tmp close] timeToRun.
>    }
>    #(152 120 )
>    #(150 119 )
>
> But wait, the file was buffered (bytes are fetched from file by packets),
> but the decoder was not! All decoding is performed char by char.
> That's bad, because when only a few bytes require decoding and
> majority can be translated unchanged to String, there is potentially a
> major speed up by simple using a sub-array copy primitive. We know
> this since #squeakToUTF8, many thanks to Andreas.
> To profit by buffering for decoder too, just use a message to wrap it up:
>
>    {
>    [| tmp |
>        tmp := (MultiByteFileStream readOnlyFileNamed: (SourceFiles
> at: 2) name) ascii;
>                wantsLineEndConversion: false; converter: UTF8TextConverter new.
>        1 to: 20000 do: [:i | tmp upTo: Character cr].
>        tmp close] timeToRun.
>    [| tmp |
>        tmp := ((StandardFileStream readOnlyFileNamed: (SourceFiles at: 2) name)
>                readXtream binary buffered ~> UTF8Decoder) buffered.
>        1 to: 20000 do: [:i | tmp upTo: Character cr].
>        tmp close] timeToRun.
>    }
>   #(152 18)
>   #(152 19)
>
> Bingo, now the speed up is there too! 7.5x is not a bad score afterall.
> That's not amazing, the change log is essentially made of ASCII and
> does rarely require any UTF8 translation at all.
> Of course, if you handle files full of chinese code points, don't
> expect a speed up at all!
> But for a decent proportion of latin character users, the potential
> speed-up is there, right under our Streams.

This could also be handled with my MultiByteStream idea (mentioned above
without the name).

So what's the conclusion? Should we consider adding XTream to Squeak and
evolving the system to use it instead of Streams?


Levente

>
> Nicolas (again)
>
>

Reply | Threaded
Open this post in threaded view
|

Re: Squeak XTream + COG

Nicolas Cellier
2010/9/12 Levente Uzonyi <[hidden email]>:

> On Sat, 11 Sep 2010, Nicolas Cellier wrote:
>
>> My own Squeak XTream development has got a bit frozen...
>> The arrival of COG is an excellent pretext to wake it up.
>>
>> In short, the essence of this mail is that XTream performs well in COG
>> (as well as Stream or better).
>> Moreover it shows how some possible speed up are still possible for
>> files despite Levente Stream optimizations.
>> If interested, read the rest, if not apologies for the too long mail.
>>
>> Nicolas
>>
>>
>>
>> What Squeak XTream is not
>> ======================
>>
>> First, I'd like to renew my apologies to VW fellows (Michael Lucas
>> Smith & al) for hi-jacking the name.
>> Squeak XTream is not a port of VW XTream, though it share some ideas
>> and inspiration.
>> If someone wants to port VW XTream to Squeak, then I shall give the name
>> back.
>> In the interim, I keep it, it's too nice (any idea welcome for a rename).
>>
>> Squeak XTream is much less extreme than VW: for example, it preserve
>> most message names.
>> As so, it's just a possible replacement/clean-up of Squeak Stream.
>> Squeak XTream is also far less extended (no parser etc...).
>>
>> What Squeak XTream is
>> ===================
>>
>> Did you ever browse Squeak Stream hierarchy? Then you know what Squeak
>> XTream is about!
>> It's about reducing complexity and increasing quality by using simple
>> uniform concepts.
>> Not sure the goal is reached yet, but we must keep it in mind.
>>
>> The 1st idea behind Squeak XTream is to use Wrappers rather than
>> subclasses.
>> Wrappers act somehow as filters with an input and an output (think of
>> a Unix pipe).
>>
>> Note there are other alternatives like composition by Traits - see Nile.
>> Nile implements some Wrappers too though.
>>
>> The 2nd idea is to separate ReadXtream and WriteXtream as much as
>> possible.
>> Well Squeak XTream is not as extreme as VW with this respect too...
>> It still have a read/write stream subclass (not nice).
>> I think this was the main start point of Nile.
>>
>> The 3rd idea is to uniformely provide #readInto:startingAt:count: and
>> #next:putAll:startingAt: API.
>> This is essential to increase the throughput when possible (this is
>> the well known buffering).
>>
>> The 4th idea is to offer a parametric endOfStream handling for read
>> stream.
>> It can be a simple ^nil, raising an Exception or evaluating a Block...
>> (anything responding to value).
>> In Squeak, streaming on a collection containing nil could be problematic.
>>
>> The performances
>> ==============
>>
>> We have justified Squeak XTream by quality, but performances count too.
>> Essentially because Stream are used everywhere deep in the Kernel
>> operations (File read/write, character encoding/decoding,
>> Compiler/Parser, text processing etc...).
>>
>> With traditional VM, Squeak XTream performs well, but for #next sends
>> because it does not implement a primitive.
>> Good news, as already indicated by Eliot, the #next #nextPut:
>> primitives are absolutely not necessary with COG, better throw them
>> out!
>
> That's right, but we should keep them, because the SqueakVM is still the
> most widely used VM and that's significantly slower without the primitives.
>
>>
>>   {
>>   [| tmp |
>>       tmp := (String new: 10000) writeStream.
>>       1 to: 10000 do: [:i | tmp nextPut: $0]] bench.
>>   [| tmp |
>>       tmp := (String new: 10000) writeXtream.
>>       1 to: 10000 do: [:i | tmp nextPut: $0]] bench.
>>   }
>>   #('1,200 per second.' '1,310 per second.')
>>   #('1,180 per second.' '1,320 per second.')
>
> XTreams are a bit slower if you add a WideCharacter to the stream, because
> ByteString is swapped to WideString with #becomeForward:, while WriteStream

Ah, yes, sure, becomeForward: is expensive, but it happens once per
Stream at most.
Without a primitive for nextPut:, checking at each put would be too expensive.

>>> #nextPut: has a string specific hack for this.
>
>>
>>   {
>>   [| tmp |
>>       tmp := (String new: 10000 withAll: $0) readStream.
>>       1 to: 10000 do: [:i | tmp next]] bench.
>>   [| tmp |
>>       tmp := (String new: 10000 withAll: $0) readXtream.
>>       1 to: 10000 do: [:i | tmp next]] bench.
>>   }
>>  #('2,470 per second.' '2,470 per second.')
>>  #('2,490 per second.' '2,480 per second.')
>>
>>
>> This now makes XTream performance similar to Stream for every current
>> message.
>> This includes performances on all kind of simple loops.
>>   [ tmp := aStream next. tmp==nil] whileFalse: [ tmp doSomething ].
>>   [ aStream atEnd] whileFalse: [ aStream next doSomething ].
>>   aStream do: [:next | next doSomething ].
>> Though XTream adds one more possibility.
>>   aStream endOfStreamAction: [ ^nil].
>>   [ aStream next doSomething. true ] whileTrue. "Don't use repeat in
>> Squeak, it's not inlined"
>>
>>   | str |
>>   str := String new: 1000 withAll: $a.
>>   {
>>       [| tmp | tmp := str readStream. [tmp next==nil] whileFalse] bench.
>>       [| tmp | tmp := str readXtream. [tmp next==nil] whileFalse] bench.
>>   }
>>   #('34,800 per second.' '37,000 per second.')
>>   #('36,000 per second.' '37,000 per second.')
>>
>>   | str |
>>   str := String new: 1000 withAll: $a.
>>   {
>>       [tmp := str readStream. [tmp atEnd] whileFalse: [tmp next]] bench.
>>       [tmp := str readXtream. [tmp atEnd] whileFalse: [tmp next]] bench.
>>   }
>>   #('27,000 per second.' '27,500 per second.')
>>   #('26,600 per second.' '27,100 per second.')
>>
>> This also is the case of major messages upTo: upToAnyOf: nextPutAll:
>> etc...
>>
>>   | str |
>>   str := String new: 1000 withAll: $a.
>>   {
>>       [str readStream upTo: $b] bench.
>>       [str readXtream upTo: $b] bench.
>>   }
>>   #('294,000 per second.' '297,000 per second.')
>>   #('296,000 per second.' '293,000 per second.')
>>
>> Now, what about Files ?
>> Squeak FileStream has recently (4.1) known major speed up thanks to
>> the hard work of Levente which backported several experiments to
>> Squeak Stream hierarchy with 100% backward compatibility and smooth
>> transition. Bravo!
>> This does naturally reduce one of the advantage of Squeak XTream,
>> which was buffering optimizations.
>>
>>   {
>>   [| tmp |
>>       tmp := (StandardFileStream readOnlyFileNamed: (SourceFiles at: 2)
>> name).
>>       [tmp next==nil] whileFalse. tmp close] timeToRun.
>>   [| tmp |
>>       tmp := (StandardFileStream readOnlyFileNamed: (SourceFiles at:
>> 2) name) readXtream buffered.
>>       [tmp next==nil] whileFalse. tmp close] timeToRun.
>>   }
>>   #(1497 1164)
>>   #(1426 1132)
>>
>> The speed up is not null but not major neither.
>
> If you use #basicNext instead of #next for StandardFileStream and remove the
> primitive from it, the performance will be the same. So getting rid of the
> #basic* methods can give us a bit more speed besides cleaner code.
> I have an idea to implement this, but I'm not sure the transition to the new
> code would be smooth enough. I'm also not sure it's worth putting more
> effort in this, using another stream implementation may be a better idea.
>

Switching is difficult too, unless you provide 100% of old API (in
which case, you did not clean that much...).
Maybe evolution of old code is worth a try.

>> Though, some old Stream messages still deserve optimization as
>> demonstrated here:
>>
>>   {
>>   [| tmp |
>>       tmp := (StandardFileStream readOnlyFileNamed: (SourceFiles at:
>> 2) name) ascii.
>>       [tmp upTo: Character cr. tmp atEnd] whileFalse. tmp close]
>> timeToRun.
>>   [| tmp |
>>       tmp := (StandardFileStream readOnlyFileNamed:
>>       (SourceFiles at: 2) name) readXtream ascii buffered.
>>       [tmp upTo: Character cr. tmp atEnd] whileFalse. tmp close]
>> timeToRun.
>>   }
>>   #(8854 716)
>>   #(8859 716)
>>
>>   {
>>   [| tmp |
>>       tmp := (StandardFileStream readOnlyFileNamed: (SourceFiles at:
>> 2) name) ascii.
>>       [tmp upToAnyOf: (CharacterSet crlf). tmp atEnd] whileFalse.
>>       tmp close] timeToRun.
>>   [| tmp |
>>       tmp := (StandardFileStream readOnlyFileNamed: (SourceFiles at:
>> 2) name) readXtream ascii buffered.
>>       [tmp upToAnyOf: (CharacterSet crlf). tmp atEnd] whileFalse.
>>       tmp close] timeToRun.
>>   }
>>   #(9138 857)
>>   #(9157 837)
>
> So far I couldn't find the real cause of this difference, maybe the vm
> profiler will tell more. MessageTally doesn't seem to be useful.
>
>>
>> One more important subject is the MultiByteFileStream bottleneck.
>> Internationalisation is an essential feature, many thanks to Yoshiki
>> for bringing it alive.
>> But the performance price is high.
>> Also, this is a place where Squeak dramatically require clean-ups (you
>> know all the basicNext and the like are just hackish).
>> Now, once again, since Levente buffering, the difference is not that high:
>>
>>   {
>>   [| tmp |
>>       tmp := (MultiByteFileStream readOnlyFileNamed: (SourceFiles
>> at: 2) name) ascii;
>>               wantsLineEndConversion: false; converter: UTF8TextConverter
>> new.
>>       1 to: 20000 do: [:i | tmp upTo: Character cr].
>>       tmp close] timeToRun.
>>   [| tmp |
>>       tmp := (StandardFileStream readOnlyFileNamed: (SourceFiles at: 2)
>> name)
>>               readXtream binary buffered ~> UTF8Decoder.
>>       1 to: 20000 do: [:i | tmp upTo: Character cr].
>>       tmp close] timeToRun.
>>   }
>>   #(152 120 )
>>   #(150 119 )
>>
>> But wait, the file was buffered (bytes are fetched from file by packets),
>> but the decoder was not! All decoding is performed char by char.
>> That's bad, because when only a few bytes require decoding and
>> majority can be translated unchanged to String, there is potentially a
>> major speed up by simple using a sub-array copy primitive. We know
>> this since #squeakToUTF8, many thanks to Andreas.
>> To profit by buffering for decoder too, just use a message to wrap it up:
>>
>>   {
>>   [| tmp |
>>       tmp := (MultiByteFileStream readOnlyFileNamed: (SourceFiles
>> at: 2) name) ascii;
>>               wantsLineEndConversion: false; converter: UTF8TextConverter
>> new.
>>       1 to: 20000 do: [:i | tmp upTo: Character cr].
>>       tmp close] timeToRun.
>>   [| tmp |
>>       tmp := ((StandardFileStream readOnlyFileNamed: (SourceFiles at: 2)
>> name)
>>               readXtream binary buffered ~> UTF8Decoder) buffered.
>>       1 to: 20000 do: [:i | tmp upTo: Character cr].
>>       tmp close] timeToRun.
>>   }
>>  #(152 18)
>>  #(152 19)
>>
>> Bingo, now the speed up is there too! 7.5x is not a bad score afterall.
>> That's not amazing, the change log is essentially made of ASCII and
>> does rarely require any UTF8 translation at all.
>> Of course, if you handle files full of chinese code points, don't
>> expect a speed up at all!
>> But for a decent proportion of latin character users, the potential
>> speed-up is there, right under our Streams.
>
> This could also be handled with my MultiByteStream idea (mentioned above
> without the name).
>
> So what's the conclusion? Should we consider adding XTream to Squeak and
> evolving the system to use it instead of Streams?
>

I don't think it is possible yet. Stream has so many messages I'm not
sure I should reproduce in Xtream...
I just tried this,
WriteStream class>>on: aCollection
    self == WriteStream ifTrue: [^WriteXtream on: aCollection].
    ^super on: aCollection

My image did not survive :(

On the other hand, changing the whole hierarchy smoothly is a challenge too...

Nicolas

>
> Levente
>
>>
>> Nicolas (again)
>>
>>
>
>