Smalltalk › Squeak › Squeak - Dev

EndOfStream unused

_‹ Previous Topic Next Topic _›

Classic

List

Threaded

25 messages Options

Paolo Bonzini-2

Re: EndOfStream unused

> Why would it kill performance? The block is created at the first
> call, and thereafter a reference to it is past, no? That sounds
> pretty fast to me.

If you run this:

[ self next == nil ] whileTrue: [ ... ]

and Stream>>next is this

^self nextIfAtEnd: [ ^ nil ]

every time you invoke #next, it you create a new
BlockContext/BlockClosure, and that makes it impossible to exploit LIFO
behavior for thisContext. It kills performance.

Paolo

Nicolas Cellier-3

Re: EndOfStream unused

To close the nextIfAtEnd: alternative see micro benchmarks (source and
results) at http://bugs.squeak.org/view.php?id=6755.

Paolo Bonzini a écrit :

>
>> Why would it kill performance? The block is created at the first
>> call, and thereafter a reference to it is past, no? That sounds
>> pretty fast to me.
>
> If you run this:
>
> [ self next == nil ] whileTrue: [ ... ]
>
> and Stream>>next is this
>
> ^self nextIfAtEnd: [ ^ nil ]
>
> every time you invoke #next, it you create a new
> BlockContext/BlockClosure, and that makes it impossible to exploit LIFO
> behavior for thisContext. It kills performance.
>

Yeah, implementing default next like this is definitely a no go

> Paolo
>
>

Nicolas Cellier-3

EndOfStream performance cost [was EndOfStream unused]

In reply to this post by Andreas.Raab

Hi Andreas

To answer these important questions:
- what performance drop/gain can we expect (trade off)?
- should we have to rewrite anything apart core #next methods?

you can see I completed your tests at
http://bugs.squeak.org/view.php?id=6755

Here I take rationale again with a similar more systematic approach.

PROCEDURE:
=========

Analyzing residual costs with Micro bench should tell us the price:
1) == nil test costs = p (like pastEndTest)
2) atEnd test costs = a
3) exception handling cost = e when not handled and = h when handled

4) stream size is n

We get three main ways to write loops:
5) (P) a pastEnd loop costs (residual only)
n*p before,
n*p+e after EndOfStream change.
6) (A) a atEnd test loop costs
n*a before,
n*a after.
7) (E) an exception loop
is impossible before,
costs h after.

RESULTS:
=======

On my machine:

[200000 timesRepeat: []] timeToRun. 148
[200000 timesRepeat: [$a == nil ifFalse: [nil]]] timeToRun. 167

[5000 timesRepeat: []] timeToRun. 3
[5000 timesRepeat: [EndOfStream signal]] timeToRun. 159

[5000 timesRepeat: []] timeToRun. 3
[5000 timesRepeat: [[EndOfStream signal] on: EndOfStream do: [:exc | exc
return]]] timeToRun. 209

s := ReadStream on: (1 to: 1000) asArray.
[200000 timesRepeat: []] timeToRun. 148
[200000 timesRepeat: [s atEnd ifFalse: [nil]]] timeToRun. 223

e=0.03 h=0.04 a=0.00037 p=0.0001

IS IT WORTH REWRITING (A) loops ?
---------------------------------

Obviously, no hurry...
Since no exception is raised, cost is unchanged
It would be worth only if average n > h/a.
(ON MY MACHINE n>120)

WHAT THE CHANGE COST ON (P) loops?
---------------------------------

- Introducing the change costs a penalty e on each pastEnd loop (P)
- This is neglectable as soon as n >> e/p
- But does de-optimize small pastEnd loops as noticed by Andreas.

IS IT WORTH REWRITING (P) loops ?
---------------------------------

- rewrite (P) in (E) for small pastEnd loops is no use
because n*p + e and h are same for small n
- rewrite (P) in (A) would be viable if n*p+e > n*a, n < e/(a-p)
(ON MY MACHINE n<110)
- rewrite (P) in (E) would be worth for n > h/p.
(ON MY MACHINE n>400)
But no hurry, we have seen that penalty is small for such loops.

CONCLUSIONS
-----------

Introducing the change potentially cost a penalty on pastEnd short
stream tight loops.
If necessary, we might re-optimize a little most critical loops with a
(P)->(A) rewrite, but without reaching previous performance.

Suggested by Paolo, my guess however is that:
- Most pastEnd loops are written for files. files are long and
performance drop would be neglectable. A rewrite would benefit but no hurry.
- Most String processing loops use atEnd loop (upper level ReadStream
does not use == nil trick, client code might however uses few).

POSSIBLE PLAN to confirm: trap EndOfStream to trace usage of pastEnd
loops and run maximum activity in squeak...

Introducing this change does not seem worth per se, except maybe for
huge files.
However - TO BE CONFIRMED, it might as well be neutral toward
performances with current VMs.

It would be worth in this case for enabling Stream extensions having
costly atEnd tests and no possible == nil trick.

Apart core #next, no rewrite of basic method is necessary.
No hurry to do massive rewrite of code anyway.

Cheers

Nicolas

Paolo Bonzini-2

Re: EndOfStream performance cost [was EndOfStream unused]

> - Most String processing loops use atEnd loop (upper level ReadStream
> does not use == nil trick, client code might however uses few).

I found only one that doesn't, in InflateStream>>#atEnd, using this code:

(CompiledMethod allInstances
select: [ :m | | s |
s := m getSource asString.
('*next*' match: s ) and: [
('*next == nil*' match: s ) or: [
('*next isNil' match: s ) or: [
'*next notNil' match: s ]]]])

Here is the source code:

atEnd
"Note: It is possible that we have a few bits left,
representing just the EOB marker. To check for
this we must force decompression of the next
block if at end of data."
super atEnd ifFalse:[^false]. "Primitive test"
(position >= readLimit and:[state = StateNoMoreData])
ifTrue:[^true].
self next == nil ifTrue:[^true].
position := position - 1.
^false

and in this same class, we have:

pastEndRead
state = StateNoMoreData ifTrue:[^nil].
...
^self next

next
<primitive: 65>
position >= readLimit
ifTrue: [^self pastEndRead]
ifFalse: [^collection at: (position := position + 1)]

So, the "self next == nil" could be written "self pastEndRead == nil".
Or even better, one could simply rewrite "atEnd and next" like this:

atEnd
position < readLimit ifFalse: [^false].
"for speed, we inline pastEndRead with these changes:"
state = StateNoMoreData ifTrue:[^true].
...
^false

next
<primitive: 65>
"position >= readLimit test inlined for speed"
(position >= readLimit and: [self atEnd]) ifTrue: [ ^nil ].
^collection at: (position := position + 1)

Actually, this implementation of #next could be moved up to ReadStream,
so that no override is necessary in InflateStream. So, it looks like
the only explicit usage (in my image) of "self next == nil" ought to be
eliminated anyway.

In some cases, the "self next == nil" test is used implicitly, for example:

Number>>readExponent:base:
('edq' includes: aStream next) ifFalse: [^ nil].
...

This code however does not seem to be written with speed in mind.

Paolo

Nicolas Cellier-3

Re: EndOfStream performance cost [was EndOfStream unused]

Yes as expected most are FileStream kind
However, using:

SystemNavigation new browseAllCallsOn: #next and: #==

I found some other cases that might read pastEnd like for example:
- Number class>>#readExponent:base:from: might read pastEnd
- PositionableStream>>#next:into:startingAt: which has some senders...
- etc...

So I'm now trying to instrument with a tally...
It's hacky, but seems to work:

tally := MessageTally new.
tally spyEvery: 100 on: ['em ezilaitini ot si siht' reverse].
tally class: World class method: #doOneCycle.
"middle lines begin"
tallyEnd := false.
[(Delay forSeconds: 300) wait. tallyEnd := true] fork.
[[World doOneCycle. Processor yield.tallyEnd] whileFalse]
on: EndOfStream
do: [:exc | tally tally: exc signalerContext by: 1.
exc resume].
(StringHolder new contents:
(String streamContents: [:s | tally report: s]))
openLabel: 'EndOfStream Spy Results'.
"middle lines end"
tally close.

If you do not execute tally close, and execute middle lines again, I
think results can be cumulated.

I effectively find Number readFrom: used from Compiler. But got only 40
tallies from ReadStream in 5 minutes...
Gasp, did not find the gold mine of pastEnd...

If someone wants to try to help, load ConnectEndOfStream-M6755-nice.1.cs
(http://bugs.squeak.org/file_download.php?file_id=3064&type=bug) from
http://bugs.squeak.org/view.php?id=6755 and execute above hack in a
workspace, then play your favourite activities...

In case of severe slow down please post results to me or the list.
Thanks

Nicolas

Paolo Bonzini a écrit :

>> - Most String processing loops use atEnd loop (upper level ReadStream
>> does not use == nil trick, client code might however uses few).
>
> I found only one that doesn't, in InflateStream>>#atEnd, using this code:
>
> (CompiledMethod allInstances
> select: [ :m | | s |
> s := m getSource asString.
> ('*next*' match: s ) and: [
> ('*next == nil*' match: s ) or: [
> ('*next isNil' match: s ) or: [
> '*next notNil' match: s ]]]])
>
> Here is the source code:
>
> atEnd
> "Note: It is possible that we have a few bits left,
> representing just the EOB marker. To check for
> this we must force decompression of the next
> block if at end of data."
> super atEnd ifFalse:[^false]. "Primitive test"
> (position >= readLimit and:[state = StateNoMoreData])
> ifTrue:[^true].
> self next == nil ifTrue:[^true].
> position := position - 1.
> ^false
>
> and in this same class, we have:
>
> pastEndRead
> state = StateNoMoreData ifTrue:[^nil].
> ...
> ^self next
>
> next
> <primitive: 65>
> position >= readLimit
> ifTrue: [^self pastEndRead]
> ifFalse: [^collection at: (position := position + 1)]
>
> So, the "self next == nil" could be written "self pastEndRead == nil".
> Or even better, one could simply rewrite "atEnd and next" like this:
>
> atEnd
> position < readLimit ifFalse: [^false].
> "for speed, we inline pastEndRead with these changes:"
> state = StateNoMoreData ifTrue:[^true].
> ...
> ^false
>
> next
> <primitive: 65>
> "position >= readLimit test inlined for speed"
> (position >= readLimit and: [self atEnd]) ifTrue: [ ^nil ].
> ^collection at: (position := position + 1)
>
> Actually, this implementation of #next could be moved up to ReadStream,
> so that no override is necessary in InflateStream. So, it looks like
> the only explicit usage (in my image) of "self next == nil" ought to be
> eliminated anyway.
>
> In some cases, the "self next == nil" test is used implicitly, for example:
>
> Number>>readExponent:base:
> ('edq' includes: aStream next) ifFalse: [^ nil].
> ...
>
> This code however does not seem to be written with speed in mind.
>
> Paolo
>
>