> Why would it kill performance? The block is created at the first > call, and thereafter a reference to it is past, no? That sounds > pretty fast to me. If you run this: [ self next == nil ] whileTrue: [ ... ] and Stream>>next is this ^self nextIfAtEnd: [ ^ nil ] every time you invoke #next, it you create a new BlockContext/BlockClosure, and that makes it impossible to exploit LIFO behavior for thisContext. It kills performance. Paolo |
To close the nextIfAtEnd: alternative see micro benchmarks (source and
results) at http://bugs.squeak.org/view.php?id=6755. Paolo Bonzini a écrit : > >> Why would it kill performance? The block is created at the first >> call, and thereafter a reference to it is past, no? That sounds >> pretty fast to me. > > If you run this: > > [ self next == nil ] whileTrue: [ ... ] > > and Stream>>next is this > > ^self nextIfAtEnd: [ ^ nil ] > > every time you invoke #next, it you create a new > BlockContext/BlockClosure, and that makes it impossible to exploit LIFO > behavior for thisContext. It kills performance. > Yeah, implementing default next like this is definitely a no go > Paolo > > |
In reply to this post by Andreas.Raab
Hi Andreas
To answer these important questions: - what performance drop/gain can we expect (trade off)? - should we have to rewrite anything apart core #next methods? you can see I completed your tests at http://bugs.squeak.org/view.php?id=6755 Here I take rationale again with a similar more systematic approach. PROCEDURE: ========= Analyzing residual costs with Micro bench should tell us the price: 1) == nil test costs = p (like pastEndTest) 2) atEnd test costs = a 3) exception handling cost = e when not handled and = h when handled 4) stream size is n We get three main ways to write loops: 5) (P) a pastEnd loop costs (residual only) n*p before, n*p+e after EndOfStream change. 6) (A) a atEnd test loop costs n*a before, n*a after. 7) (E) an exception loop is impossible before, costs h after. RESULTS: ======= On my machine: [200000 timesRepeat: []] timeToRun. 148 [200000 timesRepeat: [$a == nil ifFalse: [nil]]] timeToRun. 167 [5000 timesRepeat: []] timeToRun. 3 [5000 timesRepeat: [EndOfStream signal]] timeToRun. 159 [5000 timesRepeat: []] timeToRun. 3 [5000 timesRepeat: [[EndOfStream signal] on: EndOfStream do: [:exc | exc return]]] timeToRun. 209 s := ReadStream on: (1 to: 1000) asArray. [200000 timesRepeat: []] timeToRun. 148 [200000 timesRepeat: [s atEnd ifFalse: [nil]]] timeToRun. 223 e=0.03 h=0.04 a=0.00037 p=0.0001 IS IT WORTH REWRITING (A) loops ? --------------------------------- Obviously, no hurry... Since no exception is raised, cost is unchanged It would be worth only if average n > h/a. (ON MY MACHINE n>120) WHAT THE CHANGE COST ON (P) loops? --------------------------------- - Introducing the change costs a penalty e on each pastEnd loop (P) - This is neglectable as soon as n >> e/p - But does de-optimize small pastEnd loops as noticed by Andreas. IS IT WORTH REWRITING (P) loops ? --------------------------------- - rewrite (P) in (E) for small pastEnd loops is no use because n*p + e and h are same for small n - rewrite (P) in (A) would be viable if n*p+e > n*a, n < e/(a-p) (ON MY MACHINE n<110) - rewrite (P) in (E) would be worth for n > h/p. (ON MY MACHINE n>400) But no hurry, we have seen that penalty is small for such loops. CONCLUSIONS ----------- Introducing the change potentially cost a penalty on pastEnd short stream tight loops. If necessary, we might re-optimize a little most critical loops with a (P)->(A) rewrite, but without reaching previous performance. Suggested by Paolo, my guess however is that: - Most pastEnd loops are written for files. files are long and performance drop would be neglectable. A rewrite would benefit but no hurry. - Most String processing loops use atEnd loop (upper level ReadStream does not use == nil trick, client code might however uses few). POSSIBLE PLAN to confirm: trap EndOfStream to trace usage of pastEnd loops and run maximum activity in squeak... Introducing this change does not seem worth per se, except maybe for huge files. However - TO BE CONFIRMED, it might as well be neutral toward performances with current VMs. It would be worth in this case for enabling Stream extensions having costly atEnd tests and no possible == nil trick. Apart core #next, no rewrite of basic method is necessary. No hurry to do massive rewrite of code anyway. Cheers Nicolas |
> - Most String processing loops use atEnd loop (upper level ReadStream
> does not use == nil trick, client code might however uses few). I found only one that doesn't, in InflateStream>>#atEnd, using this code: (CompiledMethod allInstances select: [ :m | | s | s := m getSource asString. ('*next*' match: s ) and: [ ('*next == nil*' match: s ) or: [ ('*next isNil' match: s ) or: [ '*next notNil' match: s ]]]]) Here is the source code: atEnd "Note: It is possible that we have a few bits left, representing just the EOB marker. To check for this we must force decompression of the next block if at end of data." super atEnd ifFalse:[^false]. "Primitive test" (position >= readLimit and:[state = StateNoMoreData]) ifTrue:[^true]. self next == nil ifTrue:[^true]. position := position - 1. ^false and in this same class, we have: pastEndRead state = StateNoMoreData ifTrue:[^nil]. ... ^self next next <primitive: 65> position >= readLimit ifTrue: [^self pastEndRead] ifFalse: [^collection at: (position := position + 1)] So, the "self next == nil" could be written "self pastEndRead == nil". Or even better, one could simply rewrite "atEnd and next" like this: atEnd position < readLimit ifFalse: [^false]. "for speed, we inline pastEndRead with these changes:" state = StateNoMoreData ifTrue:[^true]. ... ^false next <primitive: 65> "position >= readLimit test inlined for speed" (position >= readLimit and: [self atEnd]) ifTrue: [ ^nil ]. ^collection at: (position := position + 1) Actually, this implementation of #next could be moved up to ReadStream, so that no override is necessary in InflateStream. So, it looks like the only explicit usage (in my image) of "self next == nil" ought to be eliminated anyway. In some cases, the "self next == nil" test is used implicitly, for example: Number>>readExponent:base: ('edq' includes: aStream next) ifFalse: [^ nil]. ... This code however does not seem to be written with speed in mind. Paolo |
Yes as expected most are FileStream kind
However, using: SystemNavigation new browseAllCallsOn: #next and: #== I found some other cases that might read pastEnd like for example: - Number class>>#readExponent:base:from: might read pastEnd - PositionableStream>>#next:into:startingAt: which has some senders... - etc... So I'm now trying to instrument with a tally... It's hacky, but seems to work: tally := MessageTally new. tally spyEvery: 100 on: ['em ezilaitini ot si siht' reverse]. tally class: World class method: #doOneCycle. "middle lines begin" tallyEnd := false. [(Delay forSeconds: 300) wait. tallyEnd := true] fork. [[World doOneCycle. Processor yield.tallyEnd] whileFalse] on: EndOfStream do: [:exc | tally tally: exc signalerContext by: 1. exc resume]. (StringHolder new contents: (String streamContents: [:s | tally report: s])) openLabel: 'EndOfStream Spy Results'. "middle lines end" tally close. If you do not execute tally close, and execute middle lines again, I think results can be cumulated. I effectively find Number readFrom: used from Compiler. But got only 40 tallies from ReadStream in 5 minutes... Gasp, did not find the gold mine of pastEnd... If someone wants to try to help, load ConnectEndOfStream-M6755-nice.1.cs (http://bugs.squeak.org/file_download.php?file_id=3064&type=bug) from http://bugs.squeak.org/view.php?id=6755 and execute above hack in a workspace, then play your favourite activities... In case of severe slow down please post results to me or the list. Thanks Nicolas Paolo Bonzini a écrit : >> - Most String processing loops use atEnd loop (upper level ReadStream >> does not use == nil trick, client code might however uses few). > > I found only one that doesn't, in InflateStream>>#atEnd, using this code: > > (CompiledMethod allInstances > select: [ :m | | s | > s := m getSource asString. > ('*next*' match: s ) and: [ > ('*next == nil*' match: s ) or: [ > ('*next isNil' match: s ) or: [ > '*next notNil' match: s ]]]]) > > Here is the source code: > > atEnd > "Note: It is possible that we have a few bits left, > representing just the EOB marker. To check for > this we must force decompression of the next > block if at end of data." > super atEnd ifFalse:[^false]. "Primitive test" > (position >= readLimit and:[state = StateNoMoreData]) > ifTrue:[^true]. > self next == nil ifTrue:[^true]. > position := position - 1. > ^false > > and in this same class, we have: > > pastEndRead > state = StateNoMoreData ifTrue:[^nil]. > ... > ^self next > > next > <primitive: 65> > position >= readLimit > ifTrue: [^self pastEndRead] > ifFalse: [^collection at: (position := position + 1)] > > So, the "self next == nil" could be written "self pastEndRead == nil". > Or even better, one could simply rewrite "atEnd and next" like this: > > atEnd > position < readLimit ifFalse: [^false]. > "for speed, we inline pastEndRead with these changes:" > state = StateNoMoreData ifTrue:[^true]. > ... > ^false > > next > <primitive: 65> > "position >= readLimit test inlined for speed" > (position >= readLimit and: [self atEnd]) ifTrue: [ ^nil ]. > ^collection at: (position := position + 1) > > Actually, this implementation of #next could be moved up to ReadStream, > so that no override is necessary in InflateStream. So, it looks like > the only explicit usage (in my image) of "self next == nil" ought to be > eliminated anyway. > > In some cases, the "self next == nil" test is used implicitly, for example: > > Number>>readExponent:base: > ('edq' includes: aStream next) ifFalse: [^ nil]. > ... > > This code however does not seem to be written with speed in mind. > > Paolo > > |
Free forum by Nabble | Edit this page |