[squeak-dev] Speeding up Float Numbers reading

Previous Topic Next Topic
 
classic Classic list List threaded Threaded
4 messages Options
Reply | Threaded
Open this post in threaded view
|

[squeak-dev] Speeding up Float Numbers reading

Nicolas Cellier-3
Following Andreas ReadStream>>nextFloat track, I did improve
SqNumberParser speed too, just for fun.

With minor changes, a speed factor > 2 in Float crunching has been achieved.
I could not reach Andreas efficiency, though.
SqNumberParser is still 2 to 4 times behind.

A difference is that SqNumberParser does return the nearest floating
point value, while ReadStream>>nextFloat does return an approximate.
This explains some of the bad scores of the former which cannot avoid
LargeInteger arithmetic when the latter does (cases of exponent with
high values).

This can be marginally improved with timesTwoPower: trick in the
Fraction case, which would sometimes avoid LargeInteger arithmetic.
But the implications with Float overflow and underflow must be handled,
and it's not really worth...

Remaining performance is mostly explained by method inlining, which is
not desirable in SqNumberParser. It could however be subclassed with an
OptimizedSqueakFloatParser (I did not...).

Experiments are at http://bugs.squeak.org/view.php?id=6976 if anyone
interested.

I remind there are also tricks that SqNumberParser uselessly deals with
like:
| s |
s := '1' , (String new: 400 withAll: $0) , '.1e-400'.
{s readStream nextFloat. SqNumberParser parse: s readStream. Number
readFrom: s readStream}.

Last thing, speeding up Float reading is not premature optimization,
large files made of formatted Float are widely used in some areas.
Sure an atof() primitive would probably outperform if that really matters...

Nicolas


Reply | Threaded
Open this post in threaded view
|

[squeak-dev] Re: Speeding up Float Numbers reading

Andreas.Raab
nicolas cellier wrote:
> A difference is that SqNumberParser does return the nearest floating
> point value, while ReadStream>>nextFloat does return an approximate.
> This explains some of the bad scores of the former which cannot avoid
> LargeInteger arithmetic when the latter does (cases of exponent with
> high values).

Interesting. Do you have an example for this? I went to great length to
ensure that #nextFloat answers consistent results to Float>>readFrom:
(otherwise I could have optimized various cases even more aggressively)
so I'd be interested in seeing cases where it differs.

Test cases would probably be best but in absence of those some general
guidance about how to create the problem would be appreciated.

> Last thing, speeding up Float reading is not premature optimization,
> large files made of formatted Float are widely used in some areas.
> Sure an atof() primitive would probably outperform if that really
> matters...

Absolutely. I was considering just that except that I wasn't quite sure
whether the various C runtime libs would implement atof in a
bit-identical manner (as a matter of fact I'm pretty sure they don't).
Since this was a requirement for Croquet I opted for optimizing the read
with the means I had available.

Cheers,
   - Andreas


Reply | Threaded
Open this post in threaded view
|

[squeak-dev] Re: Speeding up Float Numbers reading

Andreas.Raab
In reply to this post by Nicolas Cellier-3
nicolas cellier wrote:
> Remaining performance is mostly explained by method inlining, which is
> not desirable in SqNumberParser. It could however be subclassed with an
> OptimizedSqueakFloatParser (I did not...).

Indeed. You should *very* seriously consider this. I'm including the
tallies of the reader below (done with the Qwaq profiler which does
sub-millisecond primitive sampling) and look how much time you spend
elsewhere: Character>>digitValue is a bitch (going through
EncodedCharSet every time) and PositionableStream>>atEnd is called a
gigazillion times. Compare this with the streamlined version in
ReadStream>>nextFloat and you can see where the difference is in inlining.

Cheers,
   - Andreas




----------------------- Float>>readFrom: --------------------------

100.0 (10,406)  Compiler  evaluate:in:to:notifying:ifFail:logged:
   100.0 (10,406)  TXmlConstructor class  DoIt
     100.0 (10,406)  TXmlConstructor class  runFloatTest:
      94.7 (9,854)  Float class  readFrom:
       |94.6 (9,844)  Float class [Number class]  readFrom:
       |  50.3 (5,234)  Float class [Number class]
readRemainderOf:from:base:withSign:
       |    |26.0 (2,706)  Integer class  readFrom:base:
       |    |  |10.1 (1,051)  Character  digitValue
       |    |  |  |9.7 (1,009)  EncodedCharSet class  charsetAt:
       |    |  |  |  9.7 (1,009)  Array [SequenceableCollection]
at:ifAbsent:
       |    |  |  |    9.1 (947)  Object  at:
       |    |  |7.5 (780)  ReadStream [PositionableStream]  peekFor:
       |    |  |  |5.8 (604)  PositionableStream  atEnd
       |    |  |  |1.7 (177)  ReadStream  next
       |    |  |5.1 (531)  PositionableStream  atEnd
       |    |  |3.2 (333)  ReadStream  next
       |    |  |  1.7 (177)  ByteString  at:
       |    |9.2 (957)  SmallInteger  asFloat
       |    |4.1 (427)  Float class [Number class]
canParseExponentFor:base:from:
       |    |  |2.9 (302)  ByteString [SequenceableCollection]  includes:
       |    |  |  |2.9 (302)  ByteString [String]  indexOf:
       |    |  |  |  2.9 (302)  ByteString class
indexOfAscii:inString:startingAt:
       |    |  |1.2 (125)  ReadStream [PositionableStream]  peek
       |    |3.9 (406)  ReadStream [PositionableStream]  peekFor:
       |    |  |3.5 (364)  PositionableStream  atEnd
       |    |2.7 (281)  Character  digitValue
       |    |  |2.6 (271)  EncodedCharSet class  charsetAt:
       |    |  |  2.6 (271)  Array [SequenceableCollection]  at:ifAbsent:
       |    |  |    2.1 (219)  Object  at:
       |    |2.2 (229)  Float class [Number class]
canParseAsScaledDecimal...digits:base:sign:from:
       |    |  |1.8 (187)  ReadStream [PositionableStream]  peek
       |    |1.1 (114)  PositionableStream  atEnd
       |  30.3 (3,153)  Integer class  readFrom:base:
       |    |13.1 (1,363)  Character  digitValue
       |    |  |12.6 (1,311)  EncodedCharSet class  charsetAt:
       |    |  |  12.5 (1,301)  Array [SequenceableCollection]  at:ifAbsent:
       |    |  |    11.5 (1,197)  Object  at:
       |    |7.1 (739)  PositionableStream  atEnd
       |    |5.9 (614)  ReadStream [PositionableStream]  peekFor:
       |    |  |5.4 (562)  PositionableStream  atEnd
       |    |4.1 (427)  ReadStream  next
       |    |  2.0 (208)  ByteString  at:
       |  7.6 (791)  ReadStream [PositionableStream]  peekFor:
       |    |6.3 (656)  PositionableStream  atEnd
       |    |1.2 (125)  ReadStream  next
       |  6.5 (676)  ReadStream [Stream]  nextMatchAll:
       |    4.5 (468)  ByteString [SequenceableCollection]  do:
       |      |4.4 (458)  ArrayedCollection  size
       |    1.7 (177)  ReadStream  next
      3.7 (385)  ReadStream [PositionableStream]  skipSeparators
       |2.8 (291)  PositionableStream  atEnd
      1.6 (166)  PositionableStream  atEnd

**Leaves**
42.7 (4,443)  PositionableStream  atEnd
23.7 (2,466)  Object  at:
9.6 (999)  SmallInteger  asFloat
6.0 (624)  ByteString  at:
4.8 (499)  ArrayedCollection  size
4.7 (489)  ReadStream  next
3.0 (312)  ByteString class  indexOfAscii:inString:startingAt:

**Memory**
        old +0 bytes
        young +399,516 bytes
        used +399,516 bytes
        free -399,516 bytes

**GCs**
        full 0 totalling 0ms (0.0% uptime)
        incr 170 totalling 690ms (7.0% uptime), avg 4.0ms
        tenures 0
        root table 0 overflows





------------------------ Readstream>>nextFloat ------------------------

100.0 (744)  Compiler  evaluate:in:to:notifying:ifFail:logged:
   100.0 (744)  TXmlConstructor class  DoIt
     100.0 (744)  TXmlConstructor class  runFloatTest:
       100.0 (744)  ReadStream  nextFloat
         74.8 (557)  ByteString  byteAt:
         13.6 (101)  ByteString class  findFirstInString:inSet:startingAt:
         11.5 (86)  SmallInteger  asFloat

**Leaves**
74.9 (557)  ByteString  byteAt:
13.6 (101)  ByteString class  findFirstInString:inSet:startingAt:
11.5 (86)  SmallInteger  asFloat

**Memory**
        old +0 bytes
        young -562,888 bytes
        used -562,888 bytes
        free +562,888 bytes

**GCs**
        full 0 totalling 0ms (0.0% uptime)
        incr 13 totalling 18ms (2.0% uptime), avg 1.0ms
        tenures 0
        root table 0 overflows

Reply | Threaded
Open this post in threaded view
|

[squeak-dev] Re: Speeding up Float Numbers reading

Nicolas Cellier-3
In reply to this post by Andreas.Raab
Andreas Raab <andreas.raab <at> gmx.de> writes:

>
> nicolas cellier wrote:
> > A difference is that SqNumberParser does return the nearest floating
> > point value, while ReadStream>>nextFloat does return an approximate.
> > This explains some of the bad scores of the former which cannot avoid
> > LargeInteger arithmetic when the latter does (cases of exponent with
> > high values).
>
> Interesting. Do you have an example for this? I went to great length to
> ensure that #nextFloat answers consistent results to Float>>readFrom:
> (otherwise I could have optimized various cases even more aggressively)
> so I'd be interested in seeing cases where it differs.
>
> Test cases would probably be best but in absence of those some general
> guidance about how to create the problem would be appreciated.
>

Hi Andreas,
If you say nextFloat implementation is bit identicle, i trust you.
What i do not trust is Number class>>readFrom: to answer nearest Floating point.
It cumulates several rounding errors (see below).

An example is this one:

self assert:
((SqNumberParser parse: '1.2345678901234567980') asTrueFraction
  - (12345678901234567980/10000000000000000000)) abs
<
((Number readFrom: '1.2345678901234567980') asTrueFraction
  - (12345678901234567980/10000000000000000000)) abs.


You need a 3.9/3.10 image or patches from:
http://bugs.squeak.org/view.php?id=3564
http://bugs.squeak.org/view.php?id=3568
http://bugs.squeak.org/view.php?id=3512

Or you can just play with asTrueFraction:
(SqNumberParser parse: '1.2345678901234567980') asTrueFraction
  storeStringBase: 16.
'(16r13C0CA428C59FB/16r10000000000000)'

(Number readFrom: '1.2345678901234567980') asTrueFraction
  storeStringBase: 16.
'(16r4F03290A3167F/16r4000000000000)'
This is a one bit difference - multiply num and den by 4:
'(16r13C0CA428C59FC/16r10000000000000)'

Sure we can construct some cases with 2 bits off or more given the number of
successive inexact operations:

fraction := fractionPart asFloat
    "asFloat is inexact at about 15 decimal digits"
/ (base raisedTo: aStream position - fracpos).
  "10 raisedTo: is inexact if more than 23 digits
  (with asFloat patch, otherwise worse), and division is inexact too"

fractionDigits := aStream position - fracpos.
value := value asFloat + fraction.
  "Addition is inexact, that's already fourth
inexact operation before exponent operation"

And lastly:

baseValue * (base raisedTo: exp)
  "That's two more inexact ops, implicit asFloat conversion if exponent > 23,
  or negative, and Float multiply"

Nicolas