[BUG][FIX] Float conversion #asFloat and #asTrueFraction

Previous Topic Next Topic
 
classic Classic list List threaded Threaded
2 messages Options
Reply | Threaded
Open this post in threaded view
|

[BUG][FIX] Float conversion #asFloat and #asTrueFraction

Nicolas Cellier-3
There are several problems in Dolphin Float/Fraction conversion

1) Float asTrueFraction does not handle denormalized number
   (gradual underflow)

2) Large fractions asFloat do overflow (due to intermediate steps)

3) Integer and Fraction #asFloat do not round to nearest Floating
   point value (current algorithm accumulate round off errors)

These floating point problems are hanging around in major Smalltalk
dialects for so long...

They all did exist in ST-80, but 2) has been corrected in Squeak and
VW.

I recently corrected 1) and 3) in Squeak.

This is general Smalltalk improvement that i will port to other
commercial and free dialects.
Consider licence as MIT or no licence at all if you prefer, and feel
free to modify/uncomment/incorporate in Dolphin distribution or any
other Smalltalk.

I join changed methods plus a TestCase inline since i do not know how
to join a file through google groups... Is there a better way ?

Nicolas





"Filed out from Dolphin Smalltalk X6"!

TestCase subclass: #FloatTest
        instanceVariableNames: ''
        classVariableNames: ''
        poolDictionaries: ''
        classInstanceVariableNames: ''!
FloatTest guid: (GUID fromString:
'{B3BD6664-DD98-4FD0-A2E6-481B53583567}')!
FloatTest comment: ''!
!FloatTest categoriesForClass!Unclassified! !
!FloatTest methodsFor!

testFloatAsTrueFraction
        "check if gradual underflow is correctly handled"

        self assert: 1.0e-316 asFraction asFloat = 1.0e-316!

testFractionAsFloat
        "use a random test to be sure that fractions are rounded to
nearest float"

        | r frac err collec |
        r := RandomLinearCongruential
                newModulus: (2 raisedTo: 54)
                multiplier: (2 raisedTo: 53) - (2 raisedTo: 47) - (2
raisedTo: 33) - 1
                increment: 1234567.
        r seed: 1234567.
        1000 timesRepeat: [
                frac := ((r next ; seed) * (r next ; seed) + 1) / ((r
next; seed) * (r next; seed) + 1).
                err := (frac - frac asFloat asTrueFraction) * frac
reciprocal * (1 bitShift: Float precision - 1).
                self assert: err < (1/2)].

        collec := #(
                16r010000000000000 16r01FFFFFFFFFFFFF 1 2
                w.pdf                                     1707
                16r020000000000000 16r0020000000000001
                     t.st
                16r3FFFFFFFFFFFFF 16r3FFFFFFFFFFFFE 16r3FFFFFFFFFFFFD).
        collec do: [:num |
                collec do: [:den |
                        frac := Fraction numerator: num denominator:
den.
                        err := (frac - frac asFloat asTrueFraction) *
frac reciprocal * (1 bitShift: Float precision - 1).
                        self assert: err <= (1/2)]].!
                 es.st                                      115

                      ip
testIntegerAsFloat
                        ge
        "assert IEEE 754 round to nearest even mode is honoured"
                          anges

        self deny: 16r1FFFFFFFFFFFF0801 asFloat = 16r1FFFFFFFFFFFF0800
asFloat. "this test is on 65 bits"
        self deny: 16r1FFFFFFFFFFFF0802 asFloat = 16r1FFFFFFFFFFFF0800
asFloat. "this test is on 64 bits"
        self assert: 16r1FFFFFFFFFFF1F800 asFloat =
16r1FFFFFFFFFFF20000 asFloat. "nearest even is upper"
        self assert: 16r1FFFFFFFFFFFF0800 asFloat =
16r1FFFFFFFFFFFF0000 asFloat. "nearest even is lower"
! !
!FloatTest categoriesFor: #testFloatAsTrueFraction!public!Testing! !
!FloatTest categoriesFor: #testFractionAsFloat!public!Testing! !
!FloatTest categoriesFor: #testIntegerAsFloat!public!Testing! !





"Filed out from Dolphin Smalltalk X6"!

!Float categoriesForClass!Magnitude-Numbers! !
!Float methodsFor!

asTrueFraction
        "Answer a <rational> that precisely represents the binary
fractional
        value of the receiver using all available bits of the double
precision
        IEEE floating point representation. Note that because <Float>
is
        an imprecise representation, the result may have more precision
than
        appropriate.  For example the decimal number 0.1 cannot be
represented
        precisely as a binary floating point number, and hence the
<Float>
        representation is itself only approximate. When <Float>
representation of
        0.1 is converted using this method the result is a precisely
equivalent Fraction
        that is very close to (1/10), but not actually equal to 0.1."

        " Extract the bits of an IEEE double float "

        | shifty sign expPart exp fraction fractionPart |
        shifty := VMLibrary default makeLargeUnsigned: self.

        " Extract the sign and the biased exponent "
        sign := (shifty bitShift: -63) == 0 ifTrue: [1] ifFalse: [-1].
        expPart := (shifty bitShift: -52) bitAnd: 16r7FF.

        " Extract fractional part; answer 0 if this is a true 0.0 value
"
        fractionPart := shifty bitAnd: 16r000FFFFFFFFFFFFF.
        (expPart == 0 and: [fractionPart = 0]) ifTrue: [^0].

        "Add implied leading 1 into fraction"
        "2006/06/02 nice: denormalized numbers (gradual underflow) does
not have leading 1"
        fraction := expPart = 0
                ifTrue: [fractionPart]
                ifFalse: [fractionPart bitOr: 16r0010000000000000].

        "Unbias exponent: 16r3FF is bias; 52 is fraction width"
        exp := ##(16r3FF + 52) - expPart.

        "Form the result. When exp>52, the exponent is adjusted by
          the number of trailing zero bits in the fraction to minimize
          the (huge) time otherwise spent in #gcd:. "
        ^exp negative
                ifTrue: [sign * fraction bitShift: exp negated]
                ifFalse:
                        [| zeroBitsCount |
                        zeroBitsCount := fraction lowBit - 1.
                        exp := exp - zeroBitsCount.
                        exp <= 0
                                ifTrue:
                                        [zeroBitsCount := zeroBitsCount
+ exp.
                                        sign * fraction bitShift:
zeroBitsCount negated]
                                ifFalse:
                                        [Fraction numerator: (sign *
fraction bitShift: zeroBitsCount negated)
                                                denominator: (1
bitShift: exp)]]! !

!Float categoriesFor: #asTrueFraction!converting!public! !







"Filed out from Dolphin Smalltalk X6"!

!LargeInteger categoriesForClass!Magnitude-Numbers! !
!LargeInteger methodsFor!

asFloat
        "Answer the floating point representation of the receiver. Some
precision
        may be lost.

        Primitive Failure Reasons:
                1       -       The receiver has more than 64-bits
        "

        | result nTruncatedBits exponent mask trailingBits carry |
        <primitive: 167>

        "2006/06/02 nice: previous implementation might accumulate
rounding errors.
        This implementation does honour IEEE default rounding mode (to
nearest even)"
        result := self abs.
        nTruncatedBits := result highBit - Float precision.
        nTruncatedBits > 0
                ifTrue:
                        [mask := (1 bitShift: nTruncatedBits) - 1.
                        trailingBits := result bitAnd: mask.
                        "inexact := trailingBits isZero not."
                        carry := trailingBits bitShift: 1 -
nTruncatedBits.
                        result := result bitShift: nTruncatedBits
negated.
                        exponent := nTruncatedBits.
                        (carry isZero
                                        or: [(trailingBits bitAnd:
(mask bitShift: -1)) isZero
                                                        and: [result
even]])
                                ifFalse: [result := result + 1].
                        ^ self positive
                                ifTrue: [result asFloat timesTwoPower:
exponent]
                                ifFalse: [result asFloat negated
timesTwoPower: exponent]].

        "We should never reach this code if primitive works properly"
        result := 0.0.
        self basicSize to: 1 by: -1 do: [ :i | result := result * 256.0
+ (self byteAt: i) asFloat].
        ^result! !

!LargeInteger categoriesFor: #asFloat!converting!public! !







"Filed out from Dolphin Smalltalk X6"!

!Fraction categoriesForClass!Magnitude-Numbers! !
!Fraction methodsFor!

asFloat
        "Answer the receiver as a Float"

        "2006/0602 nice: naive algorithm might overflow prematurely
                ^numerator asFloat / denominator asFloat
        This implementation will answer the closest floating point
number to the receiver.
        It uses the IEEE 754 round to nearest even mode."

        | a b q r exponent floatExponent nBits ha hb hq q1 |
        a := numerator abs.
        b := denominator abs.
        ha := a highBit.
        hb := b highBit.

        "If both numerator and denominator are represented exactly in
floating point number,
        then fastest thing to do is to use hardwired float division"
        nBits := Float precision + 1.
        (ha < nBits and: [hb < nBits]) ifTrue: [^numerator asFloat /
denominator asFloat].

        "Try and obtain a mantissa with 54 bits by integer division.
        This is 53 bits of IEEE 754 mantissa plus 1 bit for rounding
        First guess is rough, we might get one more bit or one less"
        exponent := ha - hb - nBits.
        exponent > 0
                ifTrue: [b := b bitShift: exponent]
                ifFalse: [a := a bitShift: exponent negated].
        q := a quo: b.
        r := a - (q * b).
        hq := q highBit.

        "check for gradual underflow, in which case we should use less
bits"
        floatExponent := exponent + hq.
        floatExponent >= Float emin
                ifFalse: [nBits := nBits + floatExponent - Float emin].

        "Use exactly nBits"
        hq > nBits
                ifTrue:
                        [exponent := exponent + hq - nBits.
                        r := (q bitAnd: (1 bitShift: nBits - hq) - 1) *
b + r.
                        q := q bitShift: nBits - hq].
        hq < nBits
                ifTrue:
                        [exponent := exponent + hq - nBits.
                        q1 := (r bitShift: nBits - hq) quo: b.
                        q := (q bitShift: nBits - hq) bitAnd: q1.
                        r := r - (q1 * b)].

        "check if we should round upward.
        The case of exact half (q bitAnd: 1) = 1 & (r isZero)
        will be handled by Integer>>asFloat"
        ((q bitAnd: 1) isZero or: [r isZero])
                ifFalse: [q := q + 1].

        "build the float"
        ^ (self positive
                ifTrue: [q asFloat]
                ifFalse: [q asFloat negated])
                timesTwoPower: exponent! !


!Fraction categoriesFor: #asFloat!converting!public! !


Reply | Threaded
Open this post in threaded view
|

Re: [BUG][FIX] Float conversion #asFloat and #asTrueFraction

Blair McGlashan-4
"nicolas cellier" <[hidden email]> wrote in message
news:[hidden email]...

> There are several problems in Dolphin Float/Fraction conversion
>
> 1) Float asTrueFraction does not handle denormalized number
>   (gradual underflow)
>
> 2) Large fractions asFloat do overflow (due to intermediate steps)
>
> 3) Integer and Fraction #asFloat do not round to nearest Floating
>   point value (current algorithm accumulate round off errors)
>
> These floating point problems are hanging around in major Smalltalk
> dialects for so long...
>
> They all did exist in ST-80, but 2) has been corrected in Squeak and
> VW.
>
> I recently corrected 1) and 3) in Squeak.
>
> This is general Smalltalk improvement that i will port to other
> commercial and free dialects.
> Consider licence as MIT or no licence at all if you prefer, and feel
> free to modify/uncomment/incorporate in Dolphin distribution or any
> other Smalltalk.

Thank you for your contribution Nicolas. We will certainly include this in
the next patch release.

Regards

Blair