Numerics question: reading floating point constants

Previous Topic Next Topic
 
classic Classic list List threaded Threaded
16 messages Options
Reply | Threaded
Open this post in threaded view
|

Numerics question: reading floating point constants

Andreas.Raab
Hi Folks,

This post is one for the numerics fraction. I was just in the process of
transcribing some interesting numerical code from C to Squeak while I
was running into an issue with floating point constants. The code that I
have has constants like here:

   pio2_2  =  6.07710050630396597660e-11, /* 0x3DD0B461, 0x1A600000 */

but naively transcribing this into, e.g.,

   pio22 := 6.07710050630396597660e-11.

turned out to get some of the expected results wrong. Why? Because the
bit pattern is different, e.g., what we get is this:

   6.07710050630396597660e-11 hex => '3DD0B4611A5FFFFF'

Notice how the last hex digits are different.

My question here is the following: While I can understand why *printing*
might give different results based on the algorithm chosen (e.g., what
amount of error do we allow when printing floating point numbers etc) is
it to be expected that *reading* has similar properties? In particular
considering that, e.g.,

   (607710050630396597660 * 1.0e-31) hex => '3DD0B4611A600000'

gives the "correct" bit pattern makes me think that there might be
something broken in our way of reading floating point numbers.

Obviously, in a case like mine the best way to work around this problem
is to construct the floats from the bit patterns directly but I can't
stop but wonder what the impact of a subtle bug like this might be on
code which does *not* include the hex denotations for the relevant bit
patterns.

So. Is this a bug?

Cheers,
   - Andreas

Reply | Threaded
Open this post in threaded view
|

Re: Numerics question: reading floating point constants

stéphane ducasse-2
Hi andreas

I knew you much stronger on your statements :) so why would not it be  
a bug?
Do you imply that we could have an implementation justification of  
this ad-hoc behavior in reading
float? I'm puzzled because reading seems to me the only easy things  
with floats :)

Stef

> This post is one for the numerics fraction. I was just in the  
> process of transcribing some interesting numerical code from C to  
> Squeak while I was running into an issue with floating point  
> constants. The code that I have has constants like here:
>
>   pio2_2  =  6.07710050630396597660e-11, /* 0x3DD0B461, 0x1A600000 */
>
> but naively transcribing this into, e.g.,
>
>   pio22 := 6.07710050630396597660e-11.
>
> turned out to get some of the expected results wrong. Why? Because  
> the bit pattern is different, e.g., what we get is this:
>
>   6.07710050630396597660e-11 hex => '3DD0B4611A5FFFFF'
>
> Notice how the last hex digits are different.
>
> My question here is the following: While I can understand why  
> *printing* might give different results based on the algorithm  
> chosen (e.g., what amount of error do we allow when printing  
> floating point numbers etc) is it to be expected that *reading* has  
> similar properties? In particular considering that, e.g.,
>
>   (607710050630396597660 * 1.0e-31) hex => '3DD0B4611A600000'
>
> gives the "correct" bit pattern makes me think that there might be  
> something broken in our way of reading floating point numbers.
>
> Obviously, in a case like mine the best way to work around this  
> problem is to construct the floats from the bit patterns directly  
> but I can't stop but wonder what the impact of a subtle bug like  
> this might be on code which does *not* include the hex denotations  
> for the relevant bit patterns.
>
> So. Is this a bug?


Reply | Threaded
Open this post in threaded view
|

Re: Numerics question: reading floating point constants

Boris.Gaertner
In reply to this post by Andreas.Raab
From: "Andreas Raab" <[hidden email]> wrote:


> Hi Folks,
>
> This post is one for the numerics fraction. I was just in the process of
> transcribing some interesting numerical code from C to Squeak while I
> was running into an issue with floating point constants. The code that I
> have has constants like here:
>
>    pio2_2  =  6.07710050630396597660e-11, /* 0x3DD0B461, 0x1A600000 */
>
> but naively transcribing this into, e.g.,
>
>    pio22 := 6.07710050630396597660e-11.
>
> turned out to get some of the expected results wrong. Why? Because the
> bit pattern is different, e.g., what we get is this:
>
>    6.07710050630396597660e-11 hex => '3DD0B4611A5FFFFF'
>
> Notice how the last hex digits are different.
>
> My question here is the following: While I can understand why *printing*
> might give different results based on the algorithm chosen (e.g., what
> amount of error do we allow when printing floating point numbers etc) is
> it to be expected that *reading* has similar properties? In particular
> considering that, e.g.,
>
>    (607710050630396597660 * 1.0e-31) hex => '3DD0B4611A600000'
>
> gives the "correct" bit pattern makes me think that there might be
> something broken in our way of reading floating point numbers.
>
A tricky question, indeed.
I do not predend to have the answers to all problems that floats
can introduce into computer science, but I think that method
Number class>>readRemainderOf:from:base:withSign:
has some questionable features.
Here is what happens when we read a constant of the form
<intPart>.<fractionalPart>e<expPart> :
 1. the <intPart> is read. This is a nonnegative integer
 2. the <fractionalPart> ist read. This is also a nonnegative integer.
 3. The<fractionalPart> is converted into a float. A floating point
division follows to scale that value. The floating point division
requires a second #asFloat to bring the integer value of
   (base raisedTo: aStream position - fracpos)
into a float.
4. next, the <intPart> is converted into a float (this is the third #asFloat
and added to the float representation of the <fractionalPart>.
5. Finally, the <expPart> is read. The scale factor
  (base raisedTo: <expPart> is multiplied into the result of
step (4), this multiplication reqires again an asFloat, the forth one
in our algorithm.

Well, I really think that four calls of  #asFlaot are too much.
We introduce approximated number representations far too
early and we do this without need.

The line

>    (607710050630396597660 * 1.0e-31) hex => '3DD0B4611A600000'

of your post clearly shows that it is possible to introduce floats
much later in the computation
Attached you find a proposal (for 3.9 #6713) to avoid three of the four
convertions that we use now.

> Obviously, in a case like mine the best way to work around this problem
> is to construct the floats from the bit patterns directly but I can't
> stop but wonder what the impact of a subtle bug like this might be on
> code which does *not* include the hex denotations for the relevant bit
> patterns.
>
> So. Is this a bug?
>
> Cheers,
>    - Andreas
>
So much for now, in the afternoon I will
 * read your comments
 * look into that sutff again
 * write some test cases

Cheers, Boris

As always, comments are very welcome



ReadFloatProposal.1.cs (4K) Download Attachment
Reply | Threaded
Open this post in threaded view
|

Re: Numerics question: reading floating point constants

stéphane ducasse-2
Cool I learned something today.... :)
Better than filling some ugly rtf files :).


>> This post is one for the numerics fraction. I was just in the  
>> process of
>> transcribing some interesting numerical code from C to Squeak while I
>> was running into an issue with floating point constants. The code  
>> that I
>> have has constants like here:
>>
>>    pio2_2  =  6.07710050630396597660e-11, /* 0x3DD0B461,  
>> 0x1A600000 */
>>
>> but naively transcribing this into, e.g.,
>>
>>    pio22 := 6.07710050630396597660e-11.
>>
>> turned out to get some of the expected results wrong. Why? Because  
>> the
>> bit pattern is different, e.g., what we get is this:
>>
>>    6.07710050630396597660e-11 hex => '3DD0B4611A5FFFFF'
>>
>> Notice how the last hex digits are different.
>>
>> My question here is the following: While I can understand why  
>> *printing*
>> might give different results based on the algorithm chosen (e.g.,  
>> what
>> amount of error do we allow when printing floating point numbers  
>> etc) is
>> it to be expected that *reading* has similar properties? In  
>> particular
>> considering that, e.g.,
>>
>>    (607710050630396597660 * 1.0e-31) hex => '3DD0B4611A600000'
>>
>> gives the "correct" bit pattern makes me think that there might be
>> something broken in our way of reading floating point numbers.
>>
> A tricky question, indeed.
> I do not predend to have the answers to all problems that floats
> can introduce into computer science, but I think that method
> Number class>>readRemainderOf:from:base:withSign:
> has some questionable features.
> Here is what happens when we read a constant of the form
> <intPart>.<fractionalPart>e<expPart> :
>  1. the <intPart> is read. This is a nonnegative integer
>  2. the <fractionalPart> ist read. This is also a nonnegative integer.
>  3. The<fractionalPart> is converted into a float. A floating point
> division follows to scale that value. The floating point division
> requires a second #asFloat to bring the integer value of
>    (base raisedTo: aStream position - fracpos)
> into a float.
> 4. next, the <intPart> is converted into a float (this is the third  
> #asFloat
> and added to the float representation of the <fractionalPart>.
> 5. Finally, the <expPart> is read. The scale factor
>   (base raisedTo: <expPart> is multiplied into the result of
> step (4), this multiplication reqires again an asFloat, the forth one
> in our algorithm.
>
> Well, I really think that four calls of  #asFlaot are too much.
> We introduce approximated number representations far too
> early and we do this without need.
>
> The line
>
>>    (607710050630396597660 * 1.0e-31) hex => '3DD0B4611A600000'
>
> of your post clearly shows that it is possible to introduce floats
> much later in the computation
> Attached you find a proposal (for 3.9 #6713) to avoid three of the  
> four
> convertions that we use now.
>
>> Obviously, in a case like mine the best way to work around this  
>> problem
>> is to construct the floats from the bit patterns directly but I can't
>> stop but wonder what the impact of a subtle bug like this might be on
>> code which does *not* include the hex denotations for the relevant  
>> bit
>> patterns.
>>
>> So. Is this a bug?
>>
>> Cheers,
>>    - Andreas
>>
>
> So much for now, in the afternoon I will
>  * read your comments
>  * look into that sutff again
>  * write some test cases
>
> Cheers, Boris
>
> As always, comments are very welcome
> <ReadFloatProposal.1.cs>
>


Reply | Threaded
Open this post in threaded view
|

Re: Numerics question: reading floating point constants

Tony Garnock-Jones-2
In reply to this post by Andreas.Raab
For reference, some of the best known algorithms for reading
floating-point accurately is specified within

"How to Read Floating Point Numbers Accurately" (1990), William D. Clinger
        http://citeseer.ist.psu.edu/clinger90how.html

Cheers,
  Tony

Andreas Raab wrote:

> Hi Folks,
>
> This post is one for the numerics fraction. I was just in the process of
> transcribing some interesting numerical code from C to Squeak while I
> was running into an issue with floating point constants. The code that I
> have has constants like here:
>
>   pio2_2  =  6.07710050630396597660e-11, /* 0x3DD0B461, 0x1A600000 */
>
> but naively transcribing this into, e.g.,
>
>   pio22 := 6.07710050630396597660e-11.
>
> turned out to get some of the expected results wrong. Why? Because the
> bit pattern is different, e.g., what we get is this:
>
>   6.07710050630396597660e-11 hex => '3DD0B4611A5FFFFF'
>
> Notice how the last hex digits are different.
>
> My question here is the following: While I can understand why *printing*
> might give different results based on the algorithm chosen (e.g., what
> amount of error do we allow when printing floating point numbers etc) is
> it to be expected that *reading* has similar properties? In particular
> considering that, e.g.,
>
>   (607710050630396597660 * 1.0e-31) hex => '3DD0B4611A600000'
>
> gives the "correct" bit pattern makes me think that there might be
> something broken in our way of reading floating point numbers.
>
> Obviously, in a case like mine the best way to work around this problem
> is to construct the floats from the bit patterns directly but I can't
> stop but wonder what the impact of a subtle bug like this might be on
> code which does *not* include the hex denotations for the relevant bit
> patterns.
>
> So. Is this a bug?
>
> Cheers,
>   - Andreas
>
>


Reply | Threaded
Open this post in threaded view
|

Re: Numerics question: reading floating point constants

Andres Valloud-2
In reply to this post by Andreas.Raab
Hello Andreas,

Saturday, April 8, 2006, 8:12:22 PM, you wrote:

AR> Notice how the last hex digits are different.

Floating point numbers work in base 2, while human code works in base
10.  That factor of 5 cannot be put into a floating point number
accurately because it's coprime with 2.  Therefore, things like 1/5
cannot be represented with base 2 floating point numbers.  As such,
this is probably what you are running into.

It would work if the constants were expressed as

  bits * 2^exponent

Then it would be fine.  But

  digits * 10^exponent

will typically cause approximation errors.

--
Best regards,
 Andres                            mailto:[hidden email]


Reply | Threaded
Open this post in threaded view
|

Re: Numerics question: reading floating point constants

Nicolas Cellier-3
1 point for Andres, conversion from base 10 to base 2 is inexact.

1 point for Boris : successive inexact operations are not only costly, they
can degrade accuracy.

I wonder if the conversion shouldn't be made with respect to IEEE rounding
mode, and if i remember, this rounding mode is a parameter of underlying
hardware.

In the past, i tried several algo. and finally, i lazily decided to use a call
to sprintf/sscan from within VW.
But with a good reference such as Tony's (1 point) we should make it in
Smalltalk.

Nicolas


Reply | Threaded
Open this post in threaded view
|

Re: Numerics question: reading floating point constants

stéphane ducasse-2
Do you know a good book or lectures on the problems related to number  
representation in our nice
computers?

Stef

On 9 avr. 06, at 18:18, nicolas cellier wrote:

> 1 point for Andres, conversion from base 10 to base 2 is inexact.
>
> 1 point for Boris : successive inexact operations are not only  
> costly, they
> can degrade accuracy.
>
> I wonder if the conversion shouldn't be made with respect to IEEE  
> rounding
> mode, and if i remember, this rounding mode is a parameter of  
> underlying
> hardware.
>
> In the past, i tried several algo. and finally, i lazily decided to  
> use a call
> to sprintf/sscan from within VW.
> But with a good reference such as Tony's (1 point) we should make  
> it in
> Smalltalk.
>
> Nicolas
>
>


Reply | Threaded
Open this post in threaded view
|

Re: Numerics question: reading floating point constants

Klaus D. Witzel
Hi Stef,

there are some books mentioned by

- http://scholar.google.com/scholar?q=number+representation+errors

albeit I don't know how good they are (maybe someone else knows them?).

On Sun, 09 Apr 2006 18:23:23 +0200, stéphane ducasse  
<[hidden email]> wrote:

> Do you know a good book or lectures on the problems related to number  
> representation in our nice
> computers?
>
> Stef
...


Reply | Threaded
Open this post in threaded view
|

Re: Numerics question: reading floating point constants

Wolfgang Helbig-2
In reply to this post by Andreas.Raab
>Do you know a good book or lectures on the problems related to number  
>representation in our nice
>computers?
>
>Stef

The book "Apple Numerics Manual, Addison-Wesley, 1986" describes SANE, "The
Standard Apple Numerics Environment". This Manual book is a comprehensive
description of an IEEE 754 implementation.

If you are programming more than "occasionally", as they say, then Volume 2,
"Seminumerical Algorithms" of Knuth's "The Art of Computer
Programming" is for you. This tells you everything about floating point
numbers, their representation and software implementations for their
basic operations. It does not treat however the implementation of
transcendatal functions like sin or exp. So if any of you knows where to find
information about that, I'd like to hear from you.

Both of those books a fairly seasoned, but the material they cover didn't
change much over the time, at least to my knowlegde, which didn't change
either much over time :-).

Greetings,
Wolfgang
--
Weniger, aber besser.


Reply | Threaded
Open this post in threaded view
|

Re: Numerics question: reading floating point constants

Nicolas Cellier-3
In reply to this post by stéphane ducasse-2
Le Dimanche 09 Avril 2006 18:23, stéphane ducasse a écrit :
> Do you know a good book or lectures on the problems related to number  
> representation in our nice
> computers?
>
> Stef
>
> On 9 avr. 06, at 18:18, nicolas cellier wrote:

Hi Stef,
all i know was taken from Sun workstations documentation that were not bad,
but i have no more access to them.

Tony's ref does not seem bad too, but does not seem to deal with IEEE rounding
mode, the algorithm rounds to nearest.

I do not know what printf/scanf do.
Documentation should exist for sure in the gnu project (gcc, glib, etc... all
deal with this problem), but i have no direct link.

BTW, our nice computers are doing close to their best: most real numbers are
simply not computable by an algorithm... Read this link taken from wikipedia
http://turnbull.mcs.st-and.ac.uk/~history/HistTopics/Real_numbers_3.html

Nicolas


Reply | Threaded
Open this post in threaded view
|

Re: Numerics question: reading floating point constants

Boris.Gaertner
In reply to this post by Andreas.Raab
Hi folks,

here is a first version of the promised test cases.
Examples 4, 5 and 6 illustrate open problems.
I think that I will have to spend more time to find
better solutions, but I want to show you some
problems now. Also I think that tomorrow I
should prepare a Mantis report.

(Example 4 is related to the writing form of a
ScaledDecimal; in 3.9 we fail to read  constants
like
   123s    123.0s
which, to the best of my knowledge, are mentioned
in the ANSI standard. At least they were mentioned
in several drafts.)

Examples 5 and 6 check whether extremely long
entier and fractional parts are handled in a clever way.

Greetings,
Boris


FloatTests.2.cs (4K) Download Attachment
Reply | Threaded
Open this post in threaded view
|

Re: Numerics question: reading floating point constants

Howard Stearns
In reply to this post by stéphane ducasse-2
There's the classic dragon algorithm in Steele & White:
http://portal.acm.org/citation.cfm?
id=93559&coll=portal&dl=ACM&CFID=12707099&CFTOKEN=91916204
This is the flip-side of Clinger.

Note that conforming ANSI Common Lisp implementations are required to  
comply correctly when the programmer requests that numbers be bit  
identical after being round-tripped though PRINT and READ.  As far as  
I know, everyone just uses the same old code for this, which is  
available in open-source Lisps.

On Apr 9, 2006, at 11:23 AM, stéphane ducasse wrote:

> Do you know a good book or lectures on the problems related to  
> number representation in our nice
> computers?
>
> Stef
>
> On 9 avr. 06, at 18:18, nicolas cellier wrote:
>
>> 1 point for Andres, conversion from base 10 to base 2 is inexact.
>>
>> 1 point for Boris : successive inexact operations are not only  
>> costly, they
>> can degrade accuracy.
>>
>> I wonder if the conversion shouldn't be made with respect to IEEE  
>> rounding
>> mode, and if i remember, this rounding mode is a parameter of  
>> underlying
>> hardware.
>>
>> In the past, i tried several algo. and finally, i lazily decided  
>> to use a call
>> to sprintf/sscan from within VW.
>> But with a good reference such as Tony's (1 point) we should make  
>> it in
>> Smalltalk.
>>
>> Nicolas
>>
>>
>
>


Reply | Threaded
Open this post in threaded view
|

Re: Numerics question: reading floating point constants

Tony Garnock-Jones-2
In reply to this post by stéphane ducasse-2
stéphane ducasse wrote:
> Do you know a good book or lectures on the problems related to number
> representation in our nice
> computers?

Besides Clinger's paper on reading floats accurately, there's a
companion by Burger and Dybvig on "Printing Floating Point Numbers
Quickly and Accurately": http://citeseer.ist.psu.edu/28233.html
http://www.cs.indiana.edu/~dyb/FP-Printing-PLDI96.ps.gz

Regards,
  Tony
--
 [][][] Tony Garnock-Jones     | Mob: +44 (0)7905 974 211
   [][] LShift Ltd             | Tel: +44 (0)20 7729 7060
 []  [] http://www.lshift.net/ | Email: [hidden email]

Reply | Threaded
Open this post in threaded view
|

Re: Numerics question: reading floating point constants

Tim Olson-3

On Apr 10, 2006, at 4:50 AM, Tony Garnock-Jones wrote:

> stéphane ducasse wrote:
>> Do you know a good book or lectures on the problems related to number
>> representation in our nice
>> computers?
>
> Besides Clinger's paper on reading floats accurately, there's a
> companion by Burger and Dybvig on "Printing Floating Point Numbers
> Quickly and Accurately": http://citeseer.ist.psu.edu/28233.html
> http://www.cs.indiana.edu/~dyb/FP-Printing-PLDI96.ps.gz

I used this algorithm when I wrote the Float>>absPrintOn:base: and
Float>>absPrintExactlyOn:base: routines which are used for printing
Floats in Squeak.  However, I never re-vamped the Number>>readFrom:
routines which read floats.  The read routines can use the same basic
method for generating exact values as the print routines do (the bases
are just swapped).

        -- Tim Olson


Reply | Threaded
Open this post in threaded view
|

Re: Numerics question: reading floating point constants

Milan Zimmermann-2
In reply to this post by stéphane ducasse-2
On 2006 April 9 12:23, stéphane ducasse wrote:
> Do you know a good book or lectures on the problems related to number
> representation in our nice
> computers?

Perhaps the best resource is the web site of William Kahan, "father" of IEEE
754 here:

http://www.cs.berkeley.edu/~wkahan/

It is a fascinating read about the IEEE-754 battle. On the bottom of his site,
he has recent (extremely interesting) articles, this is just one example:

http://www.cs.berkeley.edu/~wkahan/ARITH_17.pdf

In all those recent articles, he is describing the efforts to maintain
hardware manufacturer's interest to support IEEE 754 while software support
is lagging badly, even 20 years later, apparently only C99 and Fortran
(2000something) are compliant, Cell chips have already sacrificed their
hardware support, and other interesting stuff. It really feels sad for this
industry ... but anyway, that seems the best place to look.

Also this site is interesting:

http://babbage.cs.qc.edu/

In terms of books, perhaps this:

http://www.cs.nyu.edu/overton/book/

(but I did not read it)

I was looking for software that would test how much a language (libraries)
adhere to IEEE754, thinking it would be interesting to translate to Squeak,
there is stuff like:

http://www.tybor.com/

But I run out of understanding (and time)

Sorry to flood with links, I was just recently reading about this stuff and
saved it all on Flock :)

Milan


>
> Stef
>
> On 9 avr. 06, at 18:18, nicolas cellier wrote:
> > 1 point for Andres, conversion from base 10 to base 2 is inexact.
> >
> > 1 point for Boris : successive inexact operations are not only
> > costly, they
> > can degrade accuracy.
> >
> > I wonder if the conversion shouldn't be made with respect to IEEE
> > rounding
> > mode, and if i remember, this rounding mode is a parameter of
> > underlying
> > hardware.
> >
> > In the past, i tried several algo. and finally, i lazily decided to
> > use a call
> > to sprintf/sscan from within VW.
> > But with a good reference such as Tony's (1 point) we should make
> > it in
> > Smalltalk.
> >
> > Nicolas