lunacy regarding FP

Previous Topic Next Topic
 
classic Classic list List threaded Threaded
5 messages Options
Reply | Threaded
Open this post in threaded view
|

lunacy regarding FP

David P. Reed
Having spent many years of my life doing floating point simulations, let
me observe that floating point algorithms are deterministic functions of
their inputs, given proper floating point implementations.  Of course
some computers actually don't have the same implementations, but that is
a problem for languages like FORTRAN that don't actually have
well-specified semantic models.

Reproducibility of FP calaculation is, for example, why multi-computer
simulations work at all, so that massively parallel simulations work.  
Croquet was not designed as a programming language for analog computers,
in which "divergent" simulations arise under identical inputs.   It was
designed for *replicated* simulations.   Replication means *exact*
replication.  One can do that with floating point.

Squeak uses IEEE 754 standard floating point numerics, which specifies
reproducible, exact results accurate to the last bit.  It doesn't
generate random results from non-random inputs, even when those inputs
are approximations to analog inputs.




Reply | Threaded
Open this post in threaded view
|

Re: lunacy regarding FP

Andreas.Raab
It is worthwhile to note, however, that it's very easy to get a
different impression when you look at the current FP environments (in
particular in C, I can't really speak about the situation in Fortran).
Ian, John, and I had a hell of two weeks tweaking FP computations
literally "down to the last bit" in the VMs. The main problems we found
were:
- Different algorithms are used for the same functions resulting in
different approximations depending on the platform libraries (we solved
this by using fdlibm for everything which -luckily- is available freely)
- Different "default" settings based on platform/fp environment (for
example the standard is to use the extended 80bit internal accuracy on
x87 which results in different results from 64bit accuracy)
- Different choice of fp instructions with different effects by the
compiler; most notably the use of fused multiply-add which computes a
result with higher internal accuracy but consequently makes this result
be different from performing the computations independently
- Compiler optimizations which -while perfectly valid for integer
computations- are simply inadequate for fp computations (like some
re-orderings and CSE problems)
All these taken together make for a hard problem for anyone who really
tries to get identical results across the various platforms even if we
rely on (fortunately, well-defined) IEEE754 fp environments.

BTW, Croquet comes with a set of tests to ensure all the platforms
compute (what Ian, John, and I agreed on were ;-) the "correct" results.
It is paramount that for using Croquet in heterogenous environments the
VM passes these tests to ensure it will compute the same results as any
other Croquet client (this is all in the CroquetVMTests).

Cheers,
   - Andreas

David P. Reed wrote:

> Having spent many years of my life doing floating point simulations, let
> me observe that floating point algorithms are deterministic functions of
> their inputs, given proper floating point implementations.  Of course
> some computers actually don't have the same implementations, but that is
> a problem for languages like FORTRAN that don't actually have
> well-specified semantic models.
>
> Reproducibility of FP calaculation is, for example, why multi-computer
> simulations work at all, so that massively parallel simulations work.  
> Croquet was not designed as a programming language for analog computers,
> in which "divergent" simulations arise under identical inputs.   It was
> designed for *replicated* simulations.   Replication means *exact*
> replication.  One can do that with floating point.
>
> Squeak uses IEEE 754 standard floating point numerics, which specifies
> reproducible, exact results accurate to the last bit.  It doesn't
> generate random results from non-random inputs, even when those inputs
> are approximations to analog inputs.
>
>
>
>
>

Reply | Threaded
Open this post in threaded view
|

Re: lunacy regarding FP

Eugen Leitl
In reply to this post by David P. Reed
On Mon, May 01, 2006 at 11:35:27AM -0400, David P. Reed wrote:

> Having spent many years of my life doing floating point simulations, let
> me observe that floating point algorithms are deterministic functions of
> their inputs, given proper floating point implementations.  Of course

But the input is not deterministic, given that optimizing compilers
rearrange the order of instruction execution, depending on compiler version,
optimization flags -- even on the "same" architecture.  Changing
order of execution on instructions subject to rounding errors is
a source of noise, which will become exponentially amplified in
a nonlinear system. Most physical simulations are that. Game worlds
attempt to mimick reality, and hence realistic game worlds are
very large scale simulations, whether we like it, or not. See e.g.
http://www.gamasutra.com/features/20060320/carless_01.shtml
for a description of Second Life (2 kNodes, 160 kUsers, 10 M in-game objects,
15 TBytes of user-created data, 2 TFlops of crunch to make it run).
Second Life relies on centralism to prevent divergence. With a loosely
coupled spatially distributed synchronized instances of worlds run locally
great care must be taken to eliminate local noise sources which will
result in simulation divergence, which is hard to detect (multiple
measurement points required) and once detected would require expensive
resynchronizations which are a nightmare for several systems coupled
over a WAN with differing, widely varying latency and bandwidth
conditions.

How many visitors would a node on residential broadband take before
QoS issues make it prohibitive?

> some computers actually don't have the same implementations, but that is
> a problem for languages like FORTRAN that don't actually have
> well-specified semantic models.
>
> Reproducibility of FP calaculation is, for example, why multi-computer
> simulations work at all, so that massively parallel simulations work.  

If two observers look at a particle cloud e.g., it can be quite important
that the individual locations on individual particles on two or more
instances of the same simulation match exactly.

> Croquet was not designed as a programming language for analog computers,
> in which "divergent" simulations arise under identical inputs.   It was
> designed for *replicated* simulations.   Replication means *exact*
> replication.  One can do that with floating point.

You might look into reproducibility of MD simulations, e.g. on conventional
computers. You might be surprised. The large-scale metrics extracted from
the trajectory are the same, of course -- within error of the measurement.
 
> Squeak uses IEEE 754 standard floating point numerics, which specifies
> reproducible, exact results accurate to the last bit.  It doesn't
> generate random results from non-random inputs, even when those inputs
> are approximations to analog inputs.

Sorry if I belabor the obvious, but IMO it is a very important and
hard problem.

--
Eugen* Leitl <a href="http://leitl.org">leitl</a> http://leitl.org
______________________________________________________________
ICBM: 48.07100, 11.36820            http://www.ativel.com
8B29F6BE: 099D 78BA 2FD3 B014 B08A  7779 75B0 2443 8B29 F6BE

signature.asc (198 bytes) Download Attachment
Reply | Threaded
Open this post in threaded view
|

Re: lunacy regarding FP

David P. Reed
In reply to this post by David P. Reed
Eugen Leitl wrote:

> On Mon, May 01, 2006 at 11:35:27AM -0400, David P. Reed wrote:
>
>  
>> Having spent many years of my life doing floating point simulations, let
>> me observe that floating point algorithms are deterministic functions of
>> their inputs, given proper floating point implementations.  Of course
>>    
>
> But the input is not deterministic, given that optimizing compilers
> rearrange the order of instruction execution, depending on compiler version,
> optimization flags -- even on the "same" architecture.
Again, this is a compiler *bug*.   Rearranging floating point operations
from the way the author wrote them, and throwing away significance, is
not a "legal optimization" it is an error.
> Changing
> order of execution on instructions subject to rounding errors is
> a source of noise, which will become exponentially amplified in
> a nonlinear system. Most physical simulations are that. Game worlds
> attempt to mimick reality, and hence realistic game worlds are
> very large scale simulations, whether we like it, or not.
I agree with your point here.

> See e.g.
> http://www.gamasutra.com/features/20060320/carless_01.shtml
> for a description of Second Life (2 kNodes, 160 kUsers, 10 M in-game objects,
> 15 TBytes of user-created data, 2 TFlops of crunch to make it run).
> Second Life relies on centralism to prevent divergence. With a loosely
> coupled spatially distributed synchronized instances of worlds run locally
> great care must be taken to eliminate local noise sources which will
> result in simulation divergence, which is hard to detect (multiple
> measurement points required) and once detected would require expensive
> resynchronizations which are a nightmare for several systems coupled
> over a WAN with differing, widely varying latency and bandwidth
> conditions.
>  
Simulation divergence does not happen if you write code that does the
same thing on each platform.  A+B = A+B everywhere, when the same
definition of a floating point systems is used everywhere, and it is in
the case of well-specified languages like Squeak.   If it doesn't equal
B+A, perhaps, just perhaps the compiler is in error to replace A+B with
B+A.  
>
> If two observers look at a particle cloud e.g., it can be quite important
> that the individual locations on individual particles on two or more
> instances of the same simulation match exactly.
>
>  
And they will, if the code is replicated properly.   I'm sorry, FPUs are
reliable - they are digital and not analog devices.   You may have in
your fantasy some idea that FPU's generate fuzzy answers that are only
approximately determined by their inputs.   That is never true in
digital computers in the real world.  It might be true when idiots write
bad compilers and you compare the results of two compilers on different
architectures.   It is not true for Squeak (or for that matter, for most
other algorithmic languages)
>
> You might look into reproducibility of MD simulations, e.g. on conventional
> computers. You might be surprised. The large-scale metrics extracted from
> the trajectory are the same, of course -- within error of the measurement.
>  
I am fully aware of the issues you refer to, having been in the
computing business since long before you were born, probably.   However,
they are due to bad compilers and compiler optimizations, or in some
cases, uncaught hardware faults (gamma rays).
>
>
> Sorry if I belabor the obvious, but IMO it is a very important and
> hard problem.
>  
It's actually a trivial or imaginary problem, caused by the programmers
thenselves being lazy or uninformed.


Reply | Threaded
Open this post in threaded view
|

Re: lunacy regarding FP

Ralph Johnson
In reply to this post by David P. Reed
If you run a Fortran program on two different computers, you are not guaranteed to get the same result, because they may implement floating point operations different, might perform different optimizations, etc.  If you run a Squeak program that does the same thing on two different computers, you are supposed to get the same result.  Squeak is different from Fortran, and its implementation is more oriented toward accuracy than performance.
 
The orginal question was about having some sort of attached processor to do physics calculations.  Could Croquet off-load physics computation to these attached processors?  The answer is: only if they produce identical results on every machine.  It is easy to imagine that they would not produce identical results, but it is possible to imagine that they would.  It depends on whether the designers of these attached processors are more concerned with speed or accuracy.  Until we see the hardware, we don't know for sure.
 
A related question is what happens if some people have the hardware and others do not.  Does an island (teaparty) have to run at the speed of the slowest computer in it?  I think so, which means that all the participants would have to have the extra hardware or it wouldn't make any sense.
 
-Ralph