Having spent many years of my life doing floating point simulations, let
me observe that floating point algorithms are deterministic functions of their inputs, given proper floating point implementations. Of course some computers actually don't have the same implementations, but that is a problem for languages like FORTRAN that don't actually have well-specified semantic models. Reproducibility of FP calaculation is, for example, why multi-computer simulations work at all, so that massively parallel simulations work. Croquet was not designed as a programming language for analog computers, in which "divergent" simulations arise under identical inputs. It was designed for *replicated* simulations. Replication means *exact* replication. One can do that with floating point. Squeak uses IEEE 754 standard floating point numerics, which specifies reproducible, exact results accurate to the last bit. It doesn't generate random results from non-random inputs, even when those inputs are approximations to analog inputs. |
It is worthwhile to note, however, that it's very easy to get a
different impression when you look at the current FP environments (in particular in C, I can't really speak about the situation in Fortran). Ian, John, and I had a hell of two weeks tweaking FP computations literally "down to the last bit" in the VMs. The main problems we found were: - Different algorithms are used for the same functions resulting in different approximations depending on the platform libraries (we solved this by using fdlibm for everything which -luckily- is available freely) - Different "default" settings based on platform/fp environment (for example the standard is to use the extended 80bit internal accuracy on x87 which results in different results from 64bit accuracy) - Different choice of fp instructions with different effects by the compiler; most notably the use of fused multiply-add which computes a result with higher internal accuracy but consequently makes this result be different from performing the computations independently - Compiler optimizations which -while perfectly valid for integer computations- are simply inadequate for fp computations (like some re-orderings and CSE problems) All these taken together make for a hard problem for anyone who really tries to get identical results across the various platforms even if we rely on (fortunately, well-defined) IEEE754 fp environments. BTW, Croquet comes with a set of tests to ensure all the platforms compute (what Ian, John, and I agreed on were ;-) the "correct" results. It is paramount that for using Croquet in heterogenous environments the VM passes these tests to ensure it will compute the same results as any other Croquet client (this is all in the CroquetVMTests). Cheers, - Andreas David P. Reed wrote: > Having spent many years of my life doing floating point simulations, let > me observe that floating point algorithms are deterministic functions of > their inputs, given proper floating point implementations. Of course > some computers actually don't have the same implementations, but that is > a problem for languages like FORTRAN that don't actually have > well-specified semantic models. > > Reproducibility of FP calaculation is, for example, why multi-computer > simulations work at all, so that massively parallel simulations work. > Croquet was not designed as a programming language for analog computers, > in which "divergent" simulations arise under identical inputs. It was > designed for *replicated* simulations. Replication means *exact* > replication. One can do that with floating point. > > Squeak uses IEEE 754 standard floating point numerics, which specifies > reproducible, exact results accurate to the last bit. It doesn't > generate random results from non-random inputs, even when those inputs > are approximations to analog inputs. > > > > > |
In reply to this post by David P. Reed
On Mon, May 01, 2006 at 11:35:27AM -0400, David P. Reed wrote:
> Having spent many years of my life doing floating point simulations, let > me observe that floating point algorithms are deterministic functions of > their inputs, given proper floating point implementations. Of course But the input is not deterministic, given that optimizing compilers rearrange the order of instruction execution, depending on compiler version, optimization flags -- even on the "same" architecture. Changing order of execution on instructions subject to rounding errors is a source of noise, which will become exponentially amplified in a nonlinear system. Most physical simulations are that. Game worlds attempt to mimick reality, and hence realistic game worlds are very large scale simulations, whether we like it, or not. See e.g. http://www.gamasutra.com/features/20060320/carless_01.shtml for a description of Second Life (2 kNodes, 160 kUsers, 10 M in-game objects, 15 TBytes of user-created data, 2 TFlops of crunch to make it run). Second Life relies on centralism to prevent divergence. With a loosely coupled spatially distributed synchronized instances of worlds run locally great care must be taken to eliminate local noise sources which will result in simulation divergence, which is hard to detect (multiple measurement points required) and once detected would require expensive resynchronizations which are a nightmare for several systems coupled over a WAN with differing, widely varying latency and bandwidth conditions. How many visitors would a node on residential broadband take before QoS issues make it prohibitive? > some computers actually don't have the same implementations, but that is > a problem for languages like FORTRAN that don't actually have > well-specified semantic models. > > Reproducibility of FP calaculation is, for example, why multi-computer > simulations work at all, so that massively parallel simulations work. If two observers look at a particle cloud e.g., it can be quite important that the individual locations on individual particles on two or more instances of the same simulation match exactly. > Croquet was not designed as a programming language for analog computers, > in which "divergent" simulations arise under identical inputs. It was > designed for *replicated* simulations. Replication means *exact* > replication. One can do that with floating point. You might look into reproducibility of MD simulations, e.g. on conventional computers. You might be surprised. The large-scale metrics extracted from the trajectory are the same, of course -- within error of the measurement. > Squeak uses IEEE 754 standard floating point numerics, which specifies > reproducible, exact results accurate to the last bit. It doesn't > generate random results from non-random inputs, even when those inputs > are approximations to analog inputs. Sorry if I belabor the obvious, but IMO it is a very important and hard problem. -- Eugen* Leitl <a href="http://leitl.org">leitl</a> http://leitl.org ______________________________________________________________ ICBM: 48.07100, 11.36820 http://www.ativel.com 8B29F6BE: 099D 78BA 2FD3 B014 B08A 7779 75B0 2443 8B29 F6BE signature.asc (198 bytes) Download Attachment |
In reply to this post by David P. Reed
Eugen Leitl wrote:
> On Mon, May 01, 2006 at 11:35:27AM -0400, David P. Reed wrote: > > >> Having spent many years of my life doing floating point simulations, let >> me observe that floating point algorithms are deterministic functions of >> their inputs, given proper floating point implementations. Of course >> > > But the input is not deterministic, given that optimizing compilers > rearrange the order of instruction execution, depending on compiler version, > optimization flags -- even on the "same" architecture. from the way the author wrote them, and throwing away significance, is not a "legal optimization" it is an error. > Changing > order of execution on instructions subject to rounding errors is > a source of noise, which will become exponentially amplified in > a nonlinear system. Most physical simulations are that. Game worlds > attempt to mimick reality, and hence realistic game worlds are > very large scale simulations, whether we like it, or not. I agree with your point here. > See e.g. > http://www.gamasutra.com/features/20060320/carless_01.shtml > for a description of Second Life (2 kNodes, 160 kUsers, 10 M in-game objects, > 15 TBytes of user-created data, 2 TFlops of crunch to make it run). > Second Life relies on centralism to prevent divergence. With a loosely > coupled spatially distributed synchronized instances of worlds run locally > great care must be taken to eliminate local noise sources which will > result in simulation divergence, which is hard to detect (multiple > measurement points required) and once detected would require expensive > resynchronizations which are a nightmare for several systems coupled > over a WAN with differing, widely varying latency and bandwidth > conditions. > same thing on each platform. A+B = A+B everywhere, when the same definition of a floating point systems is used everywhere, and it is in the case of well-specified languages like Squeak. If it doesn't equal B+A, perhaps, just perhaps the compiler is in error to replace A+B with B+A. > > If two observers look at a particle cloud e.g., it can be quite important > that the individual locations on individual particles on two or more > instances of the same simulation match exactly. > > And they will, if the code is replicated properly. I'm sorry, FPUs are reliable - they are digital and not analog devices. You may have in your fantasy some idea that FPU's generate fuzzy answers that are only approximately determined by their inputs. That is never true in digital computers in the real world. It might be true when idiots write bad compilers and you compare the results of two compilers on different architectures. It is not true for Squeak (or for that matter, for most other algorithmic languages) > > You might look into reproducibility of MD simulations, e.g. on conventional > computers. You might be surprised. The large-scale metrics extracted from > the trajectory are the same, of course -- within error of the measurement. > I am fully aware of the issues you refer to, having been in the computing business since long before you were born, probably. However, they are due to bad compilers and compiler optimizations, or in some cases, uncaught hardware faults (gamma rays). > > > Sorry if I belabor the obvious, but IMO it is a very important and > hard problem. > It's actually a trivial or imaginary problem, caused by the programmers thenselves being lazy or uninformed. |
In reply to this post by David P. Reed
If you run a Fortran program on two different computers, you are not guaranteed to get the same result, because they may implement floating point operations different, might perform different optimizations, etc. If you run a Squeak program that does the same thing on two different computers, you are supposed to get the same result. Squeak is different from Fortran, and its implementation is more oriented toward accuracy than performance.
The orginal question was about having some sort of attached processor to do physics calculations. Could Croquet off-load physics computation to these attached processors? The answer is: only if they produce identical results on every machine. It is easy to imagine that they would not produce identical results, but it is possible to imagine that they would. It depends on whether the designers of these attached processors are more concerned with speed or accuracy. Until we see the hardware, we don't know for sure.
A related question is what happens if some people have the hardware and others do not. Does an island (teaparty) have to run at the speed of the slowest computer in it? I think so, which means that all the participants would have to have the extra hardware or it wouldn't make any sense.
-Ralph
|
Free forum by Nabble | Edit this page |