failing/errors Pharo Tests with CogVM

Previous Topic Next Topic
 
classic Classic list List threaded Threaded
24 messages Options
12
Reply | Threaded
Open this post in threaded view
|

Re: failing/errors Pharo Tests with CogVM

Nicolas Cellier

2010/9/17 Andreas Raab <[hidden email]>:

>
> On 9/16/2010 2:44 PM, Nicolas Cellier wrote:
>>
>> Of course, as Andreas and Igor, I also prefer a frank exception to
>> creeping NaNs...
>>
>> The main reason for having NaN is compatibility with external world
>> indeed...
>> Some alien code will produce some NaN and we have to deal with it.
>
> Right. But that's not inconsistent from my perspective. I'm not saying
> disallow NaN altogether; I'm saying don't allow silently introducing it in
> arithmetic operations. Thus, if you have external code that produces NaN you
> can't use that easily to produce further NaN's by means of just arithmetic.
>
> BTW, I'd be *really* curious to see what kind of code people would really
> argue to have propagating NaNs for. It seems we're all violently agreeing
> that none of us want silent NaN propagation so why exactly are we arguing
> again? ;-)
>

It all depends on whether the arithmetic unit can easily deliver
exceptions and whether programming language can easily handle them.
While the former is a requirement of IEEE 754, the latter was far from
obvious 30 years ago
In this context my understanding of the rationale is that it can be
more efficient to test for an exceptional result once after a batch of
computations, rather than at each arithmetic operation.
With advent of decent exception handling in most languages, i can't
really agree with this rationale (or my understanding of it), but
that's easy in 2010 ;)

Of course, my own rationale is only valid in a restricted set of
architectures...
Note that with the advent of parallel FPU, a single exception is
signalled after a batch of computations too... This complexifies a bit
the exception handling. This could be complexified again by other
hardware choices...
BTW, good luck to those trying to use GPU as a FPU; as long as these
are not governed by a standard of the quality of IEEE 754.

Nicolas

> Cheers,
>  - Andreas
>
>>
>> Nicolas
>>
>> 2010/9/16 Eliot Miranda<[hidden email]>:
>>>
>>>
>>>
>>> On Thu, Sep 16, 2010 at 1:25 PM, Andreas Raab<[hidden email]>
>>>  wrote:
>>>>
>>>> On 9/16/2010 12:32 PM, Eliot Miranda wrote:
>>>>>
>>>>>     I need to check these carefully.  One thing that does differ in
>>>>> current Cog is that the machine-code arithmetic float primitives don't
>>>>> fail if they produce a NaN result; they simply answer a NaN result.
>>>>>  IMo
>>>>> what needs to be done is two-fold.
>>>>>
>>>>> a) we need a NaN mode flag in the VM, that persists across snapshots
>>>>> and
>>>>> e.g. is queryable/settable via vmParameterAt:put:, that puts the
>>>>> floating-pont primitive into a state where NaNs are answered instead of
>>>>> primitives failing.
>>>>
>>>> FWIW, I don't think we need that flag. Failing the primitive instead of
>>>> producing something that is specifically declared not to be a number in an
>>>> arithmetic computation is *always* the right thing to do. The problem with
>>>> NaNs is that they propagate. So you start with an isolated NaN as the result
>>>> of a division by an underflow number and may be able to catch it. But you
>>>> don't because it's silent and then it propagates into a matrix because
>>>> you're scaling the matrix by that number. Now you've got a matrix full of
>>>> NaNs. As you push your geometry through that matrix everything becomes
>>>> complete and utter NaN garbage. And of course, NaNs break reflexivity,
>>>> symmetry and transitivity of comparisons.
>>>
>>> We went through a long series of discussions with customers taking
>>> exactly the position you are and finally capitulated because some customers
>>> required IEEE behavior. NaNs do all that you say, but for good reason.  If
>>> they appear in one's calculations then one's calculations are unsafe.  NaNs
>>> exist anyway; they creep in through the FFI even if the VM refises to
>>> produce them.  So they're hard to sweep under the carpet.  One principled
>>> position is to allow them and deal with them correctly.
>>> FWIW, I'm with you.  I would rather the primitives always failed  But
>>> I've lost that argument against people who knew what they were talking about
>>> (I'm no floating point expert).  So I like the flag because it keeps
>>> people's options open.
>>>>
>>>> As a consequence we should never allow them to be introduced silently by
>>>> the VM. If the error handling code for some arithmetic primitive decides
>>>> that against all reasoning you'd like to produce an NaN as the result
>>>> regardless, that's fine, you have been warned. But having the VM introduce
>>>> NaNs silently is wrong, wrong, wrong.
>>>
>>> The current situation in Cog is certainly wrong.  But as discussed above
>>> I don't think it's wrong for the VM to introduce them if it is explicitly in
>>> such a mode and I know from experience that users want and even need such a
>>> mode.
>>> best,
>>> Eliot
>>>>
>>>> Cheers,
>>>>  - Andreas
>>>>
>>>>> b) the Cog code generator needs to respect this flag and arrange that
>>>>> when in the default mode (current behavior) the machine-code arithmetic
>>>>> float primitives also fail if they produce a NaN result.
>>>>>
>>>>> We can then decide at a later date whether to change the primitive
>>>>> behavior to answer NaNs or not.  This is also what we did in
>>>>> VisualWorks; there's an IEEE arithmetic mode and in recent releases
>>>>> VW's
>>>>> floating-point arithmetic will produce NaNs.
>>>>>
>>>>> Anyone interested in taking a look at this is very welcome.  Its
>>>>> probably a week long project at most.
>>>>>
>>>>> best,
>>>>> Eliot
>>>>>
>>>>> On Thu, Sep 16, 2010 at 12:19 PM, Nicolas Cellier
>>>>> <[hidden email]
>>>>> <mailto:[hidden email]>>  wrote:
>>>>>
>>>>>
>>>>>    I mean the M7260-primitiveSmallIntegerCompareNan-Patch-nice.1.cs
>>>>> part,
>>>>>    the rest has already been applied in COG.
>>>>>
>>>>>    Nicolas
>>>>>
>>>>>    2010/9/16 Nicolas Cellier<[hidden email]
>>>>>    <mailto:[hidden email]>>:
>>>>>     >  I see http://bugs.squeak.org/view.php?id=7260 was not integrated
>>>>> in
>>>>>     >  COG, which was the cause of most of the Floating point failures
>>>>>    in old
>>>>>     >  VM, but maybe it's now more complex than that ?
>>>>>     >
>>>>>     >  Nicolas
>>>>>     >
>>>>>     >  2010/9/16 Mariano Martinez Peck<[hidden email]
>>>>>    <mailto:[hidden email]>>:
>>>>>     >>
>>>>>     >>  Hi Eliot. I took a Pharo 1.1.1 image (which has included the
>>>>>    changes to run Cog) and I run all the tests with the build  r2219
>>>>>     >>
>>>>>     >>  And these are the results:
>>>>>     >>
>>>>>     >>  9768 run, 9698 passes, 53 expected failures, 15 failures, 2
>>>>>    errors, 0 unexpected passes
>>>>>     >>  Failures:
>>>>>     >>  FloatTest>>#testRaisedTo
>>>>>     >>  MCInitializationTest>>#testWorkingCopy
>>>>>     >>  FloatTest>>#testReciprocal
>>>>>     >>  ReleaseTest>>#testUndeclared
>>>>>     >>  FloatTest>>#testDivide
>>>>>     >>  MethodContextTest>>#testClosureRestart
>>>>>     >>  FloatTest>>#testCloseTo
>>>>>     >>  FloatTest>>#testHugeIntegerCloseTo
>>>>>     >>  FloatTest>>#testInfinityCloseTo
>>>>>     >>  WeakRegistryTest>>#testFinalization
>>>>>     >>  PCCByLiteralsTest>>#testSwitchPrimCallOffOn
>>>>>     >>  AllocationTest>>#testOneGigAllocation
>>>>>     >>  FloatTest>>#testNaNCompare
>>>>>     >>  FileStreamTest>>#testPositionPastEndIsAtEnd
>>>>>     >>  NumberTest>>#testRaisedToIntegerWithFloats
>>>>>     >>
>>>>>     >>  Errors:
>>>>>     >>  MessageTallyTest>>#testSampling1
>>>>>     >>  WeakSetInspectorTest>>#testSymbolTableM6812
>>>>>     >>
>>>>>     >>
>>>>>     >>
>>>>>     >>  I think that most of these problems were fixed in latest
>>>>>    official SqueakVM. I guess they were integrated in VMMaker in
>>>>>    versions later than the one you used for Cog. Maybe you can
>>>>>    integrate them and create a new version?
>>>>>     >>
>>>>>     >>  I am not a VM expert so please if you can help us with this
>>>>>    tests it would be cool.
>>>>>     >>
>>>>>     >>  Thanks
>>>>>     >>
>>>>>     >>  Mariano
>>>>>     >>
>>>>>     >>
>>>>>     >
>>>>>
>>>>>
>>>
>>>
>>>
>>
>
Reply | Threaded
Open this post in threaded view
|

Re: failing/errors Pharo Tests with CogVM

Eliot Miranda-2
In reply to this post by Andreas.Raab
 


On Thu, Sep 16, 2010 at 10:54 PM, Andreas Raab <[hidden email]> wrote:

On 9/16/2010 2:44 PM, Eliot Miranda wrote:
    I realise we need to be more precise.  Are you talking about NaNs
specifically or NaN and Inf?

NaN only. +-Inf are fine as they have a well-defined mathematical relationship over the set of numbers. NaN does not.


The Squeak VM happily answers Inf from its
float primitives.  In fact the only guard against a NaN or Inf result
being produced by the floating-point primitives is the guard against
dividing by zero.  But e.g. in the interpreter (1.0e300 / 1.0e-300)
isInfinite and there is no failure.  So specifically failing for aFloat
/ 0.0 seems a bit of a fig leaf to me.

So what would your ideal semantics be?
a) - fail whenever the result is Inf or NaN?
b) - fail whenever the result is NaN and allow aFloat / 0.0 to answer Inf
c) - fail whenever the result is NaN but fail aFloat / 0.0
d) - the Interpreter status quo, fail only for aFloat / 0.0
e) - never fail and answer Nan and Inf as specified in IEEE 754

The situation with VW before IEEE was that it did a) and we changed it
so that the mode switch selected either a) or e), with, IIRC, the
current default being e).

f) Fail whenever the result is NaN or when dividing by zero.

OK, to be pedantic that's c) above.  But fine. This is a reasonable choice.


My preference for f) is that division by zero should be consistent between floating point numbers and integers. It would be strange if "1 / 0" => boom but "1.0 / 0" => Inf or "1 / 0.0" => Inf. *However* underflow isn't division by zero and may silently result in Inf. In other words:

       self should:[1.0 / 0.0] raise: ZeroDivide.

but (#successor produces the smallest float larger than the receiver)

       self shouldnt:[1.0 / 0.0 successor] raise: Error.
       self assert: (1.0 / 0.0 successor) = Float infinity.

Cheers,
 - Andreas

Reply | Threaded
Open this post in threaded view
|

Re: failing/errors Pharo Tests with CogVM

Mariano Martinez Peck
 
Ok....that's to all the responses. The thread is now far away my knoweldege, so I will naively ask:  is it worth creating a Pharo 1.1.1 one click with CogVM?

Marcus released a PharoCore 1.1.1 with changes for CogVM and some important fixes. I took that core image and I created a new dev. But before releasing, I run the tests and I found them.

Seems you all discussed about the Float problems...but what about the others? I remember them and I thing they were fixed in newest versions of VMMaker.

So...the question is now....are the failure/error of those tests means that Pharo is "unreleasable" with CogVM? Should we wait for the fix or we should release anyway?

Thanks in advance,

Mariano

On Fri, Sep 17, 2010 at 8:20 PM, Eliot Miranda <[hidden email]> wrote:
 


On Thu, Sep 16, 2010 at 10:54 PM, Andreas Raab <[hidden email]> wrote:

On 9/16/2010 2:44 PM, Eliot Miranda wrote:
    I realise we need to be more precise.  Are you talking about NaNs
specifically or NaN and Inf?

NaN only. +-Inf are fine as they have a well-defined mathematical relationship over the set of numbers. NaN does not.


The Squeak VM happily answers Inf from its
float primitives.  In fact the only guard against a NaN or Inf result
being produced by the floating-point primitives is the guard against
dividing by zero.  But e.g. in the interpreter (1.0e300 / 1.0e-300)
isInfinite and there is no failure.  So specifically failing for aFloat
/ 0.0 seems a bit of a fig leaf to me.

So what would your ideal semantics be?
a) - fail whenever the result is Inf or NaN?
b) - fail whenever the result is NaN and allow aFloat / 0.0 to answer Inf
c) - fail whenever the result is NaN but fail aFloat / 0.0
d) - the Interpreter status quo, fail only for aFloat / 0.0
e) - never fail and answer Nan and Inf as specified in IEEE 754

The situation with VW before IEEE was that it did a) and we changed it
so that the mode switch selected either a) or e), with, IIRC, the
current default being e).

f) Fail whenever the result is NaN or when dividing by zero.

OK, to be pedantic that's c) above.  But fine. This is a reasonable choice.


My preference for f) is that division by zero should be consistent between floating point numbers and integers. It would be strange if "1 / 0" => boom but "1.0 / 0" => Inf or "1 / 0.0" => Inf. *However* underflow isn't division by zero and may silently result in Inf. In other words:

       self should:[1.0 / 0.0] raise: ZeroDivide.

but (#successor produces the smallest float larger than the receiver)

       self shouldnt:[1.0 / 0.0 successor] raise: Error.
       self assert: (1.0 / 0.0 successor) = Float infinity.

Cheers,
 - Andreas



Reply | Threaded
Open this post in threaded view
|

Re: failing/errors Pharo Tests with CogVM

Mariano Martinez Peck
 
Hi. Today I run the tests with the new build 2312 and it is better, at least the Float tests are passing:


9715 run, 9708 passes, 0 expected failures, 5 failures, 2 errors, 0 unexpected passes
Failures:
FileStreamTest>>#testPositionPastEndIsAtEnd
PCCByLiteralsTest>>#testSwitchPrimCallOffOn
AllocationTest>>#testOneGigAllocation
MethodContextTest>>#testClosureRestart
ReleaseTest>>#testUndeclared

Errors:
MessageTallyTest>>#testSampling1
WeakSetInspectorTest>>#testSymbolTableM6812


Thanks

Mariano

On Mon, Sep 20, 2010 at 4:18 PM, Mariano Martinez Peck <[hidden email]> wrote:
Ok....that's to all the responses. The thread is now far away my knoweldege, so I will naively ask:  is it worth creating a Pharo 1.1.1 one click with CogVM?

Marcus released a PharoCore 1.1.1 with changes for CogVM and some important fixes. I took that core image and I created a new dev. But before releasing, I run the tests and I found them.

Seems you all discussed about the Float problems...but what about the others? I remember them and I thing they were fixed in newest versions of VMMaker.

So...the question is now....are the failure/error of those tests means that Pharo is "unreleasable" with CogVM? Should we wait for the fix or we should release anyway?

Thanks in advance,

Mariano

On Fri, Sep 17, 2010 at 8:20 PM, Eliot Miranda <[hidden email]> wrote:
 


On Thu, Sep 16, 2010 at 10:54 PM, Andreas Raab <[hidden email]> wrote:

On 9/16/2010 2:44 PM, Eliot Miranda wrote:
    I realise we need to be more precise.  Are you talking about NaNs
specifically or NaN and Inf?

NaN only. +-Inf are fine as they have a well-defined mathematical relationship over the set of numbers. NaN does not.


The Squeak VM happily answers Inf from its
float primitives.  In fact the only guard against a NaN or Inf result
being produced by the floating-point primitives is the guard against
dividing by zero.  But e.g. in the interpreter (1.0e300 / 1.0e-300)
isInfinite and there is no failure.  So specifically failing for aFloat
/ 0.0 seems a bit of a fig leaf to me.

So what would your ideal semantics be?
a) - fail whenever the result is Inf or NaN?
b) - fail whenever the result is NaN and allow aFloat / 0.0 to answer Inf
c) - fail whenever the result is NaN but fail aFloat / 0.0
d) - the Interpreter status quo, fail only for aFloat / 0.0
e) - never fail and answer Nan and Inf as specified in IEEE 754

The situation with VW before IEEE was that it did a) and we changed it
so that the mode switch selected either a) or e), with, IIRC, the
current default being e).

f) Fail whenever the result is NaN or when dividing by zero.

OK, to be pedantic that's c) above.  But fine. This is a reasonable choice.


My preference for f) is that division by zero should be consistent between floating point numbers and integers. It would be strange if "1 / 0" => boom but "1.0 / 0" => Inf or "1 / 0.0" => Inf. *However* underflow isn't division by zero and may silently result in Inf. In other words:

       self should:[1.0 / 0.0] raise: ZeroDivide.

but (#successor produces the smallest float larger than the receiver)

       self shouldnt:[1.0 / 0.0 successor] raise: Error.
       self assert: (1.0 / 0.0 successor) = Float infinity.

Cheers,
 - Andreas




12