FloatingPointException

Previous Topic Next Topic
 
classic Classic list List threaded Threaded
17 messages Options
Reply | Threaded
Open this post in threaded view
|

FloatingPointException

Brad Fuller-4
After running croquet for a while, a FP except ion occured. So, I
restarted it from scratch and started "Croquet (Master)" and left it
to run. I did not enter to world and move anything around. Eventually,
the FP exception occured.

I searched the developers archives for floating point exception, but
couldn't find anything.

  I can not debug it much, it runs very slowly. But, i took a screen
shot of it and attached it. It might tell someone something. I'm going
to attempt to attach the screenshot to this msg, hoping that this
mailing list accepts attachments.

This is the standard off-the-shelf SDK 1.0 from the croquetproject
website on Linux:

$ uname -a
Linux IVES 2.6.25-gentoo-r7-a #3 SMP PREEMPT Wed Oct 1 14:45:40 PDT
2008 x86_64 AMD Athlon(tm) 64 X2 Dual Core Processor 5000+
AuthenticAMD GNU/Linux

--
Brad Fuller


FPE.png (29K) Download Attachment
Reply | Threaded
Open this post in threaded view
|

Re: FloatingPointException

Steve Wart
That's kind of interesting.

The problem seems to be that a 3d vector is somehow reporting a negative length. Taking the square root of a negative number causes croquet to throw an exception.

This is implemented by a primitive method called from FloatArray called dot: which does this in Smalltalk:

dot: aFloatVector
    "Primitive. Return the dot product of the receiver and the argument.
    Fail if the argument is not of the same size as the receiver."
    | result |
    <primitive:'primitiveDotProduct' module: 'FloatArrayPlugin'>
    self size = aFloatVector size ifFalse:[^self error:'Must be equal size'].
    result := 0.0.
    1 to: self size do:[:i|
        result := result + ((self at: i) * (aFloatVector at: i)).
    ].
    ^result

If the C code is doing what the Smalltalk claims to be doing, it should never return a negative number, unless someone's implemented complex floats :)

Steve

On Thu, Nov 6, 2008 at 8:04 AM, Brad Fuller <[hidden email]> wrote:
After running croquet for a while, a FP except ion occured. So, I
restarted it from scratch and started "Croquet (Master)" and left it
to run. I did not enter to world and move anything around. Eventually,
the FP exception occured.

I searched the developers archives for floating point exception, but
couldn't find anything.

 I can not debug it much, it runs very slowly. But, i took a screen
shot of it and attached it. It might tell someone something. I'm going
to attempt to attach the screenshot to this msg, hoping that this
mailing list accepts attachments.

This is the standard off-the-shelf SDK 1.0 from the croquetproject
website on Linux:

$ uname -a
Linux IVES 2.6.25-gentoo-r7-a #3 SMP PREEMPT Wed Oct 1 14:45:40 PDT
2008 x86_64 AMD Athlon(tm) 64 X2 Dual Core Processor 5000+
AuthenticAMD GNU/Linux

--
Brad Fuller


Reply | Threaded
Open this post in threaded view
|

Re: FloatingPointException

David P. Reed
In reply to this post by Brad Fuller-4
Brad - very likely you have one of two problems:

1) A bad Croquet/Squeak VM.
2) a serious bug in your Linux kernel.

The stack trace suggests that the code which computes the length of a 3D
vector is trying to take the sqrt of a number less than zero.   This
should NEVER happen.

What could cause it to happen is a processor floating point exception
flag that is left set and which triggers a faux exception in the sqrt
operation.

I don't have your machine or the particular SDK.  Try running it on
another machine with a different Linux kernel...

(there have been many conflicts in the last few months about using the
new GCC to compile the Linux kernel.  There is assembly code in the x86
Linux kernel that expects GCC to treat flag register one way, and GCC
just decided to go another way, based on the C Standard, though who
knows if it has affected your case)

Brad Fuller wrote:

> After running croquet for a while, a FP except ion occured. So, I
> restarted it from scratch and started "Croquet (Master)" and left it
> to run. I did not enter to world and move anything around. Eventually,
> the FP exception occured.
>
> I searched the developers archives for floating point exception, but
> couldn't find anything.
>
>  I can not debug it much, it runs very slowly. But, i took a screen
> shot of it and attached it. It might tell someone something. I'm going
> to attempt to attach the screenshot to this msg, hoping that this
> mailing list accepts attachments.
>
> This is the standard off-the-shelf SDK 1.0 from the croquetproject
> website on Linux:
>
> $ uname -a
> Linux IVES 2.6.25-gentoo-r7-a #3 SMP PREEMPT Wed Oct 1 14:45:40 PDT
> 2008 x86_64 AMD Athlon(tm) 64 X2 Dual Core Processor 5000+
> AuthenticAMD GNU/Linux
>
>
> ------------------------------------------------------------------------
>
Reply | Threaded
Open this post in threaded view
|

Re: FloatingPointException

Brad Fuller-4
thanks all. I'll see what I can do.

brad

On Thu, Nov 6, 2008 at 9:05 AM, David P. Reed <[hidden email]> wrote:

> Brad - very likely you have one of two problems:
>
> 1) A bad Croquet/Squeak VM.
> 2) a serious bug in your Linux kernel.
>
> The stack trace suggests that the code which computes the length of a 3D
> vector is trying to take the sqrt of a number less than zero.   This should
> NEVER happen.
>
> What could cause it to happen is a processor floating point exception flag
> that is left set and which triggers a faux exception in the sqrt operation.
>
> I don't have your machine or the particular SDK.  Try running it on another
> machine with a different Linux kernel...
>
> (there have been many conflicts in the last few months about using the new
> GCC to compile the Linux kernel.  There is assembly code in the x86 Linux
> kernel that expects GCC to treat flag register one way, and GCC just decided
> to go another way, based on the C Standard, though who knows if it has
> affected your case)
>
> Brad Fuller wrote:
>>
>> After running croquet for a while, a FP except ion occured. So, I
>> restarted it from scratch and started "Croquet (Master)" and left it
>> to run. I did not enter to world and move anything around. Eventually,
>> the FP exception occured.
>>
>> I searched the developers archives for floating point exception, but
>> couldn't find anything.
>>
>>  I can not debug it much, it runs very slowly. But, i took a screen
>> shot of it and attached it. It might tell someone something. I'm going
>> to attempt to attach the screenshot to this msg, hoping that this
>> mailing list accepts attachments.
>>
>> This is the standard off-the-shelf SDK 1.0 from the croquetproject
>> website on Linux:
>>
>> $ uname -a
>> Linux IVES 2.6.25-gentoo-r7-a #3 SMP PREEMPT Wed Oct 1 14:45:40 PDT
>> 2008 x86_64 AMD Athlon(tm) 64 X2 Dual Core Processor 5000+
>> AuthenticAMD GNU/Linux
>>
>>
>> ------------------------------------------------------------------------
>>
>



--
Brad Fuller
Reply | Threaded
Open this post in threaded view
|

Re: FloatingPointException

Les Howell
In reply to this post by Brad Fuller-4
somehow you took the squareroot of a negative number.  Sorry if I am
late to the party in responding, I've been busy.

Regards,
Les H
On Thu, 2008-11-06 at 08:04 -0800, Brad Fuller wrote:

> After running croquet for a while, a FP except ion occured. So, I
> restarted it from scratch and started "Croquet (Master)" and left it
> to run. I did not enter to world and move anything around. Eventually,
> the FP exception occured.
>
> I searched the developers archives for floating point exception, but
> couldn't find anything.
>
>   I can not debug it much, it runs very slowly. But, i took a screen
> shot of it and attached it. It might tell someone something. I'm going
> to attempt to attach the screenshot to this msg, hoping that this
> mailing list accepts attachments.
>
> This is the standard off-the-shelf SDK 1.0 from the croquetproject
> website on Linux:
>
> $ uname -a
> Linux IVES 2.6.25-gentoo-r7-a #3 SMP PREEMPT Wed Oct 1 14:45:40 PDT
> 2008 x86_64 AMD Athlon(tm) 64 X2 Dual Core Processor 5000+
> AuthenticAMD GNU/Linux
>

Reply | Threaded
Open this post in threaded view
|

Re: FloatingPointException

Joshua Gargus-2
Les wrote:
somehow you took the squareroot of a negative number.  Sorry if I am
late to the party in responding, I've been busy.

  

Yeah, but the weird thing is that the dot-product of a vector with itself is never negative. 

It might be interesting/enlightening to compute the dot-products of some random vectors and verify that the results are sane.  Eg: we would expect "1@2@3 dot: 4@5@6" to evaluate to 32.0; does it?  It would be a very helpful test case if we could identify a pair of vectors that give an incorrect result.

Cheers,
Josh



Regards,
Les H
On Thu, 2008-11-06 at 08:04 -0800, Brad Fuller wrote:
  
After running croquet for a while, a FP except ion occured. So, I
restarted it from scratch and started "Croquet (Master)" and left it
to run. I did not enter to world and move anything around. Eventually,
the FP exception occured.

I searched the developers archives for floating point exception, but
couldn't find anything.

  I can not debug it much, it runs very slowly. But, i took a screen
shot of it and attached it. It might tell someone something. I'm going
to attempt to attach the screenshot to this msg, hoping that this
mailing list accepts attachments.

This is the standard off-the-shelf SDK 1.0 from the croquetproject
website on Linux:

$ uname -a
Linux IVES 2.6.25-gentoo-r7-a #3 SMP PREEMPT Wed Oct 1 14:45:40 PDT
2008 x86_64 AMD Athlon(tm) 64 X2 Dual Core Processor 5000+
AuthenticAMD GNU/Linux

    

  

Reply | Threaded
Open this post in threaded view
|

Re: FloatingPointException

David P. Reed
In reply to this post by Les Howell
If you look at the stack, the sqrt in question is in a calculation of
the length of a vector.

This calculation is:  sqrt(delta_x^2 + delta_y^2 + delta_z^2)

The ^2 makes all summands positive or zero.   Thus, the square root
NEVER has a negative argument in this case.

However, the FSQRT function in the x87 instruction subset does not clear
the FPE exception.   If the exception flag is checked only *after* the
FSQRT (as is the convention in most C compilers), any prior FPE setting
instruction can have caused the exception, and in Linux these days, it
can actually be inherited from other parallel processes, because of the
bug I mentioned.

So in fact, the bug may actually be caused far, far away from where it
appears if the problem is in the VM or the Linux kernel.   And the
Croquet call history is a path that is repeatedly recomputed (many times
per second) and always gives the same answer based on the same data if
the user is not moving the mouse, so the likelihood that the symptom
occurs only after a long idle time due to "local" effects is near zero.



Les wrote:

> somehow you took the squareroot of a negative number.  Sorry if I am
> late to the party in responding, I've been busy.
>
> Regards,
> Les H
> On Thu, 2008-11-06 at 08:04 -0800, Brad Fuller wrote:
>  
>> After running croquet for a while, a FP except ion occured. So, I
>> restarted it from scratch and started "Croquet (Master)" and left it
>> to run. I did not enter to world and move anything around. Eventually,
>> the FP exception occured.
>>
>> I searched the developers archives for floating point exception, but
>> couldn't find anything.
>>
>>   I can not debug it much, it runs very slowly. But, i took a screen
>> shot of it and attached it. It might tell someone something. I'm going
>> to attempt to attach the screenshot to this msg, hoping that this
>> mailing list accepts attachments.
>>
>> This is the standard off-the-shelf SDK 1.0 from the croquetproject
>> website on Linux:
>>
>> $ uname -a
>> Linux IVES 2.6.25-gentoo-r7-a #3 SMP PREEMPT Wed Oct 1 14:45:40 PDT
>> 2008 x86_64 AMD Athlon(tm) 64 X2 Dual Core Processor 5000+
>> AuthenticAMD GNU/Linux
>>
>>    
>
>
>  
Reply | Threaded
Open this post in threaded view
|

Re: FloatingPointException

Bob Arning
FWIW, the version of #primitiveSqrt I have seems to generate the  
exception only if self < 0.0. While the FSQRT bug may be a problem  
elsewhere, it doesn't look like the culprit here.

Cheers,
Bob


On Nov 7, 2008, at 10:49 AM, David P. Reed wrote:

> If you look at the stack, the sqrt in question is in a calculation  
> of the length of a vector.
>
> This calculation is:  sqrt(delta_x^2 + delta_y^2 + delta_z^2)
>
> The ^2 makes all summands positive or zero.   Thus, the square root  
> NEVER has a negative argument in this case.
>
> However, the FSQRT function in the x87 instruction subset does not  
> clear the FPE exception.   If the exception flag is checked only  
> *after* the FSQRT (as is the convention in most C compilers), any  
> prior FPE setting instruction can have caused the exception, and in  
> Linux these days, it can actually be inherited from other parallel  
> processes, because of the bug I mentioned.
>
> So in fact, the bug may actually be caused far, far away from where  
> it appears if the problem is in the VM or the Linux kernel.   And  
> the Croquet call history is a path that is repeatedly recomputed  
> (many times per second) and always gives the same answer based on  
> the same data if the user is not moving the mouse, so the  
> likelihood that the symptom occurs only after a long idle time due  
> to "local" effects is near zero.
>
>
>
> Les wrote:
>> somehow you took the squareroot of a negative number.  Sorry if I am
>> late to the party in responding, I've been busy.
>>
>> Regards,
>> Les H
>> On Thu, 2008-11-06 at 08:04 -0800, Brad Fuller wrote:
>>
>>> After running croquet for a while, a FP except ion occured. So, I
>>> restarted it from scratch and started "Croquet (Master)" and left it
>>> to run. I did not enter to world and move anything around.  
>>> Eventually,
>>> the FP exception occured.
>>>
>>> I searched the developers archives for floating point exception, but
>>> couldn't find anything.
>>>
>>>   I can not debug it much, it runs very slowly. But, i took a screen
>>> shot of it and attached it. It might tell someone something. I'm  
>>> going
>>> to attempt to attach the screenshot to this msg, hoping that this
>>> mailing list accepts attachments.
>>>
>>> This is the standard off-the-shelf SDK 1.0 from the croquetproject
>>> website on Linux:
>>>
>>> $ uname -a
>>> Linux IVES 2.6.25-gentoo-r7-a #3 SMP PREEMPT Wed Oct 1 14:45:40 PDT
>>> 2008 x86_64 AMD Athlon(tm) 64 X2 Dual Core Processor 5000+
>>> AuthenticAMD GNU/Linux
>>>
>>>
>>
>>
>>

Reply | Threaded
Open this post in threaded view
|

Re: FloatingPointException

David P. Reed
That's why it is a mystery!  The code in question cannot pass a negative
number.  Ever.

Bob Arning wrote:

> FWIW, the version of #primitiveSqrt I have seems to generate the
> exception only if self < 0.0. While the FSQRT bug may be a problem
> elsewhere, it doesn't look like the culprit here.
>
> Cheers,
> Bob
>
>
> On Nov 7, 2008, at 10:49 AM, David P. Reed wrote:
>
>> If you look at the stack, the sqrt in question is in a calculation of
>> the length of a vector.
>>
>> This calculation is:  sqrt(delta_x^2 + delta_y^2 + delta_z^2)
>>
>> The ^2 makes all summands positive or zero.   Thus, the square root
>> NEVER has a negative argument in this case.
>>
>> However, the FSQRT function in the x87 instruction subset does not
>> clear the FPE exception.   If the exception flag is checked only
>> *after* the FSQRT (as is the convention in most C compilers), any
>> prior FPE setting instruction can have caused the exception, and in
>> Linux these days, it can actually be inherited from other parallel
>> processes, because of the bug I mentioned.
>>
>> So in fact, the bug may actually be caused far, far away from where
>> it appears if the problem is in the VM or the Linux kernel.   And the
>> Croquet call history is a path that is repeatedly recomputed (many
>> times per second) and always gives the same answer based on the same
>> data if the user is not moving the mouse, so the likelihood that the
>> symptom occurs only after a long idle time due to "local" effects is
>> near zero.
>>
>>
>>
>> Les wrote:
>>> somehow you took the squareroot of a negative number.  Sorry if I am
>>> late to the party in responding, I've been busy.
>>>
>>> Regards,
>>> Les H
>>> On Thu, 2008-11-06 at 08:04 -0800, Brad Fuller wrote:
>>>
>>>> After running croquet for a while, a FP except ion occured. So, I
>>>> restarted it from scratch and started "Croquet (Master)" and left it
>>>> to run. I did not enter to world and move anything around. Eventually,
>>>> the FP exception occured.
>>>>
>>>> I searched the developers archives for floating point exception, but
>>>> couldn't find anything.
>>>>
>>>>   I can not debug it much, it runs very slowly. But, i took a screen
>>>> shot of it and attached it. It might tell someone something. I'm going
>>>> to attempt to attach the screenshot to this msg, hoping that this
>>>> mailing list accepts attachments.
>>>>
>>>> This is the standard off-the-shelf SDK 1.0 from the croquetproject
>>>> website on Linux:
>>>>
>>>> $ uname -a
>>>> Linux IVES 2.6.25-gentoo-r7-a #3 SMP PREEMPT Wed Oct 1 14:45:40 PDT
>>>> 2008 x86_64 AMD Athlon(tm) 64 X2 Dual Core Processor 5000+
>>>> AuthenticAMD GNU/Linux
>>>>
>>>>
>>>
>>>
>>>
>
>
Reply | Threaded
Open this post in threaded view
|

Re: FloatingPointException

Steve Wart
Unless the compiler is broken or the primitive code is being incorrectly generated.

Can someone reproduce this problem in a debugger?

On Fri, Nov 7, 2008 at 8:27 AM, David P. Reed <[hidden email]> wrote:
That's why it is a mystery!  The code in question cannot pass a negative number.  Ever.


Bob Arning wrote:
FWIW, the version of #primitiveSqrt I have seems to generate the exception only if self < 0.0. While the FSQRT bug may be a problem elsewhere, it doesn't look like the culprit here.

Cheers,
Bob


On Nov 7, 2008, at 10:49 AM, David P. Reed wrote:

If you look at the stack, the sqrt in question is in a calculation of the length of a vector.

This calculation is:  sqrt(delta_x^2 + delta_y^2 + delta_z^2)

The ^2 makes all summands positive or zero.   Thus, the square root NEVER has a negative argument in this case.

However, the FSQRT function in the x87 instruction subset does not clear the FPE exception.   If the exception flag is checked only *after* the FSQRT (as is the convention in most C compilers), any prior FPE setting instruction can have caused the exception, and in Linux these days, it can actually be inherited from other parallel processes, because of the bug I mentioned.

So in fact, the bug may actually be caused far, far away from where it appears if the problem is in the VM or the Linux kernel.   And the Croquet call history is a path that is repeatedly recomputed (many times per second) and always gives the same answer based on the same data if the user is not moving the mouse, so the likelihood that the symptom occurs only after a long idle time due to "local" effects is near zero.



Les wrote:
somehow you took the squareroot of a negative number.  Sorry if I am
late to the party in responding, I've been busy.

Regards,
Les H
On Thu, 2008-11-06 at 08:04 -0800, Brad Fuller wrote:

After running croquet for a while, a FP except ion occured. So, I
restarted it from scratch and started "Croquet (Master)" and left it
to run. I did not enter to world and move anything around. Eventually,
the FP exception occured.

I searched the developers archives for floating point exception, but
couldn't find anything.

 I can not debug it much, it runs very slowly. But, i took a screen
shot of it and attached it. It might tell someone something. I'm going
to attempt to attach the screenshot to this msg, hoping that this
mailing list accepts attachments.

This is the standard off-the-shelf SDK 1.0 from the croquetproject
website on Linux:

$ uname -a
Linux IVES 2.6.25-gentoo-r7-a #3 SMP PREEMPT Wed Oct 1 14:45:40 PDT
2008 x86_64 AMD Athlon(tm) 64 X2 Dual Core Processor 5000+
AuthenticAMD GNU/Linux








Reply | Threaded
Open this post in threaded view
|

Re: FloatingPointException

Bob Arning
Short of seeing this in a debugger , the SqueakDebug.log file has a  
lot of clues. Seeing the file from Brad's case might tell us a lot.

Cheers,
Bob


FloatingPointException: undefined if less than zero.
7 November 2008 11:34:51 am

VM: Mac OS - a SmalltalkImage
Image: Croquet1.0beta [latest update: #2]

SecurityManager state:
Restricted: false
FileAccess: true
SocketAccess: true
Working Dir /Users/bob/Desktop/Miscellaneous/CroquetSDK-1.0.18
Trusted Dir /Users/bob/Desktop/Miscellaneous/CroquetSDK-1.0.18
Untrusted Dir foobar/tooBar/forSqueak/bogus

Float>>primitiveSqrt
        Receiver: -1.0
        Arguments and temporary variables:
                exp: nil
                guess: nil
                eps: nil
                delta: nil
        Receiver's instance variables:
-1.0

Float>>sqrt
        Receiver: -1.0
        Arguments and temporary variables:

        Receiver's instance variables:
-1.0

SmallInteger(Number)>>sqrt
        Receiver: -1
        Arguments and temporary variables:

        Receiver's instance variables:
-1

UndefinedObject>>DoIt
        Receiver: nil
        Arguments and temporary variables:

        Receiver's instance variables:
nil

On Nov 7, 2008, at 11:31 AM, Steve Wart wrote:

> Unless the compiler is broken or the primitive code is being  
> incorrectly generated.
>
> Can someone reproduce this problem in a debugger?
>
> On Fri, Nov 7, 2008 at 8:27 AM, David P. Reed <[hidden email]> wrote:
> That's why it is a mystery!  The code in question cannot pass a  
> negative number.  Ever.
>
>
> Bob Arning wrote:
> FWIW, the version of #primitiveSqrt I have seems to generate the  
> exception only if self < 0.0. While the FSQRT bug may be a problem  
> elsewhere, it doesn't look like the culprit here.
>
> Cheers,
> Bob
>
>
> On Nov 7, 2008, at 10:49 AM, David P. Reed wrote:
>
> If you look at the stack, the sqrt in question is in a calculation  
> of the length of a vector.
>
> This calculation is:  sqrt(delta_x^2 + delta_y^2 + delta_z^2)
>
> The ^2 makes all summands positive or zero.   Thus, the square root  
> NEVER has a negative argument in this case.
>
> However, the FSQRT function in the x87 instruction subset does not  
> clear the FPE exception.   If the exception flag is checked only  
> *after* the FSQRT (as is the convention in most C compilers), any  
> prior FPE setting instruction can have caused the exception, and in  
> Linux these days, it can actually be inherited from other parallel  
> processes, because of the bug I mentioned.
>
> So in fact, the bug may actually be caused far, far away from where  
> it appears if the problem is in the VM or the Linux kernel.   And  
> the Croquet call history is a path that is repeatedly recomputed  
> (many times per second) and always gives the same answer based on  
> the same data if the user is not moving the mouse, so the  
> likelihood that the symptom occurs only after a long idle time due  
> to "local" effects is near zero.
>
>
>
> Les wrote:
> somehow you took the squareroot of a negative number.  Sorry if I am
> late to the party in responding, I've been busy.
>
> Regards,
> Les H
> On Thu, 2008-11-06 at 08:04 -0800, Brad Fuller wrote:
>
> After running croquet for a while, a FP except ion occured. So, I
> restarted it from scratch and started "Croquet (Master)" and left it
> to run. I did not enter to world and move anything around. Eventually,
> the FP exception occured.
>
> I searched the developers archives for floating point exception, but
> couldn't find anything.
>
>  I can not debug it much, it runs very slowly. But, i took a screen
> shot of it and attached it. It might tell someone something. I'm going
> to attempt to attach the screenshot to this msg, hoping that this
> mailing list accepts attachments.
>
> This is the standard off-the-shelf SDK 1.0 from the croquetproject
> website on Linux:
>
> $ uname -a
> Linux IVES 2.6.25-gentoo-r7-a #3 SMP PREEMPT Wed Oct 1 14:45:40 PDT
> 2008 x86_64 AMD Athlon(tm) 64 X2 Dual Core Processor 5000+
> AuthenticAMD GNU/Linux
>
>
>
>
>
>
>
>

Reply | Threaded
Open this post in threaded view
|

Re: FloatingPointException

Les Howell
I was just examining the code again, and noticed the loop went from 1 to
size.  I haven't played with squeak or cobalt, so the array sizes may
not be what I think, but in C and most other modern languages, arrays
start at element 0 and go to size-1.  If that is the case, could there
be an "empty" number at the end of the array?  If so could code have
changed that number to a negative, or could an overflow have happened.
Also the code says fail if sizes don't match, what is the fail
condition?  If a fail returns a minus 1, that could be the source of the
error.

Regards,
Les H
On Fri, 2008-11-07 at 11:38 -0500, Bob Arning wrote:

> Short of seeing this in a debugger , the SqueakDebug.log file has a  
> lot of clues. Seeing the file from Brad's case might tell us a lot.
>
> Cheers,
> Bob
>
>
> FloatingPointException: undefined if less than zero.
> 7 November 2008 11:34:51 am
>
> VM: Mac OS - a SmalltalkImage
> Image: Croquet1.0beta [latest update: #2]
>
> SecurityManager state:
> Restricted: false
> FileAccess: true
> SocketAccess: true
> Working Dir /Users/bob/Desktop/Miscellaneous/CroquetSDK-1.0.18
> Trusted Dir /Users/bob/Desktop/Miscellaneous/CroquetSDK-1.0.18
> Untrusted Dir foobar/tooBar/forSqueak/bogus
>
> Float>>primitiveSqrt
> Receiver: -1.0
> Arguments and temporary variables:
> exp: nil
> guess: nil
> eps: nil
> delta: nil
> Receiver's instance variables:
> -1.0
>
> Float>>sqrt
> Receiver: -1.0
> Arguments and temporary variables:
>
> Receiver's instance variables:
> -1.0
>
> SmallInteger(Number)>>sqrt
> Receiver: -1
> Arguments and temporary variables:
>
> Receiver's instance variables:
> -1
>
> UndefinedObject>>DoIt
> Receiver: nil
> Arguments and temporary variables:
>
> Receiver's instance variables:
> nil
>
> On Nov 7, 2008, at 11:31 AM, Steve Wart wrote:
>
> > Unless the compiler is broken or the primitive code is being  
> > incorrectly generated.
> >
> > Can someone reproduce this problem in a debugger?
> >
> > On Fri, Nov 7, 2008 at 8:27 AM, David P. Reed <[hidden email]> wrote:
> > That's why it is a mystery!  The code in question cannot pass a  
> > negative number.  Ever.
> >
> >
> > Bob Arning wrote:
> > FWIW, the version of #primitiveSqrt I have seems to generate the  
> > exception only if self < 0.0. While the FSQRT bug may be a problem  
> > elsewhere, it doesn't look like the culprit here.
> >
> > Cheers,
> > Bob
> >
> >
> > On Nov 7, 2008, at 10:49 AM, David P. Reed wrote:
> >
> > If you look at the stack, the sqrt in question is in a calculation  
> > of the length of a vector.
> >
> > This calculation is:  sqrt(delta_x^2 + delta_y^2 + delta_z^2)
> >
> > The ^2 makes all summands positive or zero.   Thus, the square root  
> > NEVER has a negative argument in this case.
> >
> > However, the FSQRT function in the x87 instruction subset does not  
> > clear the FPE exception.   If the exception flag is checked only  
> > *after* the FSQRT (as is the convention in most C compilers), any  
> > prior FPE setting instruction can have caused the exception, and in  
> > Linux these days, it can actually be inherited from other parallel  
> > processes, because of the bug I mentioned.
> >
> > So in fact, the bug may actually be caused far, far away from where  
> > it appears if the problem is in the VM or the Linux kernel.   And  
> > the Croquet call history is a path that is repeatedly recomputed  
> > (many times per second) and always gives the same answer based on  
> > the same data if the user is not moving the mouse, so the  
> > likelihood that the symptom occurs only after a long idle time due  
> > to "local" effects is near zero.
> >
> >
> >
> > Les wrote:
> > somehow you took the squareroot of a negative number.  Sorry if I am
> > late to the party in responding, I've been busy.
> >
> > Regards,
> > Les H
> > On Thu, 2008-11-06 at 08:04 -0800, Brad Fuller wrote:
> >
> > After running croquet for a while, a FP except ion occured. So, I
> > restarted it from scratch and started "Croquet (Master)" and left it
> > to run. I did not enter to world and move anything around. Eventually,
> > the FP exception occured.
> >
> > I searched the developers archives for floating point exception, but
> > couldn't find anything.
> >
> >  I can not debug it much, it runs very slowly. But, i took a screen
> > shot of it and attached it. It might tell someone something. I'm going
> > to attempt to attach the screenshot to this msg, hoping that this
> > mailing list accepts attachments.
> >
> > This is the standard off-the-shelf SDK 1.0 from the croquetproject
> > website on Linux:
> >
> > $ uname -a
> > Linux IVES 2.6.25-gentoo-r7-a #3 SMP PREEMPT Wed Oct 1 14:45:40 PDT
> > 2008 x86_64 AMD Athlon(tm) 64 X2 Dual Core Processor 5000+
> > AuthenticAMD GNU/Linux
> >
> >
> >
> >
> >
> >
> >
> >
>

Reply | Threaded
Open this post in threaded view
|

Re: FloatingPointException

Russell M. Taylor II
And if what is past the end of the array happens to be a -inf or NaN,
it may be that the negative would survive the squaring operation (not
sure about what happens in these corner cases, but I think that NaN
may pass every comparison test).

Russ

At 11:55 AM 11/11/2008, you wrote:

>I was just examining the code again, and noticed the loop went from 1 to
>size.  I haven't played with squeak or cobalt, so the array sizes may
>not be what I think, but in C and most other modern languages, arrays
>start at element 0 and go to size-1.  If that is the case, could there
>be an "empty" number at the end of the array?  If so could code have
>changed that number to a negative, or could an overflow have happened.
>Also the code says fail if sizes don't match, what is the fail
>condition?  If a fail returns a minus 1, that could be the source of the
>error.
>
>Regards,
>Les H
>On Fri, 2008-11-07 at 11:38 -0500, Bob Arning wrote:
> > Short of seeing this in a debugger , the SqueakDebug.log file has a
> > lot of clues. Seeing the file from Brad's case might tell us a lot.
> >
> > Cheers,
> > Bob
> >
> >
> > FloatingPointException: undefined if less than zero.
> > 7 November 2008 11:34:51 am
> >
> > VM: Mac OS - a SmalltalkImage
> > Image: Croquet1.0beta [latest update: #2]
> >
> > SecurityManager state:
> > Restricted: false
> > FileAccess: true
> > SocketAccess: true
> > Working Dir /Users/bob/Desktop/Miscellaneous/CroquetSDK-1.0.18
> > Trusted Dir /Users/bob/Desktop/Miscellaneous/CroquetSDK-1.0.18
> > Untrusted Dir foobar/tooBar/forSqueak/bogus
> >
> > Float>>primitiveSqrt
> >       Receiver: -1.0
> >       Arguments and temporary variables:
> >               exp:    nil
> >               guess:  nil
> >               eps:    nil
> >               delta:  nil
> >       Receiver's instance variables:
> > -1.0
> >
> > Float>>sqrt
> >       Receiver: -1.0
> >       Arguments and temporary variables:
> >
> >       Receiver's instance variables:
> > -1.0
> >
> > SmallInteger(Number)>>sqrt
> >       Receiver: -1
> >       Arguments and temporary variables:
> >
> >       Receiver's instance variables:
> > -1
> >
> > UndefinedObject>>DoIt
> >       Receiver: nil
> >       Arguments and temporary variables:
> >
> >       Receiver's instance variables:
> > nil
> >
> > On Nov 7, 2008, at 11:31 AM, Steve Wart wrote:
> >
> > > Unless the compiler is broken or the primitive code is being
> > > incorrectly generated.
> > >
> > > Can someone reproduce this problem in a debugger?
> > >
> > > On Fri, Nov 7, 2008 at 8:27 AM, David P. Reed <[hidden email]> wrote:
> > > That's why it is a mystery!  The code in question cannot pass a
> > > negative number.  Ever.
> > >
> > >
> > > Bob Arning wrote:
> > > FWIW, the version of #primitiveSqrt I have seems to generate the
> > > exception only if self < 0.0. While the FSQRT bug may be a problem
> > > elsewhere, it doesn't look like the culprit here.
> > >
> > > Cheers,
> > > Bob
> > >
> > >
> > > On Nov 7, 2008, at 10:49 AM, David P. Reed wrote:
> > >
> > > If you look at the stack, the sqrt in question is in a calculation
> > > of the length of a vector.
> > >
> > > This calculation is:  sqrt(delta_x^2 + delta_y^2 + delta_z^2)
> > >
> > > The ^2 makes all summands positive or zero.   Thus, the square root
> > > NEVER has a negative argument in this case.
> > >
> > > However, the FSQRT function in the x87 instruction subset does not
> > > clear the FPE exception.   If the exception flag is checked only
> > > *after* the FSQRT (as is the convention in most C compilers), any
> > > prior FPE setting instruction can have caused the exception, and in
> > > Linux these days, it can actually be inherited from other parallel
> > > processes, because of the bug I mentioned.
> > >
> > > So in fact, the bug may actually be caused far, far away from where
> > > it appears if the problem is in the VM or the Linux kernel.   And
> > > the Croquet call history is a path that is repeatedly recomputed
> > > (many times per second) and always gives the same answer based on
> > > the same data if the user is not moving the mouse, so the
> > > likelihood that the symptom occurs only after a long idle time due
> > > to "local" effects is near zero.
> > >
> > >
> > >
> > > Les wrote:
> > > somehow you took the squareroot of a negative number.  Sorry if I am
> > > late to the party in responding, I've been busy.
> > >
> > > Regards,
> > > Les H
> > > On Thu, 2008-11-06 at 08:04 -0800, Brad Fuller wrote:
> > >
> > > After running croquet for a while, a FP except ion occured. So, I
> > > restarted it from scratch and started "Croquet (Master)" and left it
> > > to run. I did not enter to world and move anything around. Eventually,
> > > the FP exception occured.
> > >
> > > I searched the developers archives for floating point exception, but
> > > couldn't find anything.
> > >
> > >  I can not debug it much, it runs very slowly. But, i took a screen
> > > shot of it and attached it. It might tell someone something. I'm going
> > > to attempt to attach the screenshot to this msg, hoping that this
> > > mailing list accepts attachments.
> > >
> > > This is the standard off-the-shelf SDK 1.0 from the croquetproject
> > > website on Linux:
> > >
> > > $ uname -a
> > > Linux IVES 2.6.25-gentoo-r7-a #3 SMP PREEMPT Wed Oct 1 14:45:40 PDT
> > > 2008 x86_64 AMD Athlon(tm) 64 X2 Dual Core Processor 5000+
> > > AuthenticAMD GNU/Linux
> > >
> > >
> > >
> > >
> > >
> > >
> > >
> > >
> >

---
Russell M. Taylor II, Ph.D.                           [hidden email]
CB #3175, Sitterson Hall                        www.cs.unc.edu/~taylorr
University of North Carolina,                     Voice: (919) 962-1701
Chapel Hill, NC 27599-3175                        FAX:   (919) 962-1799

Reply | Threaded
Open this post in threaded view
|

Re: FloatingPointException

Carole Dodd
In reply to this post by Brad Fuller-4
Re: [croquet-dev] FloatingPointException

Please take me off you email list as of today

Thank you
Carole dodd
Sent from the NEW CURVE Blackberry from Telus

----- Original Message -----
From: Russell M. Taylor II <[hidden email]>
To: [hidden email] <[hidden email]>; Les <[hidden email]>
Sent: Tue Nov 11 15:05:16 2008
Subject: Re: [croquet-dev] FloatingPointException

And if what is past the end of the array happens to be a -inf or NaN,
it may be that the negative would survive the squaring operation (not
sure about what happens in these corner cases, but I think that NaN
may pass every comparison test).

Russ

At 11:55 AM 11/11/2008, you wrote:
>I was just examining the code again, and noticed the loop went from 1 to
>size.  I haven't played with squeak or cobalt, so the array sizes may
>not be what I think, but in C and most other modern languages, arrays
>start at element 0 and go to size-1.  If that is the case, could there
>be an "empty" number at the end of the array?  If so could code have
>changed that number to a negative, or could an overflow have happened.
>Also the code says fail if sizes don't match, what is the fail
>condition?  If a fail returns a minus 1, that could be the source of the
>error.
>
>Regards,
>Les H
>On Fri, 2008-11-07 at 11:38 -0500, Bob Arning wrote:
> > Short of seeing this in a debugger , the SqueakDebug.log file has a
> > lot of clues. Seeing the file from Brad's case might tell us a lot.
> >
> > Cheers,
> > Bob
> >
> >
> > FloatingPointException: undefined if less than zero.
> > 7 November 2008 11:34:51 am
> >
> > VM: Mac OS - a SmalltalkImage
> > Image: Croquet1.0beta [latest update: #2]
> >
> > SecurityManager state:
> > Restricted: false
> > FileAccess: true
> > SocketAccess: true
> > Working Dir /Users/bob/Desktop/Miscellaneous/CroquetSDK-1.0.18
> > Trusted Dir /Users/bob/Desktop/Miscellaneous/CroquetSDK-1.0.18
> > Untrusted Dir foobar/tooBar/forSqueak/bogus
> >
> > Float>>primitiveSqrt
> >       Receiver: -1.0
> >       Arguments and temporary variables:
> >               exp:    nil
> >               guess:  nil
> >               eps:    nil
> >               delta:  nil
> >       Receiver's instance variables:
> > -1.0
> >
> > Float>>sqrt
> >       Receiver: -1.0
> >       Arguments and temporary variables:
> >
> >       Receiver's instance variables:
> > -1.0
> >
> > SmallInteger(Number)>>sqrt
> >       Receiver: -1
> >       Arguments and temporary variables:
> >
> >       Receiver's instance variables:
> > -1
> >
> > UndefinedObject>>DoIt
> >       Receiver: nil
> >       Arguments and temporary variables:
> >
> >       Receiver's instance variables:
> > nil
> >
> > On Nov 7, 2008, at 11:31 AM, Steve Wart wrote:
> >
> > > Unless the compiler is broken or the primitive code is being
> > > incorrectly generated.
> > >
> > > Can someone reproduce this problem in a debugger?
> > >
> > > On Fri, Nov 7, 2008 at 8:27 AM, David P. Reed <[hidden email]> wrote:
> > > That's why it is a mystery!  The code in question cannot pass a
> > > negative number.  Ever.
> > >
> > >
> > > Bob Arning wrote:
> > > FWIW, the version of #primitiveSqrt I have seems to generate the
> > > exception only if self < 0.0. While the FSQRT bug may be a problem
> > > elsewhere, it doesn't look like the culprit here.
> > >
> > > Cheers,
> > > Bob
> > >
> > >
> > > On Nov 7, 2008, at 10:49 AM, David P. Reed wrote:
> > >
> > > If you look at the stack, the sqrt in question is in a calculation
> > > of the length of a vector.
> > >
> > > This calculation is:  sqrt(delta_x^2 + delta_y^2 + delta_z^2)
> > >
> > > The ^2 makes all summands positive or zero.   Thus, the square root
> > > NEVER has a negative argument in this case.
> > >
> > > However, the FSQRT function in the x87 instruction subset does not
> > > clear the FPE exception.   If the exception flag is checked only
> > > *after* the FSQRT (as is the convention in most C compilers), any
> > > prior FPE setting instruction can have caused the exception, and in
> > > Linux these days, it can actually be inherited from other parallel
> > > processes, because of the bug I mentioned.
> > >
> > > So in fact, the bug may actually be caused far, far away from where
> > > it appears if the problem is in the VM or the Linux kernel.   And
> > > the Croquet call history is a path that is repeatedly recomputed
> > > (many times per second) and always gives the same answer based on
> > > the same data if the user is not moving the mouse, so the
> > > likelihood that the symptom occurs only after a long idle time due
> > > to "local" effects is near zero.
> > >
> > >
> > >
> > > Les wrote:
> > > somehow you took the squareroot of a negative number.  Sorry if I am
> > > late to the party in responding, I've been busy.
> > >
> > > Regards,
> > > Les H
> > > On Thu, 2008-11-06 at 08:04 -0800, Brad Fuller wrote:
> > >
> > > After running croquet for a while, a FP except ion occured. So, I
> > > restarted it from scratch and started "Croquet (Master)" and left it
> > > to run. I did not enter to world and move anything around. Eventually,
> > > the FP exception occured.
> > >
> > > I searched the developers archives for floating point exception, but
> > > couldn't find anything.
> > >
> > >  I can not debug it much, it runs very slowly. But, i took a screen
> > > shot of it and attached it. It might tell someone something. I'm going
> > > to attempt to attach the screenshot to this msg, hoping that this
> > > mailing list accepts attachments.
> > >
> > > This is the standard off-the-shelf SDK 1.0 from the croquetproject
> > > website on Linux:
> > >
> > > $ uname -a
> > > Linux IVES 2.6.25-gentoo-r7-a #3 SMP PREEMPT Wed Oct 1 14:45:40 PDT
> > > 2008 x86_64 AMD Athlon(tm) 64 X2 Dual Core Processor 5000+
> > > AuthenticAMD GNU/Linux
> > >
> > >
> > >
> > >
> > >
> > >
> > >
> > >
> >

---
Russell M. Taylor II, Ph.D.                           [hidden email]
CB #3175, Sitterson Hall                        www.cs.unc.edu/~taylorr
University of North Carolina,                     Voice: (919) 962-1701
Chapel Hill, NC 27599-3175                        FAX:   (919) 962-1799

Reply | Threaded
Open this post in threaded view
|

Re: FloatingPointException

Bob Arning
In reply to this post by Russell M. Taylor II

On Nov 11, 2008, at 3:05 PM, Russell M. Taylor II wrote:

> And if what is past the end of the array happens to be a -inf or  
> NaN, it may be that the negative would survive the squaring  
> operation (not sure about what happens in these corner cases, but I  
> think that NaN may pass every comparison test).
>
> Russ


NaN could well be what's happening:

        (Vector3 x: Float nan y: 1 z: 1)  length

will produce the error first reported---

FloatingPointException: undefined if less than zero.
11 November 2008 11:04:59 pm

VM: Mac OS - a SmalltalkImage
Image: Croquet1.0beta [latest update: #2]

SecurityManager state:
Restricted: false
FileAccess: true
SocketAccess: true
Working Dir /Users/bob/Desktop/Miscellaneous/CroquetSDK-1.0.18
Trusted Dir /Users/bob/Desktop/Miscellaneous/CroquetSDK-1.0.18
Untrusted Dir foobar/tooBar/forSqueak/bogus

Float>>primitiveSqrt
        Receiver: NaN
        Arguments and temporary variables:
                exp: nil
                guess: nil
                eps: nil
                delta: nil
        Receiver's instance variables:
NaN

Float>>sqrt
        Receiver: NaN
        Arguments and temporary variables:

        Receiver's instance variables:
NaN

Vector3(FloatArray)>>length
        Receiver: a Vector3(NaN 1.0 1.0)
        Arguments and temporary variables:

        Receiver's instance variables:
a Vector3(NaN 1.0 1.0)


Reply | Threaded
Open this post in threaded view
|

Re: FloatingPointException

Joshua Gargus-2
In reply to this post by Russell M. Taylor II
Russell M. Taylor II wrote:
> And if what is past the end of the array happens to be a -inf or NaN,
> it may be that the negative would survive the squaring operation (not
> sure about what happens in these corner cases, but I think that NaN
> may pass every comparison test).
>

We seem to be veering off into wild speculation.  Let's see if we can
avoid doing so...

There are two methods that Les might be talking about when he was "just
examining the code again":
    - FloatArray>>dot:
    - FloatArrayPlugin>>primitiveDotProduct:

I'm assuming that we're talking about FloatArray>>dot:, since that's the
one that is indexed from 1 to size.  It's easy to verify that this is
the way that FloatArray is supposed to be used.  Try evaluating the
following expressions:

#(1 2 3) asFloatArray at: 1 "answers 1.0"
#(1 2 3) asFloatArray at: 2 "answers 2.0"
#(1 2 3) asFloatArray at: 3 "answers 3.0"
#(1 2 3) asFloatArray at: 0 "error"
#(1 2 3) asFloatArray at: 4 "error"

So, if we are executing the Smalltalk code in FloatArray>>dot:, we can't
do an out-of-bounds access.

However, note that under normal conditions, the Smalltalk code in
FloatArray>>dot: will never be executed.  Instead, it will execute the
primitive specified by FloatArrayPlugin>>primitiveDotProduct:.  If
you're not familiar with Squeak plugins, the gist is that
#primitiveFloatPlugin: is translated to C, compiled with a C compiler,
and dynamically linked into Croquet.  The Smalltalk code in
FloatArray>>dot: is evaluated only if the primitive fails (for example,
if the two vectors have different lengths).  Note that the primitive
does bounds-checking; the problem doesn't appear to be there.

My guess is still that the problem is something REALLY WEIRD.  If it was
a straight-forward bug in such heavily-used code, then users like Qwaq
would have found it long ago.

Cheers,
Josh



> Russ
>
> At 11:55 AM 11/11/2008, you wrote:
>> I was just examining the code again, and noticed the loop went from 1 to
>> size.  I haven't played with squeak or cobalt, so the array sizes may
>> not be what I think, but in C and most other modern languages, arrays
>> start at element 0 and go to size-1.  If that is the case, could there
>> be an "empty" number at the end of the array?  If so could code have
>> changed that number to a negative, or could an overflow have happened.
>> Also the code says fail if sizes don't match, what is the fail
>> condition?  If a fail returns a minus 1, that could be the source of the
>> error.
>>
>> Regards,
>> Les H
>> On Fri, 2008-11-07 at 11:38 -0500, Bob Arning wrote:
>> > Short of seeing this in a debugger , the SqueakDebug.log file has a
>> > lot of clues. Seeing the file from Brad's case might tell us a lot.
>> >
>> > Cheers,
>> > Bob
>> >
>> >
>> > FloatingPointException: undefined if less than zero.
>> > 7 November 2008 11:34:51 am
>> >
>> > VM: Mac OS - a SmalltalkImage
>> > Image: Croquet1.0beta [latest update: #2]
>> >
>> > SecurityManager state:
>> > Restricted: false
>> > FileAccess: true
>> > SocketAccess: true
>> > Working Dir /Users/bob/Desktop/Miscellaneous/CroquetSDK-1.0.18
>> > Trusted Dir /Users/bob/Desktop/Miscellaneous/CroquetSDK-1.0.18
>> > Untrusted Dir foobar/tooBar/forSqueak/bogus
>> >
>> > Float>>primitiveSqrt
>> >       Receiver: -1.0
>> >       Arguments and temporary variables:
>> >               exp:    nil
>> >               guess:  nil
>> >               eps:    nil
>> >               delta:  nil
>> >       Receiver's instance variables:
>> > -1.0
>> >
>> > Float>>sqrt
>> >       Receiver: -1.0
>> >       Arguments and temporary variables:
>> >
>> >       Receiver's instance variables:
>> > -1.0
>> >
>> > SmallInteger(Number)>>sqrt
>> >       Receiver: -1
>> >       Arguments and temporary variables:
>> >
>> >       Receiver's instance variables:
>> > -1
>> >
>> > UndefinedObject>>DoIt
>> >       Receiver: nil
>> >       Arguments and temporary variables:
>> >
>> >       Receiver's instance variables:
>> > nil
>> >
>> > On Nov 7, 2008, at 11:31 AM, Steve Wart wrote:
>> >
>> > > Unless the compiler is broken or the primitive code is being
>> > > incorrectly generated.
>> > >
>> > > Can someone reproduce this problem in a debugger?
>> > >
>> > > On Fri, Nov 7, 2008 at 8:27 AM, David P. Reed <[hidden email]>
>> wrote:
>> > > That's why it is a mystery!  The code in question cannot pass a
>> > > negative number.  Ever.
>> > >
>> > >
>> > > Bob Arning wrote:
>> > > FWIW, the version of #primitiveSqrt I have seems to generate the
>> > > exception only if self < 0.0. While the FSQRT bug may be a problem
>> > > elsewhere, it doesn't look like the culprit here.
>> > >
>> > > Cheers,
>> > > Bob
>> > >
>> > >
>> > > On Nov 7, 2008, at 10:49 AM, David P. Reed wrote:
>> > >
>> > > If you look at the stack, the sqrt in question is in a calculation
>> > > of the length of a vector.
>> > >
>> > > This calculation is:  sqrt(delta_x^2 + delta_y^2 + delta_z^2)
>> > >
>> > > The ^2 makes all summands positive or zero.   Thus, the square root
>> > > NEVER has a negative argument in this case.
>> > >
>> > > However, the FSQRT function in the x87 instruction subset does not
>> > > clear the FPE exception.   If the exception flag is checked only
>> > > *after* the FSQRT (as is the convention in most C compilers), any
>> > > prior FPE setting instruction can have caused the exception, and in
>> > > Linux these days, it can actually be inherited from other parallel
>> > > processes, because of the bug I mentioned.
>> > >
>> > > So in fact, the bug may actually be caused far, far away from where
>> > > it appears if the problem is in the VM or the Linux kernel.   And
>> > > the Croquet call history is a path that is repeatedly recomputed
>> > > (many times per second) and always gives the same answer based on
>> > > the same data if the user is not moving the mouse, so the
>> > > likelihood that the symptom occurs only after a long idle time due
>> > > to "local" effects is near zero.
>> > >
>> > >
>> > >
>> > > Les wrote:
>> > > somehow you took the squareroot of a negative number.  Sorry if I am
>> > > late to the party in responding, I've been busy.
>> > >
>> > > Regards,
>> > > Les H
>> > > On Thu, 2008-11-06 at 08:04 -0800, Brad Fuller wrote:
>> > >
>> > > After running croquet for a while, a FP except ion occured. So, I
>> > > restarted it from scratch and started "Croquet (Master)" and left it
>> > > to run. I did not enter to world and move anything around.
>> Eventually,
>> > > the FP exception occured.
>> > >
>> > > I searched the developers archives for floating point exception, but
>> > > couldn't find anything.
>> > >
>> > >  I can not debug it much, it runs very slowly. But, i took a screen
>> > > shot of it and attached it. It might tell someone something. I'm
>> going
>> > > to attempt to attach the screenshot to this msg, hoping that this
>> > > mailing list accepts attachments.
>> > >
>> > > This is the standard off-the-shelf SDK 1.0 from the croquetproject
>> > > website on Linux:
>> > >
>> > > $ uname -a
>> > > Linux IVES 2.6.25-gentoo-r7-a #3 SMP PREEMPT Wed Oct 1 14:45:40 PDT
>> > > 2008 x86_64 AMD Athlon(tm) 64 X2 Dual Core Processor 5000+
>> > > AuthenticAMD GNU/Linux
>> > >
>> > >
>> > >
>> > >
>> > >
>> > >
>> > >
>> > >
>> >
>
> ---
> Russell M. Taylor II, Ph.D.                           [hidden email]
> CB #3175, Sitterson Hall                        www.cs.unc.edu/~taylorr
> University of North Carolina,                     Voice: (919) 962-1701
> Chapel Hill, NC 27599-3175                        FAX:   (919) 962-1799

Reply | Threaded
Open this post in threaded view
|

Re: FloatingPointException

Les Howell
Of course you are right, Josh,  I didn't realize that the function in
question was a library function, so that is why I made the suggestion I
did.

Regards,
Les H
On Wed, 2008-11-12 at 01:05 -0800, Joshua Gargus wrote:

> Russell M. Taylor II wrote:
> > And if what is past the end of the array happens to be a -inf or NaN,
> > it may be that the negative would survive the squaring operation (not
> > sure about what happens in these corner cases, but I think that NaN
> > may pass every comparison test).
> >
>
> We seem to be veering off into wild speculation.  Let's see if we can
> avoid doing so...
>
> There are two methods that Les might be talking about when he was "just
> examining the code again":
>     - FloatArray>>dot:
>     - FloatArrayPlugin>>primitiveDotProduct:
>
> I'm assuming that we're talking about FloatArray>>dot:, since that's the
> one that is indexed from 1 to size.  It's easy to verify that this is
> the way that FloatArray is supposed to be used.  Try evaluating the
> following expressions:
>
> #(1 2 3) asFloatArray at: 1 "answers 1.0"
> #(1 2 3) asFloatArray at: 2 "answers 2.0"
> #(1 2 3) asFloatArray at: 3 "answers 3.0"
> #(1 2 3) asFloatArray at: 0 "error"
> #(1 2 3) asFloatArray at: 4 "error"
>
> So, if we are executing the Smalltalk code in FloatArray>>dot:, we can't
> do an out-of-bounds access.
>
> However, note that under normal conditions, the Smalltalk code in
> FloatArray>>dot: will never be executed.  Instead, it will execute the
> primitive specified by FloatArrayPlugin>>primitiveDotProduct:.  If
> you're not familiar with Squeak plugins, the gist is that
> #primitiveFloatPlugin: is translated to C, compiled with a C compiler,
> and dynamically linked into Croquet.  The Smalltalk code in
> FloatArray>>dot: is evaluated only if the primitive fails (for example,
> if the two vectors have different lengths).  Note that the primitive
> does bounds-checking; the problem doesn't appear to be there.
>
> My guess is still that the problem is something REALLY WEIRD.  If it was
> a straight-forward bug in such heavily-used code, then users like Qwaq
> would have found it long ago.
>
> Cheers,
> Josh
>
>
>
> > Russ
> >
> > At 11:55 AM 11/11/2008, you wrote:
> >> I was just examining the code again, and noticed the loop went from 1 to
> >> size.  I haven't played with squeak or cobalt, so the array sizes may
> >> not be what I think, but in C and most other modern languages, arrays
> >> start at element 0 and go to size-1.  If that is the case, could there
> >> be an "empty" number at the end of the array?  If so could code have
> >> changed that number to a negative, or could an overflow have happened.
> >> Also the code says fail if sizes don't match, what is the fail
> >> condition?  If a fail returns a minus 1, that could be the source of the
> >> error.
> >>
> >> Regards,
> >> Les H
> >> On Fri, 2008-11-07 at 11:38 -0500, Bob Arning wrote:
> >> > Short of seeing this in a debugger , the SqueakDebug.log file has a
> >> > lot of clues. Seeing the file from Brad's case might tell us a lot.
> >> >
> >> > Cheers,
> >> > Bob
> >> >
> >> >
> >> > FloatingPointException: undefined if less than zero.
> >> > 7 November 2008 11:34:51 am
> >> >
> >> > VM: Mac OS - a SmalltalkImage
> >> > Image: Croquet1.0beta [latest update: #2]
> >> >
> >> > SecurityManager state:
> >> > Restricted: false
> >> > FileAccess: true
> >> > SocketAccess: true
> >> > Working Dir /Users/bob/Desktop/Miscellaneous/CroquetSDK-1.0.18
> >> > Trusted Dir /Users/bob/Desktop/Miscellaneous/CroquetSDK-1.0.18
> >> > Untrusted Dir foobar/tooBar/forSqueak/bogus
> >> >
> >> > Float>>primitiveSqrt
> >> >       Receiver: -1.0
> >> >       Arguments and temporary variables:
> >> >               exp:    nil
> >> >               guess:  nil
> >> >               eps:    nil
> >> >               delta:  nil
> >> >       Receiver's instance variables:
> >> > -1.0
> >> >
> >> > Float>>sqrt
> >> >       Receiver: -1.0
> >> >       Arguments and temporary variables:
> >> >
> >> >       Receiver's instance variables:
> >> > -1.0
> >> >
> >> > SmallInteger(Number)>>sqrt
> >> >       Receiver: -1
> >> >       Arguments and temporary variables:
> >> >
> >> >       Receiver's instance variables:
> >> > -1
> >> >
> >> > UndefinedObject>>DoIt
> >> >       Receiver: nil
> >> >       Arguments and temporary variables:
> >> >
> >> >       Receiver's instance variables:
> >> > nil
> >> >
> >> > On Nov 7, 2008, at 11:31 AM, Steve Wart wrote:
> >> >
> >> > > Unless the compiler is broken or the primitive code is being
> >> > > incorrectly generated.
> >> > >
> >> > > Can someone reproduce this problem in a debugger?
> >> > >
> >> > > On Fri, Nov 7, 2008 at 8:27 AM, David P. Reed <[hidden email]>
> >> wrote:
> >> > > That's why it is a mystery!  The code in question cannot pass a
> >> > > negative number.  Ever.
> >> > >
> >> > >
> >> > > Bob Arning wrote:
> >> > > FWIW, the version of #primitiveSqrt I have seems to generate the
> >> > > exception only if self < 0.0. While the FSQRT bug may be a problem
> >> > > elsewhere, it doesn't look like the culprit here.
> >> > >
> >> > > Cheers,
> >> > > Bob
> >> > >
> >> > >
> >> > > On Nov 7, 2008, at 10:49 AM, David P. Reed wrote:
> >> > >
> >> > > If you look at the stack, the sqrt in question is in a calculation
> >> > > of the length of a vector.
> >> > >
> >> > > This calculation is:  sqrt(delta_x^2 + delta_y^2 + delta_z^2)
> >> > >
> >> > > The ^2 makes all summands positive or zero.   Thus, the square root
> >> > > NEVER has a negative argument in this case.
> >> > >
> >> > > However, the FSQRT function in the x87 instruction subset does not
> >> > > clear the FPE exception.   If the exception flag is checked only
> >> > > *after* the FSQRT (as is the convention in most C compilers), any
> >> > > prior FPE setting instruction can have caused the exception, and in
> >> > > Linux these days, it can actually be inherited from other parallel
> >> > > processes, because of the bug I mentioned.
> >> > >
> >> > > So in fact, the bug may actually be caused far, far away from where
> >> > > it appears if the problem is in the VM or the Linux kernel.   And
> >> > > the Croquet call history is a path that is repeatedly recomputed
> >> > > (many times per second) and always gives the same answer based on
> >> > > the same data if the user is not moving the mouse, so the
> >> > > likelihood that the symptom occurs only after a long idle time due
> >> > > to "local" effects is near zero.
> >> > >
> >> > >
> >> > >
> >> > > Les wrote:
> >> > > somehow you took the squareroot of a negative number.  Sorry if I am
> >> > > late to the party in responding, I've been busy.
> >> > >
> >> > > Regards,
> >> > > Les H
> >> > > On Thu, 2008-11-06 at 08:04 -0800, Brad Fuller wrote:
> >> > >
> >> > > After running croquet for a while, a FP except ion occured. So, I
> >> > > restarted it from scratch and started "Croquet (Master)" and left it
> >> > > to run. I did not enter to world and move anything around.
> >> Eventually,
> >> > > the FP exception occured.
> >> > >
> >> > > I searched the developers archives for floating point exception, but
> >> > > couldn't find anything.
> >> > >
> >> > >  I can not debug it much, it runs very slowly. But, i took a screen
> >> > > shot of it and attached it. It might tell someone something. I'm
> >> going
> >> > > to attempt to attach the screenshot to this msg, hoping that this
> >> > > mailing list accepts attachments.
> >> > >
> >> > > This is the standard off-the-shelf SDK 1.0 from the croquetproject
> >> > > website on Linux:
> >> > >
> >> > > $ uname -a
> >> > > Linux IVES 2.6.25-gentoo-r7-a #3 SMP PREEMPT Wed Oct 1 14:45:40 PDT
> >> > > 2008 x86_64 AMD Athlon(tm) 64 X2 Dual Core Processor 5000+
> >> > > AuthenticAMD GNU/Linux
> >> > >
> >> > >
> >> > >
> >> > >
> >> > >
> >> > >
> >> > >
> >> > >
> >> >
> >
> > ---
> > Russell M. Taylor II, Ph.D.                           [hidden email]
> > CB #3175, Sitterson Hall                        www.cs.unc.edu/~taylorr
> > University of North Carolina,                     Voice: (919) 962-1701
> > Chapel Hill, NC 27599-3175                        FAX:   (919) 962-1799
>