Large Fraction asFloat do overflow and answer Float infinity. ((11 raisedTo: 400) / 2) asFloat = Float infinity. "is true" So far, nothing wrong, except i prefer exception, but that was another discussion. But then they also equal Float infinity, and that sound strange to me: ((11 raisedTo: 400) / 2) = Float inifinity. "is true" What is bad in this behavior ? It is that you don't have equality transitivity property any more, and that is a flaw: ((11 raisedTo: 400) / 2) = Float inifinity. "is true" ((13 raisedTo: 400) / 2) = Float inifinity. "is true" ((11 raisedTo: 400) / 2) = ((13 raisedTo: 400) / 2). "is false" Then you can expect very weird bugs again in Sets. Add these 3 objects to a Set. Since transitivity is broken, the size of the set will vary according to the order you will add objects to it, the hash code algorithm and the set capacity. Something very nasty. Sure, few people use Large Fractions, and knowing this, they'd rather not, but this is not the right argument. Also, the hash correction i'am proposing is likely to make things worse. So we have to fix this one. Any idea but testing for infinity in adaptToFraction:andSend: / adaptToFloat:andSend: ? This is another reason why i really prefer arithmetic exceptions: handle such case. But i cannot move to arithmetic exception alone. |
Wow. Great catch. Clearly, this is broken. I think we need to change
this coercion to something that deals with the issue properly (e.g., responding false to the comparison in question). Any ideas how to fix that? Cheers, - Andreas nicolas cellier wrote: > Large Fraction asFloat do overflow and answer Float infinity. > ((11 raisedTo: 400) / 2) asFloat = Float infinity. "is true" > So far, nothing wrong, except i prefer exception, but that was another > discussion. > > But then they also equal Float infinity, and that sound strange to me: > ((11 raisedTo: 400) / 2) = Float inifinity. "is true" > > What is bad in this behavior ? It is that you don't have equality transitivity > property any more, and that is a flaw: > ((11 raisedTo: 400) / 2) = Float inifinity. "is true" > ((13 raisedTo: 400) / 2) = Float inifinity. "is true" > ((11 raisedTo: 400) / 2) = ((13 raisedTo: 400) / 2). "is false" > > Then you can expect very weird bugs again in Sets. Add these 3 objects to a > Set. Since transitivity is broken, the size of the set will vary according to > the order you will add objects to it, the hash code algorithm and the set > capacity. Something very nasty. > > Sure, few people use Large Fractions, and knowing this, they'd rather not, but > this is not the right argument. > > Also, the hash correction i'am proposing is likely to make things worse. > So we have to fix this one. Any idea but testing for infinity in > adaptToFraction:andSend: / adaptToFloat:andSend: ? > > This is another reason why i really prefer arithmetic exceptions: handle such > case. But i cannot move to arithmetic exception alone. > > > |
In reply to this post by Nicolas Cellier-3
Well, the problem is not only for infinity (Overflow). It is also true for underflow. (2/(11 raisedTo: 400)) = 0.0. "is true" (2/(11 raisedTo: 400)) = 0. "is false" And in fact, the equality transitivity problem can show up for any fraction not exactly representable in IEEE Floating point, that is most Fractions. Since Fraction set is dense in set of real numbers, i can find an infinite number of Fractions lying between two Float. Since IEEE floating point arithmetic only have a finite number of possible values, then for each floating point value, there exist an infinite number of fractions that will be equal to this float (in current implementation, not mathematically), but the fraction will not equal to each other, thanks to exact arithmetic. Example: | a b | a := (16rFFFFFFFFFFFFF11 / 16r1000000000000000). b := (16rFFFFFFFFFFFFF10 / 16r100000000 0000000). c := a asFloat. {a = b. a = c. b = c.}. "is {false . true . true}" Knowing this, i'am not sure that ((1/3) asFloat = (1/3)) should answer true. Maybe that should be only the case for exact representation like 1/2 1/4 3/4 etc... But in this case, we also have to redefine coercion to coercing to Fraction instead of Float, because Fraction are more general (yes, Float is a subset of Fraction). If we do not coerce to exact arithmetic, we will have ((1/3) asFloat - (1/3)) isZero, and still we might be caught by some form of equality problem... Thus are we ready to exchange our fast floating point algorithm for slower Fraction, with a huge number of useless digits, each time somebody introduce a Fraction. Are we really sure we vote for this perfect system ? I am not sure at all. And that is also the case between floating points and integers. Sur e other dialects also have the equality problem, maybe not for Infinity, but for less trivial inexact arithmetic case. All is coming from problem of these languages calling floating point numbers real. Real is more general than fraction, ah yes, we have been foolished by this one. What should we do ? What is your opinion ?
|
In reply to this post by Andreas.Raab
Hi Andreas,
I agree we should answer false to equality test, unless exact arithmetic. Note that (1/3) asFloat asFraction = (1/3) answer false since asTrueFraction is used, but (1/2) asFloat asFraction = (1/2) is true, and that is exactly how we like it. I think we should change coercion algorithm for equality test relying on the asTrueFraction, but not touch + * - / since it would break lot of code. This could be something like: Float>>adaptToFraction: rcvr andSend: selector ^selector = #= ifTrue: [[rcvr = self asTrueFraction] "have to handle NaN and Infinity" on: Error do: [:exc | exc return: false]] ifFalse: [rcvr asFloat perform: selector with: self]. same in Fraction>>adpatToFloat: rcvr andSend: selector This does also apply to < > <= >= ~= (some are inherited and need not being handled, some must be handled the same way as =, except error handling block because we cannot compare to NaN and should raise an error...). So we have to complexify above method a bit. If it becomes too complex, we'll have to use more specialized selectors. Note that same stuff is to be done for Float/Integer coercion: | a b c | a := 16rFFFFFFFFFFFFF81. b := 16rFFFFFFFFFFFFF82. c := a asFloat. {a = b. a = c. b = c.} Maybe we can also expect advices from other Smalltalkers, and make various dialect response more uniform on such a Kernel subject. I went on vwnc mailing list with these, but no answer yet. Agree ? Nicolas Le Mardi 28 Mars 2006 12:45, Andreas Raab a écrit : > Wow. Great catch. Clearly, this is broken. I think we need to change > this coercion to something that deals with the issue properly (e.g., > responding false to the comparison in question). Any ideas how to fix that? > > Cheers, > - Andreas > |
Nicolas
Thanks for all these emails. Keep going :) For the compatibility between dialects we should not dream (else we would not have new calling initialize which is extremely cool as a teacher and programmer too). My feeling is that if we can improve squeak + document the problem with tests + evaluate how we can move without breaking too much existing code then we win :) Stef > Hi Andreas, > > I agree we should answer false to equality test, unless exact > arithmetic. > Note that (1/3) asFloat asFraction = (1/3) answer false since > asTrueFraction > is used, > but (1/2) asFloat asFraction = (1/2) is true, and that is exactly > how we like > it. > > I think we should change coercion algorithm for equality test > relying on the > asTrueFraction, but not touch + * - / since it would break lot of > code. > This could be something like: > > Float>>adaptToFraction: rcvr andSend: selector > ^selector = #= > ifTrue: [[rcvr = self asTrueFraction] > "have to handle NaN and Infinity" > on: Error do: [:exc | exc return: false]] > ifFalse: [rcvr asFloat perform: selector with: self]. > > same in Fraction>>adpatToFloat: rcvr andSend: selector > > This does also apply to < > <= >= ~= (some are inherited and need > not being > handled, some must be handled the same way as =, except error > handling block > because we cannot compare to NaN and should raise an error...). > So we have to complexify above method a bit. If it becomes too > complex, we'll > have to use more specialized selectors. > > Note that same stuff is to be done for Float/Integer coercion: > | a b c | > a := 16rFFFFFFFFFFFFF81. > b := 16rFFFFFFFFFFFFF82. > c := a asFloat. > {a = b. > a = c. > b = c.} > > Maybe we can also expect advices from other Smalltalkers, and make > various > dialect response more uniform on such a Kernel subject. > I went on vwnc mailing list with these, but no answer yet. > > Agree ? > > Nicolas > > Le Mardi 28 Mars 2006 12:45, Andreas Raab a écrit : >> Wow. Great catch. Clearly, this is broken. I think we need to change >> this coercion to something that deals with the issue properly (e.g., >> responding false to the comparison in question). Any ideas how to >> fix that? >> >> Cheers, >> - Andreas >> > > > |
OK I put it on http://bugs.impara.de/view.php?id=3374
and start to implement my response. Nicolas Le Mardi 28 Mars 2006 22:39, vous avez écrit : > Nicolas > Thanks for all these emails. > Keep going :) > For the compatibility between dialects we should not dream (else we > would not have new calling initialize which is extremely cool as a > teacher and > programmer too). > My feeling is that if we can improve squeak + document the problem > with tests + evaluate how we can move without breaking too much > existing code then we win :) > > Stef |
In reply to this post by Nicolas Cellier-3
Hello ncellier,
Tuesday, March 28, 2006, 7:25:46 AM, you wrote: nic> [number hashing is broken] nic> What is your opinion ? IMHO... sometimes it may be convenient to have 2 = 2.0, but since they represent different intentions, I would not expect that to happen. Why should a perfectly specified integer be equal to, possibly, the result of carrying calculations without infinite precision? What would it mean if theGreatDoublePrecisionResult = 2? Wouldn't it be more interesting (and precise) to ask laysWithin: epsilon from: 2? In other words, comparing constants makes it look much simpler than what it really is. Reading something like anInteger = 2.0 would be, from my point of view, highly questionable because it is an assertion that an approximation has an *exact* value. Nonsense. >From a more pragmatic point of view, there is also the issue of 2 = 2.0, but things like (1 bitShift: 1000) - 1 cannot be equal to any floating point number supported by common hardware. Thus, exploiting anInteger = aFloat is intention obscuring by definition since it may or may not work depending on the integer. Again, highly questionable. In short: floating point numbers *may* be equal to integers, but the behavior of #= cannot be determined a priori. Since the behavior of #= does not imply a relationship of equivalence, the behavior of #hash is inconsequential. Adding integers and floats to a set gets messed up. So we have two options... a) don't do it because it is intention obscuring... or b) make integers never be equal to floating point numbers. Same deal with fractions and the like. -- Best regards, Andres mailto:[hidden email] |
I wholeheartedly agree. We have used tolerance tests for floats since 1962
because equality tests are meaningless. = for floats should give error. Cheers --Trygve At 06:01 29.03.2006, Andres wrote: >Hello ncellier, > >Tuesday, March 28, 2006, 7:25:46 AM, you wrote: > >nic> [number hashing is broken] >nic> What is your opinion ? > >IMHO... sometimes it may be convenient to have 2 = 2.0, but since they >represent different intentions, I would not expect that to happen. >Why should a perfectly specified integer be equal to, possibly, the >result of carrying calculations without infinite precision? What >would it mean if theGreatDoublePrecisionResult = 2? Wouldn't it be >more interesting (and precise) to ask laysWithin: epsilon from: 2? > >In other words, comparing constants makes it look much simpler than >what it really is. Reading something like anInteger = 2.0 would be, >from my point of view, highly questionable because it is an assertion >that an approximation has an *exact* value. Nonsense. > > >From a more pragmatic point of view, there is also the issue of 2 = >2.0, but things like > > (1 bitShift: 1000) - 1 > >cannot be equal to any floating point number supported by common >hardware. Thus, exploiting anInteger = aFloat is intention obscuring >by definition since it may or may not work depending on the integer. >Again, highly questionable. > >In short: floating point numbers *may* be equal to integers, but the >behavior of #= cannot be determined a priori. Since the behavior of >#= does not imply a relationship of equivalence, the behavior of #hash >is inconsequential. > >Adding integers and floats to a set gets messed up. So we have two >options... a) don't do it because it is intention obscuring... or b) >make integers never be equal to floating point numbers. > >Same deal with fractions and the like. > >-- >Best regards, > Andres mailto:[hidden email] -- Trygve Reenskaug mailto: [hidden email] Morgedalsvn. 5A http://heim.ifi.uio.no/~trygver N-0378 Oslo Tel: (+47) 22 49 57 27 Norway |
In reply to this post by Nicolas Cellier-3
Thanks Trygve and Andres,
I also agree that comparing float is a bad practice. It is even worse outside squeak because of intel registers extra bits for example (result will depend on use of registry or not and hence on compiler optimization and neighbour code). The question should (2 = 2.0) and (2.0 = 2.0) is interesting... In my opinion, there are cases when float comparison is usefull: precondition to guard against an exception (zero divide or whatever). First solution i was proposing is to continue to answer true to that, since it can be valid as long as no inexact arithmetic took place (depends how we made the 2.0, but the conversion from string is OK). I even happened to use float arithmetic once just to have cheap integer arithmetic on 53bits (OK, shame on me, but it works). Hence we can modify = so as to answer true only in exact arithmetic sense ( (1/2)=0.5 but (1/10)~=0.1 ). Concerning Set and hash code, we can make it working accordingly, using Float asTrueFraction. But there is a performance penalty (factor 100 on hash before optimization). I am not totally opposed to more radical solution you seem suggesting: (2 = 2.0) never answer true. In this case, we can still have efficient hash for comparing (2.0=2.0), and do not have to bother with Integer and Fraction. There are some cases when i would like such a behavior: For example, i want to use compiler to repeatedly evaluate an automatically generated mathematical formula, but i face the too many literals byte code limit. I can reduce number of literals by eliminating double entries, and use a Set for that. Here, i do not want literal #(2.0) being replaced with #(2), and in this case i would prefer 2~=2.0... (Yes, i can have array literals in my formula due to matrix). The only penalty of second solution is to break all bad-practice code... It is always possible to force a conversion toward exact or inexact arithmetic. But people can have programmed with automatic conversion in mind... For example, protect against zero divide with a~=0... (yes a isZero would be better in this case). Is there a lot such code in Squeak ? Hard to say... We must inquire that and ask other squeakers opinion. It is not a thin change... Nicolas Le Mercredi 29 Mars 2006 08:13, Trygve Reenskaug a écrit : > I wholeheartedly agree. We have used tolerance tests for floats since 1962 > because equality tests are meaningless. = for floats should give error. > > Cheers > --Trygve > > At 06:01 29.03.2006, Andres wrote: > >Hello ncellier, > > > >Tuesday, March 28, 2006, 7:25:46 AM, you wrote: > > > >nic> [number hashing is broken] > >nic> What is your opinion ? > > > >IMHO... sometimes it may be convenient to have 2 = 2.0, but since they > >represent different intentions, I would not expect that to happen. > >Why should a perfectly specified integer be equal to, possibly, the > >result of carrying calculations without infinite precision? What > >would it mean if theGreatDoublePrecisionResult = 2? Wouldn't it be > >more interesting (and precise) to ask laysWithin: epsilon from: 2? > > > >In other words, comparing constants makes it look much simpler than > >what it really is. Reading something like anInteger = 2.0 would be, > >from my point of view, highly questionable because it is an assertion > >that an approximation has an *exact* value. Nonsense. > > > > >From a more pragmatic point of view, there is also the issue of 2 = > > > >2.0, but things like > > > > (1 bitShift: 1000) - 1 > > > >cannot be equal to any floating point number supported by common > >hardware. Thus, exploiting anInteger = aFloat is intention obscuring > >by definition since it may or may not work depending on the integer. > >Again, highly questionable. > > > >In short: floating point numbers *may* be equal to integers, but the > >behavior of #= cannot be determined a priori. Since the behavior of > >#= does not imply a relationship of equivalence, the behavior of #hash > >is inconsequential. > > > >Adding integers and floats to a set gets messed up. So we have two > >options... a) don't do it because it is intention obscuring... or b) > >make integers never be equal to floating point numbers. > > > >Same deal with fractions and the like. > > > >-- > >Best regards, > > Andres mailto:[hidden email] ------------------------------------------------------- |
In reply to this post by Trygve
same here
I always use aFloat closeTo: bFloat Stef Trygve Reenskaug wrote: > I wholeheartedly agree. We have used tolerance tests for floats since > 1962 because equality tests are meaningless. = for floats should give > error. > |
> same here
> > I always use > aFloat closeTo: bFloat > > Stef closeTo: is not bad, but 0.0001 accuracy is really something arbitrary. An optional accuracy argument with default value 0.0001 would be better, like Andres suggested: > laysWithin: epsilon from: 2 Back to Andres very clever mail : > In other words, comparing constants makes it look much simpler than > what it really is. Reading something like anInteger = 2.0 would be, > from my point of view, highly questionable because it is an assertion > that an approximation has an *exact* value. Nonsense. > That is the very right answer to something i said but did not find satisfying, if not stupid : i said Fraction are more general than Float, so Float could eventually convert to Fraction for doing arithmetics... ksss The other point of view is to say that Float are inexact arithmetic, Fraction and Integer are exact. And in this case, adding something exact with something inexact gives me something inexact. So this is a good reason for converting to Float. Other points were already discussed, and we now have all elements in mind and must make a decision between the three possibilities: a) let things as is with broken Set and hash (meaning go to hell with your mixed Set) b) change int=float to be never true (implementation very easy, possible nasty side effect in existing code) c) change int=float to be true only if exact representation are equal (implementation already on mantis, but great hash code slow down) What is the right place for such a vote ? Squeak chat ? squeak-dev ? mantis ? Nicolas |
In reply to this post by Andres Valloud
Andres Valloud wrote:
> IMHO... sometimes it may be convenient to have 2 = 2.0, but since they > represent different intentions, I would not expect that to happen. What about magnitude comparisons (<, >, <=, >=)? For example when using (perfectly well-defined) binary search algorithms on mixed number representations it may be more than merely convenient to have 2 = 2.0. Cheers, - Andreas > Why should a perfectly specified integer be equal to, possibly, the > result of carrying calculations without infinite precision? What > would it mean if theGreatDoublePrecisionResult = 2? Wouldn't it be > more interesting (and precise) to ask laysWithin: epsilon from: 2? > > In other words, comparing constants makes it look much simpler than > what it really is. Reading something like anInteger = 2.0 would be, > from my point of view, highly questionable because it is an assertion > that an approximation has an *exact* value. Nonsense. > >>From a more pragmatic point of view, there is also the issue of 2 = > 2.0, but things like > > (1 bitShift: 1000) - 1 > > cannot be equal to any floating point number supported by common > hardware. Thus, exploiting anInteger = aFloat is intention obscuring > by definition since it may or may not work depending on the integer. > Again, highly questionable. > > In short: floating point numbers *may* be equal to integers, but the > behavior of #= cannot be determined a priori. Since the behavior of > #= does not imply a relationship of equivalence, the behavior of #hash > is inconsequential. > > Adding integers and floats to a set gets messed up. So we have two > options... a) don't do it because it is intention obscuring... or b) > make integers never be equal to floating point numbers. > > Same deal with fractions and the like. > |
Hello Andreas,
Wednesday, March 29, 2006, 6:36:47 PM, you wrote: AR> What about magnitude comparisons (<, >, <=, >=)? For example when AR> using (perfectly well-defined) binary search algorithms on mixed AR> number representations it may be more than merely convenient to AR> have 2 = 2.0. Well... you can use aFloat > 2.0 as well, and in this case expressing intentions clearly would take just two extra characters. I find it hard to justify lack of proper expression because of unwillingness to type ".0". And besides it will be faster! :) -- Best regards, Andres mailto:[hidden email] |
In reply to this post by Andreas.Raab
The Scheme community has a process called SRFIs which models on IETF
RFCs to explore models and extensions to the standard language (I was one of the founding editors of SRFIs). Although Scheme is not (usually) OO, it is dynamically typed and has many of the same issues as Smalltalk. In particular, they have recently been discussing many of these issues as part of SRFI-77, at: http://srfi.schemers.org/srfi-77/ In the Document, look at the Issues under the Design Rationalle section. You may also find much of the archived discussion relevant. ../Dave |
Thanks for this interesting link.
I see we share same problems indeed, and maybe your reflexion is more advanced on the subject. I do not think Smalltalkers are ready for duplicated arithmetic operators (float vs fix, exact vs inexact), they would rather prefer an explicit conversion. So maybe we won't share all our respective solutions, but we may at least for the generic arithmetic part. As an interesting example, i never answered the question for complex with null imaginary part: should i keep the result complex or convert it to real ? Squeak keep it complex, visualworks convert it. The exact/inexact case you distinguish sounds a good proposition to me. But things are not so obvious... For example (* (1.0+0i) (1.0+0i) ) might well answer false to real? , unless (* 1.0 0) result in an exact zero (what it should do if you think of it). Nicolas Le Jeudi 30 Mars 2006 18:52, Dave Mason a écrit : > The Scheme community has a process called SRFIs which models on IETF > RFCs to explore models and extensions to the standard language (I was > one of the founding editors of SRFIs). Although Scheme is not (usually) > OO, it is dynamically typed and has many of the same issues as > Smalltalk. > > In particular, they have recently been discussing many of these issues > as part of SRFI-77, at: > http://srfi.schemers.org/srfi-77/ > > In the Document, look at the Issues under the Design Rationalle section. > > You may also find much of the archived discussion relevant. > > ../Dave |
In reply to this post by Nicolas Cellier-3
Le Mercredi 29 Mars 2006 10:22, nicolas cellier a écrit :
> I am not totally opposed to more radical solution you seem suggesting: (2 = > 2.0) never answer true. > In this case, we can still have efficient hash for comparing (2.0=2.0), and > do not have to bother with Integer and Fraction. I must answer no to myself because nobody did: As we won't have neither (2>2.0) nor (2<2.0), if we do not have (2=2.0), then that means that Numbers are unordered... Something i dislike and would not vote for. True, 2.0 might be inexact, maybe it is 1.9999, maybe 2.0001, but if i want such behaviour, i would rather use interval arithmetic 2.0+/- 0.0001. Such intervals cannot be fully ordered. Nicolas |
Free forum by Nabble | Edit this page |