Hi all,
accidentally I stumbled across some problems in the Magnitude area and wonder why these are there. These doits all break in a way or another, which I would not expect - 1e39 inspect no calculation actually, just not representable as Float while easily representable as LargeInteger or Double. While there seems to be no problem to jump from SmallInteger to LargeInteger, I do not understand, why a Double would not be a valid return value. - 1d309 inspect fails similarly, still representable as LargeInteger - (10 raisedTo: 40) < 1e6 the LargeInteger is about to be transformed to a Float in Float>>lessFromInteger: but no Float could ever be bigger than this. My guess is, that more double dispatching to the Integer and its subclasses would help a lot. - (10 raisedTo: 40) log fails, because it takes a detour to Float. There are a lot more functions like that, typically involving sends of #asLimitedPrecisionReal, which in turn only covers Float and not Double. For my taste either a Double coercion might help (pushing the limits further) or subclass implementations for LargeInteger are missing here. Regards Jan _______________________________________________ vwnc mailing list [hidden email] http://lists.cs.uiuc.edu/mailman/listinfo/vwnc |
Let's see...
1e39 inspect. As per the syntax, 1e39 is supposed to be a float. However, the value specified in float syntax is beyond the reach of floats. Something similar happens with 1d309. Check out Float class>>fmax and Double class>>fmax for the largest possible representable values. (10 raisedTo: 40) < 1e6. This works in 7.7.1. Other comparisons were broken similarly, but that is no longer the case :). (10 raisedTo: 40) log. This one still breaks because the default precision for floating point is single precision. Changing asLimitedPrecisionReal is not that simple, though. If you want to use doubles, you might want to try something like (10.0d raisedTo: 40) log. Andres. -----Original Message----- From: [hidden email] [mailto:[hidden email]] On Behalf Of Jan Weerts Sent: Friday, May 28, 2010 12:15 AM To: vwnc list Subject: [vwnc] Number operations and comparisons Hi all, accidentally I stumbled across some problems in the Magnitude area and wonder why these are there. These doits all break in a way or another, which I would not expect - 1e39 inspect no calculation actually, just not representable as Float while easily representable as LargeInteger or Double. While there seems to be no problem to jump from SmallInteger to LargeInteger, I do not understand, why a Double would not be a valid return value. - 1d309 inspect fails similarly, still representable as LargeInteger - (10 raisedTo: 40) < 1e6 the LargeInteger is about to be transformed to a Float in Float>>lessFromInteger: but no Float could ever be bigger than this. My guess is, that more double dispatching to the Integer and its subclasses would help a lot. - (10 raisedTo: 40) log fails, because it takes a detour to Float. There are a lot more functions like that, typically involving sends of #asLimitedPrecisionReal, which in turn only covers Float and not Double. For my taste either a Double coercion might help (pushing the limits further) or subclass implementations for LargeInteger are missing here. Regards Jan _______________________________________________ vwnc mailing list [hidden email] http://lists.cs.uiuc.edu/mailman/listinfo/vwnc _______________________________________________ vwnc mailing list [hidden email] http://lists.cs.uiuc.edu/mailman/listinfo/vwnc |
Hi Andres!
I somehow expected, that you would respond to this :) Valloud, Andres wrote: > 1e39 inspect. As per the syntax, 1e39 is supposed to be a float. > However, the value specified in float syntax is beyond the reach of > floats. Something similar happens with 1d309. Check out Float > class>>fmax and Double class>>fmax for the largest possible > representable values. I am quite aware of the syntax and its implications. My point is, that it does not have to fail, because the exact same number is still representable as some other classes object. And since this is not Java and we already juggle Integers and LargeIntegers in one go, why is it different for limited precision reals? Well I suppose somewhere in an X-colored book there is the root of this *insert appropriate term here*. > (10 raisedTo: 40) < 1e6. This works in 7.7.1. Other comparisons were > broken similarly, but that is no longer the case :). nice, gimme. > (10 raisedTo: 40) log. This one still breaks because the default > precision for floating point is single precision. Changing > asLimitedPrecisionReal is not that simple, though. If you want to use > doubles, you might want to try something like (10.0d raisedTo: 40) log. ... and fail again at a higher #raisedTo: argument? My point here was, that it would be quite easy to implement a lot of those functions for LargeInteger because the #asLimitedPrecisionReal feels like a crutch to avoid implementing it for these type of objects. I forgot to mention, why I posted this: we tested some input validation code (and users can be quite inventive) and found out about the XeY problem of the number parser. That's where it took off to some other tests in that direction. Regards Jan _______________________________________________ vwnc mailing list [hidden email] http://lists.cs.uiuc.edu/mailman/listinfo/vwnc |
In general terms, I am suspicious that the root of the insertable term
is the desire not to guess what the developer meant because invariably some guess will be wrong. And then, well... beware of the developers' wrath! Thus, let's not guess. It may seem limited. But, also, it's consistent and predictable. Regarding implementing log and other functions for large integers... I think part of the problem is that implementing fully precise algorithms that run very quickly is not that easy for arbitrary precision arguments. For example, how much would you be willing to wait for this? (10 raisedTo: 400) log: 7 Should the answer be a double of a fixed point? How many decimals should you get? What happens if you want this instead? (10 raisedTo: 400) log: Double pi To what precision should the calculations be carried out? Who sets the default? I think sqrtFloor and sqrtRounded are generally useful for large integers. Fortunately, in the case of logs, usually you can either approximate or use the number's own construction to figure out its logarithm in some base, e.g.: log_7 (10^400) = 400 * log_7 10, or LargePositiveInteger>>log self <= Double fmax truncated ifTrue: [^self asDouble log]. ^Double fmax log + (self // Double fmax truncated) log But is the potential for accumulated error acceptable in the above method? It is not possible to guess how arbitrary applications will tolerate errors in the above method. Similarly, it is not possible to guess how arbitrary applications will tolerate performance penalties due to the desire to get even the very last bit right (never mind what happens when the result is irrational and thus requires infinite decimals to express correctly). What should be done? Guess, and thus run the risk of guessing wrong? Maybe we just need a loadable package with some of these things implemented in some way so that others can extend / refine the functionality for their own purposes. Andres. -----Original Message----- From: Jan Weerts [mailto:[hidden email]] Sent: Friday, May 28, 2010 7:29 AM To: Valloud, Andres Cc: [hidden email] Subject: Re: [vwnc] Number operations and comparisons Hi Andres! I somehow expected, that you would respond to this :) Valloud, Andres wrote: > 1e39 inspect. As per the syntax, 1e39 is supposed to be a float. > However, the value specified in float syntax is beyond the reach of > floats. Something similar happens with 1d309. Check out Float > class>>fmax and Double class>>fmax for the largest possible > representable values. I am quite aware of the syntax and its implications. My point is, that it does not have to fail, because the exact same number is still representable as some other classes object. And since this is not Java and we already juggle Integers and LargeIntegers in one go, why is it different for limited precision reals? Well I suppose somewhere in an X-colored book there is the root of this *insert appropriate term here*. > (10 raisedTo: 40) < 1e6. This works in 7.7.1. Other comparisons were > broken similarly, but that is no longer the case :). nice, gimme. > (10 raisedTo: 40) log. This one still breaks because the default > precision for floating point is single precision. Changing > asLimitedPrecisionReal is not that simple, though. If you want to use > doubles, you might want to try something like (10.0d raisedTo: 40) log. ... and fail again at a higher #raisedTo: argument? My point here was, that it would be quite easy to implement a lot of those functions for LargeInteger because the #asLimitedPrecisionReal feels like a crutch to avoid implementing it for these type of objects. I forgot to mention, why I posted this: we tested some input validation code (and users can be quite inventive) and found out about the XeY problem of the number parser. That's where it took off to some other tests in that direction. Regards Jan _______________________________________________ vwnc mailing list [hidden email] http://lists.cs.uiuc.edu/mailman/listinfo/vwnc |
Valloud, Andres <avalloud <at> cincom.com> writes:
> > snip.. (gmane oblige) > > LargePositiveInteger>>log > > self <= Double fmax truncated ifTrue: [^self asDouble log]. > ^Double fmax log + (self // Double fmax truncated) log > > But is the potential for accumulated error acceptable in the above > method? It is not possible to guess how arbitrary applications will > tolerate errors in the above method. Similarly, it is not possible to > guess how arbitrary applications will tolerate performance penalties due > to the desire to get even the very last bit right (never mind what > happens when the result is irrational and thus requires infinite > decimals to express correctly). What should be done? Guess, and thus > run the risk of guessing wrong? Maybe we just need a loadable package > with some of these things implemented in some way so that others can > extend / refine the functionality for their own purposes. > > Andres. > Hi again, For fun, I just put a patch in squeak inbox similar to your proposition. But I don't know if it is usefull indeed. For example, what would we expect here: (10 raisedTo: 40) ln exp Concerning accuracy, isn't intel x86 log already up to 3 ulp off? So maybe LargeInteger>>log won't be much worse. Note that naive algorithm can be very bad wrt accuracy for Fraction due to gradual underflow. cheers Nicolas _______________________________________________ vwnc mailing list [hidden email] http://lists.cs.uiuc.edu/mailman/listinfo/vwnc |
Generally speaking, the problem with this type of methods is that they
are only suitable once you know the context of application. However, from the POV of a language vendor, we do not know the context of application. Thus, if we did something, we'd be forced to guess, and then we would eventually guess wrong. So, it is better not to guess. -----Original Message----- From: [hidden email] [mailto:[hidden email]] On Behalf Of nicolas cellier Sent: Friday, May 28, 2010 12:57 PM To: [hidden email] Subject: Re: [vwnc] Number operations and comparisons Valloud, Andres <avalloud <at> cincom.com> writes: > > snip.. (gmane oblige) > > LargePositiveInteger>>log > > self <= Double fmax truncated ifTrue: [^self asDouble log]. > ^Double fmax log + (self // Double fmax truncated) log > > But is the potential for accumulated error acceptable in the above > method? It is not possible to guess how arbitrary applications will > tolerate errors in the above method. Similarly, it is not possible to > guess how arbitrary applications will tolerate performance penalties > due to the desire to get even the very last bit right (never mind what > happens when the result is irrational and thus requires infinite > decimals to express correctly). What should be done? Guess, and thus > run the risk of guessing wrong? Maybe we just need a loadable package > with some of these things implemented in some way so that others can > extend / refine the functionality for their own purposes. > > Andres. > Hi again, For fun, I just put a patch in squeak inbox similar to your proposition. But I don't know if it is usefull indeed. For example, what would we expect here: (10 raisedTo: 40) ln exp Concerning accuracy, isn't intel x86 log already up to 3 ulp off? So maybe LargeInteger>>log won't be much worse. Note that naive algorithm can be very bad wrt accuracy for Fraction due to gradual underflow. cheers Nicolas _______________________________________________ vwnc mailing list [hidden email] http://lists.cs.uiuc.edu/mailman/listinfo/vwnc _______________________________________________ vwnc mailing list [hidden email] http://lists.cs.uiuc.edu/mailman/listinfo/vwnc |
In reply to this post by Andres Valloud-6
Hi Andres!
Valloud, Andres wrote: > In general terms, I am suspicious that the root of the insertable term > is the desire not to guess what the developer meant because invariably > some guess will be wrong. And then, well... beware of the developers' > wrath! Thus, let's not guess. It may seem limited. But, also, it's > consistent and predictable. Sorry about the previous verbal fit. But for me consistency is what I argue for and do not see in the implementation. The objects are all Numbers, yet behave differently depending on their class/value. #log is implemented at Number so I would expect it to work for all Numbers. Predictability is only given in case the "wrathful" developers know all hinges of the implementation and catches all possible errors in places where a quick glance at the location of implementation would not have predicted a possible failure path. > Regarding implementing log and other functions for large integers... I > think part of the problem is that implementing fully precise algorithms > that run very quickly is not that easy for arbitrary precision > arguments. For example, how much would you be willing to wait for this? > > (10 raisedTo: 400) log: 7 Well I do know that you know a lot more about mathematics than I do, but I thought a little bit of bit copy operations on the binary representation of a LargePositiveInteger to convert it to a Doubles mantissa plus some basic log-mathematics would have helped. I'll give it a try, when I find the time. And possibly, when I want to handle large values I might be inclined to wait a little longer than usual. > Should the answer be a double of a fixed point? Well a FixedPoint right now returns a Float, so keep it that way until it overflows to Double :) I read all of your questions and know, that one really has to think about these problems, if something is going to happen in that area. I am not in the position to find all valid use cases, I just commented on the problems we encountered while parsing user input. The application domain we typically handle does not care about huge precisions and calculation efficiency on reals is also not a big deal, so I expect e.g. Nicolas to be not very fond of the requirements I would specify. But in the interest of our customers are stable applications. And in this area we (developers) found out (luckily while testing) that the given library throws errors in places we did not expect. Hence the mail. > Maybe we just need a loadable package > with some of these things implemented in some way so that others can > extend / refine the functionality for their own purposes. That might be an idea. Thanks for your feedback Jan _______________________________________________ vwnc mailing list [hidden email] http://lists.cs.uiuc.edu/mailman/listinfo/vwnc |
I think the consistency we have right now is that everything except
Doubles goes through Float because that's what asLimitedPrecisionReal does. I understand it may not be the kind of consistency you want, though :). Unfortunately, changing it is not that easy because of the implications. We might have to wait until 8.0 to do that, because then we get a license to potentially break backwards compatibility. And perhaps it's not a good idea at all, imagine tests written like self assert: -1 sin = Float pi, or self assert: resultOfNumericalCalculations = someResultThatHappenedWithThePreviousCode, or you load old data from the database and you find that you can't reproduce the old results, etc. Roughly speaking, it's not clear what the behavior of veryLargePositiveInteger log should be. For example, > Well I do know that you know a lot more about mathematics than I do, but I thought a little bit of bit copy operations on the binary representation of a LargePositiveInteger to convert it to a Doubles mantissa plus some basic log-mathematics would have helped. Well but see, the problem is that to get an accurate result you may need to calculate log 2 to a lot of decimals so that you can do things like veryLargePositiveInteger log = = (2^k * smallerLargePositiveInteger) log = k log 2 + smallerLargePositiveInteger log with some degree of accuracy. What degree of accuracy are we talking about? What if a double does not have enough significant digits so when you do 10^(veryLargePositiveInteger log) you get veryLargePositiveInteger back? Since veryLargePositiveInteger does not fit in a double, then should we also implement integer (and fraction, fixed point, etc) e^x? To what precision? With what kind of performance impact? How do we know what is a reasonable answer in the absence of actual use cases? > And possibly, when I want to handle large values I might be inclined to wait a little longer than usual. The problem is that, generally speaking, it doesn't take a "little" longer. Rather, it takes MUCH longer. I'd expect that just emulating what a modern FPU is capable of in software, without using floating point operations, would execute at least a good 3-5x slower, if not much worse. I'd be happy predicting that doing the same with arbitrary precision will become slower in proportion to the input sizes (if not worse, like in proportion to the square of the input sizes). So, doing an operation with 10^400 should take at least ~100 times more time than the same operation with a double (10^400 has 1200 bits, which is >20 times larger than double's 52/53 bit mantissas, and then we have to apply another 3-5x performance penalty factor on top of that, and we're still assuming linear performance deterioration which I suspect is very optimistic, and we're also assuming the implementation of arbitrary precision arithmetic in Smalltalk will be as fast as in C, etc). So then we should compromise somewhat. But how much can we compromise when we do not know what customers need? Or, perhaps more precisely, how much should we compromise when different customers will have completely different needs? The problem here is that no matter what we do, I don't think we can really make all customers happy by changing the default implementation for everybody this time. Hence the loadable packages approach. Now, if there are things that make the loadable packages approach unnecessarily difficult, I'd even risk it and say the obstacles should be made easier to deal with. Andres. -----Original Message----- From: Jan Weerts [mailto:[hidden email]] Sent: Monday, May 31, 2010 2:00 AM To: Valloud, Andres Cc: [hidden email] Subject: Re: [vwnc] Number operations and comparisons Hi Andres! Valloud, Andres wrote: > In general terms, I am suspicious that the root of the insertable term > is the desire not to guess what the developer meant because invariably > some guess will be wrong. And then, well... beware of the developers' > wrath! Thus, let's not guess. It may seem limited. But, also, it's > consistent and predictable. Sorry about the previous verbal fit. But for me consistency is what I argue for and do not see in the implementation. The objects are all Numbers, yet behave differently depending on their class/value. #log is implemented at Number so I would expect it to work for all Numbers. Predictability is only given in case the "wrathful" developers know all hinges of the implementation and catches all possible errors in places where a quick glance at the location of implementation would not have predicted a possible failure path. > Regarding implementing log and other functions for large integers... I > think part of the problem is that implementing fully precise > algorithms that run very quickly is not that easy for arbitrary > precision arguments. For example, how much would you be willing to wait for this? > > (10 raisedTo: 400) log: 7 Well I do know that you know a lot more about mathematics than I do, but I thought a little bit of bit copy operations on the binary representation of a LargePositiveInteger to convert it to a Doubles mantissa plus some basic log-mathematics would have helped. I'll give it a try, when I find the time. And possibly, when I want to handle large values I might be inclined to wait a little longer than usual. > Should the answer be a double of a fixed point? Well a FixedPoint right now returns a Float, so keep it that way until it overflows to Double :) I read all of your questions and know, that one really has to think about these problems, if something is going to happen in that area. I am not in the position to find all valid use cases, I just commented on the problems we encountered while parsing user input. The application domain we typically handle does not care about huge precisions and calculation efficiency on reals is also not a big deal, so I expect e.g. Nicolas to be not very fond of the requirements I would specify. But in the interest of our customers are stable applications. And in this area we (developers) found out (luckily while testing) that the given library throws errors in places we did not expect. Hence the mail. > Maybe we just need a loadable package > with some of these things implemented in some way so that others can > extend / refine the functionality for their own purposes. That might be an idea. Thanks for your feedback Jan _______________________________________________ vwnc mailing list [hidden email] http://lists.cs.uiuc.edu/mailman/listinfo/vwnc |
Free forum by Nabble | Edit this page |