Hi!
In playing around with SmallInteger>>printString (see bugs.squeak.org/view.php?id=6512) I tested this code for calculating the length of the String: self log truncated + 1 ...but that doesn't work. The current code in my ChangeSet uses an unrolled binary search (nested ifs) which is 7x faster (#decimalDigitLength) so the above code is not something we would like to use, but I still wonder why it fails and what I would have needed to do about it to get it to work: 999 log truncated ===> 2 As expected. 1000 log ===> 3.0 As expected. 1000 log truncated ===> 2 Eh, what? Not expected! ...ok, yes, I know - Floats are the "tar pits of hell" when it comes to expected results - but... can someone answer if this is a bug or if I simply can't expect this code to work? regards, Göran PS. Strongtalk does in fact do SmallInteger>>printString almost exactly like my enhancement does (didn't peek, promise!) but does it around 5x faster than Squeak. :) I fired it up to look at how it does #log but log wasn't implemented. On the other hand Bryce tested Exupery on my code and it made a real difference - at least 2x better (not sure if he only tested printString). And then he mentioned an obvious improvement in Exupery that would make it even faster. Interesting method for comparing! |
On May 30, 2007, at 13:01 , Göran Krampe wrote: > Hi! > > In playing around with SmallInteger>>printString (see > bugs.squeak.org/view.php?id=6512) I tested this code for > calculating the > length of the String: > > self log truncated + 1 > > ...but that doesn't work. The current code in my ChangeSet uses an > unrolled binary search (nested ifs) which is 7x faster > (#decimalDigitLength) so the above code is not something we would > like to > use, but I still wonder why it fails and what I would have needed > to do > about it to get it to work: > > 999 log truncated ===> 2 As expected. > > 1000 log ===> 3.0 As expected. > > 1000 log truncated ===> 2 Eh, what? Not expected! > > ...ok, yes, I know - Floats are the "tar pits of hell" when it > comes to > expected results - but... can someone answer if this is a bug or if I > simply can't expect this code to work? 1000 log - 3.0 "-4.44089209850063e-16" - Bert - |
In reply to this post by Göran Krampe
Hi Göran,
when I wrote printStringBase: I remember I tried to use x ln / base ln to get the string length with the same results as you are. Depending on floats it is just not robust. I went with Stream to keep it simple. It sure can help to have an optimized version for the major base rather then using printStringBase: 10. Alex Göran Krampe schrieb: > Hi! > > In playing around with SmallInteger>>printString (see > bugs.squeak.org/view.php?id=6512) I tested this code for calculating the > length of the String: > > self log truncated + 1 > > ...but that doesn't work. The current code in my ChangeSet uses an > unrolled binary search (nested ifs) which is 7x faster > (#decimalDigitLength) so the above code is not something we would like to > use, but I still wonder why it fails and what I would have needed to do > about it to get it to work: > > 999 log truncated ===> 2 As expected. > > 1000 log ===> 3.0 As expected. > > 1000 log truncated ===> 2 Eh, what? Not expected! > > ...ok, yes, I know - Floats are the "tar pits of hell" when it comes to > expected results - but... can someone answer if this is a bug or if I > simply can't expect this code to work? > > regards, Göran > > PS. Strongtalk does in fact do SmallInteger>>printString almost exactly > like my enhancement does (didn't peek, promise!) but does it around 5x > faster than Squeak. :) I fired it up to look at how it does #log but log > wasn't implemented. On the other hand Bryce tested Exupery on my code and > it made a real difference - at least 2x better (not sure if he only tested > printString). And then he mentioned an obvious improvement in Exupery that > would make it even faster. Interesting method for comparing! > > |
Well,
(self asFloat + 0.9) log truncated + 1 would work reliably for SmallIntegers. The nested conditionals might still win ... in particular if it was not binary search but based on frequency of numbers. - Bert - On May 30, 2007, at 15:02 , Alexander Lazarevic' wrote: > Hi Göran, > > when I wrote printStringBase: I remember I tried to use x ln / base ln > to get the string length with the same results as you are. > Depending on > floats it is just not robust. I went with Stream to keep it simple. It > sure can help to have an optimized version for the major base rather > then using printStringBase: 10. > > Alex > > Göran Krampe schrieb: >> Hi! >> >> In playing around with SmallInteger>>printString (see >> bugs.squeak.org/view.php?id=6512) I tested this code for >> calculating the >> length of the String: >> >> self log truncated + 1 >> >> ...but that doesn't work. The current code in my ChangeSet uses an >> unrolled binary search (nested ifs) which is 7x faster >> (#decimalDigitLength) so the above code is not something we would >> like to >> use, but I still wonder why it fails and what I would have needed >> to do >> about it to get it to work: >> >> 999 log truncated ===> 2 As expected. >> >> 1000 log ===> 3.0 As expected. >> >> 1000 log truncated ===> 2 Eh, what? Not expected! >> >> ...ok, yes, I know - Floats are the "tar pits of hell" when it >> comes to >> expected results - but... can someone answer if this is a bug or if I >> simply can't expect this code to work? >> >> regards, Göran >> >> PS. Strongtalk does in fact do SmallInteger>>printString almost >> exactly >> like my enhancement does (didn't peek, promise!) but does it >> around 5x >> faster than Squeak. :) I fired it up to look at how it does #log >> but log >> wasn't implemented. On the other hand Bryce tested Exupery on my >> code and >> it made a real difference - at least 2x better (not sure if he >> only tested >> printString). And then he mentioned an obvious improvement in >> Exupery that >> would make it even faster. Interesting method for comparing! >> >> > |
I used this in VW for any arbitrary large integer ,
Works even if integer cannot convert to Float because > Float maxVal. (available at public store SYSEXT-floorLog) Integer>>floorLog10 "Initial guess is based on digitLength and following approximation : (2 raisedTo: 10) > (10 raisedTo: 3)" | i j step guess n | (self negative or: [self isZero]) ifTrue: [^self error: 'Can''t take log of number <= 0']. guess := (self digitLength - 1) * 8 * 3 // 10. i := self // (10 raisedTo: guess). step := 2. [step > 0] whileTrue: [n := 10 raisedTo: step. [j := i // n. j > 0] whileTrue: [guess := guess + step. i := j]. step := step // 2]. ^guess Of course, for SmallInteger, Bert's solution might be faster. As for Float, i guess a native log10 FFI call should behave better than self ln / 10 ln, since the latter implies 3 rounding operations (and an inexact operation). I did not check the former however. Nicolas Bert Freudenberg a écrit : > Well, > > (self asFloat + 0.9) log truncated + 1 > > would work reliably for SmallIntegers. > > The nested conditionals might still win ... in particular if it was not > binary search but based on frequency of numbers. > > - Bert - > > On May 30, 2007, at 15:02 , Alexander Lazarevic' wrote: > >> Hi Göran, >> >> when I wrote printStringBase: I remember I tried to use x ln / base ln >> to get the string length with the same results as you are. Depending on >> floats it is just not robust. I went with Stream to keep it simple. It >> sure can help to have an optimized version for the major base rather >> then using printStringBase: 10. >> >> Alex >> >> Göran Krampe schrieb: >>> Hi! >>> >>> In playing around with SmallInteger>>printString (see >>> bugs.squeak.org/view.php?id=6512) I tested this code for calculating the >>> length of the String: >>> >>> self log truncated + 1 >>> >>> ...but that doesn't work. The current code in my ChangeSet uses an >>> unrolled binary search (nested ifs) which is 7x faster >>> (#decimalDigitLength) so the above code is not something we would >>> like to >>> use, but I still wonder why it fails and what I would have needed to do >>> about it to get it to work: >>> >>> 999 log truncated ===> 2 As expected. >>> >>> 1000 log ===> 3.0 As expected. >>> >>> 1000 log truncated ===> 2 Eh, what? Not expected! >>> >>> ...ok, yes, I know - Floats are the "tar pits of hell" when it comes to >>> expected results - but... can someone answer if this is a bug or if I >>> simply can't expect this code to work? >>> >>> regards, Göran >>> >>> PS. Strongtalk does in fact do SmallInteger>>printString almost exactly >>> like my enhancement does (didn't peek, promise!) but does it around 5x >>> faster than Squeak. :) I fired it up to look at how it does #log but log >>> wasn't implemented. On the other hand Bryce tested Exupery on my code >>> and >>> it made a real difference - at least 2x better (not sure if he only >>> tested >>> printString). And then he mentioned an obvious improvement in Exupery >>> that >>> would make it even faster. Interesting method for comparing! >>> >>> >> > > > > > > |
nicolas cellier wrote:
> As for Float, i guess a native log10 FFI call should behave better than > self ln / 10 ln, since the latter implies 3 rounding operations (and an > inexact operation). Indeed. Croquet gets it right (it uses a primitive from fdlibm). Cheers, - Andreas |
In reply to this post by Bert Freudenberg
Just tried this very simple one:
(I guess the smaller the merrier) floorLog10bert "from Bert Freundenberg's idea" (self negative or: [self isZero]) ifTrue: [^self error: 'Can''t take log of number <= 0']. self < 10 ifTrue: [^0]. self < 100 ifTrue: [^1]. self < 1000 ifTrue: [^2]. self < 10000 ifTrue: [^3]. self < 100000 ifTrue: [^4]. self < 1000000 ifTrue: [^5]. self < 10000000 ifTrue: [^6]. self < 100000000 ifTrue: [^7]. self < 1000000000 ifTrue: [^8]. ^9 "Based on the fact that (SmallInteger maxVal log: 10) floor = 9" Comparing universal Integer version provided in previous mail and Bert's gives a factor 5 at least Time millisecondsToRun: [1000000 timesRepeat: [SmallInteger maxVal floorLog10 ]]. 19846 Time millisecondsToRun: [1000000 timesRepeat: [SmallInteger maxVal floorLog10bert ]]. 3790 3777 Ain't that much to gain when reducing number of tests anyway: Time millisecondsToRun: [1000000 timesRepeat: [1 floorLog10bert ]]. 2563 Seems fast enough not to bother. Nicolas Bert Freudenberg a écrit : > > The nested conditionals might still win ... in particular if it was not > binary search but based on frequency of numbers. > > - Bert - > |
Hi!
nicolas cellier <[hidden email]> wrote: > Just tried this very simple one: > (I guess the smaller the merrier) > > floorLog10bert > "from Bert Freundenberg's idea" > > (self negative or: [self isZero]) > ifTrue: > [^self error: 'Can''t take log of number <= 0']. > self < 10 ifTrue: [^0]. > self < 100 ifTrue: [^1]. > self < 1000 ifTrue: [^2]. > self < 10000 ifTrue: [^3]. > self < 100000 ifTrue: [^4]. > self < 1000000 ifTrue: [^5]. > self < 10000000 ifTrue: [^6]. > self < 100000000 ifTrue: [^7]. > self < 1000000000 ifTrue: [^8]. > ^9 "Based on the fact that (SmallInteger maxVal log: 10) floor = 9" > > > Comparing universal Integer version provided in previous mail and Bert's > gives a factor 5 at least I also noted this and I also used the above approach - though with a binary search. My code is in the mantis issue I posted in the beginning. :) Anyway, since smaller numbers are likely to be much more common I would guess the ordering above is actually "smarter". And yes, it was fun to see that Strongtalk does the above and its SmallInteger>>printString is very similar to the one I ended up writing in the mantis issue. regards, Göran |
On May 30, 2007, at 23:36 , [hidden email] wrote:
> Hi! > > nicolas cellier <[hidden email]> wrote: >> Just tried this very simple one: >> (I guess the smaller the merrier) >> >> floorLog10bert >> "from Bert Freundenberg's idea" (My name occurrencesOf: $n) = 1 ;-) >> (self negative or: [self isZero]) >> ifTrue: >> [^self error: 'Can''t take log of number <= 0']. I'd use "self < 1" for the first test to avoid unnecessary message sends (and it's prettier, too). >> self < 10 ifTrue: [^0]. >> self < 100 ifTrue: [^1]. >> self < 1000 ifTrue: [^2]. >> self < 10000 ifTrue: [^3]. >> self < 100000 ifTrue: [^4]. >> self < 1000000 ifTrue: [^5]. >> self < 10000000 ifTrue: [^6]. >> self < 100000000 ifTrue: [^7]. >> self < 1000000000 ifTrue: [^8]. >> ^9 "Based on the fact that (SmallInteger maxVal log: 10) floor = 9" >> >> >> Comparing universal Integer version provided in previous mail and >> Bert's >> gives a factor 5 at least > I also noted this and I also used the above approach - though with a > binary search. My code is in the mantis issue I posted in the > beginning. > :) > > Anyway, since smaller numbers are likely to be much more common I > would > guess the ordering above is actually "smarter". And yes, it was fun to > see that Strongtalk does the above and its > SmallInteger>>printString is > very similar to the one I ended up writing in the mantis issue. I still wonder how common #printString calls are in performance- sensitive code ... I usually use printOn: if performance matters, so maybe that should be optimized, too? - Bert - |
Bert Freudenberg a écrit :
> On May 30, 2007, at 23:36 , [hidden email] wrote: > >>> floorLog10bert >>> "from Bert Freundenberg's idea" > > (My name occurrencesOf: $n) = 1 ;-) > Apologies, take it as a friendly error. >>> (self negative or: [self isZero]) >>> ifTrue: >>> [^self error: 'Can''t take log of number <= 0']. > > I'd use "self < 1" for the first test to avoid unnecessary message sends > (and it's prettier, too). > For sure, (self negative or: [self isZero]) is a stupid copy paste. It was tailored for LargeInteger implementation for which both operations are trivial. But when i think of it, LargeInteger < SmallInteger should be trivial too. I wonder which dialect might engage a costly useless arithmetic operation when asked for it... Nicolas |
In reply to this post by Bert Freudenberg
Bert Freudenberg schrieb:
> Well, > > (self asFloat + 0.9) log truncated + 1 > > would work reliably for SmallIntegers. Ah. Well yes :) Alex |
In reply to this post by Nicolas Cellier-3
nicolas cellier a écrit :
> For sure, (self negative or: [self isZero]) is a stupid copy paste. > It was tailored for LargeInteger implementation for which both > operations are trivial. > But when i think of it, LargeInteger < SmallInteger should be trivial too. > I wonder which dialect might engage a costly useless arithmetic > operation when asked for it... > > Nicolas > For my defense, just gave it a try: after defining LargePositiveInteger>>isZero ^false, i test: | val | val := (SmallInteger maxVal raisedTo: 4) + 1. {Time millisecondsToRun: [100000 timesRepeat: [val < 1]]. Time millisecondsToRun: [100000 timesRepeat: [val negative or: [val isZero]]]. Time millisecondsToRun: [100000 timesRepeat: [val strictlyPositive not]].} #(457 118 106) Hmm test < 1 is not the fastest. Since primitive 23 seems to fail, i remove LargePositiveInteger>>#< and i get #(399 121 113). Then not much to gain, #digitCompare: must be half the cost, one trivial negative and two #< tests on small integers... So i tried the double dispatching way: LargePositiveInteger>>#< aNumber ^aNumber isInteger ifTrue: [aNumber greaterThanLargePositiveInteger: self] ifFalse: [aNumber adaptToInteger: self andSend: #<] LargePositiveInteger>>#greaterThanLargePositiveInteger: aNumber ^(self digitCompare: aNumber) > 0 LargeNegativeInteger>>#greaterThanLargePositiveInteger: aNumber ^false SmallInteger>>#greaterThanLargePositiveInteger: aNumber ^false I now get #(217 111 108). Like for isZero and isNegative, I call two trivial methods but pay a penalty for intermediate #< method invocation. (would need Exupery inlining if able to handle such polymorphic case). I could also suppress the isInteger test and implement: Object>>#greaterThanLargePositiveInteger: aNumber ^self adaptToInteger: aNumber andSend: #< Well #(190 112 108) maybe does not deserve polluting Object protocol... If you don't care, or think it's too much for such a ridiculous gain, then you're probably right. Don't you optimize but in tight loops! And maybe arithmetics too... Anyway it's fun when it can be done in Smalltalk without any additionnal primitive or plugin... Nicolas |
In reply to this post by Bert Freudenberg
Hi folks!
Messing with small scale optimization can be great fun. Preamble: "Change Set: FloorLogSmallInteger Date: 31 May 2007 Author: Göran Krampe After messing with SmallInteger>>decimalDigitLength and noticing that '1000 log truncated' returns the wrong answer I ended up chasing down base-10 log algorithms. Found this nice source of tricks: http://www.hackersdelight.org/HDcode/ilog.c And based on code in there and the fact that nlz (number of leading zeros) can be computed using 'self asFloat exponent' I came up with this implementation of #floorLog for SmallIntegers. It still is not as fast as simpleminded comparisons as in decimalDigitLength though so it is not worth using. But it was fun to experiment." So the attachment is merely for amusement and/or info on the subject! :) :) regards, Göran FloorLogSmallInteger.1.cs.gz (898 bytes) Download Attachment |
Göran Krampe <goran <at> krampe.se> writes:
> > After messing with SmallInteger>>decimalDigitLength and noticing that > '1000 log truncated' returns the wrong answer I ended up chasing down > base-10 log algorithms. Found this nice source of tricks: > > http://www.hackersdelight.org/HDcode/ilog.c > > And based on code in there and the fact that nlz (number of leading zeros) > can be computed using 'self asFloat exponent' I came up with this > implementation of #floorLog for SmallIntegers. It still is not as fast as > simpleminded comparisons as in decimalDigitLength though so it is not > worth using. But it was fun to experiment." > Yes, this link is fun. Also don't forget Smalltalk has #highBit, a complement of nlz not involving Float. Nicolas |
Free forum by Nabble | Edit this page |