Nicolas Cellier uploaded a new version of Kernel to project The Trunk:
http://source.squeak.org/trunk/Kernel-nice.1402.mcz ==================== Summary ==================== Name: Kernel-nice.1402 Author: nice Time: 7 May 2021, 9:39:45.713143 pm UUID: 21ee698b-9f75-40e3-9bd3-944e73cb9e20 Ancestors: Kernel-nice.1401 Fix sqrt bugs exposed by #testSqrtNearExactTie. If we pretend to round to nearest, we must do it right, not just pretend... How was the problem found? Accidentally... I re-examined our implementation after musing in SO: https://stackoverflow.com/questions/67361541/correctly-rounded-computation-of-sqrt-of-sum-of-two-floats-handling-overflow/67426790#67426790 Musing is more powerful than dumb static and coverage tests, I wish I got more time for musing :) We deadly need evolutive testing (neural based). =============== Diff against Kernel-nice.1401 =============== Item was changed: ----- Method: LargePositiveInteger>>sqrt (in category 'mathematical functions') ----- sqrt "Answer the square root of the receiver. If the square root is exact, answer an Integer, else answer a Float approximation. Make sure the result is correctly rounded (i.e. the nearest Float to the exact square root)" + | floatResult integerResult guardBit highBit sr maybeExact | + maybeExact := self mightBeASquare. + self isAnExactFloat - | floatResult integerResult guardBit highBit sr | - (highBit := self highBit) < (Float precision * 2) ifTrue: ["the sqrt of self asFloat is correctly rounded, so use it" floatResult := self asFloat sqrt. + maybeExact ifFalse: [^floatResult]. - self mightBeASquare ifFalse: [^floatResult]. "Answer integerResult in case of perfect square" integerResult := floatResult truncated. integerResult squared = self ifTrue: [^integerResult]. ^floatResult]. + "Eventually use guard bits for handling correct rounding direction" + highBit := self highBit. - "Eventually use a guard bit for handling correct rounding direction" guardBit := highBit <= (Float precision + 1 * 2) ifTrue: + ["Add guard bits for rounding correctly" + Float precision + 1 * 2 + 1 - highBit] - ["Add one guard bit for rounding correctly" - 1] ifFalse: + [maybeExact - [self mightBeASquare ifTrue: + ["Keep all the bits in case we were a perfect square" - ["Keep all the bits in case we are a perfect square" 0] ifFalse: + ["Remove superfluous bits that won't change the Float approximation" - ["Remove superfluous bit that won't change the Float approximation" Float precision + 1 - (highBit // 2)]]. "Get truncated sqrt and remainder for the same price" sr := (self bitShift: guardBit * 2) sqrtRem. "Handle case of perfect square" integerResult := sr first. + (maybeExact and: [sr last isZero]) ifTrue: [^integerResult bitShift: guardBit negated]. - sr last isZero ifTrue: [^integerResult bitShift: guardBit negated]. "Answer the best we have which is the sqrt correctly rounded at Float precision." ^((integerResult bitShift: Float precision - integerResult highBit) + (integerResult bitAt: integerResult highBit - Float precision)) asFloat timesTwoPower: integerResult highBit - Float precision - guardBit! |
On Fri, May 07, 2021 at 07:39:50PM +0000, [hidden email] wrote:
> Nicolas Cellier uploaded a new version of Kernel to project The Trunk: > http://source.squeak.org/trunk/Kernel-nice.1402.mcz > > > Musing is more powerful than dumb static and coverage tests, I wish I got more time for musing :) > We deadly need evolutive testing (neural based). > Interesting commit comment. How might this work? Dave |
Le dim. 16 mai 2021 à 17:10, David T. Lewis <[hidden email]> a écrit :
> > On Fri, May 07, 2021 at 07:39:50PM +0000, [hidden email] wrote: > > Nicolas Cellier uploaded a new version of Kernel to project The Trunk: > > http://source.squeak.org/trunk/Kernel-nice.1402.mcz > > > > > > Musing is more powerful than dumb static and coverage tests, I wish I got more time for musing :) > > We deadly need evolutive testing (neural based). > > > > Interesting commit comment. How might this work? > > Dave > > How? This way: put enough buzzwords in commit comments to bring some academics on the subject ;) |
More seriously, there is not a single kind of test.
One category is kind of specification illustrating the expectations, and demonstrating how to use some message/class. Most of the time, our tests as specification lack the quantifiers (like the universal quantifier), that's why I name them illustrating. Ideally, we would like to have some form of formal proof, but there rarely is one accessible, unless we drastically restrict the capabilities (like recursivity and all forms of reflexivity) At least, that's my understanding of https://en.wikipedia.org/wiki/Formal_methods In some rare cases, we now have enough computing power to test an implementation exhaustively (like a function of a single float32 argument). Alternatively, we can try and test with randomly generated inputs, but that's a bit like shooting in the dark. In order to be more eager, we sometimes write tests against a specific implementation with specially crafted examples for non regression or main gotchas of the specific algorithm. I guess my efforts fall in such a category: it's kind of adversarial strategy; somehow like a game of finding the shortcomings. If we have watts to burn, I think that it would be interesting to use machine power to find and construct those adversarial examples, not based on sole randomness, but some form of analysis of algorithms and probably some set of heuristics. How could we build such machinery, I don't know, for now it's still buzzwords. Le lun. 17 mai 2021 à 12:00, Nicolas Cellier <[hidden email]> a écrit : > > Le dim. 16 mai 2021 à 17:10, David T. Lewis <[hidden email]> a écrit : > > > > On Fri, May 07, 2021 at 07:39:50PM +0000, [hidden email] wrote: > > > Nicolas Cellier uploaded a new version of Kernel to project The Trunk: > > > http://source.squeak.org/trunk/Kernel-nice.1402.mcz > > > > > > > > > Musing is more powerful than dumb static and coverage tests, I wish I got more time for musing :) > > > We deadly need evolutive testing (neural based). > > > > > > > Interesting commit comment. How might this work? > > > > Dave > > > > > Hi Dave, > How? This way: put enough buzzwords in commit comments to bring some > academics on the subject ;) |
Nicolas, are you aware of SAT-SMT solvers? (SAT/SMT by Example (sat-smt.codes)) Microsoft used Z3 to great effect in flushing out bugs in Vista. SAT-SMT's used in a thing called Concolic testing - Wikipedia, frank On Mon, 17 May 2021 at 05:14, Nicolas Cellier <[hidden email]> wrote: More seriously, there is not a single kind of test. |
Hi Frank,
Thanks for the links. Le lun. 17 mai 2021 à 20:05, Frank Shearar <[hidden email]> a écrit :
|
In reply to this post by Nicolas Cellier
On 17/05/2021 13:14, Nicolas Cellier wrote:
> Alternatively, we can try and test with randomly generated inputs, but > that's a bit like shooting in the dark. Isn't that the whole point of fuzz testing which seems quite effective in other languages? |
Hi Douglas,
If the number of failing cases is one out of 1,000,000 or more, the probability to find one is infinitesimal. The percentage of failure in the sqrt case was even smaller (it requires crafting the trailing bits of integer whose highBit > 100)... I don't believe that fuzz testing is really the effective way in this context. Only the setup is effective (very simple and cheap - though we have to bound our LargePositiveInteger...). Shooting in the dark is effective only in crowded places. White box testing is certainly much more effective in such cases. Le mar. 18 mai 2021 à 20:15, Douglas Brebner <[hidden email]> a écrit : > > On 17/05/2021 13:14, Nicolas Cellier wrote: > > Alternatively, we can try and test with randomly generated inputs, but > > that's a bit like shooting in the dark. > Isn't that the whole point of fuzz testing which seems quite effective > in other languages? > |
On 18/05/2021 20:44, Nicolas Cellier wrote:
> Hi Douglas, > If the number of failing cases is one out of 1,000,000 or more, the > probability to find one is infinitesimal. > The percentage of failure in the sqrt case was even smaller (it > requires crafting the trailing bits of integer whose highBit > 100)... > I don't believe that fuzz testing is really the effective way in this context. > Only the setup is effective (very simple and cheap - though we have to > bound our LargePositiveInteger...). > Shooting in the dark is effective only in crowded places. > White box testing is certainly much more effective in such cases. Oh yeah, in this context it may not work well. I was just pointing out that it can be useful. Not to mention that even one in a million chances are likely to be smoked out with hundreds of millions (or more) fuzz tests. |
Free forum by Nabble | Edit this page |