Aconcagua Hashing Bug?

Previous Topic Next Topic
 
classic Classic list List threaded Threaded
13 messages Options
Reply | Threaded
Open this post in threaded view
|

Aconcagua Hashing Bug?

Sean P. DeNigris
Administrator
(1/2) teaspoons hash ~= 0.5 teaspoons hash

b.t.w. Phexample caught this automatically because it checks hashes when comparing for equality... pretty cool
Cheers,
Sean
Reply | Threaded
Open this post in threaded view
|

Re: Aconcagua Hashing Bug?

Sven Van Caekenberghe-2
So ?

1/2 hash ~= 0.5 hash

I mean what would you expect, how should this have to work ?

Maybe the hashing specialists have something to say about this.

On 20 May 2014, at 01:38, Sean P. DeNigris <[hidden email]> wrote:

> (1/2) teaspoons hash ~= 0.5 teaspoons hash
>
> b.t.w. Phexample caught this automatically because it checks hashes when
> comparing for equality... pretty cool
>
>
>
> -----
> Cheers,
> Sean
> --
> View this message in context: http://forum.world.st/Aconcagua-Hashing-Bug-tp4759622.html
> Sent from the Pharo Smalltalk Developers mailing list archive at Nabble.com.
>


Reply | Threaded
Open this post in threaded view
|

Re: Aconcagua Hashing Bug?

philippeback
On Tue, May 20, 2014 at 7:28 AM, Sven Van Caekenberghe <[hidden email]> wrote:
So ?

1/2 hash ~= 0.5 hash

((1/2) hash = 0.5 hash) true
((1/2) hash ~= 0.5 hash) true

and

((1/3) hash =(1.0/3) hash) false

The implementation (in 3.0) has a special case for denominator is a power of 2:

Fraction>>hash
"Hash is reimplemented because = is implemented.
Care is taken that a Fraction equal to a Float also have an equal hash"

| tmp |
denominator isPowerOfTwo ifTrue: [
"If denominator is a power of two, I can be exactly equal to a Float"
tmp := self asFloat.
tmp isFinite ifTrue: [^tmp hash]].
"Else, I cannot be exactly equal to a Float, use own hash algorithm.
(Assume the fraction is already reduced)"
^numerator hash bitXor: denominator hash


Float>>hash
"Hash is reimplemented because = is implemented. Both words of the float are used; 8 bits are removed from each end to clear most of the exponent regardless of the byte ordering. (The bitAnd:'s ensure that the intermediate results do not become a large integer.) Slower than the original version in the ratios 12:5 to 2:1 depending on values. (DNS, 11 May, 1997)"

(self isFinite and: [self fractionPart = 0.0]) ifTrue: [^self truncated hash].
^ (((self basicAt: 1) bitAnd: 16r00FFFF00) +
  ((self basicAt: 2) bitAnd: 16r00FFFF00)) bitShift: -8


Mmmh, I am curious to know what the rationale for this is.

Getting into: 

Fraction>>isPowerOfTwo
|reduced|
reduced := self reduced.
^(reduced numerator = 1 and: [reduced denominator isPowerOfTwo])
or: [reduced denominator = 1 and: [reduced numerator isPowerOfTwo]]

calls in >>reduced, which in turn calls gcd: ...

That's quite something to be aware of.

Phil
 

I mean what would you expect, how should this have to work ?

Maybe the hashing specialists have something to say about this.

On 20 May 2014, at 01:38, Sean P. DeNigris <[hidden email]> wrote:

> (1/2) teaspoons hash ~= 0.5 teaspoons hash
>
> b.t.w. Phexample caught this automatically because it checks hashes when
> comparing for equality... pretty cool
>
>
>
> -----
> Cheers,
> Sean
> --
> View this message in context: http://forum.world.st/Aconcagua-Hashing-Bug-tp4759622.html
> Sent from the Pharo Smalltalk Developers mailing list archive at Nabble.com.
>



Reply | Threaded
Open this post in threaded view
|

Re: Aconcagua Hashing Bug?

Sven Van Caekenberghe-2

On 20 May 2014, at 08:17, [hidden email] wrote:

> On Tue, May 20, 2014 at 7:28 AM, Sven Van Caekenberghe <[hidden email]> wrote:
> So ?
>
> 1/2 hash ~= 0.5 hash
>
> ((1/2) hash = 0.5 hash) true

Yes, with the () they're equal...
Reply | Threaded
Open this post in threaded view
|

Re: Aconcagua Hashing Bug?

Guillermo Polito
I don't know about aconcagua, but as far as I know we have to keep the following invariant to have well behaving hashed collections:

 if two objects a and b are equals, they should have the same hash.


On Tue, May 20, 2014 at 8:34 AM, Sven Van Caekenberghe <[hidden email]> wrote:

On 20 May 2014, at 08:17, [hidden email] wrote:

> On Tue, May 20, 2014 at 7:28 AM, Sven Van Caekenberghe <[hidden email]> wrote:
> So ?
>
> 1/2 hash ~= 0.5 hash
>
> ((1/2) hash = 0.5 hash) true

Yes, with the () they're equal...

Reply | Threaded
Open this post in threaded view
|

Re: Aconcagua Hashing Bug?

Maximiliano Taborda
In few words, one measure is equal to another when their internal collaborators are equals between each other. And the hash of two measures are equals when the hash of their collaborators are equals too.

Then, some examples

"#="
(1/2) = 0.5 "this evaluate to true"
(1/2) ~= 0.5 "this evaluate to false"

"#hash"
(1/2) hash = 0.5 hash "this evaluate to true"
(1/2) hash ~= 0.5 hash "this evaluate to false"

Seeing the same with aconcagua

| teaspoons aHalfTeaspoon anotherHalfTeaspoon |
teaspoons := BaseUnit named: 'teaspoons'.
aHalfTeaspoon := Measure amount: 1/2 unit: teaspoons.
anotherHalfTeaspoon := Measure amount: 0.5 unit: teaspoons.

"#="
aHalfTeaspoon = anotherHalfTeaspoon "this evaluate to true"
aHalfTeaspoon ~= anotherHalfTeaspoon "this evaluate to false"

aHalfTeaspoon unit = anotherHalfTeaspoon unit "this evaluate to true"
aHalfTeaspoon unit ~= anotherHalfTeaspoon unit "this evaluate to false"

aHalfTeaspoon amount = anotherHalfTeaspoon amount "this evaluate to true"
aHalfTeaspoon amount ~= anotherHalfTeaspoon amount "this evaluate to false"

"#hash"
aHalfTeaspoon hash = anotherHalfTeaspoon hash "this evaluate to true"
aHalfTeaspoon hash ~= anotherHalfTeaspoon hash "this evaluate to false"

aHalfTeaspoon unit hash = anotherHalfTeaspoon unit hash "this evaluate to true"
aHalfTeaspoon unit hash ~= anotherHalfTeaspoon unit hash "this evaluate to false"

aHalfTeaspoon amount hash = anotherHalfTeaspoon amount hash "this evaluate to true"
aHalfTeaspoon amount hash ~= anotherHalfTeaspoon amount hash "this evaluate to false"

So, I don't see where is the bug.

Regards.
Maxi


2014-05-20 4:14 GMT-03:00 Guillermo Polito <[hidden email]>:
I don't know about aconcagua, but as far as I know we have to keep the following invariant to have well behaving hashed collections:

 if two objects a and b are equals, they should have the same hash.


On Tue, May 20, 2014 at 8:34 AM, Sven Van Caekenberghe <[hidden email]> wrote:

On 20 May 2014, at 08:17, [hidden email] wrote:

> On Tue, May 20, 2014 at 7:28 AM, Sven Van Caekenberghe <[hidden email]> wrote:
> So ?
>
> 1/2 hash ~= 0.5 hash
>
> ((1/2) hash = 0.5 hash) true

Yes, with the () they're equal...


Reply | Threaded
Open this post in threaded view
|

Re: Aconcagua Hashing Bug?

Nicolas Cellier
In reply to this post by philippeback



2014-05-20 8:17 GMT+02:00 [hidden email] <[hidden email]>:
On Tue, May 20, 2014 at 7:28 AM, Sven Van Caekenberghe <[hidden email]> wrote:
So ?

1/2 hash ~= 0.5 hash

((1/2) hash = 0.5 hash) true
((1/2) hash ~= 0.5 hash) true

and

((1/3) hash =(1.0/3) hash) false

The implementation (in 3.0) has a special case for denominator is a power of 2:

Fraction>>hash
"Hash is reimplemented because = is implemented.
Care is taken that a Fraction equal to a Float also have an equal hash"

| tmp |
denominator isPowerOfTwo ifTrue: [
"If denominator is a power of two, I can be exactly equal to a Float"
tmp := self asFloat.
tmp isFinite ifTrue: [^tmp hash]].
"Else, I cannot be exactly equal to a Float, use own hash algorithm.
(Assume the fraction is already reduced)"
^numerator hash bitXor: denominator hash


Float>>hash
"Hash is reimplemented because = is implemented. Both words of the float are used; 8 bits are removed from each end to clear most of the exponent regardless of the byte ordering. (The bitAnd:'s ensure that the intermediate results do not become a large integer.) Slower than the original version in the ratios 12:5 to 2:1 depending on values. (DNS, 11 May, 1997)"

(self isFinite and: [self fractionPart = 0.0]) ifTrue: [^self truncated hash].
^ (((self basicAt: 1) bitAnd: 16r00FFFF00) +
  ((self basicAt: 2) bitAnd: 16r00FFFF00)) bitShift: -8


Mmmh, I am curious to know what the rationale for this is.


A Float exact representation is of the form

(-1 raisedTo: signBit) * integerSignificand * (2 raisedTo: biasedExponent).

Thus it is either an Integer, or a Fraction whose denominator is a power of 2 (when biasedExponent < 0).

In Squeak and Pharo, Float equality versus other numbers is based on equality of exact representation.
I think this has been changed too in Dolphin and gst, but not in VW.

So a naive implementation of hash matching those equality rules could be
Float>>hash ^self asTrueFraction hash

But we try to avoid sending asTrueFraction because it costs (involves LargeInt arithmetic), so the logic for hash is:
if Float can be an integer, convert it to integer.
if Fraction could be a Float, convert it to Float.
Note that we could also check the other requirement (Fraction numerator highBitOfMagnitude <= Float precision).

Also note that
^ (((self basicAt: 1) bitAnd: 16r00FFFF00) +
  ((self basicAt: 2) bitAnd: 16r00FFFF00)) bitShift: -8
is the historical Squeak hash method for Float and is really POOR (many collisions, only half bits are used !!!)
I think I changed it in Squeak, but can't remember.

 
Getting into: 

Fraction>>isPowerOfTwo
|reduced|
reduced := self reduced.
^(reduced numerator = 1 and: [reduced denominator isPowerOfTwo])
or: [reduced denominator = 1 and: [reduced numerator isPowerOfTwo]]

calls in >>reduced, which in turn calls gcd: ...

That's quite something to be aware of.


I don't see no need for reduced here.
All Fraction are (should be) reduced.
replaced reduced by self, and you won't get any gcd:

 
Phil
 

I mean what would you expect, how should this have to work ?

Maybe the hashing specialists have something to say about this.

On 20 May 2014, at 01:38, Sean P. DeNigris <[hidden email]> wrote:

> (1/2) teaspoons hash ~= 0.5 teaspoons hash
>
> b.t.w. Phexample caught this automatically because it checks hashes when
> comparing for equality... pretty cool
>
>
>
> -----
> Cheers,
> Sean
> --
> View this message in context: http://forum.world.st/Aconcagua-Hashing-Bug-tp4759622.html
> Sent from the Pharo Smalltalk Developers mailing list archive at Nabble.com.
>




Reply | Threaded
Open this post in threaded view
|

Re: Aconcagua Hashing Bug?

Sean P. DeNigris
Administrator
In reply to this post by Maximiliano Taborda
Maximiliano Taborda wrote
Seeing the same with aconcagua
...
 teaspoons := BaseUnit named: 'teaspoons'.
 aHalfTeaspoon := Measure amount: 1/2 unit: teaspoons.
 anotherHalfTeaspoon := Measure amount: 0.5 unit: teaspoons.

"#hash"
 aHalfTeaspoon hash = anotherHalfTeaspoon hash "this evaluate to true"

So, I don't see where is the bug.
It seems to only appear when comparing measures with derived units...

baseUnit := BaseUnit named: 'tablespoon'.
unit := ProportionalDerivedUnit baseUnit: baseUnit conversionFactor: 1/3 named: 'teaspoon'.
m1 := Measure amount: 1/2 unit: unit.
m2 := Measure amount: 0.5 unit: unit.
m1 hash = m2 hash. "this evaluate to false"
Cheers,
Sean
Reply | Threaded
Open this post in threaded view
|

Re: Aconcagua Hashing Bug?

Eliot Miranda-2



On Tue, May 20, 2014 at 1:45 PM, Sean P. DeNigris <[hidden email]> wrote:
Maximiliano Taborda wrote
> Seeing the same with aconcagua
> ...
>  teaspoons := BaseUnit named: 'teaspoons'.
>  aHalfTeaspoon := Measure amount: 1/2 unit: teaspoons.
>  anotherHalfTeaspoon := Measure amount: 0.5 unit: teaspoons.
>
> "#hash"
>  aHalfTeaspoon hash = anotherHalfTeaspoon hash "this evaluate to true"
>
> So, I don't see where is the bug.

It seems to only appear when comparing measures with derived units...

baseUnit := BaseUnit named: 'tablespoon'.
unit := ProportionalDerivedUnit baseUnit: baseUnit conversionFactor: 1/3
named: 'teaspoon'.
m1 := Measure amount: 1/2 unit: unit.
m2 := Measure amount: 0.5 unit: unit.
m1 hash = m2 hash. "this evaluate to false"


are you surprised?

0.5 * 0.5 = (1/2 * (1/2)) true
0.5 / 3 = (1/2 * (1/3)) false

You can't divide floats by 3 and get an exact answer:

0.5 / 3 = 0.16666666666666666

(1/2)/3 = (1/6) 




-----
Cheers,
Sean
--
View this message in context: http://forum.world.st/Aconcagua-Hashing-Bug-tp4759622p4759745.html
Sent from the Pharo Smalltalk Developers mailing list archive at Nabble.com.




--
best,
Eliot
Reply | Threaded
Open this post in threaded view
|

Re: Aconcagua Hashing Bug?

Nicolas Cellier

2014-05-21 0:16 GMT+02:00 Eliot Miranda <[hidden email]>:



On Tue, May 20, 2014 at 1:45 PM, Sean P. DeNigris <[hidden email]> wrote:
Maximiliano Taborda wrote
> Seeing the same with aconcagua
> ...
>  teaspoons := BaseUnit named: 'teaspoons'.
>  aHalfTeaspoon := Measure amount: 1/2 unit: teaspoons.
>  anotherHalfTeaspoon := Measure amount: 0.5 unit: teaspoons.
>
> "#hash"
>  aHalfTeaspoon hash = anotherHalfTeaspoon hash "this evaluate to true"
>
> So, I don't see where is the bug.

It seems to only appear when comparing measures with derived units...

baseUnit := BaseUnit named: 'tablespoon'.
unit := ProportionalDerivedUnit baseUnit: baseUnit conversionFactor: 1/3
named: 'teaspoon'.
m1 := Measure amount: 1/2 unit: unit.
m2 := Measure amount: 0.5 unit: unit.
m1 hash = m2 hash. "this evaluate to false"


are you surprised?

0.5 * 0.5 = (1/2 * (1/2)) true
0.5 / 3 = (1/2 * (1/3)) false

You can't divide floats by 3 and get an exact answer:

0.5 / 3 = 0.16666666666666666

(1/2)/3 = (1/6) 

Good find!
Use a smaller spoon, like 1/4 ;)
 




-----
Cheers,
Sean
--
View this message in context: http://forum.world.st/Aconcagua-Hashing-Bug-tp4759622p4759745.html
Sent from the Pharo Smalltalk Developers mailing list archive at Nabble.com.




--
best,
Eliot

Reply | Threaded
Open this post in threaded view
|

Re: Aconcagua Hashing Bug?

Maximiliano Taborda
Or just use ScaledDecimal instead of Float

0.5 asScaledDecimal * 0.5 asScaledDecimal = (1/2 * (1/2)) true
0.5 asScaledDecimal / 3 = (1/2 * (1/3)) true


2014-05-20 19:21 GMT-03:00 Nicolas Cellier <[hidden email]>:

2014-05-21 0:16 GMT+02:00 Eliot Miranda <[hidden email]>:




On Tue, May 20, 2014 at 1:45 PM, Sean P. DeNigris <[hidden email]> wrote:
Maximiliano Taborda wrote
> Seeing the same with aconcagua
> ...
>  teaspoons := BaseUnit named: 'teaspoons'.
>  aHalfTeaspoon := Measure amount: 1/2 unit: teaspoons.
>  anotherHalfTeaspoon := Measure amount: 0.5 unit: teaspoons.
>
> "#hash"
>  aHalfTeaspoon hash = anotherHalfTeaspoon hash "this evaluate to true"
>
> So, I don't see where is the bug.

It seems to only appear when comparing measures with derived units...

baseUnit := BaseUnit named: 'tablespoon'.
unit := ProportionalDerivedUnit baseUnit: baseUnit conversionFactor: 1/3
named: 'teaspoon'.
m1 := Measure amount: 1/2 unit: unit.
m2 := Measure amount: 0.5 unit: unit.
m1 hash = m2 hash. "this evaluate to false"


are you surprised?

0.5 * 0.5 = (1/2 * (1/2)) true
0.5 / 3 = (1/2 * (1/3)) false

You can't divide floats by 3 and get an exact answer:

0.5 / 3 = 0.16666666666666666

(1/2)/3 = (1/6) 

Good find!
Use a smaller spoon, like 1/4 ;)
 




-----
Cheers,
Sean
--
View this message in context: http://forum.world.st/Aconcagua-Hashing-Bug-tp4759622p4759745.html
Sent from the Pharo Smalltalk Developers mailing list archive at Nabble.com.




--
best,
Eliot


Reply | Threaded
Open this post in threaded view
|

Re: Aconcagua Hashing Bug?

Maximiliano Taborda
If you see the code at #hash method for class Measure, your realize that this class converts the amount of the measure to its base unit and then sends hash to this converted value. So, for the case where you have 1/2 teaspoon the #hash method, after some conversions, this does:
          (1/2) * (1/3) = (1/6)
then, obtain the hash of the fraction 1/6 => 7.

The same goes for your 0.5 teaspoon. But, because floats are bad guys, this is what happens:
          0.5 * (1/3) =  0.16666666666666666 (like Eliot shows)
then, obtain the hash for this ugly float => 72362.

Its ok or not, well, thats depends. If you need a more precise number, use fractions or scaled decimals.




2014-05-20 21:07 GMT-03:00 Maximiliano Taborda <[hidden email]>:
Or just use ScaledDecimal instead of Float

0.5 asScaledDecimal * 0.5 asScaledDecimal = (1/2 * (1/2)) true
0.5 asScaledDecimal / 3 = (1/2 * (1/3)) true


2014-05-20 19:21 GMT-03:00 Nicolas Cellier <[hidden email]>:


2014-05-21 0:16 GMT+02:00 Eliot Miranda <[hidden email]>:




On Tue, May 20, 2014 at 1:45 PM, Sean P. DeNigris <[hidden email]> wrote:
Maximiliano Taborda wrote
> Seeing the same with aconcagua
> ...
>  teaspoons := BaseUnit named: 'teaspoons'.
>  aHalfTeaspoon := Measure amount: 1/2 unit: teaspoons.
>  anotherHalfTeaspoon := Measure amount: 0.5 unit: teaspoons.
>
> "#hash"
>  aHalfTeaspoon hash = anotherHalfTeaspoon hash "this evaluate to true"
>
> So, I don't see where is the bug.

It seems to only appear when comparing measures with derived units...

baseUnit := BaseUnit named: 'tablespoon'.
unit := ProportionalDerivedUnit baseUnit: baseUnit conversionFactor: 1/3
named: 'teaspoon'.
m1 := Measure amount: 1/2 unit: unit.
m2 := Measure amount: 0.5 unit: unit.
m1 hash = m2 hash. "this evaluate to false"


are you surprised?

0.5 * 0.5 = (1/2 * (1/2)) true
0.5 / 3 = (1/2 * (1/3)) false

You can't divide floats by 3 and get an exact answer:

0.5 / 3 = 0.16666666666666666

(1/2)/3 = (1/6) 

Good find!
Use a smaller spoon, like 1/4 ;)
 




-----
Cheers,
Sean
--
View this message in context: http://forum.world.st/Aconcagua-Hashing-Bug-tp4759622p4759745.html
Sent from the Pharo Smalltalk Developers mailing list archive at Nabble.com.




--
best,
Eliot



Reply | Threaded
Open this post in threaded view
|

Re: Aconcagua Hashing Bug?

Sean P. DeNigris
Administrator
In reply to this post by Eliot Miranda-2
Eliot Miranda-2 wrote
You can't divide floats by 3 and get an exact answer:
0.5 / 3 = 0.16666666666666666
Duh :) Thank you.
Cheers,
Sean