What's wrong with this statement?

Previous Topic Next Topic
 
classic Classic list List threaded Threaded
19 messages Options
Reply | Threaded
Open this post in threaded view
|

What's wrong with this statement?

Miller, Timothy J.
(2 raisedTo: 128) atRandom hex

Gives me results like:

B990880B732110000000000000000001
BFD3A7B37FA750000000000000000001
E0A6F981C14DF0000000000000000001

Somehow I'm not buying the lower-order bits on this one.  :)

-- Tim


_______________________________________________
Beginners mailing list
[hidden email]
http://lists.squeakfoundation.org/mailman/listinfo/beginners

smime.p7s (3K) Download Attachment
Reply | Threaded
Open this post in threaded view
|

Re: What's wrong with this statement?

Randal L. Schwartz
>>>>> "Timothy" == Timothy J Miller <[hidden email]> writes:

Timothy> (2 raisedTo: 128) atRandom hex
Timothy> Gives me results like:

Timothy> B990880B732110000000000000000001
Timothy> BFD3A7B37FA750000000000000000001
Timothy> E0A6F981C14DF0000000000000000001

Timothy> Somehow I'm not buying the lower-order bits on this one.  :)

It's pretty likely you have a 16-bit random generator, so there just aren't
enough bits to make the low-order bits random in any way shape or form.

--
Randal L. Schwartz - Stonehenge Consulting Services, Inc. - +1 503 777 0095
<[hidden email]> <URL:http://www.stonehenge.com/merlyn/>
Smalltalk/Perl/Unix consulting, Technical writing, Comedy, etc. etc.
See http://methodsandmessages.vox.com/ for Smalltalk and Seaside discussion
_______________________________________________
Beginners mailing list
[hidden email]
http://lists.squeakfoundation.org/mailman/listinfo/beginners
Reply | Threaded
Open this post in threaded view
|

Re: What's wrong with this statement?

Randal L. Schwartz
>>>>> "Randal" == Randal L Schwartz <[hidden email]> writes:

>>>>> "Timothy" == Timothy J Miller <[hidden email]> writes:
Timothy> (2 raisedTo: 128) atRandom hex
Timothy> Gives me results like:

Timothy> B990880B732110000000000000000001
Timothy> BFD3A7B37FA750000000000000000001
Timothy> E0A6F981C14DF0000000000000000001

Timothy> Somehow I'm not buying the lower-order bits on this one.  :)

Randal> It's pretty likely you have a 16-bit random generator, so there just aren't
Randal> enough bits to make the low-order bits random in any way shape or form.

Err, uh, 64 bits of random.  But still not enough for a 128-bit random
number.

--
Randal L. Schwartz - Stonehenge Consulting Services, Inc. - +1 503 777 0095
<[hidden email]> <URL:http://www.stonehenge.com/merlyn/>
Smalltalk/Perl/Unix consulting, Technical writing, Comedy, etc. etc.
See http://methodsandmessages.vox.com/ for Smalltalk and Seaside discussion
_______________________________________________
Beginners mailing list
[hidden email]
http://lists.squeakfoundation.org/mailman/listinfo/beginners
Reply | Threaded
Open this post in threaded view
|

Re: What's wrong with this statement?

Edgar J. De Cleene
In reply to this post by Randal L. Schwartz



El 7/31/08 4:41 PM, "Randal L. Schwartz" <[hidden email]> escribió:

>>>>>> "Timothy" == Timothy J Miller <[hidden email]> writes:
>
> Timothy> (2 raisedTo: 128) atRandom hex
> Timothy> Gives me results like:
>
> Timothy> B990880B732110000000000000000001
> Timothy> BFD3A7B37FA750000000000000000001
> Timothy> E0A6F981C14DF0000000000000000001
>
> Timothy> Somehow I'm not buying the lower-order bits on this one.  :)
>
> It's pretty likely you have a 16-bit random generator, so there just aren't
> enough bits to make the low-order bits random in any way shape or form.
>
> --
> Randal L. Schwartz - Stonehenge Consulting Services, Inc. - +1 503 777 0095
> <[hidden email]> <URL:http://www.stonehenge.com/merlyn/>
> Smalltalk/Perl/Unix consulting, Technical writing, Comedy, etc. etc.
> See http://methodsandmessages.vox.com/ for Smalltalk and Seaside discussion
> _______________________________________________
> Beginners mailing list
> [hidden email]
> http://lists.squeakfoundation.org/mailman/listinfo/beginners

But (2 raisedTo: 128) atRandom could genetate
49572205802560219958060582892667404289

and
49572205802560219958060582892667404289 hex also is wrong print
 '254B42724A9684000000000000000001'

And I sorry to tell all this is a bug from 3.2 times.
I know because I working on 3.2 and 3.10 derivative images now.

Edgar


_______________________________________________
Beginners mailing list
[hidden email]
http://lists.squeakfoundation.org/mailman/listinfo/beginners
Reply | Threaded
Open this post in threaded view
|

Re: What's wrong with this statement?

Nicolas Cellier-3
In reply to this post by Miller, Timothy J.
Timothy J Miller <tmiller <at> mitre.org> writes:

>
> (2 raisedTo: 128) atRandom hex
>
> Gives me results like:
>
> B990880B732110000000000000000001
> BFD3A7B37FA750000000000000000001
> E0A6F981C14DF0000000000000000001
>
> Somehow I'm not buying the lower-order bits on this one.  :)
>


Yes, bad!

I recommand you use the 'debug it' menu in the workspace and execute step by
step in the debugger to understand what goes on...

The key is Random>>#nextInt: anInteger
        ^ (self next * anInteger) truncated + 1

Random>>#next uses Floating point arithmetic.
Thus you get the 53 bits precision of a Float mantissa (IEEE 754 double).
It then gets shifted (128-53) times.
There's probably a trick that rounds Float last bit to zero
(like round to nearest integer always occur).
And you might be able to understand the trailing 1.

Worse, the default Random generator does a modulo (2 raisdeTo: 31) - 1.
That is, it can generate at most (2 raisdeTo: 31) different numbers...
That happens to be the number of SmallInteger in the system, so it probably
works better for SmallIntegers.
A different strategy should be adopted for LargePositiveInteger>>atRandom...
Let us say current implementation is very limited...

I suggest you open a mantis account and report the issue at http://bugs.squeak.org

Nicolas

_______________________________________________
Beginners mailing list
[hidden email]
http://lists.squeakfoundation.org/mailman/listinfo/beginners
Reply | Threaded
Open this post in threaded view
|

Re: What's wrong with this statement?

Nicolas Cellier-3
In reply to this post by Edgar J. De Cleene
Edgar J. De Cleene <edgardec2001 <at> yahoo.com.ar> writes:

>
>
> But (2 raisedTo: 128) atRandom could genetate
> 49572205802560219958060582892667404289
>
> and
> 49572205802560219958060582892667404289 hex also is wrong print
>  '254B42724A9684000000000000000001'
>

Nah! hex is an excellent print, it is printStringBase: 16

10r49572205802560219958060582892667404289 = 16r254B42724A9684000000000000000001
-> true




_______________________________________________
Beginners mailing list
[hidden email]
http://lists.squeakfoundation.org/mailman/listinfo/beginners
Reply | Threaded
Open this post in threaded view
|

Re: What's wrong with this statement?

Miller, Timothy J.
In reply to this post by Edgar J. De Cleene

On Jul 31, 2008, at 2:54 PM, Edgar J. De Cleene wrote:
> But (2 raisedTo: 128) atRandom could genetate
> 49572205802560219958060582892667404289
>
> and
> 49572205802560219958060582892667404289 hex also is wrong print
> '254B42724A9684000000000000000001'

Python 2.5.1 (r251:54863, Apr 15 2008, 22:57:26)

 >>> hex(49572205802560219958060582892667404289)
'0x254b42724a9684000000000000000001L'

Hmmm.  Clearly I'm missing something.

-- Tim


_______________________________________________
Beginners mailing list
[hidden email]
http://lists.squeakfoundation.org/mailman/listinfo/beginners

smime.p7s (3K) Download Attachment
Reply | Threaded
Open this post in threaded view
|

Re: Re: What's wrong with this statement?

Miller, Timothy J.
In reply to this post by Nicolas Cellier-3
On Jul 31, 2008, at 3:01 PM, nicolas cellier wrote:

> Random>>#next uses Floating point arithmetic.
> Thus you get the 53 bits precision of a Float mantissa (IEEE 754  
> double).
> It then gets shifted (128-53) times.
> There's probably a trick that rounds Float last bit to zero
> (like round to nearest integer always occur).
> And you might be able to understand the trailing 1.
>
> Worse, the default Random generator does a modulo (2 raisdeTo: 31) -  
> 1.
> That is, it can generate at most (2 raisdeTo: 31) different numbers...
> That happens to be the number of SmallInteger in the system, so it  
> probably
> works better for SmallIntegers.
> A different strategy should be adopted for  
> LargePositiveInteger>>atRandom...
> Let us say current implementation is very limited...
>
> I suggest you open a mantis account and report the issue at http://bugs.squeak.org
First I'll dig into the Cryptography package and see what they did for  
large random numbers.

It certainly is inconsistent to have an Integer method that doesn't  
invisibly handle large ints.

-- Tim


_______________________________________________
Beginners mailing list
[hidden email]
http://lists.squeakfoundation.org/mailman/listinfo/beginners

smime.p7s (3K) Download Attachment
Reply | Threaded
Open this post in threaded view
|

Re: Re: What's wrong with this statement?

Randal L. Schwartz
>>>>> "Timothy" == Timothy J Miller <[hidden email]> writes:

Timothy> It certainly is inconsistent to have an Integer method that doesn't
Timothy> invisibly handle large ints.

The Integer method is fine.  The problem is the lack of bits from the PRNG
from class Random.

--
Randal L. Schwartz - Stonehenge Consulting Services, Inc. - +1 503 777 0095
<[hidden email]> <URL:http://www.stonehenge.com/merlyn/>
Smalltalk/Perl/Unix consulting, Technical writing, Comedy, etc. etc.
See http://methodsandmessages.vox.com/ for Smalltalk and Seaside discussion
_______________________________________________
Beginners mailing list
[hidden email]
http://lists.squeakfoundation.org/mailman/listinfo/beginners
Reply | Threaded
Open this post in threaded view
|

Re: What's wrong with this statement?

Edgar J. De Cleene
In reply to this post by Miller, Timothy J.



El 7/31/08 5:31 PM, "Timothy J Miller" <[hidden email]> escribió:

>
> On Jul 31, 2008, at 2:54 PM, Edgar J. De Cleene wrote:
>> But (2 raisedTo: 128) atRandom could genetate
>> 49572205802560219958060582892667404289
>>
>> and
>> 49572205802560219958060582892667404289 hex also is wrong print
>> '254B42724A9684000000000000000001'
>
> Python 2.5.1 (r251:54863, Apr 15 2008, 22:57:26)
>
>>>> hex(49572205802560219958060582892667404289)
> '0x254b42724a9684000000000000000001L'
>
> Hmmm.  Clearly I'm missing something.
>
> -- Tim

Sorry by not thinking.
Only see the 000000000000000001 end in your first and test 3.2 do same as
3.10
Once hex was removed from image ...
http://bugs.squeak.org/view.php?id=6441
Edgar


_______________________________________________
Beginners mailing list
[hidden email]
http://lists.squeakfoundation.org/mailman/listinfo/beginners
Reply | Threaded
Open this post in threaded view
|

Re: What's wrong with this statement?

John Foster-3
In reply to this post by Miller, Timothy J.
The issue is almost certainly in atRandom:

Running

30  timesRepeat: [ | a |
    a:= (2 raisedTo: 57) atRandom.
    Transcript show: a ; show:  ' ' ; show: a even ; cr ].

Gives me 30 falses.

The odds of that are less than one in a billion (assuming uniformly
distributed integers).

This doesn't happen for 2^56 or lower, and does for all 2^n, n>= 57.

My guess is that it's either a hardware issue or something about Random.
I'm not clever enough to understand PRNGs, so I'll leave it others to work
out what the answer is, although I suspect that for more than 56 bits you
need a PRNG that uses LargeIntegers  and not Floats.

_______________________________________________
Beginners mailing list
[hidden email]
http://lists.squeakfoundation.org/mailman/listinfo/beginners
Reply | Threaded
Open this post in threaded view
|

Re: What's wrong with this statement?

John Foster-3
In reply to this post by Edgar J. De Cleene

> El 7/31/08 5:31 PM, "Timothy J Miller" <[hidden email]> escribió:
>
>>
>> On Jul 31, 2008, at 2:54 PM, Edgar J. De Cleene wrote:
>>> But (2 raisedTo: 128) atRandom could genetate
>>> 49572205802560219958060582892667404289
>>>
>>> and
>>> 49572205802560219958060582892667404289 hex also is wrong print
>>> '254B42724A9684000000000000000001'
>>
>> Python 2.5.1 (r251:54863, Apr 15 2008, 22:57:26)
>>
>>>>> hex(49572205802560219958060582892667404289)
>> '0x254b42724a9684000000000000000001L'
>>
>> Hmmm.  Clearly I'm missing something.
>>
>> -- Tim
>
> Sorry by not thinking.
> Only see the 000000000000000001 end in your first and test 3.2 do same as
> 3.10
> Once hex was removed from image ...
> http://bugs.squeak.org/view.php?id=6441
> Edgar
>

The problem is not in hex (although there are problems trying to get a hex
representation of a IEEE float, which might be useful to some people).
The reason I fixed hex was that the deprecation actually caused several
problems, as I discussed on Mantis.

John P Foster.

_______________________________________________
Beginners mailing list
[hidden email]
http://lists.squeakfoundation.org/mailman/listinfo/beginners
Reply | Threaded
Open this post in threaded view
|

Re: What's wrong with this statement?

Randal L. Schwartz
In reply to this post by John Foster-3
>>>>> "johnps11" == johnps11  <[hidden email]> writes:

johnps11> I'm not clever enough to understand PRNGs, so I'll leave it others
johnps11> to work out what the answer is, although I suspect that for more
johnps11> than 56 bits you need a PRNG that uses LargeIntegers and not Floats.

Yes.  There are only 56 bits in an IEEE Float.  You can't get any more
random bits from that.

--
Randal L. Schwartz - Stonehenge Consulting Services, Inc. - +1 503 777 0095
<[hidden email]> <URL:http://www.stonehenge.com/merlyn/>
Smalltalk/Perl/Unix consulting, Technical writing, Comedy, etc. etc.
See http://methodsandmessages.vox.com/ for Smalltalk and Seaside discussion
_______________________________________________
Beginners mailing list
[hidden email]
http://lists.squeakfoundation.org/mailman/listinfo/beginners
Reply | Threaded
Open this post in threaded view
|

Re: What's wrong with this statement?

Nicolas Cellier-3
In reply to this post by John Foster-3
 <johnps11 <at> bigpond.com> writes:

>
> The issue is almost certainly in atRandom:
>
> Running
>
> 30  timesRepeat: [ | a |
>     a:= (2 raisedTo: 57) atRandom.
>     Transcript show: a ; show:  ' ' ; show: a even ; cr ].
>
> Gives me 30 falses.
>
> The odds of that are less than one in a billion (assuming uniformly
> distributed integers).
>
> This doesn't happen for 2^56 or lower, and does for all 2^n, n>= 57.
>
> My guess is that it's either a hardware issue or something about Random.
> I'm not clever enough to understand PRNGs, so I'll leave it others to work
> out what the answer is, although I suspect that for more than 56 bits you
> need a PRNG that uses LargeIntegers  and not Floats.
>

Nothing surprising once again.
Float have a 53 bits integer mantissa m and and a biased exponent e:
    f := m * (2 raisedTo: e).
    m highBit = 53.
The PRNG has a uniform distribution on [0,1).
f < 1 means 100% of floats have e < -53.
Let us divide the uniform intervals in two recursively:

50% have e = -54 (0.5<=f<1)
50% have e < -54 (f<0.5)

25% have e = -55 (0.25<=f<0.5)
25% have e < -55 (f<0.25)
75% have e >=-55 (f>=0.25)

12.5% have e = -56 (0.125<=f<0.25)
12.5% have e < -56 (f<0.125)
87.5% have e >=-56 (f>=0.125)

6.25%  have e = -57 (0.0625<=f<0.125)
6.25%  have e < -57 (f<0.0625)
92.75% have e >=-57 (f>=0.0625)

3.125%  have e = -58 (0.03125<=f<0.0625)
3.125%  have e < -58 (f<0.03125)
96.875% have e >=-58 (f>=0.03125)

So when you multiply f by (2 raisedTo: 57),
that means (m bitShift: e+57),
87.5% have (e+57>0),
thus are shifted at least one bit to the left,
and you get 87.5% of even numbers.
since algorithm truncate then add 1, you then get 87.5% of odd results

I suspect that due to division by 16r7FFFFFFF, last bit of mantissa m is often
zero. In which case, you get 92.75% of odd results.

Or maybe last two bits, in which case you have 96.875% of odd results.

   ((1 to: 10000) count: [:i | (2 raisedTo: 57) atRandom odd])  / 10000.0 .
-> 0.9692
Yes, probably last two bits of these floats are set to 0...

The limit is by no way (2 raisedTo: 57).

The true limit at which you won't get any even number ?
well, smallest generated pseudo random float is 1/(2^31-1)

A little higher than 1.0 timesTwoPoser: -31.
   smallest := ((1.0 timesTwoPower: 31) - 1) reciprocal.
   smallest significandAsInteger printStringBase: 2.
-> '10000000000000000000000000000001000000000000000000000'

The biased exponent is then (-31-52=-83).
   (smallest significandAsInteger asFloat timesTwoPower: smallest exponent-52)
    = smallest
-> true

So, (2 raisedTo: 84) atRandom is guaranteed to be odd.

We conjectured that last two bits are zero,
maybe (2 raisedTo: 82) atRandom is already 100% odd?
   (1 to: 10000000 by: 11) detect: [:e | ((e * smallest) significandAsInteger
     bitAnd: 2r11) > 0] ifNone: [nil]
->1048587
Hmm the conjecture is false...

Floats are tricky, but once you understand they are just rational with
denominator being a power of two and inexact operations truncating the mantissa
at 53 bits, the mystery vanishes a bit.

Nicolas


_______________________________________________
Beginners mailing list
[hidden email]
http://lists.squeakfoundation.org/mailman/listinfo/beginners
Reply | Threaded
Open this post in threaded view
|

Re: What's wrong with this statement?

Nicolas Cellier-3
In reply to this post by Randal L. Schwartz
Randal L. Schwartz <merlyn <at> stonehenge.com> writes:

>
> >>>>> "johnps11" == johnps11  <johnps11 <at> bigpond.com> writes:
>
> johnps11> I'm not clever enough to understand PRNGs, so I'll leave it others
> johnps11> to work out what the answer is, although I suspect that for more
> johnps11> than 56 bits you need a PRNG that uses LargeIntegers and not Floats.
>
> Yes.  There are only 56 bits in an IEEE Float.  You can't get any more
> random bits from that.
>


Nah! IEEE 754 double have 53 bit mantissa.
- 1 bit for sign,
- 11 bits for biased exponent,
- 52 bits for mantissa plus a leading implied bit.

+/- 1.m1m2m3...m52 * (2 raisedTo: e1e2...e11 - 1024)

1.0 hex-> '3FF0000000000000' mantissa=16r1.0000000000000 exponent=16r3FF-1024=0
1.5 hex-> '3FF8000000000000' mantissa=16r1.8000000000000 exponent=16r3FF-1024=0
2.5 hex-> '4002000000000000' mantissa=16r1.2000000000000 exponent=16r400-1024=1
this number is thus represented 1.125 * 2

Nicolas

_______________________________________________
Beginners mailing list
[hidden email]
http://lists.squeakfoundation.org/mailman/listinfo/beginners
Reply | Threaded
Open this post in threaded view
|

Re: What's wrong with this statement?

Jerome Peace
In reply to this post by Miller, Timothy J.
[Newbies] Re: What's wrong with this statement?


HI Tim, Hi Randal

Randal, I'm with Tim on this one.
 
Random has no way of knowing how many bits to supply.
LargeInteger knows how many it needs
 and can inquire of random how many it can get.
So LargeInteger>>atRandom has the responsibility to get the job done.

Asking for enough random numbers for all the bits in question.  
One random can decide on the high order bit range.
The next on the low order range within that etc
 until you have decide on all the choices.
You can employing more than one random generator.
That would probably insure better distribution of results.


Tim, this is an excelent bug find.
Please plant a seed on mantis.
We need somewhere to put the patch and the tests for the harvesters.

The really important thing is to have a test (sunit)
 that can be run to validate if an image has the fix or not.
 Any (or all) of the ones in this thread look like good starts.

Yours in curiosity and service, --Jerome Peace


***
>Randal L. Schwartz merlyn at stonehenge.com
>Thu Jul 31 20:45:44 UTC 2008
>
>
>>>>>> "Timothy" == Timothy J Miller <tmiller at mitre.org> writes:
>
Timothy>> It certainly is inconsistent to have an Integer method that doesn't
Timothy>> invisibly handle large ints.
>
Randal>The Integer method is fine.  The problem is the lack of bits from the PRNG
Randal>from class Random.
>


     
_______________________________________________
Beginners mailing list
[hidden email]
http://lists.squeakfoundation.org/mailman/listinfo/beginners
Reply | Threaded
Open this post in threaded view
|

Re: What's wrong with this statement?

cerebus-4
In reply to this post by Miller, Timothy J.
Bug submitted.

http://bugs.squeak.org/view.php?id=7142

I may even attempt a fix.  :)

-- Tim
_______________________________________________
Beginners mailing list
[hidden email]
http://lists.squeakfoundation.org/mailman/listinfo/beginners
Reply | Threaded
Open this post in threaded view
|

Re: Re: What's wrong with this statement?

cerebus-4
Somebody check me out here, 'cause I think I'm close but something
doesn't seem quite right.

Attached is a new Integer>atRandom I'm playing with.  I break the
integer into a ByteArray and call #atRandom for every byte, using 255
as the bounding value for all but the first byte.  The bounding value
for the first byte is itself minus 1.  Concatenated together we get a
byte array representing a random value less than the bounding value.

I think.  Maybe I'm doing this wrong.  I feel like the first byte
treatment is wrong but I don't know how to achieve the same
result--without it the random value often exceeds the bounding value.

I went this route as it avoids doing a bunch of bit shifts I'd
probably get wrong anyway.

10 timesRepeat: [Transcript show: ((2 raisedTo: 128) atRandom hex); cr]

gives:

CFE72E55A8C35387C9932D3235937FE6
A1760C4F27B161A4D2431C94751A2E65
D9EC4066D2DBD7A00B17B6FC16D985DC
D3140D928C14D8EA7AEB0CC814A2CA8C
EFB37CECB90EFDCF2F93BC34245118C
1B215016A0812E04A5600FA88373BD7B
7B52EB7F11C0DC822B272B70DCE61890
1D165F2BAF12E8596D496745A41A4BF4
E8C62D390B69B26CD9FEAF58BD68C17
3FBEA83CCB11EEDF6993E9BB60171F35

-- Tim

On Sat, Aug 2, 2008 at 11:00 AM, Cerebus <[hidden email]> wrote:
> Bug submitted.
>
> http://bugs.squeak.org/view.php?id=7142
>
> I may even attempt a fix.  :)
>
> -- Tim
>

_______________________________________________
Beginners mailing list
[hidden email]
http://lists.squeakfoundation.org/mailman/listinfo/beginners

Integer-atRandom.st (1K) Download Attachment
Reply | Threaded
Open this post in threaded view
|

Re: Re: What's wrong with this statement?

Randal L. Schwartz
>>>>> "Cerebus" == Cerebus  <[hidden email]> writes:

Cerebus> Attached is a new Integer>atRandom I'm playing with.  I break the
Cerebus> integer into a ByteArray and call #atRandom for every byte, using 255
Cerebus> as the bounding value for all but the first byte.  The bounding value
Cerebus> for the first byte is itself minus 1.  Concatenated together we get a
Cerebus> byte array representing a random value less than the bounding value.

You're going to get a correlation at some point.  You're merely spreading the
same amount of entropy around in different places.  It's not "more random".

Look at it this way.  The cycle will repeat in 2**53 calls to the random
number generator.  By calling it 10 times for each "result", you're using
causing it to loop 10 times as fast.  You can't get 10 pounds of entropy out
of a 5 pound sack.

--
Randal L. Schwartz - Stonehenge Consulting Services, Inc. - +1 503 777 0095
<[hidden email]> <URL:http://www.stonehenge.com/merlyn/>
Smalltalk/Perl/Unix consulting, Technical writing, Comedy, etc. etc.
See http://methodsandmessages.vox.com/ for Smalltalk and Seaside discussion
_______________________________________________
Beginners mailing list
[hidden email]
http://lists.squeakfoundation.org/mailman/listinfo/beginners