Seeding instances of Random

Previous Topic Next Topic
 
classic Classic list List threaded Threaded
15 messages Options
Reply | Threaded
Open this post in threaded view
|

Seeding instances of Random

Levente Uzonyi-2
Hi All,

With the recent UUID changes, we're facing the problem of properly seeding
a Random instance. The current method provides only 2^32 different initial
states, with which the chance for two images to have the same values on
startup is way too high (~9300 startups for 1% chance of collision).
To avoid the collisions we need a reliable source of random bytes.
Currently we have the following sources:
1. /dev/urandom: fast, reliable, but not available on all platforms (e.g.
windows).
2. UUIDPlugin: fast, but this is what we wanted to get rid of in the first
place, so it may not be available.
3. Audio recorded through SoundSystem: slow, unreliable, and it may not be
available.
4. Time primUTCMicrosecondClock. On its own it has way too little entropy.
We can still use it to initalize the PRNG by using additional sources of
entropy (image name, path, vm version, whatever). We can use SHA1 to get
"more random" bits from out entropy sources. But this is more like a last
resort than a solution to rely on.

So I suggest we should create a new primitive, which can fill a given
indexable object (bytes and optionally words) with random values - the
same way Random >> #nextBytes:into:startingAt: works. It could use
CryptGenRandom on Windows, and /dev/urandom on other unix-like platforms.

As fallback mechanisms, I'd implement 1. 2. and optionally 4., and use
them in the given order. The drawback of these mechanisms is that they
create unwanted package dependencies, because Random is in Kernel, while
most potetial sources of entropy, along with SHA1, are in other packages.

Opinions, ideas?

Levente

Reply | Threaded
Open this post in threaded view
|

Re: Seeding instances of Random

Ben Coman
On Tue, Nov 3, 2015 at 8:32 AM, Levente Uzonyi <[hidden email]> wrote:

> Hi All,
>
> With the recent UUID changes, we're facing the problem of properly seeding a
> Random instance. The current method provides only 2^32 different initial
> states, with which the chance for two images to have the same values on
> startup is way too high (~9300 startups for 1% chance of collision).
> To avoid the collisions we need a reliable source of random bytes.
> Currently we have the following sources:
> 1. /dev/urandom: fast, reliable, but not available on all platforms (e.g.
> windows).
> 2. UUIDPlugin: fast, but this is what we wanted to get rid of in the first
> place, so it may not be available.
> 3. Audio recorded through SoundSystem: slow, unreliable, and it may not be
> available.
> 4. Time primUTCMicrosecondClock. On its own it has way too little entropy.
> We can still use it to initalize the PRNG by using additional sources of
> entropy (image name, path, vm version, whatever). We can use SHA1 to get
> "more random" bits from out entropy sources. But this is more like a last
> resort than a solution to rely on.
>
> So I suggest we should create a new primitive, which can fill a given
> indexable object (bytes and optionally words) with random values - the same
> way Random >> #nextBytes:into:startingAt: works. It could use CryptGenRandom
> on Windows, and /dev/urandom on other unix-like platforms.
>
> As fallback mechanisms, I'd implement 1. 2. and optionally 4., and use them
> in the given order. The drawback of these mechanisms is that they create
> unwanted package dependencies, because Random is in Kernel, while most
> potential sources of entropy, along with SHA1, are in other packages.
>
> Opinions, ideas?
>
> Levente
>

This snagged my interest so I had a poke around, from which maybe
getrandom() seems a better choice [1] [2] - if available.  Linux
earlier than Oct-2014 may not have it, and on OpenBSD and (maybe)
FreeBSD it is getentropy() [3].

[1] http://djalil.chafai.net/blog/2014/10/13/linux-kernel-3-17/
[2] http://man7.org/linux/man-pages/man2/getrandom.2.html
[3] http://www.2uo.de/myths-about-urandom/

Also I found discussion [4] interesting about mixing additional
entropy with OS supplied randomness, and discussion [5] is a very
interesting method that might provide that -- but is it worth the
additional effort ?

[4] http://security.stackexchange.com/questions/43344/open-source-alternative-for-cryptgenrandom
[5] http://www.chronox.de/jent/doc/CPU-Jitter-NPTRNG.html

cheers -ben

Reply | Threaded
Open this post in threaded view
|

Re: Seeding instances of Random

Colin Putney-3
In reply to this post by Levente Uzonyi-2


On Mon, Nov 2, 2015 at 4:32 PM, Levente Uzonyi <[hidden email]> wrote:
Hi All,

With the recent UUID changes, we're facing the problem of properly seeding a Random instance. The current method provides only 2^32 different initial states, with which the chance for two images to have the same values on startup is way too high (~9300 startups for 1% chance of collision).
To avoid the collisions we need a reliable source of random bytes.
Currently we have the following sources:
1. /dev/urandom: fast, reliable, but not available on all platforms (e.g. windows).
2. UUIDPlugin: fast, but this is what we wanted to get rid of in the first place, so it may not be available.
3. Audio recorded through SoundSystem: slow, unreliable, and it may not be available.
4. Time primUTCMicrosecondClock. On its own it has way too little entropy. We can still use it to initalize the PRNG by using additional sources of entropy (image name, path, vm version, whatever). We can use SHA1 to get "more random" bits from out entropy sources. But this is more like a last resort than a solution to rely on.

So I suggest we should create a new primitive, which can fill a given indexable object (bytes and optionally words) with random values - the same way Random >> #nextBytes:into:startingAt: works. It could use CryptGenRandom on Windows, and /dev/urandom on other unix-like platforms.

As fallback mechanisms, I'd implement 1. 2. and optionally 4., and use them in the given order. The drawback of these mechanisms is that they create unwanted package dependencies, because Random is in Kernel, while most potetial sources of entropy, along with SHA1, are in other packages.

Why do we want to get rid of UUIDPlugin? Is it just because fewer plugins is better, or is there a problem with this one in particular?

Also, if we're going to have a primitive that supplies fast, high-quality, random bytes, shouldn't Random just call that for random values, rather than seeding a PRNG? 

That said, I think I'd do seeding using

1. the primitive, if available
2. UUIDPlugin, if available
3. The current method

Most VMs will have either the primitive or the plugin, so the current method will be used rarely, and thus collisions will be very rare. Keep it simple. 

Colin


Reply | Threaded
Open this post in threaded view
|

Re: Seeding instances of Random

Levente Uzonyi-2
On Tue, 3 Nov 2015, Colin Putney wrote:

>
>
> On Mon, Nov 2, 2015 at 4:32 PM, Levente Uzonyi <[hidden email]> wrote:
>       Hi All,
>
>       With the recent UUID changes, we're facing the problem of properly seeding a Random instance. The current method provides only 2^32 different initial states, with which the chance for two images to have the
>       same values on startup is way too high (~9300 startups for 1% chance of collision).
>       To avoid the collisions we need a reliable source of random bytes.
>       Currently we have the following sources:
>       1. /dev/urandom: fast, reliable, but not available on all platforms (e.g. windows).
>       2. UUIDPlugin: fast, but this is what we wanted to get rid of in the first place, so it may not be available.
>       3. Audio recorded through SoundSystem: slow, unreliable, and it may not be available.
>       4. Time primUTCMicrosecondClock. On its own it has way too little entropy. We can still use it to initalize the PRNG by using additional sources of entropy (image name, path, vm version, whatever). We can use
>       SHA1 to get "more random" bits from out entropy sources. But this is more like a last resort than a solution to rely on.
>
>       So I suggest we should create a new primitive, which can fill a given indexable object (bytes and optionally words) with random values - the same way Random >> #nextBytes:into:startingAt: works. It could use
>       CryptGenRandom on Windows, and /dev/urandom on other unix-like platforms.
>
>       As fallback mechanisms, I'd implement 1. 2. and optionally 4., and use them in the given order. The drawback of these mechanisms is that they create unwanted package dependencies, because Random is in Kernel,
>       while most potetial sources of entropy, along with SHA1, are in other packages.
>
>
> Why do we want to get rid of UUIDPlugin? Is it just because fewer plugins is better, or is there a problem with this one in particular?
UUIDPlugin has both compilation and dynmic linking issues on some
platforms. I'd still use it if it's available, unless we have a primitive
to provide random bytes, because 16 random bytes make a V4 UUID (-6 bits).

>
> Also, if we're going to have a primitive that supplies fast, high-quality, random bytes, shouldn't Random just call that for random values, rather than seeding a PRNG? 

You still need a way to convert those bytes to numbers.
And stateful PRNGs better for testing. When there's a test failure, you
can generate the same "random" input to reproduce it.

>
> That said, I think I'd do seeding using
>
> 1. the primitive, if available
> 2. UUIDPlugin, if available
> 3. The current method
>
> Most VMs will have either the primitive or the plugin, so the current method will be used rarely, and thus collisions will be very rare. Keep it simple. 

What do you mean by "current method"?

Levente

>
> Colin
>
>

Reply | Threaded
Open this post in threaded view
|

Re: Seeding instances of Random

Chris Muller-3
In reply to this post by Levente Uzonyi-2
> We can still use it to initalize the PRNG by using additional sources of
> entropy (image name, path, vm version, whatever). We can use SHA1 to get
> "more random" bits from out entropy sources. But this is more like a last
> resort than a solution to rely on.

I always thought a good list of hard-to-guess attributes injected in
sequence with SHA1 feedback should be sufficiently hard to guess.

millisecondClockValue, primUTCMicrosecondClock, timezone, Locale,
available memory, consumed memory, vmpath, localpath, Display extent,
Display imageForm, Sensor mouseX / mouseY, OS string,
millisecondsToRun this

I'm not against the new primitive idea, just have always been curious
about digital security..

Reply | Threaded
Open this post in threaded view
|

Re: Seeding instances of Random

marcel.taeumel
...and add the current/latest key presses or mouse movements. :-)

Best,
Marcel
Reply | Threaded
Open this post in threaded view
|

Re: Seeding instances of Random

Tobias Pape
…all of which _might_ be pretty similar at image start up


On 03.11.2015, at 20:49, marcel.taeumel <[hidden email]> wrote:

> ...and add the current/latest key presses or mouse movements. :-)
>
> Best,
> Marcel




Reply | Threaded
Open this post in threaded view
|

Re: Seeding instances of Random

timrowledge
In reply to this post by marcel.taeumel

> On 03-11-2015, at 11:49 AM, marcel.taeumel <[hidden email]> wrote:
>
> ...and add the current/latest key presses or mouse movements. :-)

And a really hot cup of tea


tim
--
tim Rowledge; [hidden email]; http://www.rowledge.org/tim
May the bugs of many programs nest on your hard drive.



Reply | Threaded
Open this post in threaded view
|

Re: Seeding instances of Random

Bert Freudenberg

> On 03.11.2015, at 21:33, tim Rowledge <[hidden email]> wrote:
>
>
>> On 03-11-2015, at 11:49 AM, marcel.taeumel <[hidden email]> wrote:
>>
>> ...and add the current/latest key presses or mouse movements. :-)
>
> And a really hot cup of tea

That might add too much improbability.

- Bert -






smime.p7s (5K) Download Attachment
Reply | Threaded
Open this post in threaded view
|

Re: Seeding instances of Random

Ben Coman
In reply to this post by Colin Putney-3
On Wed, Nov 4, 2015 at 1:38 AM, Colin Putney <[hidden email]> wrote:

>
>
> On Mon, Nov 2, 2015 at 4:32 PM, Levente Uzonyi <[hidden email]> wrote:
>>
>> Hi All,
>>
>> With the recent UUID changes, we're facing the problem of properly seeding
>> a Random instance. The current method provides only 2^32 different initial
>> states, with which the chance for two images to have the same values on
>> startup is way too high (~9300 startups for 1% chance of collision).
>> To avoid the collisions we need a reliable source of random bytes.
>> Currently we have the following sources:
>> 1. /dev/urandom: fast, reliable, but not available on all platforms (e.g.
>> windows).
>> 2. UUIDPlugin: fast, but this is what we wanted to get rid of in the first
>> place, so it may not be available.
>> 3. Audio recorded through SoundSystem: slow, unreliable, and it may not be
>> available.
>> 4. Time primUTCMicrosecondClock. On its own it has way too little entropy.
>> We can still use it to initalize the PRNG by using additional sources of
>> entropy (image name, path, vm version, whatever). We can use SHA1 to get
>> "more random" bits from out entropy sources. But this is more like a last
>> resort than a solution to rely on.
>>
>> So I suggest we should create a new primitive, which can fill a given
>> indexable object (bytes and optionally words) with random values - the same
>> way Random >> #nextBytes:into:startingAt: works. It could use CryptGenRandom
>> on Windows, and /dev/urandom on other unix-like platforms.
>>
>> As fallback mechanisms, I'd implement 1. 2. and optionally 4., and use
>> them in the given order. The drawback of these mechanisms is that they
>> create unwanted package dependencies, because Random is in Kernel, while
>> most potetial sources of entropy, along with SHA1, are in other packages.
>
>
> Why do we want to get rid of UUIDPlugin? Is it just because fewer plugins is
> better, or is there a problem with this one in particular?
>
> Also, if we're going to have a primitive that supplies fast, high-quality,
> random bytes, shouldn't Random just call that for random values, rather than
> seeding a PRNG?

If I properly understood my recent reading on this, every time you
take a number from a True-RNG, you reduce its entropy pool, which may
only refill slowly on a remote server.

>From http://linux.die.net/man/4/urandom ...
"The kernel random-number generator is designed to produce a small
amount of high-quality seed material to seed a cryptographic
pseudo-random number generator (CPRNG). It is designed for security,
not speed, and is poorly suited to generating large amounts of random
data. Users should be very economical in the amount of seed material
that they read from /dev/urandom (and /dev/random); unnecessarily
reading large quantities of data from this device will have a negative
impact on other users of the device"

cheers -ben

Reply | Threaded
Open this post in threaded view
|

Re: Seeding instances of Random

Tobias Pape

On 04.11.2015, at 17:55, Ben Coman <[hidden email]> wrote:

> On Wed, Nov 4, 2015 at 1:38 AM, Colin Putney <[hidden email]> wrote:
>>
>>
>> On Mon, Nov 2, 2015 at 4:32 PM, Levente Uzonyi <[hidden email]> wrote:
>>>
>>> Hi All,
>>>
>>> With the recent UUID changes, we're facing the problem of properly seeding
>>> a Random instance. The current method provides only 2^32 different initial
>>> states, with which the chance for two images to have the same values on
>>> startup is way too high (~9300 startups for 1% chance of collision).
>>> To avoid the collisions we need a reliable source of random bytes.
>>> Currently we have the following sources:
>>> 1. /dev/urandom: fast, reliable, but not available on all platforms (e.g.
>>> windows).
>>> 2. UUIDPlugin: fast, but this is what we wanted to get rid of in the first
>>> place, so it may not be available.
>>> 3. Audio recorded through SoundSystem: slow, unreliable, and it may not be
>>> available.
>>> 4. Time primUTCMicrosecondClock. On its own it has way too little entropy.
>>> We can still use it to initalize the PRNG by using additional sources of
>>> entropy (image name, path, vm version, whatever). We can use SHA1 to get
>>> "more random" bits from out entropy sources. But this is more like a last
>>> resort than a solution to rely on.
>>>
>>> So I suggest we should create a new primitive, which can fill a given
>>> indexable object (bytes and optionally words) with random values - the same
>>> way Random >> #nextBytes:into:startingAt: works. It could use CryptGenRandom
>>> on Windows, and /dev/urandom on other unix-like platforms.
>>>
>>> As fallback mechanisms, I'd implement 1. 2. and optionally 4., and use
>>> them in the given order. The drawback of these mechanisms is that they
>>> create unwanted package dependencies, because Random is in Kernel, while
>>> most potetial sources of entropy, along with SHA1, are in other packages.
>>
>>
>> Why do we want to get rid of UUIDPlugin? Is it just because fewer plugins is
>> better, or is there a problem with this one in particular?
>>
>> Also, if we're going to have a primitive that supplies fast, high-quality,
>> random bytes, shouldn't Random just call that for random values, rather than
>> seeding a PRNG?
>
> If I properly understood my recent reading on this, every time you
> take a number from a True-RNG, you reduce its entropy pool, which may
> only refill slowly on a remote server.
>
>> From http://linux.die.net/man/4/urandom ...
> "The kernel random-number generator is designed to produce a small
> amount of high-quality seed material to seed a cryptographic
> pseudo-random number generator (CPRNG). It is designed for security,
> not speed, and is poorly suited to generating large amounts of random
> data. Users should be very economical in the amount of seed material
> that they read from /dev/urandom (and /dev/random); unnecessarily
> reading large quantities of data from this device will have a negative
> impact on other users of the device"

Yes. but there's more platforms than Linux to care about :)
How'd we do it on, say, RiscOS or FreeBSD?

Best regards
        -Tobias

PS: FreeBSD is not hypothetical, we've got a student working with Squeak on FreeBSD…


Reply | Threaded
Open this post in threaded view
|

Re: Seeding instances of Random

Ben Coman
On Thu, Nov 5, 2015 at 1:11 AM, Tobias Pape <[hidden email]> wrote:

>
> On 04.11.2015, at 17:55, Ben Coman <[hidden email]> wrote:
>
>> On Wed, Nov 4, 2015 at 1:38 AM, Colin Putney <[hidden email]> wrote:
>>>
>>>
>>> On Mon, Nov 2, 2015 at 4:32 PM, Levente Uzonyi <[hidden email]> wrote:
>>>>
>>>> Hi All,
>>>>
>>>> With the recent UUID changes, we're facing the problem of properly seeding
>>>> a Random instance. The current method provides only 2^32 different initial
>>>> states, with which the chance for two images to have the same values on
>>>> startup is way too high (~9300 startups for 1% chance of collision).
>>>> To avoid the collisions we need a reliable source of random bytes.
>>>> Currently we have the following sources:
>>>> 1. /dev/urandom: fast, reliable, but not available on all platforms (e.g.
>>>> windows).
>>>> 2. UUIDPlugin: fast, but this is what we wanted to get rid of in the first
>>>> place, so it may not be available.
>>>> 3. Audio recorded through SoundSystem: slow, unreliable, and it may not be
>>>> available.
>>>> 4. Time primUTCMicrosecondClock. On its own it has way too little entropy.
>>>> We can still use it to initalize the PRNG by using additional sources of
>>>> entropy (image name, path, vm version, whatever). We can use SHA1 to get
>>>> "more random" bits from out entropy sources. But this is more like a last
>>>> resort than a solution to rely on.
>>>>
>>>> So I suggest we should create a new primitive, which can fill a given
>>>> indexable object (bytes and optionally words) with random values - the same
>>>> way Random >> #nextBytes:into:startingAt: works. It could use CryptGenRandom
>>>> on Windows, and /dev/urandom on other unix-like platforms.
>>>>
>>>> As fallback mechanisms, I'd implement 1. 2. and optionally 4., and use
>>>> them in the given order. The drawback of these mechanisms is that they
>>>> create unwanted package dependencies, because Random is in Kernel, while
>>>> most potetial sources of entropy, along with SHA1, are in other packages.
>>>
>>>
>>> Why do we want to get rid of UUIDPlugin? Is it just because fewer plugins is
>>> better, or is there a problem with this one in particular?
>>>
>>> Also, if we're going to have a primitive that supplies fast, high-quality,
>>> random bytes, shouldn't Random just call that for random values, rather than
>>> seeding a PRNG?
>>
>> If I properly understood my recent reading on this, every time you
>> take a number from a True-RNG, you reduce its entropy pool, which may
>> only refill slowly on a remote server.
>>
>>> From http://linux.die.net/man/4/urandom ...
>> "The kernel random-number generator is designed to produce a small
>> amount of high-quality seed material to seed a cryptographic
>> pseudo-random number generator (CPRNG). It is designed for security,
>> not speed, and is poorly suited to generating large amounts of random
>> data. Users should be very economical in the amount of seed material
>> that they read from /dev/urandom (and /dev/random); unnecessarily
>> reading large quantities of data from this device will have a negative
>> impact on other users of the device"
>
> Yes. but there's more platforms than Linux to care about :)
> How'd we do it on, say, RiscOS or FreeBSD?
>
> Best regards
>         -Tobias
>
> PS: FreeBSD is not hypothetical, we've got a student working with Squeak on FreeBSD…
>

FreeBSD has /dev/urandom as shown in the table a few pages down here...
http://insanecoding.blogspot.com.au/2014/05/a-good-idea-with-bad-usage-devurandom.html

I don't know about RiscOS.
cheers -ben

Reply | Threaded
Open this post in threaded view
|

Re: Seeding instances of Random

Eliot Miranda-2
In reply to this post by Chris Muller-3
Seems to me that the relevant values are

primUTCMicrosecondClock (varies frequently)
MAC address of a network interface ("unique" to a machine)
Process Id/handle of VM (changes frequently, unique between simultaneous launches)

Why would a combination of these three be insufficient?

_,,,^..^,,,_ (phone)

On Nov 3, 2015, at 11:05 AM, Chris Muller <[hidden email]> wrote:

>> We can still use it to initalize the PRNG by using additional sources of
>> entropy (image name, path, vm version, whatever). We can use SHA1 to get
>> "more random" bits from out entropy sources. But this is more like a last
>> resort than a solution to rely on.
>
> I always thought a good list of hard-to-guess attributes injected in
> sequence with SHA1 feedback should be sufficiently hard to guess.
>
> millisecondClockValue, primUTCMicrosecondClock, timezone, Locale,
> available memory, consumed memory, vmpath, localpath, Display extent,
> Display imageForm, Sensor mouseX / mouseY, OS string,
> millisecondsToRun this
>
> I'm not against the new primitive idea, just have always been curious
> about digital security..
>

Reply | Threaded
Open this post in threaded view
|

Re: Seeding instances of Random

Tobias Pape

On 05.11.2015, at 17:57, Eliot Miranda <[hidden email]> wrote:

> Seems to me that the relevant values are
>
> primUTCMicrosecondClock (varies frequently)
> MAC address of a network interface ("unique" to a machine)
> Process Id/handle of VM (changes frequently, unique between simultaneous launches)
>
> Why would a combination of these three be insufficient?

Seems plausible and also contains the UUIDv1 [1] ingredients…
Where do I get handle/id of the system? likewise, Mac-address?

Best regards
        -Tobias


[1]: https://en.wikipedia.org/wiki/Universally_unique_identifier#Version_1_.28MAC_address_.26_date-time.29
Reply | Threaded
Open this post in threaded view
|

Re: Seeding instances of Random

timrowledge
In reply to this post by Ben Coman

> On 05-11-2015, at 5:33 AM, Ben Coman <[hidden email]> wrote:
>
> I don't know about RiscOS.

There’s almost certainly an RM buried somewhere that at least claims to provide an equivalent. There always is.


tim
--
tim Rowledge; [hidden email]; http://www.rowledge.org/tim
Base 8 is just like base 10, if you are missing two fingers.