Contributing to VoyageMongo (improving insertion/updating speed)

Previous Topic Next Topic
 
classic Classic list List threaded Threaded
6 messages Options
Reply | Threaded
Open this post in threaded view
|

Contributing to VoyageMongo (improving insertion/updating speed)

Holger Freyther
Hi,

I wanted to benchmark some look-ups with VoyageMongo and tried to insert
some dummy objects and it was quite slow (pharo takes 99% CPU and mongo
is idle). Using the nice profiler i noticed that VOMongoRepository>>#newVersion
calls UUIDGenerator new makeSeed which will take quite long.


  [UUIDGenerator new makeSeed] bench '96.9 per second.’

I am running this on OSX so it will end up in >>#makeUnixSeed which will open
/dev/urandom. I needed cryptographic random numbers in another project and
used NativeBoost to call into libcrypto of OpenSSL and use the RAND_bytes
function to get them.


 [ (RAND rand: 4) asInteger ] bench '588,000 per second.’


The quality of the random numbers should be comparable and now I wonder how
I can help integrating such a change.

From my point of view I can either patch Voyage-Mongo-Core to be able to
use other random number sources. Or I could add another “seed” source to
the UUIDGenerator. What would be the preferred approach and how to move
forward? Is NativeBoost something one could rely on in upcoming Pharo versions?

For a simple insertion of a class with a couple (~5) fields the insertion time for
1000 items goes from ~10s to ~250ms.

How can I move this forward?

        holger
Reply | Threaded
Open this post in threaded view
|

Re: Contributing to VoyageMongo (improving insertion/updating speed)

stepharo
Hi holger

Esteban is on vacation and I do not who is another manager of the mongo
repo.

Stef

Le 23/7/15 12:23, Holger Freyther a écrit :

> Hi,
>
> I wanted to benchmark some look-ups with VoyageMongo and tried to insert
> some dummy objects and it was quite slow (pharo takes 99% CPU and mongo
> is idle). Using the nice profiler i noticed that VOMongoRepository>>#newVersion
> calls UUIDGenerator new makeSeed which will take quite long.
>
>
>    [UUIDGenerator new makeSeed] bench '96.9 per second.’
>
> I am running this on OSX so it will end up in >>#makeUnixSeed which will open
> /dev/urandom. I needed cryptographic random numbers in another project and
> used NativeBoost to call into libcrypto of OpenSSL and use the RAND_bytes
> function to get them.
>
>
>   [ (RAND rand: 4) asInteger ] bench '588,000 per second.’
>
>
> The quality of the random numbers should be comparable and now I wonder how
> I can help integrating such a change.
>
>  From my point of view I can either patch Voyage-Mongo-Core to be able to
> use other random number sources. Or I could add another “seed” source to
> the UUIDGenerator. What would be the preferred approach and how to move
> forward? Is NativeBoost something one could rely on in upcoming Pharo versions?
>
> For a simple insertion of a class with a couple (~5) fields the insertion time for
> 1000 items goes from ~10s to ~250ms.
>
> How can I move this forward?
>
> holger
>


Reply | Threaded
Open this post in threaded view
|

Re: Contributing to VoyageMongo (improving insertion/updating speed)

Sven Van Caekenberghe-2
Good catch, that seems wrong to me, and as you showed it is way too slow. UUIDGenerator should certainly not be reinitialised each time a UUID is needed. There have been other discussions about UUID in the past.

/dev/urandom is not available on all platforms anyway, so alternatives might be useful.

Like Stef said, the MongoDB part is out of our scope.

Please make an issue (about the #makeSeed itself).

> On 23 Jul 2015, at 13:31, stepharo <[hidden email]> wrote:
>
> Hi holger
>
> Esteban is on vacation and I do not who is another manager of the mongo repo.
>
> Stef
>
> Le 23/7/15 12:23, Holger Freyther a écrit :
>> Hi,
>>
>> I wanted to benchmark some look-ups with VoyageMongo and tried to insert
>> some dummy objects and it was quite slow (pharo takes 99% CPU and mongo
>> is idle). Using the nice profiler i noticed that VOMongoRepository>>#newVersion
>> calls UUIDGenerator new makeSeed which will take quite long.
>>
>>
>>   [UUIDGenerator new makeSeed] bench '96.9 per second.’
>>
>> I am running this on OSX so it will end up in >>#makeUnixSeed which will open
>> /dev/urandom. I needed cryptographic random numbers in another project and
>> used NativeBoost to call into libcrypto of OpenSSL and use the RAND_bytes
>> function to get them.
>>
>>
>>  [ (RAND rand: 4) asInteger ] bench '588,000 per second.’
>>
>>
>> The quality of the random numbers should be comparable and now I wonder how
>> I can help integrating such a change.
>>
>> From my point of view I can either patch Voyage-Mongo-Core to be able to
>> use other random number sources. Or I could add another “seed” source to
>> the UUIDGenerator. What would be the preferred approach and how to move
>> forward? Is NativeBoost something one could rely on in upcoming Pharo versions?
>>
>> For a simple insertion of a class with a couple (~5) fields the insertion time for
>> 1000 items goes from ~10s to ~250ms.
>>
>> How can I move this forward?
>>
>> holger
>>
>
>


Reply | Threaded
Open this post in threaded view
|

Re: Contributing to VoyageMongo (improving insertion/updating speed)

Peter Uhnak
Why it doesn't use ==UUIDGenerator default== like UUID new does?

(I am running on Linux, not OS X so times are not directly comparable with Holger's)
~~~~~~~~~~~~~~~~~
[ UUIDGenerator new makeSeed ] bench. "'1,344 per second'"
[ UUID new ] bench. "'15,794 per second'"
~~~~~~~~~~~~~~~~~

On another note...the performance for UUID is still terrible compared to languages like Ruby:

~~~~~~~~~~~~
require 'benchmark'
require 'securerandom'

n = 1000000
time = Benchmark.realtime do
n.times do ; SecureRandom.uuid; end
end

puts "#{n/time} times per second"
~~~~~~~~~~~~~~

"142164.93834009714 times per second" (tenfold improvement)

Pharo's implementation is for whatever reason really complex, generating the bytes several times, assigning finals bits one by one, etc.

Peter


On Thu, Jul 23, 2015 at 1:42 PM, Sven Van Caekenberghe <[hidden email]> wrote:
Good catch, that seems wrong to me, and as you showed it is way too slow. UUIDGenerator should certainly not be reinitialised each time a UUID is needed. There have been other discussions about UUID in the past.

/dev/urandom is not available on all platforms anyway, so alternatives might be useful.

Like Stef said, the MongoDB part is out of our scope.

Please make an issue (about the #makeSeed itself).

> On 23 Jul 2015, at 13:31, stepharo <[hidden email]> wrote:
>
> Hi holger
>
> Esteban is on vacation and I do not who is another manager of the mongo repo.
>
> Stef
>
> Le 23/7/15 12:23, Holger Freyther a écrit :
>> Hi,
>>
>> I wanted to benchmark some look-ups with VoyageMongo and tried to insert
>> some dummy objects and it was quite slow (pharo takes 99% CPU and mongo
>> is idle). Using the nice profiler i noticed that VOMongoRepository>>#newVersion
>> calls UUIDGenerator new makeSeed which will take quite long.
>>
>>
>>   [UUIDGenerator new makeSeed] bench '96.9 per second.’
>>
>> I am running this on OSX so it will end up in >>#makeUnixSeed which will open
>> /dev/urandom. I needed cryptographic random numbers in another project and
>> used NativeBoost to call into libcrypto of OpenSSL and use the RAND_bytes
>> function to get them.
>>
>>
>>  [ (RAND rand: 4) asInteger ] bench '588,000 per second.’
>>
>>
>> The quality of the random numbers should be comparable and now I wonder how
>> I can help integrating such a change.
>>
>> From my point of view I can either patch Voyage-Mongo-Core to be able to
>> use other random number sources. Or I could add another “seed” source to
>> the UUIDGenerator. What would be the preferred approach and how to move
>> forward? Is NativeBoost something one could rely on in upcoming Pharo versions?
>>
>> For a simple insertion of a class with a couple (~5) fields the insertion time for
>> 1000 items goes from ~10s to ~250ms.
>>
>> How can I move this forward?
>>
>>      holger
>>
>
>



Reply | Threaded
Open this post in threaded view
|

Re: Contributing to VoyageMongo (improving insertion/updating speed)

Holger Freyther

> On 23 Jul 2015, at 16:13, Peter Uhnák <[hidden email]> wrote:
>
> Why it doesn't use ==UUIDGenerator default== like UUID new does?

>>#newVersion generates a timestamp/version for the serialized object.
I think >>#newSeed is abused as on Unix platforms it can provide a somehow
cryptographically secure random number.


> [ UUIDGenerator new makeSeed ] bench. "'1,344 per second’"

But you can even have /dev/urandom block if you read too much. So trying
to use RAND_bytes appears to be a good idea.




Reply | Threaded
Open this post in threaded view
|

Re: Contributing to VoyageMongo (improving insertion/updating speed)

Sven Van Caekenberghe-2
In reply to this post by Peter Uhnak

> On 23 Jul 2015, at 16:13, Peter Uhnák <[hidden email]> wrote:
>
> Why it doesn't use ==UUIDGenerator default== like UUID new does?
>
> (I am running on Linux, not OS X so times are not directly comparable with Holger's)
> ~~~~~~~~~~~~~~~~~
> [ UUIDGenerator new makeSeed ] bench. "'1,344 per second'"
> [ UUID new ] bench. "'15,794 per second'"
> ~~~~~~~~~~~~~~~~~
>
> On another note...the performance for UUID is still terrible compared to languages like Ruby:
>
> ~~~~~~~~~~~~
> require 'benchmark'
> require 'securerandom'
>
> n = 1000000
> time = Benchmark.realtime do
> n.times do ; SecureRandom.uuid; end
> end
>
> puts "#{n/time} times per second"
> ~~~~~~~~~~~~~~
>
> "142164.93834009714 times per second" (tenfold improvement)
>
> Pharo's implementation is for whatever reason really complex, generating the bytes several times, assigning finals bits one by one, etc.

Indeed, following a previous discussion, I once wrote my own version (in the Neo repository):

| generator |
generator := NeoUUIDGenerator new.
[ generator next ] bench. "'446,217 per second'"

But it all depends on the algorithm and which random generator/source you need/trust/want.

> Peter
>
>
> On Thu, Jul 23, 2015 at 1:42 PM, Sven Van Caekenberghe <[hidden email]> wrote:
> Good catch, that seems wrong to me, and as you showed it is way too slow. UUIDGenerator should certainly not be reinitialised each time a UUID is needed. There have been other discussions about UUID in the past.
>
> /dev/urandom is not available on all platforms anyway, so alternatives might be useful.
>
> Like Stef said, the MongoDB part is out of our scope.
>
> Please make an issue (about the #makeSeed itself).
>
> > On 23 Jul 2015, at 13:31, stepharo <[hidden email]> wrote:
> >
> > Hi holger
> >
> > Esteban is on vacation and I do not who is another manager of the mongo repo.
> >
> > Stef
> >
> > Le 23/7/15 12:23, Holger Freyther a écrit :
> >> Hi,
> >>
> >> I wanted to benchmark some look-ups with VoyageMongo and tried to insert
> >> some dummy objects and it was quite slow (pharo takes 99% CPU and mongo
> >> is idle). Using the nice profiler i noticed that VOMongoRepository>>#newVersion
> >> calls UUIDGenerator new makeSeed which will take quite long.
> >>
> >>
> >>   [UUIDGenerator new makeSeed] bench '96.9 per second.’
> >>
> >> I am running this on OSX so it will end up in >>#makeUnixSeed which will open
> >> /dev/urandom. I needed cryptographic random numbers in another project and
> >> used NativeBoost to call into libcrypto of OpenSSL and use the RAND_bytes
> >> function to get them.
> >>
> >>
> >>  [ (RAND rand: 4) asInteger ] bench '588,000 per second.’
> >>
> >>
> >> The quality of the random numbers should be comparable and now I wonder how
> >> I can help integrating such a change.
> >>
> >> From my point of view I can either patch Voyage-Mongo-Core to be able to
> >> use other random number sources. Or I could add another “seed” source to
> >> the UUIDGenerator. What would be the preferred approach and how to move
> >> forward? Is NativeBoost something one could rely on in upcoming Pharo versions?
> >>
> >> For a simple insertion of a class with a couple (~5) fields the insertion time for
> >> 1000 items goes from ~10s to ~250ms.
> >>
> >> How can I move this forward?
> >>
> >>      holger
> >>
> >
> >
>
>
>