Fwd: Adding FasterSets to Pharo: update

Previous Topic Next Topic
 
classic Classic list List threaded Threaded
1 message Options
Reply | Threaded
Open this post in threaded view
|

Fwd: Adding FasterSets to Pharo: update

Stéphane Ducasse


Begin forwarded message:

> From: Levente Uzonyi <[hidden email]>
> Date: October 30, 2009 3:29:33 AM GMT+01:00
> To: Ralph Boland <[hidden email]>
> Cc: Stéphane Ducasse <[hidden email]>
> Subject: Re: Adding FasterSets to Pharo: update
>
> Hi!
>
> On Thu, 29 Oct 2009, Ralph Boland wrote:
>
>>    In Levente's code class WeakKeyToCollectionDictionary->rehash  
>> seems to be
>>    missing.   This appears to be a mistate to me but I will live it
>> to Levente to explain
>>    it.  The fix, if needed, is simple.
>
> Only Set and MethodDictionary implements #rehash in the trunk  
> version. #rehash sends #growTo: which sends #noCheckNoGrowFillFrom:  
> which is implemented by WeakKeyToCollectionDictionary.
>
>>     b)  Levente's version does not use method  scanForNilFrom:  (He
>> would probably call
>>          the method scanForEmptySlotFrom:).
>>          I don't know why he didn't do this: it clearly cleaner.
>
> It's called #scanForEmptySlotFor:, because WeakSet has a flag for  
> empty slots, not nil.
>
>>      b) The improvement in performance for method add:  in  
>> Levente's version
>>          is insignificant   (appx 1%);  this difference could easily
>> be made up in minor
>>          changes to my code.  Thus here I consider there to be no  
>> difference.
>
> Don't let the garbage collector fool you.
> See http://leves.web.elte.hu/collections/DictionaryBenchmarkResults.txt 
>  . The values in the table are microseconds/element without gc time.
> The benchmark code is here http://leves.web.elte.hu/collections/DictionaryBenchmark.st
> A similar benchmark's result for Set:
>
> Trunk:
> Size 1000 2000 5000 10000 20000 50000 100000 200000 500000
> add: (included) 0.5 0.2 0.36 0.46 0.365 0.392 0.394 0.406 0.4112
> add: (not included) 1.5 1.6 1.38 1.35 1.365 1.25 1.261 1.252 1.5192
> includes: (included) 0.3 0.5 0.42 0.41 0.395 0.432 0.457 0.4695 0.4442
> includes: (not included) 0.2 0.5 0.56 0.51 0.51 0.632 0.66 0.6525
> 0.5994
> like: (included) 0.5 0.4 0.4 0.36 0.345 0.37 0.393 0.4105 0.3854
> like: (not included) 0.4 0.45 0.52 0.44 0.445 0.566 0.593 0.5895 0.536
> rehash 0.5 0.4 0.32 0.33 0.33 0.338 0.342 0.326 0.3386
> remove:ifAbsent: (included) 0.7 0.95 1.02 0.99 1.015 1.228 1.238
> 1.2485 0.9282
> remove:ifAbsent: (not included) 0.4 0.5 0.58 0.54 0.575 0.686 0.699
> 0.6985 0.6586
>
> Pharo:
> Size 1000 2000 5000 10000 20000 50000 100000 200000 500000
> add: (included) 0.4 0.45 0.5 0.45 0.44 0.462 0.465 0.4705 0.499
> add: (not included) 2.0 2.0 1.72 1.67 1.715 1.57 1.582 1.57 2.0874
> includes: (included) 0.5 0.5 0.56 0.6 0.52 0.52 0.547 0.5335 0.5814
> includes: (not included) 0.7 0.85 0.8 1.03 0.865 0.804 0.804 0.8265
> 0.8734
> like: (included) 0.7 0.45 0.4 0.5 0.395 0.382 0.399 0.3935 0.4208
> like: (not included) 0.3 0.65 0.66 0.9 0.705 0.66 0.659 0.679 0.783
> rehash 0.9 0.6 0.64 0.69 0.63 0.604 0.615 0.6285 0.6964
> remove:ifAbsent: (included) 2.0 1.95 2.16 6.59 2.165 2.052 1.981
> 2.092 2.0544
> remove:ifAbsent: (not included) 0.9 0.85 0.92 1.05 0.9 0.84 0.878
> 0.869 1.0232
>
> Pharo with Andres's hacks:
> Size 1000 2000 5000 10000 20000 50000 100000 200000 500000
> add: (included) 0.5 0.5 0.46 0.49 0.475 0.466 0.458 0.4935 0.5306
> add: (not included) 1.7 2.35 2.26 2.02 2.15 2.12 2.298 2.021 2.0332
> includes: (included) 0.6 0.5 0.56 0.53 0.62 0.528 0.539 0.533 0.588
> includes: (not included) 0.8 0.8 0.74 0.77 0.745 0.7 0.694 0.706
> 0.7814
> like: (included) 0.1 0.2 0.36 0.4 0.485 0.386 0.39 0.382 0.4312
> like: (not included) 0.4 0.8 0.6 0.66 0.605 0.576 0.566 0.5835 0.6554
> rehash 0.6 0.55 0.62 0.58 0.575 0.586 0.599 0.5825 0.6044
> remove:ifAbsent: (included) 1.5 1.95 1.86 1.97 1.905 1.786 1.763
> 1.8005 1.9346
> remove:ifAbsent: (not included) 0.7 0.75 0.82 0.82 0.84 0.774 0.771
> 0.7755 0.8418
>
> If you're interrested in the benchmark code, let me know.
>
>>     b)  Levente's version does not apply the FasterSets idea to
>> MethodDictionary.
>>          I am not sure why he did not do this but it may have
>> something to do with the fact that
>>          mistakes can easily corrupt your image.  The version in
>> FasterSets works fine though
>>          it took three tries for me to get it right (mostly because
>> of carelessness).
>
> It would be easy to modify MethodDictionary, but it has a different  
> implementation, so most methods would have to be overridden and the  
> performance benefits would be insignificant, because #become: is  
> much (~80x) slower than the actual rehash for an average sized  
> MethodDictionary.
> Btw, MethodDictionaries are rarely grown.
>
>>     c)  I left methods  intersection and nastyAddNew: out of classes
>> Set and Dictionary for
>>           the initial version of FasterSets to keep the number of
>> changes to a minimum.
>>           At your request I added them to the current version.
>> These methods are not in
>>           Levente's version.  They are public methods that provide
>> performance improvements.
>>           If desired they could easily be added to Levente's version.
>
> #nastyAddNew: is not really useful IMO, if you want to be nasty, you  
> can use "private" methods:
> aSet atNewIndex: (set scanForNil: anObject) put: anObject.
> It's even faster and if you do this, you probably know what you're  
> doing.
>
> I'm not really interrested in #intersecion:, but I found it  
> controversal. What's the intersecion of a Set and an IdentitySet? Do  
> you check identity or equality?
>
>> I recommend that
>>
>>    1)  You use Levente's version.
>
> Feel free to grab it.
>
>>    2)  Sort out the issue of no rehash method for Class
>> WeakKeyToCollectionDictionary.
>
> See above.
>
>>    3)  Modify his code to use method scanForNilFrom:  (calling it
>> scanForEmptySlotFrom:).
>
> See above.
>
>>    4)  Decide if you want to add methods  intersect: and  
>> nastyAddNew:.
>>         I can add these if you want.
>
> See my opinion above.
>
>>    5) Release a version of Pharo with these changes.
>>    6)  In the following release of Pharo add the changes to
>> MethodDictionary to use The
>>         FasterSets Idea.  I sugggest doing this separately from
>> everything else just because it is a
>>         bit hairy.  I would not leave this change out because
>> MethodDictionary is too important
>>         a class not to have these improvements.
>>
>
> See above.
>
>> You can post this on the Pharo newsgroup if you like but perhaps you
>> should get Levente's
>> opinion first.
>
> Go ahead, I'm already subscribed to the pharo list because I  
> couldn't stop myself to respond to a false statement.
>
> Cheers,
> Levente


_______________________________________________
Pharo-project mailing list
[hidden email]
http://lists.gforge.inria.fr/cgi-bin/mailman/listinfo/pharo-project