Hi all,
I just noticed that attempting to optimize the creation of a Dictionary or OrderedDictionary by specifying a pre-allocated size, actually slows the system down more than not using this optimization. In trunk, see? [Dictionary new] bench. '26,300,000 per second. 38.1 nanoseconds per run. 9.21816 % GC time.' [Dictionary new: 4] bench. '3,560,000 per second. 281 nanoseconds per run. 2.41952 % GC time.' So then I tried to debug it to see why, and I couldn't without my Tools-cmm.926 patch currently in the Inbox for review. Dictionary new: 4 "try to debug-it and step into" The whole purpose of #new: over #new is to increase the performance of allocation when the minimum size is known in advance, but we seem to have killed this goal by the cost of #goodPrimeAtLeast:. Levente? Best, Chris |
Hi Chris, The purpose of good prime sizes is to accelerate random access at: and at:put:. So more ellaborate bench should be used to measure if worth. Le mar. 17 déc. 2019 à 23:02, Chris Muller <[hidden email]> a écrit :
|
I know that, but I'm talking about allocation speed, not access speed. This #new: violates the expectations of your typical Smalltalk programmer, they'll write code they think is optimized but, actually, is much much slower. Surely there's a faster way to come up with a next-higher prime? Certainly, that primes table seems unnecessarily large for this purpose -- anyone creating a Dictionary with > 1M preallocated slots is a special-case scenario we should not slow down the "normal" cases for... - Chris On Tue, Dec 17, 2019 at 4:33 PM Nicolas Cellier <[hidden email]> wrote:
|
In reply to this post by Chris Muller-3
Hi Chris,
On Tue, 17 Dec 2019, Chris Muller wrote: > Hi all, > I just noticed that attempting to optimize the creation of a Dictionary or OrderedDictionary by specifying a pre-allocated size, actually slows the system down more than not using this optimization. In trunk, see? > > [Dictionary new] bench. '26,300,000 per second. 38.1 nanoseconds per run. 9.21816 % GC time.' > > [Dictionary new: 4] bench. '3,560,000 per second. 281 nanoseconds per run. 2.41952 % GC time.' > > So then I tried to debug it to see why, and I couldn't without my Tools-cmm.926 patch currently in the Inbox for review. You can't beat #new with #new: if you need to store <= 3 elements, because #new has the capacity of 5 hard-coded, while #new: has to figure out the correct capacity. If you intend to store 4 pairs in your dictionary, then it's still worth to use #new: instead of #new, because growing costs a lot more than allocating well-sized collections[1]. As you found out, the performance difference between #new and #new: boils down to #goodPrimesAtLeast:. But #sizeFor:, and allocation (7 slots instead of 5) make a difference as well. I just pushed Collections-ul.867 to the inbox, which should help with your problem, but this still won't make [ Dictionary new: 4 ] as fast as [ Dictionary new ]. Why? Even though these changes make #goodPrimesAtLeast: ~8x faster for your case, #new: will still take about two times as long as #new. Levente [1] [ Dictionary new at: 1 put: 1; at: 2 put: 2; at: 3 put: 3; at: 4 put: 4 ] bench. '2,110,000 per second. 474 nanoseconds per run. 4.87805 % GC time.'. [ (Dictionary new: 4) at: 1 put: 1; at: 2 put: 2; at: 3 put: 3; at: 4 put: 4 ] bench. '3,580,000 per second. 279 nanoseconds per run. 5.19896 % GC time.' > > Dictionary new: 4 "try to debug-it and step into" > > The whole purpose of #new: over #new is to increase the performance of allocation when the minimum size is known in advance, but we seem to have killed this goal by the cost of #goodPrimeAtLeast:. Levente? > > Best, > Chris > > |
You're right. I was actually dealing with an OrderedDictionary and got too comfortable seeing it through a "SequenceableCollection" lens, forgot about the hashing side and the need for growTo to calculate a goodPrime anyway. That clears up my confusion, thanks. :) So, goodPrime is a trade-off between allocation and access performance. I definitely like your optimization to favor access. goodPrime is an unvaoidable cost, and your Collections-ul.867 looks like a 3X improvement (for my Dictionary example), and low risk. Worth including in 5.3, IMO.. Thanks, Chris On Tue, Dec 17, 2019 at 7:39 PM Levente Uzonyi <[hidden email]> wrote: Hi Chris, |
Free forum by Nabble | Edit this page |