I now have completed a first version of a reimplementation of
WeakKeyDictionary, with the objective of improving performance (both computation and memory consumption). The entire change with a discussion can be seen at http://bugs.squeak.org/view.php?id=6348 In essence, this introduces two changes: - when values are finalized or keys removed from the dictionary, it is not always rehashed anymore. Instead, the dictionary just releases the value, and marks the association as expired. Expired associations can then be reused if a new value is to be inserted under the same has. Rehashing is only done when so many entries have expired. Expired entries don't count towards the size of the dictionary, and aren't reported when iterating over the dictionary. - WeakKeyAssociation was rewritten to be a weak class, with a fixed slot for the value, and a weak slot for the key (currently, it has two fixed slots, with the key slot referring to a WeakArray of length 1). In some measurements involving large (10000 keys) or many (10000) small weak key dictionaries, this new implementation improves performance by a factor of 4..5, when measuring finalization time. The second change saves 8 bytes per association (on a 32-bit machine). Regards, Martin |
2007/3/18, "Martin v. Löwis" <[hidden email]>:
> I now have completed a first version of a reimplementation of > WeakKeyDictionary, with the objective of improving performance > (both computation and memory consumption). The entire change > with a discussion can be seen at > > http://bugs.squeak.org/view.php?id=6348 > > In essence, this introduces two changes: > - when values are finalized or keys removed from the dictionary, > it is not always rehashed anymore. Instead, the dictionary just > releases the value, and marks the association as expired. Expired > associations can then be reused if a new value is to be inserted > under the same has. Rehashing is only done when so many entries > have expired. Expired entries don't count towards the size of the > dictionary, and aren't reported when iterating over the dictionary. > - WeakKeyAssociation was rewritten to be a weak class, with a fixed > slot for the value, and a weak slot for the key (currently, it > has two fixed slots, with the key slot referring to a WeakArray > of length 1). > > In some measurements involving large (10000 keys) or many (10000) > small weak key dictionaries, this new implementation improves > performance by a factor of 4..5, when measuring finalization time. > The second change saves 8 bytes per association (on a 32-bit > machine). Cheers Philippe > Regards, > Martin > > > > > |
Philippe Marschall schrieb:
> Could you attach the code for these stress tests to the bug? They are in there (weakdictall.1.cs): WeakKeyDictionaryTest class>>timingLarge WeakKeyDictionaryTest class>>timingMany HTH, Martin |
In reply to this post by "Martin v. Löwis"
On Sun, Mar 18, 2007 at 03:51:34PM +0100, "Martin v. L?wis" wrote:
> I now have completed a first version of a reimplementation of > WeakKeyDictionary, with the objective of improving performance > (both computation and memory consumption). The entire change > with a discussion can be seen at > > http://bugs.squeak.org/view.php?id=6348 Martin, This would be a welcome improvement! Note also http://bugs.squeak.org/view.php?id=2910 which is also related to this problem. (How do you add a "relationship" in Mantis?) Dave |
David T. Lewis schrieb:
> This would be a welcome improvement! > > Note also http://bugs.squeak.org/view.php?id=2910 which is also > related to this problem. (How do you add a "relationship" in > Mantis?) I had indeed seen this before. On a first quick review, I agreed with your analysis that the patch is right; on a closer look, I agree with Andreas: it seems that the patch drops associations with nil keys on the floor, which it shouldn't do. In any case, my patch would also address this problem: on removal, no rehashing is done at all. Instead, it just clears the association of the to-be-removed key, and then stops; the association will get only discarded when the entire dictionary gets reorganized. Regards, Martin |
In reply to this post by "Martin v. Löwis"
Martin v. Löwis wrote:
> In essence, this introduces two changes: > - when values are finalized or keys removed from the dictionary, > it is not always rehashed anymore. Instead, the dictionary just > releases the value, and marks the association as expired. Expired > associations can then be reused if a new value is to be inserted > under the same has. Rehashing is only done when so many entries > have expired. Expired entries don't count towards the size of the > dictionary, and aren't reported when iterating over the dictionary. That's an interesting approach. I would expect this to impact the time it takes to insert/find elements in the dict, no? Like: wd := WeakKeyDictionary new. 1 to: 100 do:[:i| wd at: i put: i]. 1 to: 99 do:[:i| wd removeKey: i]. After this point I'd expect the operations to be slower since we have about a hundred expired elements in the dict. Nothing that a good #rehash can't fix of course, but I'm wondering if at some point the WKD shouldn't recognize the ratio of expired/free/used slots and act accordingly. In any case, it's definitely an improvement. > - WeakKeyAssociation was rewritten to be a weak class, with a fixed > slot for the value, and a weak slot for the key (currently, it > has two fixed slots, with the key slot referring to a WeakArray > of length 1). Hah! Finally ;-) When I wrote the weak dicts I was concerned about the usage of associations for literal bindings in code and whether any such weak associations might be used for this purpose (the problem is that the VM accesses those directly and not having the right values in the right slots will give you "interesting" results). However, that concern never even remotely materialized so it's basically a waste of perfectly good resources. Thanks for cleaning that part up. Cheers, - Andreas |
In reply to this post by "Martin v. Löwis"
This appears to be something that did not make the transition to the new server, the ability for reporters to add relationships. I spent a few minutes on this but it looks like this is going to take a bit more time to resolve and I don't have the time today. I'll try to take care of it tomorrow and let you know.
Ken > From: "David T. Lewis" <[hidden email]> > Date: Sun, March 18, 2007 11:36 am ...snip... > (How do you add a "relationship" in Mantis?) > > Dave |
In reply to this post by Andreas.Raab
> That's an interesting approach. I would expect this to impact the time
> it takes to insert/find elements in the dict, no? Like: > wd := WeakKeyDictionary new. > 1 to: 100 do:[:i| wd at: i put: i]. > 1 to: 99 do:[:i| wd removeKey: i]. > After this point I'd expect the operations to be slower since we have > about a hundred expired elements in the dict. It strongly depends on the operations now. The lookup for the only remaining key (100) will not be slower, since that key is not colliding. The next invocation of fullCheck will see that there are way too many expired entries in the dictionary, and rehash it. So the very next insertion may be slightly slower, but all subsequent insertions and other operations will have the same performance - in this specific use case, the new code just saved a hundred rehash operations. > Nothing that a good > #rehash can't fix of course, but I'm wondering if at some point the WKD > shouldn't recognize the ratio of expired/free/used slots and act > accordingly. Ah, it does: if 4*expired > array size, it rehashes (in fullCheck). The precise percentage may be debatable, and it may be an option to also consider the relationship of expired and tally, but it will definitely rehash from time to time (and growing will drop expired entries, too). It might also be useful to add a fullCheck into removeKey:, so that in above example it won't end up with with 99 expired associations, but only one. With rehashing as implemented in the patch, this would cause the dictionary to shrink (as it uses essentially the implementation of Dictionary>>rehash); in turn, the loop would do 7 rehashes, at 41, 21, 14, 9, 6, 4, 3 expired elements. With rehashing similar to the original one (i.e. one that keeps the size), a fullCheck on remove, in above example, would rehash every 40 removals (given that the dictionary will have 160 slots), so you would end up with 20 expired elements after the loop. Regards, Martin |
Martin v. Löwis wrote:
> The next invocation of fullCheck will see that there are way too many > expired entries in the dictionary, and rehash it. Oh, it does! I didn't notice that. In this case, my point is of course moot. >> Nothing that a good #rehash can't fix of course, but I'm wondering if >> at some point the WKD shouldn't recognize the ratio of >> expired/free/used slots and act accordingly. > > Ah, it does: if 4*expired > array size, it rehashes (in fullCheck). The > precise percentage may be debatable, and it may be an option to also > consider the relationship of expired and tally, but it will definitely > rehash from time to time (and growing will drop expired entries, too). Right. Wasn't careful enough reading the code. Nevermind ;-) Brief Q: You didn't explicitly specify what license those changes are under. Would you be amendable to using MIT-style license for it? (it would allow me to use the code trivially in Croquet) Cheers, - Andreas |
On Mar 19, 2007, at 7:50 , Andreas Raab wrote:
> Brief Q: You didn't explicitly specify what license those changes > are under. Would you be amendable to using MIT-style license for > it? (it would allow me to use the code trivially in Croquet) In case this hasn't been clear to everybody yet - we require MIT licensing for anything that people might want to go into official Squeak nowadays. - Bert - |
On Sun, 18 Mar 2007 23:53:21 -0800, Bert Freudenberg
<[hidden email]> wrote: > On Mar 19, 2007, at 7:50 , Andreas Raab wrote: > >> Brief Q: You didn't explicitly specify what license those changes are >> under. Would you be amendable to using MIT-style license for it? (it >> would allow me to use the code trivially in Croquet) > > In case this hasn't been clear to everybody yet - we require MIT > licensing for anything that people might want to go into official Squeak > nowadays. A cursory Google suggests that "MIT licensing" might mean a number of things. If it's this: http://www.opensource.org/licenses/mit-license.php Then it would seem that MIT licensing means you can do (or not do) just about anything with the code, as long as the MIT license accompanies whatever you do? ===Blake=== |
In reply to this post by Bert Freudenberg
Bert Freudenberg schrieb:
> On Mar 19, 2007, at 7:50 , Andreas Raab wrote: > >> Brief Q: You didn't explicitly specify what license those changes are >> under. Would you be amendable to using MIT-style license for it? (it >> would allow me to use the code trivially in Croquet) > > In case this hasn't been clear to everybody yet - we require MIT > licensing for anything that people might want to go into official Squeak > nowadays. So how do you deal with the requirement that the license is put in all copies? Regards, Martin |
In reply to this post by Andreas.Raab
Andreas Raab schrieb:
> Brief Q: You didn't explicitly specify what license those changes are > under. Would you be amendable to using MIT-style license for it? (it > would allow me to use the code trivially in Croquet) Sure - I just added a license file. See also my other response; I wonder where all the license files are kept in Squeak. I would be willing to give the Squeak Foundation (as well as Apple Computer, Inc., if necessary) the irrevocable and perpetual right to make and distribute copies of the Software, as well as to create and distribute collective works and derivative works of the Software, under any other OSI-approved open source license (thus eliminating my own MIT-style license from the license stack). Regards, Martin |
Re: Reimplementing WeakKeyDictionary
«
Return to Squeak - Dev
|
1 view|%1 views
|