I now have completed a first version of a reimplementation of
WeakKeyDictionary, with the objective of improving performance (both computation and memory consumption). The entire change with a discussion can be seen at http://bugs.squeak.org/view.php?id=6348 In essence, this introduces two changes: - when values are finalized or keys removed from the dictionary, it is not always rehashed anymore. Instead, the dictionary just releases the value, and marks the association as expired. Expired associations can then be reused if a new value is to be inserted under the same has. Rehashing is only done when so many entries have expired. Expired entries don't count towards the size of the dictionary, and aren't reported when iterating over the dictionary. - WeakKeyAssociation was rewritten to be a weak class, with a fixed slot for the value, and a weak slot for the key (currently, it has two fixed slots, with the key slot referring to a WeakArray of length 1). In some measurements involving large (10000 keys) or many (10000) small weak key dictionaries, this new implementation improves performance by a factor of 4..5, when measuring finalization time. The second change saves 8 bytes per association (on a 32-bit machine). Regards, Martin |
2007/3/18, "Martin v. Löwis" <[hidden email]>:
> I now have completed a first version of a reimplementation of > WeakKeyDictionary, with the objective of improving performance > (both computation and memory consumption). The entire change > with a discussion can be seen at > > http://bugs.squeak.org/view.php?id=6348 > > In essence, this introduces two changes: > - when values are finalized or keys removed from the dictionary, > it is not always rehashed anymore. Instead, the dictionary just > releases the value, and marks the association as expired. Expired > associations can then be reused if a new value is to be inserted > under the same has. Rehashing is only done when so many entries > have expired. Expired entries don't count towards the size of the > dictionary, and aren't reported when iterating over the dictionary. > - WeakKeyAssociation was rewritten to be a weak class, with a fixed > slot for the value, and a weak slot for the key (currently, it > has two fixed slots, with the key slot referring to a WeakArray > of length 1). > > In some measurements involving large (10000 keys) or many (10000) > small weak key dictionaries, this new implementation improves > performance by a factor of 4..5, when measuring finalization time. > The second change saves 8 bytes per association (on a 32-bit > machine). Cheers Philippe > Regards, > Martin > > > > > |
Philippe Marschall schrieb:
> Could you attach the code for these stress tests to the bug? They are in there (weakdictall.1.cs): WeakKeyDictionaryTest class>>timingLarge WeakKeyDictionaryTest class>>timingMany HTH, Martin |
In reply to this post by "Martin v. Löwis"
On Sun, Mar 18, 2007 at 03:51:34PM +0100, "Martin v. L?wis" wrote:
> I now have completed a first version of a reimplementation of > WeakKeyDictionary, with the objective of improving performance > (both computation and memory consumption). The entire change > with a discussion can be seen at > > http://bugs.squeak.org/view.php?id=6348 Martin, This would be a welcome improvement! Note also http://bugs.squeak.org/view.php?id=2910 which is also related to this problem. (How do you add a "relationship" in Mantis?) Dave |
David T. Lewis schrieb:
> This would be a welcome improvement! > > Note also http://bugs.squeak.org/view.php?id=2910 which is also > related to this problem. (How do you add a "relationship" in > Mantis?) I had indeed seen this before. On a first quick review, I agreed with your analysis that the patch is right; on a closer look, I agree with Andreas: it seems that the patch drops associations with nil keys on the floor, which it shouldn't do. In any case, my patch would also address this problem: on removal, no rehashing is done at all. Instead, it just clears the association of the to-be-removed key, and then stops; the association will get only discarded when the entire dictionary gets reorganized. Regards, Martin |
In reply to this post by "Martin v. Löwis"
Martin v. Löwis wrote:
> In essence, this introduces two changes: > - when values are finalized or keys removed from the dictionary, > it is not always rehashed anymore. Instead, the dictionary just > releases the value, and marks the association as expired. Expired > associations can then be reused if a new value is to be inserted > under the same has. Rehashing is only done when so many entries > have expired. Expired entries don't count towards the size of the > dictionary, and aren't reported when iterating over the dictionary. That's an interesting approach. I would expect this to impact the time it takes to insert/find elements in the dict, no? Like: wd := WeakKeyDictionary new. 1 to: 100 do:[:i| wd at: i put: i]. 1 to: 99 do:[:i| wd removeKey: i]. After this point I'd expect the operations to be slower since we have about a hundred expired elements in the dict. Nothing that a good #rehash can't fix of course, but I'm wondering if at some point the WKD shouldn't recognize the ratio of expired/free/used slots and act accordingly. In any case, it's definitely an improvement. > - WeakKeyAssociation was rewritten to be a weak class, with a fixed > slot for the value, and a weak slot for the key (currently, it > has two fixed slots, with the key slot referring to a WeakArray > of length 1). Hah! Finally ;-) When I wrote the weak dicts I was concerned about the usage of associations for literal bindings in code and whether any such weak associations might be used for this purpose (the problem is that the VM accesses those directly and not having the right values in the right slots will give you "interesting" results). However, that concern never even remotely materialized so it's basically a waste of perfectly good resources. Thanks for cleaning that part up. Cheers, - Andreas |
In reply to this post by "Martin v. Löwis"
This appears to be something that did not make the transition to the new server, the ability for reporters to add relationships. I spent a few minutes on this but it looks like this is going to take a bit more time to resolve and I don't have the time today. I'll try to take care of it tomorrow and let you know.
Ken > From: "David T. Lewis" <[hidden email]> > Date: Sun, March 18, 2007 11:36 am ...snip... > (How do you add a "relationship" in Mantis?) > > Dave |
In reply to this post by Andreas.Raab
> That's an interesting approach. I would expect this to impact the time
> it takes to insert/find elements in the dict, no? Like: > wd := WeakKeyDictionary new. > 1 to: 100 do:[:i| wd at: i put: i]. > 1 to: 99 do:[:i| wd removeKey: i]. > After this point I'd expect the operations to be slower since we have > about a hundred expired elements in the dict. It strongly depends on the operations now. The lookup for the only remaining key (100) will not be slower, since that key is not colliding. The next invocation of fullCheck will see that there are way too many expired entries in the dictionary, and rehash it. So the very next insertion may be slightly slower, but all subsequent insertions and other operations will have the same performance - in this specific use case, the new code just saved a hundred rehash operations. > Nothing that a good > #rehash can't fix of course, but I'm wondering if at some point the WKD > shouldn't recognize the ratio of expired/free/used slots and act > accordingly. Ah, it does: if 4*expired > array size, it rehashes (in fullCheck). The precise percentage may be debatable, and it may be an option to also consider the relationship of expired and tally, but it will definitely rehash from time to time (and growing will drop expired entries, too). It might also be useful to add a fullCheck into removeKey:, so that in above example it won't end up with with 99 expired associations, but only one. With rehashing as implemented in the patch, this would cause the dictionary to shrink (as it uses essentially the implementation of Dictionary>>rehash); in turn, the loop would do 7 rehashes, at 41, 21, 14, 9, 6, 4, 3 expired elements. With rehashing similar to the original one (i.e. one that keeps the size), a fullCheck on remove, in above example, would rehash every 40 removals (given that the dictionary will have 160 slots), so you would end up with 20 expired elements after the loop. Regards, Martin |
Martin v. Löwis wrote:
> The next invocation of fullCheck will see that there are way too many > expired entries in the dictionary, and rehash it. Oh, it does! I didn't notice that. In this case, my point is of course moot. >> Nothing that a good #rehash can't fix of course, but I'm wondering if >> at some point the WKD shouldn't recognize the ratio of >> expired/free/used slots and act accordingly. > > Ah, it does: if 4*expired > array size, it rehashes (in fullCheck). The > precise percentage may be debatable, and it may be an option to also > consider the relationship of expired and tally, but it will definitely > rehash from time to time (and growing will drop expired entries, too). Right. Wasn't careful enough reading the code. Nevermind ;-) Brief Q: You didn't explicitly specify what license those changes are under. Would you be amendable to using MIT-style license for it? (it would allow me to use the code trivially in Croquet) Cheers, - Andreas |
On Mar 19, 2007, at 7:50 , Andreas Raab wrote:
> Brief Q: You didn't explicitly specify what license those changes > are under. Would you be amendable to using MIT-style license for > it? (it would allow me to use the code trivially in Croquet) In case this hasn't been clear to everybody yet - we require MIT licensing for anything that people might want to go into official Squeak nowadays. - Bert - |
On Sun, 18 Mar 2007 23:53:21 -0800, Bert Freudenberg
<[hidden email]> wrote: > On Mar 19, 2007, at 7:50 , Andreas Raab wrote: > >> Brief Q: You didn't explicitly specify what license those changes are >> under. Would you be amendable to using MIT-style license for it? (it >> would allow me to use the code trivially in Croquet) > > In case this hasn't been clear to everybody yet - we require MIT > licensing for anything that people might want to go into official Squeak > nowadays. A cursory Google suggests that "MIT licensing" might mean a number of things. If it's this: http://www.opensource.org/licenses/mit-license.php Then it would seem that MIT licensing means you can do (or not do) just about anything with the code, as long as the MIT license accompanies whatever you do? ===Blake=== |
In reply to this post by Bert Freudenberg
Bert Freudenberg schrieb:
> On Mar 19, 2007, at 7:50 , Andreas Raab wrote: > >> Brief Q: You didn't explicitly specify what license those changes are >> under. Would you be amendable to using MIT-style license for it? (it >> would allow me to use the code trivially in Croquet) > > In case this hasn't been clear to everybody yet - we require MIT > licensing for anything that people might want to go into official Squeak > nowadays. So how do you deal with the requirement that the license is put in all copies? Regards, Martin |
In reply to this post by Andreas.Raab
Andreas Raab schrieb:
> Brief Q: You didn't explicitly specify what license those changes are > under. Would you be amendable to using MIT-style license for it? (it > would allow me to use the code trivially in Croquet) Sure - I just added a license file. See also my other response; I wonder where all the license files are kept in Squeak. I would be willing to give the Squeak Foundation (as well as Apple Computer, Inc., if necessary) the irrevocable and perpetual right to make and distribute copies of the Software, as well as to create and distribute collective works and derivative works of the Software, under any other OSI-approved open source license (thus eliminating my own MIT-style license from the license stack). Regards, Martin |
In reply to this post by "Martin v. Löwis"
On Mar 19, 2007, at 10:58 , Martin v. Löwis wrote:
> Bert Freudenberg schrieb: >> On Mar 19, 2007, at 7:50 , Andreas Raab wrote: >>> Brief Q: You didn't explicitly specify what license those changes >>> are under. Would you be amendable to using MIT-style license for >>> it? (it would allow me to use the code trivially in Croquet) >> In case this hasn't been clear to everybody yet - we require MIT >> licensing for anything that people might want to go into official >> Squeak nowadays. > > So how do you deal with the requirement that the license is put in all > copies? We require a written note that allows us to use the contributions under MIT. Since this is rather new we need to figure out the exact details still, but there is a form you need to print out and sign and send in. I think Craig might know best. MIT was chosen because it is compatible to basically everything. Squeak 1.1 was relicensed by Apple under the Apache License 2.0 just last year. With this Apache base and all the MIT contributions we have a completely free system. And if there was a replacement for the Squeak kernel down the road (Spoon, Pepsi, whatever) the MIT parts can be moved over with ease. Also, the major new systems built on Squeak (like Croquet, Tweak, Seaside, etc.) are MIT licensed. - Bert - |
In reply to this post by Blake-5
On Mar 19, 2007, at 11:22 , Blake wrote:
> On Sun, 18 Mar 2007 23:53:21 -0800, Bert Freudenberg > <[hidden email]> wrote: > >> On Mar 19, 2007, at 7:50 , Andreas Raab wrote: >> >>> Brief Q: You didn't explicitly specify what license those changes >>> are under. Would you be amendable to using MIT-style license for >>> it? (it would allow me to use the code trivially in Croquet) >> >> In case this hasn't been clear to everybody yet - we require MIT >> licensing for anything that people might want to go into official >> Squeak nowadays. > > A cursory Google suggests that "MIT licensing" might mean a number > of things. If it's this: > > http://www.opensource.org/licenses/mit-license.php > > Then it would seem that MIT licensing means you can do (or not do) > just about anything with the code, as long as the MIT license > accompanies whatever you do? As I understand it, yes. - Bert - |
In reply to this post by Bert Freudenberg
Bert Freudenberg schrieb:
> MIT was chosen because it is compatible to basically everything. Squeak > 1.1 was relicensed by Apple under the Apache License 2.0 just last year. > With this Apache base and all the MIT contributions we have a completely > free system. And if there was a replacement for the Squeak kernel down > the road (Spoon, Pepsi, whatever) the MIT parts can be moved over with > ease. Also, the major new systems built on Squeak (like Croquet, Tweak, > Seaside, etc.) are MIT licensed. As a contributor, I don't mind that choice. Still, the MIT license requires that "The above copyright notice and this permission notice shall be included in all copies or substantial portions of the Software.", so I'm curious how this requirement will be executed. Will the licenses all be collected in a class comment of some class? Regards, Martin |
Martin v. Löwis skrev:
> Bert Freudenberg schrieb: >> MIT was chosen because it is compatible to basically everything. >> Squeak 1.1 was relicensed by Apple under the Apache License 2.0 just >> last year. With this Apache base and all the MIT contributions we >> have a completely free system. And if there was a replacement for the >> Squeak kernel down the road (Spoon, Pepsi, whatever) the MIT parts >> can be moved over with ease. Also, the major new systems built on >> Squeak (like Croquet, Tweak, Seaside, etc.) are MIT licensed. > > As a contributor, I don't mind that choice. Still, the MIT license > requires that "The above copyright notice and this permission notice > shall be included in all copies or substantial portions of the > Software.", so I'm curious how this requirement will be executed. > > Will the licenses all be collected in a class comment of some class? > > Regards, > Martin > > I think we have been sloppy with distributing the license with Squeak in the past. I can't recall any download wich include the Squeak License or any other for that sake. Karl |
karl skrev:
> Martin v. Löwis skrev: >> Bert Freudenberg schrieb: >>> MIT was chosen because it is compatible to basically everything. >>> Squeak 1.1 was relicensed by Apple under the Apache License 2.0 just >>> last year. With this Apache base and all the MIT contributions we >>> have a completely free system. And if there was a replacement for >>> the Squeak kernel down the road (Spoon, Pepsi, whatever) the MIT >>> parts can be moved over with ease. Also, the major new systems built >>> on Squeak (like Croquet, Tweak, Seaside, etc.) are MIT licensed. >> >> As a contributor, I don't mind that choice. Still, the MIT license >> requires that "The above copyright notice and this permission notice >> shall be included in all copies or substantial portions of the >> Software.", so I'm curious how this requirement will be executed. >> >> Will the licenses all be collected in a class comment of some class? >> >> Regards, >> Martin >> >> > Should we not have a class License and method licenseMIT ? > I think we have been sloppy with distributing the license with Squeak > in the past. I can't recall any download wich include the Squeak > License or any other for that sake. the methods which refer to the MIT license. I'm not sure how to do that.... Karl |
In reply to this post by Karl-19
> Should we not have a class License and method licenseMIT ?
> I think we have been sloppy with distributing the license with Squeak in > the past. I can't recall any download wich include the Squeak License or > any other for that sake. Reviewing the Squeak license, it appears that it includes no requirement to actually include the license in the software, so not including it wasn't a license violation (although "sloppy" may still apply). This is different with the MIT license: it explicitly requires inclusion with the software. Given that the MIT license is a template, so each contributor will provide her own version of it, this gives a long list of licenses to include. Regards, Martin |
Martin v. Löwis skrev:
>> Should we not have a class License and method licenseMIT ? >> I think we have been sloppy with distributing the license with Squeak >> in the past. I can't recall any download wich include the Squeak >> License or any other for that sake. > > Reviewing the Squeak license, it appears that it includes no requirement > to actually include the license in the software, so not including it > wasn't a license violation (although "sloppy" may still apply). This is > different with the MIT license: it explicitly requires inclusion with > the software. Given that the MIT license is a template, so each > contributor will provide her own version of it, this gives a long list > of licenses to include. 'The above copyright notice and this permission notice shall be included in all copies or substantial portions of the Software.' So it should be enough to have one license in the image ? Maybe all packages/modules should be dependent on a License module ? Patches and bugfixes, I don't know, depends on how substantial they are ? Licenses are a source of endless discussions :-) Karl |
Free forum by Nabble | Edit this page |