Hi All,
I am building a persistent Object to Relational storage engine and am using WeakValueDictionary for an object cache. This works great for a small number of object (< 1000) but if the cache gets more than a couple thousand objects the WeakArray finalization process begins to take over and grind the image to a halt. I am interested to know why WeakArray finalization eats up the CPU when standard gc does not. The only gc methods I have implemented have been standard reference counting ones so I know little of the details of other garbage collectors. My second question is..... Is there a simple way to implement a WeakArray (WeakValueDictionary) that handles larger value set better. Maybe a WeakValueDictionary that only ever looks at 1000 elements at a time for finalization and would run less often. Any ideas, explanation, or input would be appreciated. Thanks Will |
Hi William -
William Harford wrote: > I am interested to know why WeakArray finalization eats up the CPU when > standard gc does not. The only gc methods I have implemented have been > standard reference counting ones so I know little of the details of > other garbage collectors. The main issue is that the garbage collector does not provide sufficient information about which element of what array has been finalized to the image. All the garbage collector does is to signal an event that such a finalization *has* occurred (but not where). Therefore, if you want to react to it you'll have to scan the elements of "your" weak array and the amount of time spent in that process depends both on the number of objects finalized as well as the number of slots scanned. > My second question is..... > Is there a simple way to implement a WeakArray (WeakValueDictionary) > that handles larger value set better. Maybe a WeakValueDictionary that > only ever looks at 1000 elements at a time for finalization and would > run less often. There is no "simple" way for a particular meaning of "simple" - the current mechanism (which I wrote years and years back) was by far the simplest thing that could possibly work and it has served its intended purposes (mainly avoiding dangling resource handles like files or sockets) very well. It was never intended to scale up to thousands of objects. Any other mechanism will necessarily be more complex than the current mechanism. Cheers, - Andreas |
> There is no "simple" way for a particular meaning of "simple" - the
> current mechanism (which I wrote years and years back) was by far the > simplest thing that could possibly work and it has served its intended > purposes (mainly avoiding dangling resource handles like files or > sockets) very well. It was never intended to scale up to thousands of > objects. Any other mechanism will necessarily be more complex than the > current mechanism. This is a pity, since Seaside uses WeakArrays to cache session state. We had some applications that started to spend a lot (all) of the CPU time within the weak-finalizer process. The workaround was to disable the use of WeakArrays, with the drawback that memory was released much later and the application required much more memory. For frameworks like Seaside it is essential to have a performant implementation of WeakArrays up to an unlimited size. Also Dictionaries should have a much better performance, i.e. access-time should be O(1) as its name suggests and not O(n) for more than a few thousand elements. My question: How can we improve the current implementation, that worked well for years, but that obviously doesn't scale well enough anymore? It should be certainly a goal for Squeak 4 to have performant WeakArrays and Dictionaries of arbitrary size. Lukas -- Lukas Renggli http://www.lukas-renggli.ch |
May be lukas, one way would be for the seaside community to get
something done is to group yourself and do it. I do not expect that the community right now has any man power to do that. The squeakfoundation has still problem to find a way to get some simple enhancements done. Enabling changes to be done is a really a challenge for the squeakfoundation. Stef On 11 févr. 06, at 10:53, Lukas Renggli wrote: >> There is no "simple" way for a particular meaning of "simple" - the >> current mechanism (which I wrote years and years back) was by far the >> simplest thing that could possibly work and it has served its >> intended >> purposes (mainly avoiding dangling resource handles like files or >> sockets) very well. It was never intended to scale up to thousands of >> objects. Any other mechanism will necessarily be more complex than >> the >> current mechanism. > > This is a pity, since Seaside uses WeakArrays to cache session state. > We had some applications that started to spend a lot (all) of the CPU > time within the weak-finalizer process. The workaround was to disable > the use of WeakArrays, with the drawback that memory was released much > later and the application required much more memory. > > For frameworks like Seaside it is essential to have a performant > implementation of WeakArrays up to an unlimited size. Also > Dictionaries should have a much better performance, i.e. access-time > should be O(1) as its name suggests and not O(n) for more than a few > thousand elements. > > My question: How can we improve the current implementation, that > worked well for years, but that obviously doesn't scale well enough > anymore? It should be certainly a goal for Squeak 4 to have performant > WeakArrays and Dictionaries of arbitrary size. > > Lukas > > -- > Lukas Renggli > http://www.lukas-renggli.ch > > |
In reply to this post by Lukas Renggli
By the way, I think that if someone or a group of people would step
in to fix this problem, the foundation would really like to support this initiative with some money. Else I do not understand what is our role. So we could be a consortium and create a money pot for fixing this problem. Stef On 11 févr. 06, at 10:53, Lukas Renggli wrote: >> There is no "simple" way for a particular meaning of "simple" - the >> current mechanism (which I wrote years and years back) was by far the >> simplest thing that could possibly work and it has served its >> intended >> purposes (mainly avoiding dangling resource handles like files or >> sockets) very well. It was never intended to scale up to thousands of >> objects. Any other mechanism will necessarily be more complex than >> the >> current mechanism. > > This is a pity, since Seaside uses WeakArrays to cache session state. > We had some applications that started to spend a lot (all) of the CPU > time within the weak-finalizer process. The workaround was to disable > the use of WeakArrays, with the drawback that memory was released much > later and the application required much more memory. > > For frameworks like Seaside it is essential to have a performant > implementation of WeakArrays up to an unlimited size. Also > Dictionaries should have a much better performance, i.e. access-time > should be O(1) as its name suggests and not O(n) for more than a few > thousand elements. > > My question: How can we improve the current implementation, that > worked well for years, but that obviously doesn't scale well enough > anymore? It should be certainly a goal for Squeak 4 to have performant > WeakArrays and Dictionaries of arbitrary size. > > Lukas > > -- > Lukas Renggli > http://www.lukas-renggli.ch > > |
Il giorno sab, 11/02/2006 alle 11.47 +0100, stéphane ducasse ha scritto:
> By the way, I think that if someone or a group of people would step > in to fix this problem, > the foundation would really like to support this initiative with some > money. > Else I do not understand what is our role. So we could be a > consortium and create a money pot > for fixing this problem. > Would bounties be a good idea? Also, if Google is going to hold another Summer of Code ( http://code.google.com/summerofcode.html ), the foundation could partecipate as a mentoring organization. Giovanni |
On 11 févr. 06, at 16:16, Giovanni Corriga wrote: > Il giorno sab, 11/02/2006 alle 11.47 +0100, stéphane ducasse ha > scritto: >> By the way, I think that if someone or a group of people would step >> in to fix this problem, >> the foundation would really like to support this initiative with some >> money. >> Else I do not understand what is our role. So we could be a >> consortium and create a money pot >> for fixing this problem. >> > > Would bounties be a good idea? Explain to me how this would be working? I really think that we should bootstrap soon. Avi already said that he was interested in participating (bounties for speed up of MC). > Also, if Google is going to hold another Summer of Code > ( http://code.google.com/summerofcode.html ), the foundation could > partecipate as a mentoring organization. Sure. This is why we need a formal organization. To support cryptography package registration and other actions. Stef |
Il giorno sab, 11/02/2006 alle 16.45 +0100, stéphane ducasse ha scritto:
> On 11 févr. 06, at 16:16, Giovanni Corriga wrote: > > > Il giorno sab, 11/02/2006 alle 11.47 +0100, stéphane ducasse ha > > scritto: > >> By the way, I think that if someone or a group of people would step > >> in to fix this problem, > >> the foundation would really like to support this initiative with some > >> money. > >> Else I do not understand what is our role. So we could be a > >> consortium and create a money pot > >> for fixing this problem. > >> > > > > Would bounties be a good idea? > > Explain to me how this would be working? > I really think that we should bootstrap soon. > > Avi already said that he was interested in participating (bounties > for speed up of MC). The Foundation could offer a bounty on some features or problems which are needed. The first developer to provide a working implementation of a feature or solution to a problem gets the corresponing bounty. Examples of things for which the foundation could offer a bounty are: - speedup of MC - speedup of OmniBrowser - a test suite for the Collections cathegory etc. Giovanni |
>
> The Foundation could offer a bounty on some features or problems which > are needed. The first developer to provide a working implementation > of a > feature or solution to a problem gets the corresponing bounty. > Examples of things for which the foundation could offer a bounty are: > - speedup of MC > - speedup of OmniBrowser > - a test suite for the Collections cathegory > etc. Yes this is an excellent idea. I think that we should go that way. Fixing the weakArray Fixing the refresh debug problem I will propose that to the foundation board. Stef |
In reply to this post by Andreas.Raab
I entered Mantis #2910 with an enhancement that greatly improves
performance of explicit deletion from a WeakKeyDictionary. This is the bottleneck for e.g. a large WeakRegistry with objects being added and removed frequently. It does *not* address finalization performance for the reason explained by Andreas below. If your application is able to explicitly remove objects most of the time, as opposed to letting the finalization process take care of it, you may find that WeakKeyDictionarySpeedup-dtl.cs helps quite a bit. Please do not use this on a critical image such as a Seaside server until some qualified person (aka Andreas) says it's OK. Dave On Fri, Feb 10, 2006 at 09:54:54PM -0500, Andreas Raab wrote: > Hi William - > > William Harford wrote: > > I am interested to know why WeakArray finalization eats up the CPU when > > standard gc does not. The only gc methods I have implemented have been > > standard reference counting ones so I know little of the details of > > other garbage collectors. > > The main issue is that the garbage collector does not provide sufficient > information about which element of what array has been finalized to the > image. All the garbage collector does is to signal an event that such a > finalization *has* occurred (but not where). Therefore, if you want to > react to it you'll have to scan the elements of "your" weak array and > the amount of time spent in that process depends both on the number of > objects finalized as well as the number of slots scanned. > > > My second question is..... > > Is there a simple way to implement a WeakArray (WeakValueDictionary) > > that handles larger value set better. Maybe a WeakValueDictionary that > > only ever looks at 1000 elements at a time for finalization and would > > run less often. > > There is no "simple" way for a particular meaning of "simple" - the > current mechanism (which I wrote years and years back) was by far the > simplest thing that could possibly work and it has served its intended > purposes (mainly avoiding dangling resource handles like files or > sockets) very well. It was never intended to scale up to thousands of > objects. Any other mechanism will necessarily be more complex than the > current mechanism. > > Cheers, > - Andreas |
Unfortunately, this doesn't work. The reason for using #rehash was that
when a key in a weak key dictionary gets eliminated that key's hash changes. Your implementation doesn't account for that and will break in the face of some actual finalized keys in the dictionary. Cheers, - Andreas David T. Lewis wrote: > I entered Mantis #2910 with an enhancement that greatly improves > performance of explicit deletion from a WeakKeyDictionary. This is > the bottleneck for e.g. a large WeakRegistry with objects being > added and removed frequently. It does *not* address finalization > performance for the reason explained by Andreas below. > > If your application is able to explicitly remove objects most of > the time, as opposed to letting the finalization process take > care of it, you may find that WeakKeyDictionarySpeedup-dtl.cs > helps quite a bit. > > Please do not use this on a critical image such as a Seaside > server until some qualified person (aka Andreas) says it's OK. > > Dave > > On Fri, Feb 10, 2006 at 09:54:54PM -0500, Andreas Raab wrote: >> Hi William - >> >> William Harford wrote: >>> I am interested to know why WeakArray finalization eats up the CPU when >>> standard gc does not. The only gc methods I have implemented have been >>> standard reference counting ones so I know little of the details of >>> other garbage collectors. >> The main issue is that the garbage collector does not provide sufficient >> information about which element of what array has been finalized to the >> image. All the garbage collector does is to signal an event that such a >> finalization *has* occurred (but not where). Therefore, if you want to >> react to it you'll have to scan the elements of "your" weak array and >> the amount of time spent in that process depends both on the number of >> objects finalized as well as the number of slots scanned. >> >>> My second question is..... >>> Is there a simple way to implement a WeakArray (WeakValueDictionary) >>> that handles larger value set better. Maybe a WeakValueDictionary that >>> only ever looks at 1000 elements at a time for finalization and would >>> run less often. >> There is no "simple" way for a particular meaning of "simple" - the >> current mechanism (which I wrote years and years back) was by far the >> simplest thing that could possibly work and it has served its intended >> purposes (mainly avoiding dangling resource handles like files or >> sockets) very well. It was never intended to scale up to thousands of >> objects. Any other mechanism will necessarily be more complex than the >> current mechanism. >> >> Cheers, >> - Andreas > > |
On Sat, Feb 18, 2006 at 01:56:50PM -0800, Andreas Raab wrote:
> Unfortunately, this doesn't work. The reason for using #rehash was that > when a key in a weak key dictionary gets eliminated that key's hash > changes. Your implementation doesn't account for that and will break in > the face of some actual finalized keys in the dictionary. > > Cheers, > - Andreas Darn. Thanks for looking at this. But just so I understand, I'm deleting any entries with nil keys within the range of entries being rehashed (up to the next nil entry), so I thought that this would take care of the case of entries within that range that had been eliminated by the garbage collector. The idea was that the finalization process, which still does the full rehash, would take care of the rest. What did I miss? Thanks and sorry if I'm being dense. Dave p.s. To the extent that I understand this at all, your ObjectMemory>>aFinalizationComment was very helpful, thanks for documenting this. |
Free forum by Nabble | Edit this page |