A Better (different) WeakArray

Previous Topic Next Topic
 
classic Classic list List threaded Threaded
12 messages Options
Reply | Threaded
Open this post in threaded view
|

A Better (different) WeakArray

William Harford
Hi All,

I am building a persistent Object to Relational storage engine and am  
using WeakValueDictionary for an object cache. This works great for a  
small number of object (< 1000) but if the cache gets more than a  
couple thousand objects the WeakArray finalization process begins to  
take over and grind the image to a halt.

I am interested to know why WeakArray finalization eats up the CPU  
when standard gc does not. The only gc methods I have implemented  
have been standard reference counting ones so I know little of the  
details of other garbage collectors.

My second question is.....
Is there a simple way to implement a WeakArray (WeakValueDictionary)  
that handles larger value set better. Maybe a WeakValueDictionary  
that only ever looks at 1000 elements at a time for finalization and  
would run less often.

Any ideas, explanation, or input would be appreciated.
Thanks
Will



Reply | Threaded
Open this post in threaded view
|

Re: A Better (different) WeakArray

Andreas.Raab
Hi William -

William Harford wrote:
> I am interested to know why WeakArray finalization eats up the CPU when
> standard gc does not. The only gc methods I have implemented have been
> standard reference counting ones so I know little of the details of
> other garbage collectors.

The main issue is that the garbage collector does not provide sufficient
information about which element of what array has been finalized to the
image. All the garbage collector does is to signal an event that such a
finalization *has* occurred (but not where). Therefore, if you want to
react to it you'll have to scan the elements of "your" weak array and
the amount of time spent in that process depends both on the number of
objects finalized as well as the number of slots scanned.

> My second question is.....
> Is there a simple way to implement a WeakArray (WeakValueDictionary)
> that handles larger value set better. Maybe a WeakValueDictionary that
> only ever looks at 1000 elements at a time for finalization and would
> run less often.

There is no "simple" way for a particular meaning of "simple" - the
current mechanism (which I wrote years and years back) was by far the
simplest thing that could possibly work and it has served its intended
purposes (mainly avoiding dangling resource handles like files or
sockets) very well. It was never intended to scale up to thousands of
objects. Any other mechanism will necessarily be more complex than the
current mechanism.

Cheers,
   - Andreas

Reply | Threaded
Open this post in threaded view
|

Re: A Better (different) WeakArray

Lukas Renggli
> There is no "simple" way for a particular meaning of "simple" - the
> current mechanism (which I wrote years and years back) was by far the
> simplest thing that could possibly work and it has served its intended
> purposes (mainly avoiding dangling resource handles like files or
> sockets) very well. It was never intended to scale up to thousands of
> objects. Any other mechanism will necessarily be more complex than the
> current mechanism.

This is a pity, since Seaside uses WeakArrays to cache session state.
We had some applications that started to spend a lot (all) of the CPU
time within the weak-finalizer process. The workaround was to disable
the use of WeakArrays, with the drawback that memory was released much
later and the application required much more memory.

For frameworks like Seaside it is essential to have a performant
implementation of WeakArrays up to an unlimited size. Also
Dictionaries should have a much better performance, i.e. access-time
should be O(1) as its name suggests and not O(n) for more than a few
thousand elements.

My question: How can we improve the current implementation, that
worked well for years, but that obviously doesn't scale well enough
anymore? It should be certainly a goal for Squeak 4 to have performant
WeakArrays and Dictionaries of arbitrary size.

Lukas

--
Lukas Renggli
http://www.lukas-renggli.ch

Reply | Threaded
Open this post in threaded view
|

Re: A Better (different) WeakArray

stéphane ducasse-2
May be lukas, one way would be for the seaside community to get  
something done
is to group yourself and do it.
I do not expect that the community right now has any man power to do  
that.

The squeakfoundation has still problem to find a way to get some  
simple enhancements done.
Enabling changes to be done is a really a challenge for the  
squeakfoundation.

Stef

On 11 févr. 06, at 10:53, Lukas Renggli wrote:

>> There is no "simple" way for a particular meaning of "simple" - the
>> current mechanism (which I wrote years and years back) was by far the
>> simplest thing that could possibly work and it has served its  
>> intended
>> purposes (mainly avoiding dangling resource handles like files or
>> sockets) very well. It was never intended to scale up to thousands of
>> objects. Any other mechanism will necessarily be more complex than  
>> the
>> current mechanism.
>
> This is a pity, since Seaside uses WeakArrays to cache session state.
> We had some applications that started to spend a lot (all) of the CPU
> time within the weak-finalizer process. The workaround was to disable
> the use of WeakArrays, with the drawback that memory was released much
> later and the application required much more memory.
>
> For frameworks like Seaside it is essential to have a performant
> implementation of WeakArrays up to an unlimited size. Also
> Dictionaries should have a much better performance, i.e. access-time
> should be O(1) as its name suggests and not O(n) for more than a few
> thousand elements.
>
> My question: How can we improve the current implementation, that
> worked well for years, but that obviously doesn't scale well enough
> anymore? It should be certainly a goal for Squeak 4 to have performant
> WeakArrays and Dictionaries of arbitrary size.
>
> Lukas
>
> --
> Lukas Renggli
> http://www.lukas-renggli.ch
>
>


Reply | Threaded
Open this post in threaded view
|

Re: A Better (different) WeakArray

stéphane ducasse-2
In reply to this post by Lukas Renggli
By the way, I think that if someone or a group of people would step  
in to fix this problem,
the foundation would really like to support this initiative with some  
money.
Else I do not understand what is our role. So we could be a  
consortium and create a money pot
for fixing this problem.

Stef

On 11 févr. 06, at 10:53, Lukas Renggli wrote:

>> There is no "simple" way for a particular meaning of "simple" - the
>> current mechanism (which I wrote years and years back) was by far the
>> simplest thing that could possibly work and it has served its  
>> intended
>> purposes (mainly avoiding dangling resource handles like files or
>> sockets) very well. It was never intended to scale up to thousands of
>> objects. Any other mechanism will necessarily be more complex than  
>> the
>> current mechanism.
>
> This is a pity, since Seaside uses WeakArrays to cache session state.
> We had some applications that started to spend a lot (all) of the CPU
> time within the weak-finalizer process. The workaround was to disable
> the use of WeakArrays, with the drawback that memory was released much
> later and the application required much more memory.
>
> For frameworks like Seaside it is essential to have a performant
> implementation of WeakArrays up to an unlimited size. Also
> Dictionaries should have a much better performance, i.e. access-time
> should be O(1) as its name suggests and not O(n) for more than a few
> thousand elements.
>
> My question: How can we improve the current implementation, that
> worked well for years, but that obviously doesn't scale well enough
> anymore? It should be certainly a goal for Squeak 4 to have performant
> WeakArrays and Dictionaries of arbitrary size.
>
> Lukas
>
> --
> Lukas Renggli
> http://www.lukas-renggli.ch
>
>


Reply | Threaded
Open this post in threaded view
|

Re: A Better (different) WeakArray

Giovanni Corriga
Il giorno sab, 11/02/2006 alle 11.47 +0100, stéphane ducasse ha scritto:
> By the way, I think that if someone or a group of people would step  
> in to fix this problem,
> the foundation would really like to support this initiative with some  
> money.
> Else I do not understand what is our role. So we could be a  
> consortium and create a money pot
> for fixing this problem.
>

Would bounties be a good idea?

Also, if Google is going to hold another Summer of Code
( http://code.google.com/summerofcode.html ), the foundation could
partecipate as a mentoring organization.

        Giovanni


Reply | Threaded
Open this post in threaded view
|

Re: A Better (different) WeakArray

stéphane ducasse-2

On 11 févr. 06, at 16:16, Giovanni Corriga wrote:

> Il giorno sab, 11/02/2006 alle 11.47 +0100, stéphane ducasse ha  
> scritto:
>> By the way, I think that if someone or a group of people would step
>> in to fix this problem,
>> the foundation would really like to support this initiative with some
>> money.
>> Else I do not understand what is our role. So we could be a
>> consortium and create a money pot
>> for fixing this problem.
>>
>
> Would bounties be a good idea?

Explain to me how this would be working?
I really think that we should bootstrap soon.

Avi already said that he was interested in participating (bounties  
for speed up of MC).

> Also, if Google is going to hold another Summer of Code
> ( http://code.google.com/summerofcode.html ), the foundation could
> partecipate as a mentoring organization.

Sure. This is why we need a formal organization. To support  
cryptography package
registration and other actions.

Stef
Reply | Threaded
Open this post in threaded view
|

Re: A Better (different) WeakArray

Giovanni Corriga
Il giorno sab, 11/02/2006 alle 16.45 +0100, stéphane ducasse ha scritto:

> On 11 févr. 06, at 16:16, Giovanni Corriga wrote:
>
> > Il giorno sab, 11/02/2006 alle 11.47 +0100, stéphane ducasse ha  
> > scritto:
> >> By the way, I think that if someone or a group of people would step
> >> in to fix this problem,
> >> the foundation would really like to support this initiative with some
> >> money.
> >> Else I do not understand what is our role. So we could be a
> >> consortium and create a money pot
> >> for fixing this problem.
> >>
> >
> > Would bounties be a good idea?
>
> Explain to me how this would be working?
> I really think that we should bootstrap soon.
>
> Avi already said that he was interested in participating (bounties  
> for speed up of MC).

The Foundation could offer a bounty on some features or problems which
are needed. The first developer to provide a working implementation of a
feature or solution to a problem gets the corresponing bounty.
Examples of things for which the foundation could offer a bounty are:
- speedup of MC
- speedup of OmniBrowser
- a test suite for the Collections cathegory
etc.

        Giovanni


Reply | Threaded
Open this post in threaded view
|

Re: A Better (different) WeakArray

stéphane ducasse-2
>
> The Foundation could offer a bounty on some features or problems which
> are needed. The first developer to provide a working implementation  
> of a
> feature or solution to a problem gets the corresponing bounty.
> Examples of things for which the foundation could offer a bounty are:
> - speedup of MC
> - speedup of OmniBrowser
> - a test suite for the Collections cathegory
> etc.


Yes this is an excellent idea.
I think that we should go that way.

Fixing the weakArray
Fixing the refresh debug problem

I will propose that to the foundation board.

Stef

Reply | Threaded
Open this post in threaded view
|

Re: A Better (different) WeakArray

David T. Lewis
In reply to this post by Andreas.Raab
I entered Mantis #2910 with an enhancement that greatly improves
performance of explicit deletion from a WeakKeyDictionary. This is
the bottleneck for e.g. a large WeakRegistry with objects being
added and removed frequently. It does *not* address finalization
performance for the reason explained by Andreas below.

If your application is able to explicitly remove objects most of
the time, as opposed to letting the finalization process take
care of it, you may find that WeakKeyDictionarySpeedup-dtl.cs
helps quite a bit.

Please do not use this on a critical image such as a Seaside
server until some qualified person (aka Andreas) says it's OK.

Dave

On Fri, Feb 10, 2006 at 09:54:54PM -0500, Andreas Raab wrote:

> Hi William -
>
> William Harford wrote:
> > I am interested to know why WeakArray finalization eats up the CPU when
> > standard gc does not. The only gc methods I have implemented have been
> > standard reference counting ones so I know little of the details of
> > other garbage collectors.
>
> The main issue is that the garbage collector does not provide sufficient
> information about which element of what array has been finalized to the
> image. All the garbage collector does is to signal an event that such a
> finalization *has* occurred (but not where). Therefore, if you want to
> react to it you'll have to scan the elements of "your" weak array and
> the amount of time spent in that process depends both on the number of
> objects finalized as well as the number of slots scanned.
>
> > My second question is.....
> > Is there a simple way to implement a WeakArray (WeakValueDictionary)
> > that handles larger value set better. Maybe a WeakValueDictionary that
> > only ever looks at 1000 elements at a time for finalization and would
> > run less often.
>
> There is no "simple" way for a particular meaning of "simple" - the
> current mechanism (which I wrote years and years back) was by far the
> simplest thing that could possibly work and it has served its intended
> purposes (mainly avoiding dangling resource handles like files or
> sockets) very well. It was never intended to scale up to thousands of
> objects. Any other mechanism will necessarily be more complex than the
> current mechanism.
>
> Cheers,
>    - Andreas

Reply | Threaded
Open this post in threaded view
|

Re: A Better (different) WeakArray

Andreas.Raab
Unfortunately, this doesn't work. The reason for using #rehash was that
when a key in a weak key dictionary gets eliminated that key's hash
changes. Your implementation doesn't account for that and will break in
the face of some actual finalized keys in the dictionary.

Cheers,
   - Andreas

David T. Lewis wrote:

> I entered Mantis #2910 with an enhancement that greatly improves
> performance of explicit deletion from a WeakKeyDictionary. This is
> the bottleneck for e.g. a large WeakRegistry with objects being
> added and removed frequently. It does *not* address finalization
> performance for the reason explained by Andreas below.
>
> If your application is able to explicitly remove objects most of
> the time, as opposed to letting the finalization process take
> care of it, you may find that WeakKeyDictionarySpeedup-dtl.cs
> helps quite a bit.
>
> Please do not use this on a critical image such as a Seaside
> server until some qualified person (aka Andreas) says it's OK.
>
> Dave
>
> On Fri, Feb 10, 2006 at 09:54:54PM -0500, Andreas Raab wrote:
>> Hi William -
>>
>> William Harford wrote:
>>> I am interested to know why WeakArray finalization eats up the CPU when
>>> standard gc does not. The only gc methods I have implemented have been
>>> standard reference counting ones so I know little of the details of
>>> other garbage collectors.
>> The main issue is that the garbage collector does not provide sufficient
>> information about which element of what array has been finalized to the
>> image. All the garbage collector does is to signal an event that such a
>> finalization *has* occurred (but not where). Therefore, if you want to
>> react to it you'll have to scan the elements of "your" weak array and
>> the amount of time spent in that process depends both on the number of
>> objects finalized as well as the number of slots scanned.
>>
>>> My second question is.....
>>> Is there a simple way to implement a WeakArray (WeakValueDictionary)
>>> that handles larger value set better. Maybe a WeakValueDictionary that
>>> only ever looks at 1000 elements at a time for finalization and would
>>> run less often.
>> There is no "simple" way for a particular meaning of "simple" - the
>> current mechanism (which I wrote years and years back) was by far the
>> simplest thing that could possibly work and it has served its intended
>> purposes (mainly avoiding dangling resource handles like files or
>> sockets) very well. It was never intended to scale up to thousands of
>> objects. Any other mechanism will necessarily be more complex than the
>> current mechanism.
>>
>> Cheers,
>>    - Andreas
>
>


Reply | Threaded
Open this post in threaded view
|

Re: A Better (different) WeakArray

David T. Lewis
On Sat, Feb 18, 2006 at 01:56:50PM -0800, Andreas Raab wrote:
> Unfortunately, this doesn't work. The reason for using #rehash was that
> when a key in a weak key dictionary gets eliminated that key's hash
> changes. Your implementation doesn't account for that and will break in
> the face of some actual finalized keys in the dictionary.
>
> Cheers,
>    - Andreas

Darn. Thanks for looking at this.

But just so I understand, I'm deleting any entries with nil keys
within the range of entries being rehashed (up to the next nil
entry), so I thought that this would take care of the case of
entries within that range that had been eliminated by the garbage
collector. The idea was that the finalization process, which still
does the full rehash, would take care of the rest. What did I miss?

Thanks and sorry if I'm being dense.

Dave

p.s. To the extent that I understand this at all, your
ObjectMemory>>aFinalizationComment was very helpful, thanks
for documenting this.