Smalltalk › Squeak › Squeak - Dev

A Better (different) WeakArray

_‹ Previous Topic Next Topic _›

Classic

List

Threaded

12 messages Options

William Harford

A Better (different) WeakArray

Hi All,

I am building a persistent Object to Relational storage engine and am
using WeakValueDictionary for an object cache. This works great for a
small number of object (< 1000) but if the cache gets more than a
couple thousand objects the WeakArray finalization process begins to
take over and grind the image to a halt.

I am interested to know why WeakArray finalization eats up the CPU
when standard gc does not. The only gc methods I have implemented
have been standard reference counting ones so I know little of the
details of other garbage collectors.

My second question is.....
Is there a simple way to implement a WeakArray (WeakValueDictionary)
that handles larger value set better. Maybe a WeakValueDictionary
that only ever looks at 1000 elements at a time for finalization and
would run less often.

Any ideas, explanation, or input would be appreciated.
Thanks
Will

Andreas.Raab

Re: A Better (different) WeakArray

Hi William -

William Harford wrote:
> I am interested to know why WeakArray finalization eats up the CPU when
> standard gc does not. The only gc methods I have implemented have been
> standard reference counting ones so I know little of the details of
> other garbage collectors.

The main issue is that the garbage collector does not provide sufficient
information about which element of what array has been finalized to the
image. All the garbage collector does is to signal an event that such a
finalization *has* occurred (but not where). Therefore, if you want to
react to it you'll have to scan the elements of "your" weak array and
the amount of time spent in that process depends both on the number of
objects finalized as well as the number of slots scanned.

> My second question is.....
> Is there a simple way to implement a WeakArray (WeakValueDictionary)
> that handles larger value set better. Maybe a WeakValueDictionary that
> only ever looks at 1000 elements at a time for finalization and would
> run less often.

There is no "simple" way for a particular meaning of "simple" - the
current mechanism (which I wrote years and years back) was by far the
simplest thing that could possibly work and it has served its intended
purposes (mainly avoiding dangling resource handles like files or
sockets) very well. It was never intended to scale up to thousands of
objects. Any other mechanism will necessarily be more complex than the
current mechanism.

Cheers,
- Andreas

Lukas Renggli

Re: A Better (different) WeakArray

> There is no "simple" way for a particular meaning of "simple" - the
> current mechanism (which I wrote years and years back) was by far the
> simplest thing that could possibly work and it has served its intended
> purposes (mainly avoiding dangling resource handles like files or
> sockets) very well. It was never intended to scale up to thousands of
> objects. Any other mechanism will necessarily be more complex than the
> current mechanism.

This is a pity, since Seaside uses WeakArrays to cache session state.
We had some applications that started to spend a lot (all) of the CPU
time within the weak-finalizer process. The workaround was to disable
the use of WeakArrays, with the drawback that memory was released much
later and the application required much more memory.

For frameworks like Seaside it is essential to have a performant
implementation of WeakArrays up to an unlimited size. Also
Dictionaries should have a much better performance, i.e. access-time
should be O(1) as its name suggests and not O(n) for more than a few
thousand elements.

My question: How can we improve the current implementation, that
worked well for years, but that obviously doesn't scale well enough
anymore? It should be certainly a goal for Squeak 4 to have performant
WeakArrays and Dictionaries of arbitrary size.

Lukas

--
Lukas Renggli
http://www.lukas-renggli.ch

stéphane ducasse-2

Re: A Better (different) WeakArray

May be lukas, one way would be for the seaside community to get
something done
is to group yourself and do it.
I do not expect that the community right now has any man power to do
that.

The squeakfoundation has still problem to find a way to get some
simple enhancements done.
Enabling changes to be done is a really a challenge for the
squeakfoundation.

Stef

On 11 févr. 06, at 10:53, Lukas Renggli wrote:

>> There is no "simple" way for a particular meaning of "simple" - the
>> current mechanism (which I wrote years and years back) was by far the
>> simplest thing that could possibly work and it has served its
>> intended
>> purposes (mainly avoiding dangling resource handles like files or
>> sockets) very well. It was never intended to scale up to thousands of
>> objects. Any other mechanism will necessarily be more complex than
>> the
>> current mechanism.
>
> This is a pity, since Seaside uses WeakArrays to cache session state.
> We had some applications that started to spend a lot (all) of the CPU
> time within the weak-finalizer process. The workaround was to disable
> the use of WeakArrays, with the drawback that memory was released much
> later and the application required much more memory.
>
> For frameworks like Seaside it is essential to have a performant
> implementation of WeakArrays up to an unlimited size. Also
> Dictionaries should have a much better performance, i.e. access-time
> should be O(1) as its name suggests and not O(n) for more than a few
> thousand elements.
>
> My question: How can we improve the current implementation, that
> worked well for years, but that obviously doesn't scale well enough
> anymore? It should be certainly a goal for Squeak 4 to have performant
> WeakArrays and Dictionaries of arbitrary size.
>
> Lukas
>
> --
> Lukas Renggli
> http://www.lukas-renggli.ch
>
>

stéphane ducasse-2

Re: A Better (different) WeakArray

In reply to this post by Lukas Renggli

By the way, I think that if someone or a group of people would step
in to fix this problem,
the foundation would really like to support this initiative with some
money.
Else I do not understand what is our role. So we could be a
consortium and create a money pot
for fixing this problem.

Stef

On 11 févr. 06, at 10:53, Lukas Renggli wrote:

Giovanni Corriga

Re: A Better (different) WeakArray

Il giorno sab, 11/02/2006 alle 11.47 +0100, stéphane ducasse ha scritto:
> By the way, I think that if someone or a group of people would step
> in to fix this problem,
> the foundation would really like to support this initiative with some
> money.
> Else I do not understand what is our role. So we could be a
> consortium and create a money pot
> for fixing this problem.
>

Would bounties be a good idea?

Also, if Google is going to hold another Summer of Code
( http://code.google.com/summerofcode.html ), the foundation could
partecipate as a mentoring organization.

Giovanni

stéphane ducasse-2

Re: A Better (different) WeakArray

On 11 févr. 06, at 16:16, Giovanni Corriga wrote:

> Il giorno sab, 11/02/2006 alle 11.47 +0100, stéphane ducasse ha
> scritto:
>> By the way, I think that if someone or a group of people would step
>> in to fix this problem,
>> the foundation would really like to support this initiative with some
>> money.
>> Else I do not understand what is our role. So we could be a
>> consortium and create a money pot
>> for fixing this problem.
>>
>
> Would bounties be a good idea?

Explain to me how this would be working?
I really think that we should bootstrap soon.

Avi already said that he was interested in participating (bounties
for speed up of MC).

> Also, if Google is going to hold another Summer of Code
> ( http://code.google.com/summerofcode.html ), the foundation could
> partecipate as a mentoring organization.

Sure. This is why we need a formal organization. To support
cryptography package
registration and other actions.

Stef

Giovanni Corriga

Re: A Better (different) WeakArray

Il giorno sab, 11/02/2006 alle 16.45 +0100, stéphane ducasse ha scritto:

> On 11 févr. 06, at 16:16, Giovanni Corriga wrote:
>
> > Il giorno sab, 11/02/2006 alle 11.47 +0100, stéphane ducasse ha
> > scritto:
> >> By the way, I think that if someone or a group of people would step
> >> in to fix this problem,
> >> the foundation would really like to support this initiative with some
> >> money.
> >> Else I do not understand what is our role. So we could be a
> >> consortium and create a money pot
> >> for fixing this problem.
> >>
> >
> > Would bounties be a good idea?
>
> Explain to me how this would be working?
> I really think that we should bootstrap soon.
>
> Avi already said that he was interested in participating (bounties
> for speed up of MC).

The Foundation could offer a bounty on some features or problems which
are needed. The first developer to provide a working implementation of a
feature or solution to a problem gets the corresponing bounty.
Examples of things for which the foundation could offer a bounty are:
- speedup of MC
- speedup of OmniBrowser
- a test suite for the Collections cathegory
etc.

Giovanni

stéphane ducasse-2

Re: A Better (different) WeakArray

>
> The Foundation could offer a bounty on some features or problems which
> are needed. The first developer to provide a working implementation
> of a
> feature or solution to a problem gets the corresponing bounty.
> Examples of things for which the foundation could offer a bounty are:
> - speedup of MC
> - speedup of OmniBrowser
> - a test suite for the Collections cathegory
> etc.

Yes this is an excellent idea.
I think that we should go that way.

Fixing the weakArray
Fixing the refresh debug problem

I will propose that to the foundation board.

Stef

David T. Lewis

Re: A Better (different) WeakArray

In reply to this post by Andreas.Raab

I entered Mantis #2910 with an enhancement that greatly improves
performance of explicit deletion from a WeakKeyDictionary. This is
the bottleneck for e.g. a large WeakRegistry with objects being
added and removed frequently. It does *not* address finalization
performance for the reason explained by Andreas below.

If your application is able to explicitly remove objects most of
the time, as opposed to letting the finalization process take
care of it, you may find that WeakKeyDictionarySpeedup-dtl.cs
helps quite a bit.

Please do not use this on a critical image such as a Seaside
server until some qualified person (aka Andreas) says it's OK.

Dave

On Fri, Feb 10, 2006 at 09:54:54PM -0500, Andreas Raab wrote:

> Hi William -
>
> William Harford wrote:
> > I am interested to know why WeakArray finalization eats up the CPU when
> > standard gc does not. The only gc methods I have implemented have been
> > standard reference counting ones so I know little of the details of
> > other garbage collectors.
>
> The main issue is that the garbage collector does not provide sufficient
> information about which element of what array has been finalized to the
> image. All the garbage collector does is to signal an event that such a
> finalization *has* occurred (but not where). Therefore, if you want to
> react to it you'll have to scan the elements of "your" weak array and
> the amount of time spent in that process depends both on the number of
> objects finalized as well as the number of slots scanned.
>
> > My second question is.....
> > Is there a simple way to implement a WeakArray (WeakValueDictionary)
> > that handles larger value set better. Maybe a WeakValueDictionary that
> > only ever looks at 1000 elements at a time for finalization and would
> > run less often.
>
> There is no "simple" way for a particular meaning of "simple" - the
> current mechanism (which I wrote years and years back) was by far the
> simplest thing that could possibly work and it has served its intended
> purposes (mainly avoiding dangling resource handles like files or
> sockets) very well. It was never intended to scale up to thousands of
> objects. Any other mechanism will necessarily be more complex than the
> current mechanism.
>
> Cheers,
> - Andreas

Andreas.Raab

Re: A Better (different) WeakArray

Unfortunately, this doesn't work. The reason for using #rehash was that
when a key in a weak key dictionary gets eliminated that key's hash
changes. Your implementation doesn't account for that and will break in
the face of some actual finalized keys in the dictionary.

Cheers,
- Andreas

David T. Lewis wrote:

> I entered Mantis #2910 with an enhancement that greatly improves
> performance of explicit deletion from a WeakKeyDictionary. This is
> the bottleneck for e.g. a large WeakRegistry with objects being
> added and removed frequently. It does *not* address finalization
> performance for the reason explained by Andreas below.
>
> If your application is able to explicitly remove objects most of
> the time, as opposed to letting the finalization process take
> care of it, you may find that WeakKeyDictionarySpeedup-dtl.cs
> helps quite a bit.
>
> Please do not use this on a critical image such as a Seaside
> server until some qualified person (aka Andreas) says it's OK.
>
> Dave
>
> On Fri, Feb 10, 2006 at 09:54:54PM -0500, Andreas Raab wrote:
>> Hi William -
>>
>> William Harford wrote:
>>> I am interested to know why WeakArray finalization eats up the CPU when
>>> standard gc does not. The only gc methods I have implemented have been
>>> standard reference counting ones so I know little of the details of
>>> other garbage collectors.
>> The main issue is that the garbage collector does not provide sufficient
>> information about which element of what array has been finalized to the
>> image. All the garbage collector does is to signal an event that such a
>> finalization *has* occurred (but not where). Therefore, if you want to
>> react to it you'll have to scan the elements of "your" weak array and
>> the amount of time spent in that process depends both on the number of
>> objects finalized as well as the number of slots scanned.
>>
>>> My second question is.....
>>> Is there a simple way to implement a WeakArray (WeakValueDictionary)
>>> that handles larger value set better. Maybe a WeakValueDictionary that
>>> only ever looks at 1000 elements at a time for finalization and would
>>> run less often.
>> There is no "simple" way for a particular meaning of "simple" - the
>> current mechanism (which I wrote years and years back) was by far the
>> simplest thing that could possibly work and it has served its intended
>> purposes (mainly avoiding dangling resource handles like files or
>> sockets) very well. It was never intended to scale up to thousands of
>> objects. Any other mechanism will necessarily be more complex than the
>> current mechanism.
>>
>> Cheers,
>> - Andreas
>
>

David T. Lewis

Re: A Better (different) WeakArray

On Sat, Feb 18, 2006 at 01:56:50PM -0800, Andreas Raab wrote:
> Unfortunately, this doesn't work. The reason for using #rehash was that
> when a key in a weak key dictionary gets eliminated that key's hash
> changes. Your implementation doesn't account for that and will break in
> the face of some actual finalized keys in the dictionary.
>
> Cheers,
> - Andreas

Darn. Thanks for looking at this.

But just so I understand, I'm deleting any entries with nil keys
within the range of entries being rehashed (up to the next nil
entry), so I thought that this would take care of the case of
entries within that range that had been eliminated by the garbage
collector. The idea was that the finalization process, which still
does the full rehash, would take care of the rest. What did I miss?

Thanks and sorry if I'm being dense.

Dave

p.s. To the extent that I understand this at all, your
ObjectMemory>>aFinalizationComment was very helpful, thanks
for documenting this.