Smalltalk › Squeak › Squeak - Dev

Re: [Seaside] WeakArray (again)

_‹ Previous Topic Next Topic _›

Classic

List

Threaded

23 messages Options

cdavidshaffer

Re: [Seaside] WeakArray (again)

William Harford wrote:

>
> The application uses call/render a lot. This was one of the main
> draws to using Seaside. It allows us to easily customize the
> application for our clients. We build our entire application out of
> smaller reusable components that can be easily customize/replaced.
> There are ~= 12 (and could go up to 30) components created for each
> request. In most cases the user moves on. Would limiting the number
> of pages a user can backtrack help the situation ?

I'm cross-posting to Squeak-dev and I think this should migrate to that
list (since it has come up there several times). Please follow up to
Squeak-dev only. I don't believe this is a problem for VisualWorks
which seems to have a significantly more robust weak reference mechanism.

...as for your question...

I don't have a good feeling for the impact of that. It would limit the
size of the WALRUCache (?) used to track continuations but I'm not sure
that it would actually impact the lifetime of objects in the registry.
Actually I don't have much intuition at all regarding lifetime of the
objects in the registry. I think the idea was that when a continuation
"expires" (gets pushed out of the LRU cache) then data in its "snapshot"
should be available for GC. Keeping the cache small might limit the
number of items in the weak dictionary if you GC often enough. Sorry I
can't say more.

Maybe Julian or Avi would care to chime in ;-)

\begin{amateurHour}

It seems to me that the notification needs to be changed to actually
queueing information about the objects which the GC deams
un(strongly)reachable. I spent some time staring at
ObjectMemory>>sweepPhase, #finalizeReference: and #signalFinalization:
which seem to be the cornerstones of this process. All that
#signalFinalization: is currently doing is signaling a semaphore (well,
indicating that one should be "signaled" later). Why not keep a list of
(oop,i) [i is the offset of the weak reference in the oop] pairs and
somehow communicate those back to a Smalltalk object? As a total VM
novice it just seems too simple ;-) What I think I would do is
associate a queue like thing with every weak reference container. Then
when an object becomes GC-able I'd place the (oop,i) pair in that shared
queue. What I need is someone to hold my hand through...

...designing this "queue like thing". How about a circular array which
can only be "read" (move the read index) by ST code and only be written
by the VM code? This avoids a lot of concurrency issues. Are there any
examples like this in the VM?

\end{amateurHour}

Andreas said it wasn't trivial and I believe him but I think we've got
to give it a try or risk having Squeak/Seaside being ignored for larger
projects. It is also a big problem for anyone using GOODS or GLORP, for
example, since those libraries make extensive use of weak references.
So...is there anyone with a knowledge of the VM willing to step up,
design the solution and divvy up parts of it for those of us with a good
knowledge of C but little of the VM.

David

Andreas.Raab

Re: [Seaside] WeakArray (again)

Hi David -

> I'm cross-posting to Squeak-dev and I think this should migrate to that
> list (since it has come up there several times). Please follow up to
> Squeak-dev only. I don't believe this is a problem for VisualWorks
> which seems to have a significantly more robust weak reference mechanism.

True. From what I've seen over at the Seaside mailing list it seems
pretty clear that the main issue is that the finalization process simply
kicks in *way* to often. Have you guys (since you're having the problem
I think it's appropriate if you can experiment with it a little ;-)
thought about tweaking the finalization process to run at most (say)
20/sec? This is a simple change (see attached CS) which could have
dramatic impact on your performance. The obvious disadvantage, of
course, is that this renders any assumptions about "Smalltalk
garbageCollect" having an immediate effect meaningless (which is the
main reason the finalization process is implemented the way it is).

> \begin{amateurHour}
[...]
> \end{amateurHour}

I'll reply to this separately.

Cheers,
- Andreas

FinalizationProcess.cs (1K) Download Attachment

ccrraaiigg

re: WeakArray (again)

In reply to this post by cdavidshaffer

> So...is there anyone with a knowledge of the VM willing to step up,
> design the solution and divvy up parts of it for those of us with a
> good knowledge of C but little of the VM.

A pox upon you if do this in handwritten C. :)

(said the simulator enthusiast)

-C

--
Craig Latta
improvisational musical informaticist
www.netjam.org
Smalltalkers do: [:it | All with: Class, (And love: it)]

Andreas.Raab

Finalization (was: Re: [Seaside] WeakArray (again))

In reply to this post by cdavidshaffer

David Shaffer wrote:

> \begin{amateurHour}
>
> It seems to me that the notification needs to be changed to actually
> queueing information about the objects which the GC deams
> un(strongly)reachable. I spent some time staring at
> ObjectMemory>>sweepPhase, #finalizeReference: and #signalFinalization:
> which seem to be the cornerstones of this process. All that
> #signalFinalization: is currently doing is signaling a semaphore (well,
> indicating that one should be "signaled" later). Why not keep a list of
> (oop,i) [i is the offset of the weak reference in the oop] pairs and
> somehow communicate those back to a Smalltalk object? As a total VM
> novice it just seems too simple ;-) What I think I would do is
> associate a queue like thing with every weak reference container. Then
> when an object becomes GC-able I'd place the (oop,i) pair in that shared
> queue. What I need is someone to hold my hand through...
>
> ...designing this "queue like thing". How about a circular array which
> can only be "read" (move the read index) by ST code and only be written
> by the VM code? This avoids a lot of concurrency issues. Are there any
> examples like this in the VM?
>
> \end{amateurHour}

What you've described is not a bad idea in general (and it's probably
what VW does) but there are things that I don't like about it. For
example, part of why the finalization process takes so much time is that
there are so many weak references lost that we don't care about - the
whole idea that just because you use a weak array you need to know when
its contents goes away is just bogus. Secondly, once you start relying
on "accurate" finalization information you should really make sure it's
accurate (e.g., one signal/entry per finalized object). And once you do
that you need to deal with the ugly corner cases of an overflow of the
finalization queue (and the effect that you probably can't allocate any
larger one because the GC you're currently in was triggered by a low
space condition to begin with ;-) Nasty, nasty issues.

Having said that, let me propose a mechanism that (I think) is
fundamentally different and fundamentally simpler. Namely, to make the
requirement that you only get notifications for the finalization of
objects that you explicitly register for by creating a "finalizer"
object, e.g., an observer which is allocated before it's ever needed.
This simple change avoids both the problem of GC needing to allocate
memory when there is none as well as sending notifications about
finalizations that nobody cares about, which are both very desirable
properties. When the object becomes eligible for garbage collection, the
finalizer is then put into a list of objects that have indeed been
finalized and the finalization process simply pulls them out of the
queue and sends #finalize to them.

In its simplest form, this could mean a finalizer is a structure with
(besides the prev and next links for putting it into a structore) two
slots a "weak" slot for the object being guarded and a "strong" slot for
the object performing the finalization (its #finalizer). When the
garbage collector runs across a Finalizer and notices its observed value
is being collected, it can simply put the finalizer into the
finalization list and is done. (btw, this scheme is *vastly* easier to
implement than your proposed scheme since everything is pre-allocated
and you only move the object from one list to another).

But while we're at it, we could also shoot a little bit further and get
away from post-mortem finalization (which I find a highly overrated
concept in practice). The only thing we'd change in the above is that
the garbage collector would now also transfer the object from the "weak"
into the "strong" slot[*1]. This makes the finalizer the sole last
reference to the object. If the finalizer drops it, it's gone. If the
finalizer decides to store it, it will survive. Lots of interesting
possibilities and much cleaner since you gain access to the full context
of the object and its state.

[*1] The easiest way to do this would be to simply clone the object but
unfortunately this also has the unbounded memory problem so something a
bit more clever might be required. Basically we really want *all*
references to the object except from the finalizer to be cleaned up.

Note that weak arrays or other weak classes wouldn't be affected at all
by this since only Finalizers get the notifications - all other weak
classes would simply drop the references when they get collected and
never get notified about anything.

Cheers,
- Andreas

cdavidshaffer

Re: [Seaside] WeakArray (again)

In reply to this post by Andreas.Raab

Andreas Raab wrote:

> Hi David -
>
>> I'm cross-posting to Squeak-dev and I think this should migrate to that
>> list (since it has come up there several times). Please follow up to
>> Squeak-dev only. I don't believe this is a problem for VisualWorks
>> which seems to have a significantly more robust weak reference
>> mechanism.
>
>
> True. From what I've seen over at the Seaside mailing list it seems
> pretty clear that the main issue is that the finalization process
> simply kicks in *way* to often. Have you guys (since you're having the
> problem I think it's appropriate if you can experiment with it a
> little ;-) thought about tweaking the finalization process to run at
> most (say) 20/sec? This is a simple change (see attached CS) which
> could have dramatic impact on your performance. The obvious
> disadvantage, of course, is that this renders any assumptions about
> "Smalltalk garbageCollect" having an immediate effect meaningless
> (which is the main reason the finalization process is implemented the
> way it is).
>
>

No and it is definitely worth a try. I'll post (and I hope others do as
well) its impact on my applications.

David

cdavidshaffer

Re: [Seaside] WeakArray (again)

In reply to this post by Andreas.Raab

Andreas Raab wrote:

> True. From what I've seen over at the Seaside mailing list it seems
> pretty clear that the main issue is that the finalization process
> simply kicks in *way* to often. Have you guys (since you're having the
> problem I think it's appropriate if you can experiment with it a
> little ;-) thought about tweaking the finalization process to run at
> most (say) 20/sec? This is a simple change (see attached CS) which
> could have dramatic impact on your performance. The obvious
> disadvantage, of course, is that this renders any assumptions about
> "Smalltalk garbageCollect" having an immediate effect meaningless
> (which is the main reason the finalization process is implemented the
> way it is).
>

One effect this might have is to cause problems with
WeakArray>>isFinalizationSupported. I've modified mine to just answer true.

David

cdavidshaffer

Re: Finalization

In reply to this post by Andreas.Raab

Andreas Raab wrote:

> What you've described is not a bad idea in general (and it's probably
> what VW does) but there are things that I don't like about it. For
> example, part of why the finalization process takes so much time is
> that there are so many weak references lost that we don't care about -
> the whole idea that just because you use a weak array you need to know
> when its contents goes away is just bogus. Secondly, once you start
> relying on "accurate" finalization information you should really make
> sure it's accurate (e.g., one signal/entry per finalized object). And
> once you do that you need to deal with the ugly corner cases of an
> overflow of the finalization queue (and the effect that you probably
> can't allocate any larger one because the GC you're currently in was
> triggered by a low space condition to begin with ;-) Nasty, nasty issues.

Agreed. Presumably it would be one queue per user-specified group of
weak objects...similar to the Java implementation in the sense that
"weak containers" can specify a queue if they like, otherwise a
system-wide one is used. Anyway I think you're idea is better...

>
> Having said that, let me propose a mechanism that (I think) is
> fundamentally different and fundamentally simpler. Namely, to make the
> requirement that you only get notifications for the finalization of
> objects that you explicitly register for by creating a "finalizer"
> object, e.g., an observer which is allocated before it's ever needed.
> This simple change avoids both the problem of GC needing to allocate
> memory when there is none as well as sending notifications about
> finalizations that nobody cares about, which are both very desirable
> properties. When the object becomes eligible for garbage collection,
> the finalizer is then put into a list of objects that have indeed been
> finalized and the finalization process simply pulls them out of the
> queue and sends #finalize to them.
>
> In its simplest form, this could mean a finalizer is a structure with
> (besides the prev and next links for putting it into a structore) two
> slots a "weak" slot for the object being guarded and a "strong" slot
> for the object performing the finalization (its #finalizer). When the
> garbage collector runs across a Finalizer and notices its observed
> value is being collected, it can simply put the finalizer into the
> finalization list and is done. (btw, this scheme is *vastly* easier to
> implement than your proposed scheme since everything is pre-allocated
> and you only move the object from one list to another).

This sounds great. Some implementation hints? I assume I'd need to
make the weak entry recognizable by #nonWeakFieldsOf: but I don't
understand the class format voodoo in that method. Also it seems that
the weak slots have to come after the non-weak slots...just a detail but
is this correct?

>
> But while we're at it, we could also shoot a little bit further and
> get away from post-mortem finalization (which I find a highly
> overrated concept in practice). The only thing we'd change in the
> above is that the garbage collector would now also transfer the object
> from the "weak" into the "strong" slot[*1]. This makes the finalizer
> the sole last reference to the object. If the finalizer drops it, it's
> gone. If the finalizer decides to store it, it will survive. Lots of
> interesting possibilities and much cleaner since you gain access to
> the full context of the object and its state.
>
> [*1] The easiest way to do this would be to simply clone the object
> but unfortunately this also has the unbounded memory problem so
> something a bit more clever might be required. Basically we really
> want *all* references to the object except from the finalizer to be
> cleaned up.

Yes this would be nicer than the "executors" which don't really have
much to go on. Why would we copy? Why not some other color mark for
object reachable through the weak reference (but not through others, of
course)? Then the sweep phase could identify those that are only weakly
reachable and perform your switcheroo.

>
> Note that weak arrays or other weak classes wouldn't be affected at
> all by this since only Finalizers get the notifications - all other
> weak classes would simply drop the references when they get collected
> and never get notified about anything.

Yes I guess that WeakRegistry would be the only class significantly
impacted by this. I see a few other senders of addWeakDependent:
arround but it looks like the effort to move those to this scheme would
be relatively minimal.

David

Andreas.Raab

Re: Finalization

Hi David,

David Shaffer wrote:
> This sounds great. Some implementation hints? I assume I'd need to
> make the weak entry recognizable by #nonWeakFieldsOf: but I don't
> understand the class format voodoo in that method. Also it seems that
> the weak slots have to come after the non-weak slots...just a detail but
> is this correct?

Yes, weak slots always come after the fixed slots. On the image side
this is represented by the weak slots always being indexed (Dan and I
designed this so that a class could have both "strong" iVars and "weak"
indexed fields). A good starting point is actually
ObjectMemory>>finalizeReference: which cleans up all the weak fields in
an object. If we just replace #signalFinalization: with a type check for
whether the object is a finalizer or not, and if so, add it to the
finalization queue we're basically done with the first stage :-)

Checking the class would be done by adding an extra entry (the class) to
the specialObjectsArray, and then we just need something that looks like
here (the finalizerList is stored in the splObjects, too):

signalFinalization: oop
(self fetchClassOf: oop) == self classFinalizer ifTrue:[
self addLastLink: oop toList: self finalizerList.
self forceInterruptCheck.
pendingFinalizationSignals _ pendingFinalizationSignals + 1.
].

Oh, and the above is inefficient - we really want the type check to
happen early in #finalizeReference: since the above is called for each
slot that got nil-ed in the object (which may be more than one) but it's
a great starting point.

>> [*1] The easiest way to do this would be to simply clone the object
>> but unfortunately this also has the unbounded memory problem so
>> something a bit more clever might be required. Basically we really
>> want *all* references to the object except from the finalizer to be
>> cleaned up.
>
> Yes this would be nicer than the "executors" which don't really have
> much to go on. Why would we copy? Why not some other color mark for
> object reachable through the weak reference (but not through others, of
> course)? Then the sweep phase could identify those that are only weakly
> reachable and perform your switcheroo.

Actually (and I only realized that after sending off the previous
message) there is a significant problem here since you need an extra
pass to trace the "inside" of the object when you recognize that it
should be preserved after all. That seems more work than I originally
thought it would be and tricky work at that.

>> Note that weak arrays or other weak classes wouldn't be affected at
>> all by this since only Finalizers get the notifications - all other
>> weak classes would simply drop the references when they get collected
>> and never get notified about anything.
>
> Yes I guess that WeakRegistry would be the only class significantly
> impacted by this. I see a few other senders of addWeakDependent:
> arround but it looks like the effort to move those to this scheme would
> be relatively minimal.

WeakRegistry and other users would be fairly straighforward to deal with
- they'd just store (strong references to) Finalizer's instead of (weak)
object references and the finalizer would remove itself from the
registry. No big deal, really.

Cheers,
- Andreas

Andreas.Raab

Re: [Seaside] WeakArray (again)

In reply to this post by cdavidshaffer

David Shaffer wrote:
> One effect this might have is to cause problems with
> WeakArray>>isFinalizationSupported. I've modified mine to just answer true.

Yeah, that's code from the old days - there haven't been any VMs without
finalization for, uh, five years? six?

Cheers,
- Andreas

johnmci

Re: [Seaside] WeakArray (again)

In reply to this post by cdavidshaffer

Ooo finalization, my favourite nasty problem, in past experience on
other Smalltalks watching production systems
burn, is that finalization is a process that happens in the future at
the wrong time and at the wrong rate. Usually
this is discovered after the system goes into production.

That said Andreas comment about doing finalization over a time period
is a good idea, keep in mind that you might
need to apply some adaptive feedback say to ensure the finalization
work doesn't cause performance concerns, by
varying how many you look at in a pass based on the array size or
something. It's possible finalization could take 20 seonds
if things are busy, however such logic was the downfall of earlier
versions of VisualWorks which would finalizes only one
object per new space GC collection.

lastly don't forget

Smalltalk setGCBiasToGrowGCLimit: 16*1024*1024.
Smalltalk setGCBiasToGrow: 1.

On 24-Mar-06, at 8:37 PM, David Shaffer wrote:

> Andreas Raab wrote:
>
>> True. From what I've seen over at the Seaside mailing list it seems
>> pretty clear that the main issue is that the finalization process
>> simply kicks in *way* to often.

--
========================================================================
===
John M. McIntosh <[hidden email]> 1-800-477-2659
Corporate Smalltalk Consulting Ltd. http://www.smalltalkconsulting.com
========================================================================
===

Nicolás Cañibano

Re: Finalization (was: Re: [Seaside] WeakArray (again))

In reply to this post by Andreas.Raab

Andreas,
It sounds to me like you are talking about Ephemerons, am I
right? VW has support to it, what about Squeak? I have easily
implemented a Finalizer like the one you mentioned, relying in the
Ephemeron mechanism. I could give you more details if you want.

Best regards,
Cani

----- Original Message -----
From: Andreas Raab <[hidden email]>
To: The general-purpose Squeak developers list <[hidden email]>
Date: Saturday, March 25, 2006, 12:23:12 AM
Subject: Finalization (was: Re: [Seaside] WeakArray (again))

> David Shaffer wrote:
>> \begin{amateurHour}
>>
>> It seems to me that the notification needs to be changed to actually
>> queueing information about the objects which the GC deams
>> un(strongly)reachable. I spent some time staring at
>> ObjectMemory>>sweepPhase, #finalizeReference: and #signalFinalization:
>> which seem to be the cornerstones of this process. All that
>> #signalFinalization: is currently doing is signaling a semaphore (well,
>> indicating that one should be "signaled" later). Why not keep a list of
>> (oop,i) [i is the offset of the weak reference in the oop] pairs and
>> somehow communicate those back to a Smalltalk object? As a total VM
>> novice it just seems too simple ;-) What I think I would do is
>> associate a queue like thing with every weak reference container. Then
>> when an object becomes GC-able I'd place the (oop,i) pair in that shared
>> queue. What I need is someone to hold my hand through...
>>
>> ...designing this "queue like thing". How about a circular array which
>> can only be "read" (move the read index) by ST code and only be written
>> by the VM code? This avoids a lot of concurrency issues. Are there any
>> examples like this in the VM?
>>
>> \end{amateurHour}

> What you've described is not a bad idea in general (and it's probably
> what VW does) but there are things that I don't like about it. For
> example, part of why the finalization process takes so much time is that
> there are so many weak references lost that we don't care about - the
> whole idea that just because you use a weak array you need to know when
> its contents goes away is just bogus. Secondly, once you start relying
> on "accurate" finalization information you should really make sure it's
> accurate (e.g., one signal/entry per finalized object). And once you do
> that you need to deal with the ugly corner cases of an overflow of the
> finalization queue (and the effect that you probably can't allocate any
> larger one because the GC you're currently in was triggered by a low
> space condition to begin with ;-) Nasty, nasty issues.

> Having said that, let me propose a mechanism that (I think) is
> fundamentally different and fundamentally simpler. Namely, to make the
> requirement that you only get notifications for the finalization of
> objects that you explicitly register for by creating a "finalizer"
> object, e.g., an observer which is allocated before it's ever needed.
> This simple change avoids both the problem of GC needing to allocate
> memory when there is none as well as sending notifications about
> finalizations that nobody cares about, which are both very desirable
> properties. When the object becomes eligible for garbage collection, the
> finalizer is then put into a list of objects that have indeed been
> finalized and the finalization process simply pulls them out of the
> queue and sends #finalize to them.

> In its simplest form, this could mean a finalizer is a structure with
> (besides the prev and next links for putting it into a structore) two
> slots a "weak" slot for the object being guarded and a "strong" slot for
> the object performing the finalization (its #finalizer). When the
> garbage collector runs across a Finalizer and notices its observed value
> is being collected, it can simply put the finalizer into the
> finalization list and is done. (btw, this scheme is *vastly* easier to
> implement than your proposed scheme since everything is pre-allocated
> and you only move the object from one list to another).

> But while we're at it, we could also shoot a little bit further and get
> away from post-mortem finalization (which I find a highly overrated
> concept in practice). The only thing we'd change in the above is that
> the garbage collector would now also transfer the object from the "weak"
> into the "strong" slot[*1]. This makes the finalizer the sole last
> reference to the object. If the finalizer drops it, it's gone. If the
> finalizer decides to store it, it will survive. Lots of interesting
> possibilities and much cleaner since you gain access to the full context
> of the object and its state.

> [*1] The easiest way to do this would be to simply clone the object but
> unfortunately this also has the unbounded memory problem so something a
> bit more clever might be required. Basically we really want *all*
> references to the object except from the finalizer to be cleaned up.

> Note that weak arrays or other weak classes wouldn't be affected at all
> by this since only Finalizers get the notifications - all other weak
> classes would simply drop the references when they get collected and
> never get notified about anything.

> Cheers,
> - Andreas

X --------------------------

Jon Hylands

Re: Finalization (was: Re: [Seaside] WeakArray (again))

In reply to this post by Andreas.Raab

On Fri, 24 Mar 2006 19:23:12 -0800, Andreas Raab <[hidden email]>
wrote:

> And once you do
> that you need to deal with the ugly corner cases of an overflow of the
> finalization queue (and the effect that you probably can't allocate any
> larger one because the GC you're currently in was triggered by a low
> space condition to begin with ;-) Nasty, nasty issues.

I ran into this recently with VisualAge, and its probably the nastiest bug
of my career. I still haven't completely fixed it. The context of the bug
is entries in an identity map (cache) are getting finalized, and if there
are too many at once, it overflows the queue, so only some of the objects
in the isolated subgraph are getting finalized. If, before the finalization
starts up again, someone makes a hard reference to one of the objects in
the subgraph that was slated for finalization but didn't (because the weak
ref in the cache is still there), we end up with the whole GC of the
subgraph getting cancelled, and thus objects now referenced within the
subgraph that are no longer in the cache.

Very nasty, almost impossible to identity without totally screwing
performance, and the only real way to recover is to just dump the whole
cache.

Later,
Jon
--------------------------------------------------------------
Jon Hylands [hidden email] http://www.huv.com/jon

Project: Micro Seeker (Micro Autonomous Underwater Vehicle)
http://www.huv.com

William Harford

Re: [Seaside] WeakArray (again)

In reply to this post by Andreas.Raab

On Mar 24, 2006, at 9:50 PM, Andreas Raab wrote:

>
> True. From what I've seen over at the Seaside mailing list it seems
> pretty clear that the main issue is that the finalization process
> simply kicks in *way* to often. Have you guys (since you're having
> the problem I think it's appropriate if you can experiment with it
> a little ;-) thought about tweaking the finalization process to run
> at most (say) 20/sec? This is a simple change (see attached CS)
> which could have dramatic impact on your performance. The obvious
> disadvantage, of course, is that this renders any assumptions about
> "Smalltalk garbageCollect" having an immediate effect meaningless
> (which is the main reason the finalization process is implemented
> the way it is).

I made some changes like this soon after I posted a message on the
Seaside mailing list.

I also started running the Finalization process at a lower priority
level systemBackgroundPriority. I was not sure if this would cause
any problems (is there some sort of race condition I should be
worried about) but the changes have evened out my problems. It has
only been a couple days but it appears that it may have helped
mitigate some of the WeakArray finalization problems.

Thanks
Will

johnmci

Re: [Seaside] WeakArray (again)

The only concern I would have is access to the valueDictionary
WeakRegistry when processing the array at a low background priority
since it may block the access to adding new elements. However a fix
may be to add a shared queue that add: uses to place elements on.
Later on the finalizeValues stick the elements in the value
dictionary. However I'm not sure if other accesses to t he value
dictionary might impact things.

On 25-Mar-06, at 11:12 AM, William Harford wrote:

>
> On Mar 24, 2006, at 9:50 PM, Andreas Raab wrote:
>>
>> True. From what I've seen over at the Seaside mailing list it
>> seems pretty clear that the main issue is that the finalization
>> process simply kicks in *way* to often. Have you guys (since
>> you're having the problem I think it's appropriate if you can
>> experiment with it a little ;-) thought about tweaking the
>> finalization process to run at most (say) 20/sec? This is a simple
>> change (see attached CS) which could have dramatic impact on your
>> performance. The obvious disadvantage, of course, is that this
>> renders any assumptions about "Smalltalk garbageCollect" having an
>> immediate effect meaningless (which is the main reason the
>> finalization process is implemented the way it is).
>
> I made some changes like this soon after I posted a message on the
> Seaside mailing list.
>
> I also started running the Finalization process at a lower priority
> level systemBackgroundPriority. I was not sure if this would cause
> any problems (is there some sort of race condition I should be
> worried about) but the changes have evened out my problems. It has
> only been a couple days but it appears that it may have helped
> mitigate some of the WeakArray finalization problems.
>
> Thanks
> Will
>
>
>

--
========================================================================
===
John M. McIntosh <[hidden email]> 1-800-477-2659
Corporate Smalltalk Consulting Ltd. http://www.smalltalkconsulting.com
========================================================================
===

cdavidshaffer

Re: [Seaside] WeakArray (again)

In reply to this post by William Harford

William Harford wrote:

>
> On Mar 24, 2006, at 9:50 PM, Andreas Raab wrote:
>
>>
>> True. From what I've seen over at the Seaside mailing list it seems
>> pretty clear that the main issue is that the finalization process
>> simply kicks in *way* to often. Have you guys (since you're having
>> the problem I think it's appropriate if you can experiment with it a
>> little ;-) thought about tweaking the finalization process to run at
>> most (say) 20/sec? This is a simple change (see attached CS) which
>> could have dramatic impact on your performance. The obvious
>> disadvantage, of course, is that this renders any assumptions about
>> "Smalltalk garbageCollect" having an immediate effect meaningless
>> (which is the main reason the finalization process is implemented
>> the way it is).
>
>
> I made some changes like this soon after I posted a message on the
> Seaside mailing list.
>
> I also started running the Finalization process at a lower priority
> level systemBackgroundPriority. I was not sure if this would cause
> any problems (is there some sort of race condition I should be
> worried about) but the changes have evened out my problems. It has
> only been a couple days but it appears that it may have helped
> mitigate some of the WeakArray finalization problems.
>
> Thanks
> Will
>
>
>

Out of curiosity how large is your WeakArray class var

FinalizationDependents? When I had problems this array had grown to
~100k BTW. On my server image (which has been stable for a long time
now) the array is 1k.

My understanding of the notification mechanism is that each weak object
collection signals the semaphore once so if you get a bunch of
collections you are iterating over this array quite a few times. So,
even with the delay you're going to run through the loop over the weak
dependents once for each collected weak object ref. That's a lot of
wasted cycles since after the first pass you probably
#finalizaedValue-ed everyone that needed it. Growth only occurs in
addWeakDependent: when there are no empty slots so I'm surprised that it
ever got this large. Anyway after a short time it is quite sparsely
populated so it wouldn't be hard to shrink it down. One could
instrument #finalizationProcess to count the nil entries and when they
cross a low water mark you could compact the array. Continuing along my
"amateur's can do it too" theme I went ahead and coded this...I'm not
convinced that it is the right thing to do but it seems worth a try.

David

'From Squeak3.7 of ''4 September 2004'' [latest update: #5989] on 25 March 2006 at 3:02:16 pm'!

!WeakArray class methodsFor: 'private' stamp: 'cds 3/25/2006 15:01'!
compactFinalizationDependents: minimumSize
| tmp index |
tmp := WeakArray new: minimumSize + 10.
index := 1.
FinalizationLock
critical: [FinalizationDependents
do: [:dependent | dependent
ifNotNil: [tmp at: index put: dependent.
index := index + 1]].
FinalizationDependents := tmp]
ifError: [:msg :rcvr | rcvr error: msg]! !

!WeakArray class methodsFor: 'private' stamp: 'cds 3/25/2006 15:01'!
finalizationProcess
| nilEntries |
nilEntries := 0.
[true] whileTrue:
[FinalizationSemaphore wait.
FinalizationLock critical:
[FinalizationDependents do:
[:weakDependent |
weakDependent ifNotNil:
[weakDependent finalizeValues.
"***Following statement is required to keep weakDependent
from holding onto its value as garbage.***"
weakDependent _ nil]
ifNil: [nilEntries := nilEntries + 1]]]
ifError:
[:msg :rcvr | rcvr error: msg].

"Check if we should compact the array"
(nilEntries > (FinalizationDependents size quo: 4) and: [FinalizationDependents size > 10])
ifTrue: [self compactFinalizationDependents: (FinalizationDependents size - nilEntries)]
].
! !

cdavidshaffer

Re: [Seaside] WeakArray (again)

David Shaffer wrote:

>
>
>My understanding of the notification mechanism is that each weak object
>collection signals the semaphore once so if you get a bunch of
>collections you are iterating over this array quite a few times. So,
>even with the delay you're going to run through the loop over the weak
>dependents once for each collected weak object ref.
>

Oops..."my understanding" was wrong :-) I found the relevent code:

"signal any pending finalizations"
pendingFinalizationSignals > 0
ifTrue: [sema _ self splObj: TheFinalizationSemaphore.
(self fetchClassOf: sema) = (self splObj: ClassSemaphore)
ifTrue: [self synchronousSignal: sema].
pendingFinalizationSignals _ 0].

So, no matter how large pendingFinalizationSignals is, the semaphore is
only signaled once. Why use an int, then?

David

cdavidshaffer

Re: [Seaside] WeakArray (again)

In reply to this post by johnmci

John M McIntosh wrote:

> Ooo finalization, my favourite nasty problem, in past experience on
> other Smalltalks watching production systems
> burn, is that finalization is a process that happens in the future at
> the wrong time and at the wrong rate. Usually
> this is discovered after the system goes into production.
>
> That said Andreas comment about doing finalization over a time period
> is a good idea, keep in mind that you might
> need to apply some adaptive feedback say to ensure the finalization
> work doesn't cause performance concerns, by
> varying how many you look at in a pass based on the array size or
> something. It's possible finalization could take 20 seonds
> if things are busy, however such logic was the downfall of earlier
> versions of VisualWorks which would finalizes only one
> object per new space GC collection.

This sounds like another useful improvement. What do you think of the
compaction code I posted? Seem like a waste of time? I'm working on a
version which gives some feedback. I've got a pre-production project
which will be a good testbed for this stuff. The number of active users
is small but their sessions should last all day.

Another (naive, sorry) question: With the "delayed" version there may be
several pending finalization signals but we only need to run through the
loop once. What we need are:

conditionVariable ifSetClearAndDo: [some block]

Given that the VM's GC can't block on the "condition variable", though,
it's use here wouldn't really work. But we can do a poor-man's job of it:

FinalizationSemaphore wait.
pending := FinalizationSemaphore excessSignals. "you'd have to add that
accessor"
[pending > 0] whileTrue: [FinalizationSemaphore wait. pending := pending
- 1]

This counts on absolutely noone else "waiting" on the finalization
semaphore. I think that's reasonable given it is a class var in
WeakArray. Comments? Leave out the part about a semaphore's signal
count being private and I'm a bad boy for even thinking about this :-)

>
>
> lastly don't forget
>
> Smalltalk setGCBiasToGrowGCLimit: 16*1024*1024.
> Smalltalk setGCBiasToGrow: 1.

Unfortunately I'm working in 3.7 exclusively and this seems to be
missing. I've bookmarked this e-mail for when I move to a newer
VM/image. :-)

David

cdavidshaffer

Re: Finalization

In reply to this post by Nicolás Cañibano

Nicolás Cañibano wrote:

Please share with the whole group :-)

David

timrowledge

Re: [Seaside] WeakArray (again)

In reply to this post by cdavidshaffer

On 25-Mar-06, at 12:26 PM, David Shaffer wrote:

>
> So, no matter how large pendingFinalizationSignals is, the
> semaphore is
> only signaled once. Why use an int, then?
Because that's VM code (ie C in sheep's clothing) and C is too dumb
to really understand booleans.

tim
--
tim Rowledge; [hidden email]; http://www.rowledge.org/tim
Strange OpCodes: RDR: Rotate Disk Right

Andreas.Raab

Re: Finalization

In reply to this post by Nicolás Cañibano

Hi -

No I'm not talking about Ephemerons - having done an implementation for
fun in the past I'm quite aware about the differences ;-) What I'm
proposing here is simply a (precise) notification mechanism for object
finalization.

Cheers,
- Andreas

Nicolás Cañibano wrote:

> Andreas,
> It sounds to me like you are talking about Ephemerons, am I
> right? VW has support to it, what about Squeak? I have easily
> implemented a Finalizer like the one you mentioned, relying in the
> Ephemeron mechanism. I could give you more details if you want.
>
> Best regards,
> Cani
>
> ----- Original Message -----
> From: Andreas Raab <[hidden email]>
> To: The general-purpose Squeak developers list <[hidden email]>
> Date: Saturday, March 25, 2006, 12:23:12 AM
> Subject: Finalization (was: Re: [Seaside] WeakArray (again))
>
>> David Shaffer wrote:
>>> \begin{amateurHour}
>>>
>>> It seems to me that the notification needs to be changed to actually
>>> queueing information about the objects which the GC deams
>>> un(strongly)reachable. I spent some time staring at
>>> ObjectMemory>>sweepPhase, #finalizeReference: and #signalFinalization:
>>> which seem to be the cornerstones of this process. All that
>>> #signalFinalization: is currently doing is signaling a semaphore (well,
>>> indicating that one should be "signaled" later). Why not keep a list of
>>> (oop,i) [i is the offset of the weak reference in the oop] pairs and
>>> somehow communicate those back to a Smalltalk object? As a total VM
>>> novice it just seems too simple ;-) What I think I would do is
>>> associate a queue like thing with every weak reference container. Then
>>> when an object becomes GC-able I'd place the (oop,i) pair in that shared
>>> queue. What I need is someone to hold my hand through...
>>>
>>> ...designing this "queue like thing". How about a circular array which
>>> can only be "read" (move the read index) by ST code and only be written
>>> by the VM code? This avoids a lot of concurrency issues. Are there any
>>> examples like this in the VM?
>>>
>>> \end{amateurHour}
>
>> What you've described is not a bad idea in general (and it's probably
>> what VW does) but there are things that I don't like about it. For
>> example, part of why the finalization process takes so much time is that
>> there are so many weak references lost that we don't care about - the
>> whole idea that just because you use a weak array you need to know when
>> its contents goes away is just bogus. Secondly, once you start relying
>> on "accurate" finalization information you should really make sure it's
>> accurate (e.g., one signal/entry per finalized object). And once you do
>> that you need to deal with the ugly corner cases of an overflow of the
>> finalization queue (and the effect that you probably can't allocate any
>> larger one because the GC you're currently in was triggered by a low
>> space condition to begin with ;-) Nasty, nasty issues.
>
>> Having said that, let me propose a mechanism that (I think) is
>> fundamentally different and fundamentally simpler. Namely, to make the
>> requirement that you only get notifications for the finalization of
>> objects that you explicitly register for by creating a "finalizer"
>> object, e.g., an observer which is allocated before it's ever needed.
>> This simple change avoids both the problem of GC needing to allocate
>> memory when there is none as well as sending notifications about
>> finalizations that nobody cares about, which are both very desirable
>> properties. When the object becomes eligible for garbage collection, the
>> finalizer is then put into a list of objects that have indeed been
>> finalized and the finalization process simply pulls them out of the
>> queue and sends #finalize to them.
>
>> In its simplest form, this could mean a finalizer is a structure with
>> (besides the prev and next links for putting it into a structore) two
>> slots a "weak" slot for the object being guarded and a "strong" slot for
>> the object performing the finalization (its #finalizer). When the
>> garbage collector runs across a Finalizer and notices its observed value
>> is being collected, it can simply put the finalizer into the
>> finalization list and is done. (btw, this scheme is *vastly* easier to
>> implement than your proposed scheme since everything is pre-allocated
>> and you only move the object from one list to another).
>
>> But while we're at it, we could also shoot a little bit further and get
>> away from post-mortem finalization (which I find a highly overrated
>> concept in practice). The only thing we'd change in the above is that
>> the garbage collector would now also transfer the object from the "weak"
>> into the "strong" slot[*1]. This makes the finalizer the sole last
>> reference to the object. If the finalizer drops it, it's gone. If the
>> finalizer decides to store it, it will survive. Lots of interesting
>> possibilities and much cleaner since you gain access to the full context
>> of the object and its state.
>
>> [*1] The easiest way to do this would be to simply clone the object but
>> unfortunately this also has the unbounded memory problem so something a
>> bit more clever might be required. Basically we really want *all*
>> references to the object except from the finalizer to be cleaned up.
>
>> Note that weak arrays or other weak classes wouldn't be affected at all
>> by this since only Finalizers get the notifications - all other weak
>> classes would simply drop the references when they get collected and
>> never get notified about anything.
>
>> Cheers,
>> - Andreas
>
> X --------------------------
>
>
>