William Harford wrote:
> > The application uses call/render a lot. This was one of the main > draws to using Seaside. It allows us to easily customize the > application for our clients. We build our entire application out of > smaller reusable components that can be easily customize/replaced. > There are ~= 12 (and could go up to 30) components created for each > request. In most cases the user moves on. Would limiting the number > of pages a user can backtrack help the situation ? I'm cross-posting to Squeak-dev and I think this should migrate to that list (since it has come up there several times). Please follow up to Squeak-dev only. I don't believe this is a problem for VisualWorks which seems to have a significantly more robust weak reference mechanism. ...as for your question... I don't have a good feeling for the impact of that. It would limit the size of the WALRUCache (?) used to track continuations but I'm not sure that it would actually impact the lifetime of objects in the registry. Actually I don't have much intuition at all regarding lifetime of the objects in the registry. I think the idea was that when a continuation "expires" (gets pushed out of the LRU cache) then data in its "snapshot" should be available for GC. Keeping the cache small might limit the number of items in the weak dictionary if you GC often enough. Sorry I can't say more. Maybe Julian or Avi would care to chime in ;-) \begin{amateurHour} It seems to me that the notification needs to be changed to actually queueing information about the objects which the GC deams un(strongly)reachable. I spent some time staring at ObjectMemory>>sweepPhase, #finalizeReference: and #signalFinalization: which seem to be the cornerstones of this process. All that #signalFinalization: is currently doing is signaling a semaphore (well, indicating that one should be "signaled" later). Why not keep a list of (oop,i) [i is the offset of the weak reference in the oop] pairs and somehow communicate those back to a Smalltalk object? As a total VM novice it just seems too simple ;-) What I think I would do is associate a queue like thing with every weak reference container. Then when an object becomes GC-able I'd place the (oop,i) pair in that shared queue. What I need is someone to hold my hand through... ...designing this "queue like thing". How about a circular array which can only be "read" (move the read index) by ST code and only be written by the VM code? This avoids a lot of concurrency issues. Are there any examples like this in the VM? \end{amateurHour} Andreas said it wasn't trivial and I believe him but I think we've got to give it a try or risk having Squeak/Seaside being ignored for larger projects. It is also a big problem for anyone using GOODS or GLORP, for example, since those libraries make extensive use of weak references. So...is there anyone with a knowledge of the VM willing to step up, design the solution and divvy up parts of it for those of us with a good knowledge of C but little of the VM. David |
Hi David -
> I'm cross-posting to Squeak-dev and I think this should migrate to that > list (since it has come up there several times). Please follow up to > Squeak-dev only. I don't believe this is a problem for VisualWorks > which seems to have a significantly more robust weak reference mechanism. True. From what I've seen over at the Seaside mailing list it seems pretty clear that the main issue is that the finalization process simply kicks in *way* to often. Have you guys (since you're having the problem I think it's appropriate if you can experiment with it a little ;-) thought about tweaking the finalization process to run at most (say) 20/sec? This is a simple change (see attached CS) which could have dramatic impact on your performance. The obvious disadvantage, of course, is that this renders any assumptions about "Smalltalk garbageCollect" having an immediate effect meaningless (which is the main reason the finalization process is implemented the way it is). > \begin{amateurHour} [...] > \end{amateurHour} I'll reply to this separately. Cheers, - Andreas FinalizationProcess.cs (1K) Download Attachment |
In reply to this post by cdavidshaffer
> So...is there anyone with a knowledge of the VM willing to step up, > design the solution and divvy up parts of it for those of us with a > good knowledge of C but little of the VM. A pox upon you if do this in handwritten C. :) (said the simulator enthusiast) -C -- Craig Latta improvisational musical informaticist www.netjam.org Smalltalkers do: [:it | All with: Class, (And love: it)] |
In reply to this post by cdavidshaffer
David Shaffer wrote:
> \begin{amateurHour} > > It seems to me that the notification needs to be changed to actually > queueing information about the objects which the GC deams > un(strongly)reachable. I spent some time staring at > ObjectMemory>>sweepPhase, #finalizeReference: and #signalFinalization: > which seem to be the cornerstones of this process. All that > #signalFinalization: is currently doing is signaling a semaphore (well, > indicating that one should be "signaled" later). Why not keep a list of > (oop,i) [i is the offset of the weak reference in the oop] pairs and > somehow communicate those back to a Smalltalk object? As a total VM > novice it just seems too simple ;-) What I think I would do is > associate a queue like thing with every weak reference container. Then > when an object becomes GC-able I'd place the (oop,i) pair in that shared > queue. What I need is someone to hold my hand through... > > ...designing this "queue like thing". How about a circular array which > can only be "read" (move the read index) by ST code and only be written > by the VM code? This avoids a lot of concurrency issues. Are there any > examples like this in the VM? > > \end{amateurHour} What you've described is not a bad idea in general (and it's probably what VW does) but there are things that I don't like about it. For example, part of why the finalization process takes so much time is that there are so many weak references lost that we don't care about - the whole idea that just because you use a weak array you need to know when its contents goes away is just bogus. Secondly, once you start relying on "accurate" finalization information you should really make sure it's accurate (e.g., one signal/entry per finalized object). And once you do that you need to deal with the ugly corner cases of an overflow of the finalization queue (and the effect that you probably can't allocate any larger one because the GC you're currently in was triggered by a low space condition to begin with ;-) Nasty, nasty issues. Having said that, let me propose a mechanism that (I think) is fundamentally different and fundamentally simpler. Namely, to make the requirement that you only get notifications for the finalization of objects that you explicitly register for by creating a "finalizer" object, e.g., an observer which is allocated before it's ever needed. This simple change avoids both the problem of GC needing to allocate memory when there is none as well as sending notifications about finalizations that nobody cares about, which are both very desirable properties. When the object becomes eligible for garbage collection, the finalizer is then put into a list of objects that have indeed been finalized and the finalization process simply pulls them out of the queue and sends #finalize to them. In its simplest form, this could mean a finalizer is a structure with (besides the prev and next links for putting it into a structore) two slots a "weak" slot for the object being guarded and a "strong" slot for the object performing the finalization (its #finalizer). When the garbage collector runs across a Finalizer and notices its observed value is being collected, it can simply put the finalizer into the finalization list and is done. (btw, this scheme is *vastly* easier to implement than your proposed scheme since everything is pre-allocated and you only move the object from one list to another). But while we're at it, we could also shoot a little bit further and get away from post-mortem finalization (which I find a highly overrated concept in practice). The only thing we'd change in the above is that the garbage collector would now also transfer the object from the "weak" into the "strong" slot[*1]. This makes the finalizer the sole last reference to the object. If the finalizer drops it, it's gone. If the finalizer decides to store it, it will survive. Lots of interesting possibilities and much cleaner since you gain access to the full context of the object and its state. [*1] The easiest way to do this would be to simply clone the object but unfortunately this also has the unbounded memory problem so something a bit more clever might be required. Basically we really want *all* references to the object except from the finalizer to be cleaned up. Note that weak arrays or other weak classes wouldn't be affected at all by this since only Finalizers get the notifications - all other weak classes would simply drop the references when they get collected and never get notified about anything. Cheers, - Andreas |
In reply to this post by Andreas.Raab
Andreas Raab wrote:
> Hi David - > >> I'm cross-posting to Squeak-dev and I think this should migrate to that >> list (since it has come up there several times). Please follow up to >> Squeak-dev only. I don't believe this is a problem for VisualWorks >> which seems to have a significantly more robust weak reference >> mechanism. > > > True. From what I've seen over at the Seaside mailing list it seems > pretty clear that the main issue is that the finalization process > simply kicks in *way* to often. Have you guys (since you're having the > problem I think it's appropriate if you can experiment with it a > little ;-) thought about tweaking the finalization process to run at > most (say) 20/sec? This is a simple change (see attached CS) which > could have dramatic impact on your performance. The obvious > disadvantage, of course, is that this renders any assumptions about > "Smalltalk garbageCollect" having an immediate effect meaningless > (which is the main reason the finalization process is implemented the > way it is). > > well) its impact on my applications. David |
In reply to this post by Andreas.Raab
Andreas Raab wrote:
> True. From what I've seen over at the Seaside mailing list it seems > pretty clear that the main issue is that the finalization process > simply kicks in *way* to often. Have you guys (since you're having the > problem I think it's appropriate if you can experiment with it a > little ;-) thought about tweaking the finalization process to run at > most (say) 20/sec? This is a simple change (see attached CS) which > could have dramatic impact on your performance. The obvious > disadvantage, of course, is that this renders any assumptions about > "Smalltalk garbageCollect" having an immediate effect meaningless > (which is the main reason the finalization process is implemented the > way it is). > WeakArray>>isFinalizationSupported. I've modified mine to just answer true. David |
In reply to this post by Andreas.Raab
Andreas Raab wrote:
> What you've described is not a bad idea in general (and it's probably > what VW does) but there are things that I don't like about it. For > example, part of why the finalization process takes so much time is > that there are so many weak references lost that we don't care about - > the whole idea that just because you use a weak array you need to know > when its contents goes away is just bogus. Secondly, once you start > relying on "accurate" finalization information you should really make > sure it's accurate (e.g., one signal/entry per finalized object). And > once you do that you need to deal with the ugly corner cases of an > overflow of the finalization queue (and the effect that you probably > can't allocate any larger one because the GC you're currently in was > triggered by a low space condition to begin with ;-) Nasty, nasty issues. Agreed. Presumably it would be one queue per user-specified group of weak objects...similar to the Java implementation in the sense that "weak containers" can specify a queue if they like, otherwise a system-wide one is used. Anyway I think you're idea is better... > > Having said that, let me propose a mechanism that (I think) is > fundamentally different and fundamentally simpler. Namely, to make the > requirement that you only get notifications for the finalization of > objects that you explicitly register for by creating a "finalizer" > object, e.g., an observer which is allocated before it's ever needed. > This simple change avoids both the problem of GC needing to allocate > memory when there is none as well as sending notifications about > finalizations that nobody cares about, which are both very desirable > properties. When the object becomes eligible for garbage collection, > the finalizer is then put into a list of objects that have indeed been > finalized and the finalization process simply pulls them out of the > queue and sends #finalize to them. > > In its simplest form, this could mean a finalizer is a structure with > (besides the prev and next links for putting it into a structore) two > slots a "weak" slot for the object being guarded and a "strong" slot > for the object performing the finalization (its #finalizer). When the > garbage collector runs across a Finalizer and notices its observed > value is being collected, it can simply put the finalizer into the > finalization list and is done. (btw, this scheme is *vastly* easier to > implement than your proposed scheme since everything is pre-allocated > and you only move the object from one list to another). This sounds great. Some implementation hints? I assume I'd need to make the weak entry recognizable by #nonWeakFieldsOf: but I don't understand the class format voodoo in that method. Also it seems that the weak slots have to come after the non-weak slots...just a detail but is this correct? > > But while we're at it, we could also shoot a little bit further and > get away from post-mortem finalization (which I find a highly > overrated concept in practice). The only thing we'd change in the > above is that the garbage collector would now also transfer the object > from the "weak" into the "strong" slot[*1]. This makes the finalizer > the sole last reference to the object. If the finalizer drops it, it's > gone. If the finalizer decides to store it, it will survive. Lots of > interesting possibilities and much cleaner since you gain access to > the full context of the object and its state. > > [*1] The easiest way to do this would be to simply clone the object > but unfortunately this also has the unbounded memory problem so > something a bit more clever might be required. Basically we really > want *all* references to the object except from the finalizer to be > cleaned up. Yes this would be nicer than the "executors" which don't really have much to go on. Why would we copy? Why not some other color mark for object reachable through the weak reference (but not through others, of course)? Then the sweep phase could identify those that are only weakly reachable and perform your switcheroo. > > Note that weak arrays or other weak classes wouldn't be affected at > all by this since only Finalizers get the notifications - all other > weak classes would simply drop the references when they get collected > and never get notified about anything. Yes I guess that WeakRegistry would be the only class significantly impacted by this. I see a few other senders of addWeakDependent: arround but it looks like the effort to move those to this scheme would be relatively minimal. David |
Hi David,
David Shaffer wrote: > This sounds great. Some implementation hints? I assume I'd need to > make the weak entry recognizable by #nonWeakFieldsOf: but I don't > understand the class format voodoo in that method. Also it seems that > the weak slots have to come after the non-weak slots...just a detail but > is this correct? Yes, weak slots always come after the fixed slots. On the image side this is represented by the weak slots always being indexed (Dan and I designed this so that a class could have both "strong" iVars and "weak" indexed fields). A good starting point is actually ObjectMemory>>finalizeReference: which cleans up all the weak fields in an object. If we just replace #signalFinalization: with a type check for whether the object is a finalizer or not, and if so, add it to the finalization queue we're basically done with the first stage :-) Checking the class would be done by adding an extra entry (the class) to the specialObjectsArray, and then we just need something that looks like here (the finalizerList is stored in the splObjects, too): signalFinalization: oop (self fetchClassOf: oop) == self classFinalizer ifTrue:[ self addLastLink: oop toList: self finalizerList. self forceInterruptCheck. pendingFinalizationSignals _ pendingFinalizationSignals + 1. ]. Oh, and the above is inefficient - we really want the type check to happen early in #finalizeReference: since the above is called for each slot that got nil-ed in the object (which may be more than one) but it's a great starting point. >> [*1] The easiest way to do this would be to simply clone the object >> but unfortunately this also has the unbounded memory problem so >> something a bit more clever might be required. Basically we really >> want *all* references to the object except from the finalizer to be >> cleaned up. > > Yes this would be nicer than the "executors" which don't really have > much to go on. Why would we copy? Why not some other color mark for > object reachable through the weak reference (but not through others, of > course)? Then the sweep phase could identify those that are only weakly > reachable and perform your switcheroo. Actually (and I only realized that after sending off the previous message) there is a significant problem here since you need an extra pass to trace the "inside" of the object when you recognize that it should be preserved after all. That seems more work than I originally thought it would be and tricky work at that. >> Note that weak arrays or other weak classes wouldn't be affected at >> all by this since only Finalizers get the notifications - all other >> weak classes would simply drop the references when they get collected >> and never get notified about anything. > > Yes I guess that WeakRegistry would be the only class significantly > impacted by this. I see a few other senders of addWeakDependent: > arround but it looks like the effort to move those to this scheme would > be relatively minimal. WeakRegistry and other users would be fairly straighforward to deal with - they'd just store (strong references to) Finalizer's instead of (weak) object references and the finalizer would remove itself from the registry. No big deal, really. Cheers, - Andreas |
In reply to this post by cdavidshaffer
David Shaffer wrote:
> One effect this might have is to cause problems with > WeakArray>>isFinalizationSupported. I've modified mine to just answer true. Yeah, that's code from the old days - there haven't been any VMs without finalization for, uh, five years? six? Cheers, - Andreas |
In reply to this post by cdavidshaffer
Ooo finalization, my favourite nasty problem, in past experience on
other Smalltalks watching production systems burn, is that finalization is a process that happens in the future at the wrong time and at the wrong rate. Usually this is discovered after the system goes into production. That said Andreas comment about doing finalization over a time period is a good idea, keep in mind that you might need to apply some adaptive feedback say to ensure the finalization work doesn't cause performance concerns, by varying how many you look at in a pass based on the array size or something. It's possible finalization could take 20 seonds if things are busy, however such logic was the downfall of earlier versions of VisualWorks which would finalizes only one object per new space GC collection. lastly don't forget Smalltalk setGCBiasToGrowGCLimit: 16*1024*1024. Smalltalk setGCBiasToGrow: 1. On 24-Mar-06, at 8:37 PM, David Shaffer wrote: > Andreas Raab wrote: > >> True. From what I've seen over at the Seaside mailing list it seems >> pretty clear that the main issue is that the finalization process >> simply kicks in *way* to often. -- ======================================================================== === John M. McIntosh <[hidden email]> 1-800-477-2659 Corporate Smalltalk Consulting Ltd. http://www.smalltalkconsulting.com ======================================================================== === |
In reply to this post by Andreas.Raab
Andreas,
It sounds to me like you are talking about Ephemerons, am I right? VW has support to it, what about Squeak? I have easily implemented a Finalizer like the one you mentioned, relying in the Ephemeron mechanism. I could give you more details if you want. Best regards, Cani ----- Original Message ----- From: Andreas Raab <[hidden email]> To: The general-purpose Squeak developers list <[hidden email]> Date: Saturday, March 25, 2006, 12:23:12 AM Subject: Finalization (was: Re: [Seaside] WeakArray (again)) > David Shaffer wrote: >> \begin{amateurHour} >> >> It seems to me that the notification needs to be changed to actually >> queueing information about the objects which the GC deams >> un(strongly)reachable. I spent some time staring at >> ObjectMemory>>sweepPhase, #finalizeReference: and #signalFinalization: >> which seem to be the cornerstones of this process. All that >> #signalFinalization: is currently doing is signaling a semaphore (well, >> indicating that one should be "signaled" later). Why not keep a list of >> (oop,i) [i is the offset of the weak reference in the oop] pairs and >> somehow communicate those back to a Smalltalk object? As a total VM >> novice it just seems too simple ;-) What I think I would do is >> associate a queue like thing with every weak reference container. Then >> when an object becomes GC-able I'd place the (oop,i) pair in that shared >> queue. What I need is someone to hold my hand through... >> >> ...designing this "queue like thing". How about a circular array which >> can only be "read" (move the read index) by ST code and only be written >> by the VM code? This avoids a lot of concurrency issues. Are there any >> examples like this in the VM? >> >> \end{amateurHour} > What you've described is not a bad idea in general (and it's probably > what VW does) but there are things that I don't like about it. For > example, part of why the finalization process takes so much time is that > there are so many weak references lost that we don't care about - the > whole idea that just because you use a weak array you need to know when > its contents goes away is just bogus. Secondly, once you start relying > on "accurate" finalization information you should really make sure it's > accurate (e.g., one signal/entry per finalized object). And once you do > that you need to deal with the ugly corner cases of an overflow of the > finalization queue (and the effect that you probably can't allocate any > larger one because the GC you're currently in was triggered by a low > space condition to begin with ;-) Nasty, nasty issues. > Having said that, let me propose a mechanism that (I think) is > fundamentally different and fundamentally simpler. Namely, to make the > requirement that you only get notifications for the finalization of > objects that you explicitly register for by creating a "finalizer" > object, e.g., an observer which is allocated before it's ever needed. > This simple change avoids both the problem of GC needing to allocate > memory when there is none as well as sending notifications about > finalizations that nobody cares about, which are both very desirable > properties. When the object becomes eligible for garbage collection, the > finalizer is then put into a list of objects that have indeed been > finalized and the finalization process simply pulls them out of the > queue and sends #finalize to them. > In its simplest form, this could mean a finalizer is a structure with > (besides the prev and next links for putting it into a structore) two > slots a "weak" slot for the object being guarded and a "strong" slot for > the object performing the finalization (its #finalizer). When the > garbage collector runs across a Finalizer and notices its observed value > is being collected, it can simply put the finalizer into the > finalization list and is done. (btw, this scheme is *vastly* easier to > implement than your proposed scheme since everything is pre-allocated > and you only move the object from one list to another). > But while we're at it, we could also shoot a little bit further and get > away from post-mortem finalization (which I find a highly overrated > concept in practice). The only thing we'd change in the above is that > the garbage collector would now also transfer the object from the "weak" > into the "strong" slot[*1]. This makes the finalizer the sole last > reference to the object. If the finalizer drops it, it's gone. If the > finalizer decides to store it, it will survive. Lots of interesting > possibilities and much cleaner since you gain access to the full context > of the object and its state. > [*1] The easiest way to do this would be to simply clone the object but > unfortunately this also has the unbounded memory problem so something a > bit more clever might be required. Basically we really want *all* > references to the object except from the finalizer to be cleaned up. > Note that weak arrays or other weak classes wouldn't be affected at all > by this since only Finalizers get the notifications - all other weak > classes would simply drop the references when they get collected and > never get notified about anything. > Cheers, > - Andreas X -------------------------- |
In reply to this post by Andreas.Raab
On Fri, 24 Mar 2006 19:23:12 -0800, Andreas Raab <[hidden email]>
wrote: > And once you do > that you need to deal with the ugly corner cases of an overflow of the > finalization queue (and the effect that you probably can't allocate any > larger one because the GC you're currently in was triggered by a low > space condition to begin with ;-) Nasty, nasty issues. I ran into this recently with VisualAge, and its probably the nastiest bug of my career. I still haven't completely fixed it. The context of the bug is entries in an identity map (cache) are getting finalized, and if there are too many at once, it overflows the queue, so only some of the objects in the isolated subgraph are getting finalized. If, before the finalization starts up again, someone makes a hard reference to one of the objects in the subgraph that was slated for finalization but didn't (because the weak ref in the cache is still there), we end up with the whole GC of the subgraph getting cancelled, and thus objects now referenced within the subgraph that are no longer in the cache. Very nasty, almost impossible to identity without totally screwing performance, and the only real way to recover is to just dump the whole cache. Later, Jon -------------------------------------------------------------- Jon Hylands [hidden email] http://www.huv.com/jon Project: Micro Seeker (Micro Autonomous Underwater Vehicle) http://www.huv.com |
In reply to this post by Andreas.Raab
On Mar 24, 2006, at 9:50 PM, Andreas Raab wrote: > > True. From what I've seen over at the Seaside mailing list it seems > pretty clear that the main issue is that the finalization process > simply kicks in *way* to often. Have you guys (since you're having > the problem I think it's appropriate if you can experiment with it > a little ;-) thought about tweaking the finalization process to run > at most (say) 20/sec? This is a simple change (see attached CS) > which could have dramatic impact on your performance. The obvious > disadvantage, of course, is that this renders any assumptions about > "Smalltalk garbageCollect" having an immediate effect meaningless > (which is the main reason the finalization process is implemented > the way it is). I made some changes like this soon after I posted a message on the Seaside mailing list. I also started running the Finalization process at a lower priority level systemBackgroundPriority. I was not sure if this would cause any problems (is there some sort of race condition I should be worried about) but the changes have evened out my problems. It has only been a couple days but it appears that it may have helped mitigate some of the WeakArray finalization problems. Thanks Will |
The only concern I would have is access to the valueDictionary
WeakRegistry when processing the array at a low background priority since it may block the access to adding new elements. However a fix may be to add a shared queue that add: uses to place elements on. Later on the finalizeValues stick the elements in the value dictionary. However I'm not sure if other accesses to t he value dictionary might impact things. On 25-Mar-06, at 11:12 AM, William Harford wrote: > > On Mar 24, 2006, at 9:50 PM, Andreas Raab wrote: >> >> True. From what I've seen over at the Seaside mailing list it >> seems pretty clear that the main issue is that the finalization >> process simply kicks in *way* to often. Have you guys (since >> you're having the problem I think it's appropriate if you can >> experiment with it a little ;-) thought about tweaking the >> finalization process to run at most (say) 20/sec? This is a simple >> change (see attached CS) which could have dramatic impact on your >> performance. The obvious disadvantage, of course, is that this >> renders any assumptions about "Smalltalk garbageCollect" having an >> immediate effect meaningless (which is the main reason the >> finalization process is implemented the way it is). > > I made some changes like this soon after I posted a message on the > Seaside mailing list. > > I also started running the Finalization process at a lower priority > level systemBackgroundPriority. I was not sure if this would cause > any problems (is there some sort of race condition I should be > worried about) but the changes have evened out my problems. It has > only been a couple days but it appears that it may have helped > mitigate some of the WeakArray finalization problems. > > Thanks > Will > > > -- ======================================================================== === John M. McIntosh <[hidden email]> 1-800-477-2659 Corporate Smalltalk Consulting Ltd. http://www.smalltalkconsulting.com ======================================================================== === |
In reply to this post by William Harford
William Harford wrote:
> > On Mar 24, 2006, at 9:50 PM, Andreas Raab wrote: > >> >> True. From what I've seen over at the Seaside mailing list it seems >> pretty clear that the main issue is that the finalization process >> simply kicks in *way* to often. Have you guys (since you're having >> the problem I think it's appropriate if you can experiment with it a >> little ;-) thought about tweaking the finalization process to run at >> most (say) 20/sec? This is a simple change (see attached CS) which >> could have dramatic impact on your performance. The obvious >> disadvantage, of course, is that this renders any assumptions about >> "Smalltalk garbageCollect" having an immediate effect meaningless >> (which is the main reason the finalization process is implemented >> the way it is). > > > I made some changes like this soon after I posted a message on the > Seaside mailing list. > > I also started running the Finalization process at a lower priority > level systemBackgroundPriority. I was not sure if this would cause > any problems (is there some sort of race condition I should be > worried about) but the changes have evened out my problems. It has > only been a couple days but it appears that it may have helped > mitigate some of the WeakArray finalization problems. > > Thanks > Will > > > ~100k BTW. On my server image (which has been stable for a long time now) the array is 1k. My understanding of the notification mechanism is that each weak object collection signals the semaphore once so if you get a bunch of collections you are iterating over this array quite a few times. So, even with the delay you're going to run through the loop over the weak dependents once for each collected weak object ref. That's a lot of wasted cycles since after the first pass you probably #finalizaedValue-ed everyone that needed it. Growth only occurs in addWeakDependent: when there are no empty slots so I'm surprised that it ever got this large. Anyway after a short time it is quite sparsely populated so it wouldn't be hard to shrink it down. One could instrument #finalizationProcess to count the nil entries and when they cross a low water mark you could compact the array. Continuing along my "amateur's can do it too" theme I went ahead and coded this...I'm not convinced that it is the right thing to do but it seems worth a try. David 'From Squeak3.7 of ''4 September 2004'' [latest update: #5989] on 25 March 2006 at 3:02:16 pm'! !WeakArray class methodsFor: 'private' stamp: 'cds 3/25/2006 15:01'! compactFinalizationDependents: minimumSize | tmp index | tmp := WeakArray new: minimumSize + 10. index := 1. FinalizationLock critical: [FinalizationDependents do: [:dependent | dependent ifNotNil: [tmp at: index put: dependent. index := index + 1]]. FinalizationDependents := tmp] ifError: [:msg :rcvr | rcvr error: msg]! ! !WeakArray class methodsFor: 'private' stamp: 'cds 3/25/2006 15:01'! finalizationProcess | nilEntries | nilEntries := 0. [true] whileTrue: [FinalizationSemaphore wait. FinalizationLock critical: [FinalizationDependents do: [:weakDependent | weakDependent ifNotNil: [weakDependent finalizeValues. "***Following statement is required to keep weakDependent from holding onto its value as garbage.***" weakDependent _ nil] ifNil: [nilEntries := nilEntries + 1]]] ifError: [:msg :rcvr | rcvr error: msg]. "Check if we should compact the array" (nilEntries > (FinalizationDependents size quo: 4) and: [FinalizationDependents size > 10]) ifTrue: [self compactFinalizationDependents: (FinalizationDependents size - nilEntries)] ]. ! ! |
David Shaffer wrote:
> > >My understanding of the notification mechanism is that each weak object >collection signals the semaphore once so if you get a bunch of >collections you are iterating over this array quite a few times. So, >even with the delay you're going to run through the loop over the weak >dependents once for each collected weak object ref. > Oops..."my understanding" was wrong :-) I found the relevent code: "signal any pending finalizations" pendingFinalizationSignals > 0 ifTrue: [sema _ self splObj: TheFinalizationSemaphore. (self fetchClassOf: sema) = (self splObj: ClassSemaphore) ifTrue: [self synchronousSignal: sema]. pendingFinalizationSignals _ 0]. So, no matter how large pendingFinalizationSignals is, the semaphore is only signaled once. Why use an int, then? David |
In reply to this post by johnmci
John M McIntosh wrote:
> Ooo finalization, my favourite nasty problem, in past experience on > other Smalltalks watching production systems > burn, is that finalization is a process that happens in the future at > the wrong time and at the wrong rate. Usually > this is discovered after the system goes into production. > > That said Andreas comment about doing finalization over a time period > is a good idea, keep in mind that you might > need to apply some adaptive feedback say to ensure the finalization > work doesn't cause performance concerns, by > varying how many you look at in a pass based on the array size or > something. It's possible finalization could take 20 seonds > if things are busy, however such logic was the downfall of earlier > versions of VisualWorks which would finalizes only one > object per new space GC collection. This sounds like another useful improvement. What do you think of the compaction code I posted? Seem like a waste of time? I'm working on a version which gives some feedback. I've got a pre-production project which will be a good testbed for this stuff. The number of active users is small but their sessions should last all day. Another (naive, sorry) question: With the "delayed" version there may be several pending finalization signals but we only need to run through the loop once. What we need are: conditionVariable ifSetClearAndDo: [some block] Given that the VM's GC can't block on the "condition variable", though, it's use here wouldn't really work. But we can do a poor-man's job of it: FinalizationSemaphore wait. pending := FinalizationSemaphore excessSignals. "you'd have to add that accessor" [pending > 0] whileTrue: [FinalizationSemaphore wait. pending := pending - 1] This counts on absolutely noone else "waiting" on the finalization semaphore. I think that's reasonable given it is a class var in WeakArray. Comments? Leave out the part about a semaphore's signal count being private and I'm a bad boy for even thinking about this :-) > > > lastly don't forget > > Smalltalk setGCBiasToGrowGCLimit: 16*1024*1024. > Smalltalk setGCBiasToGrow: 1. Unfortunately I'm working in 3.7 exclusively and this seems to be missing. I've bookmarked this e-mail for when I move to a newer VM/image. :-) David |
In reply to this post by Nicolás Cañibano
Nicolás Cañibano wrote:
>Andreas, > It sounds to me like you are talking about Ephemerons, am I >right? VW has support to it, what about Squeak? I have easily >implemented a Finalizer like the one you mentioned, relying in the >Ephemeron mechanism. I could give you more details if you want. > >Best regards, > Cani > > Please share with the whole group :-) David |
In reply to this post by cdavidshaffer
On 25-Mar-06, at 12:26 PM, David Shaffer wrote: > > So, no matter how large pendingFinalizationSignals is, the > semaphore is > only signaled once. Why use an int, then? Because that's VM code (ie C in sheep's clothing) and C is too dumb to really understand booleans. tim -- tim Rowledge; [hidden email]; http://www.rowledge.org/tim Strange OpCodes: RDR: Rotate Disk Right |
In reply to this post by Nicolás Cañibano
Hi -
No I'm not talking about Ephemerons - having done an implementation for fun in the past I'm quite aware about the differences ;-) What I'm proposing here is simply a (precise) notification mechanism for object finalization. Cheers, - Andreas Nicolás Cañibano wrote: > Andreas, > It sounds to me like you are talking about Ephemerons, am I > right? VW has support to it, what about Squeak? I have easily > implemented a Finalizer like the one you mentioned, relying in the > Ephemeron mechanism. I could give you more details if you want. > > Best regards, > Cani > > ----- Original Message ----- > From: Andreas Raab <[hidden email]> > To: The general-purpose Squeak developers list <[hidden email]> > Date: Saturday, March 25, 2006, 12:23:12 AM > Subject: Finalization (was: Re: [Seaside] WeakArray (again)) > >> David Shaffer wrote: >>> \begin{amateurHour} >>> >>> It seems to me that the notification needs to be changed to actually >>> queueing information about the objects which the GC deams >>> un(strongly)reachable. I spent some time staring at >>> ObjectMemory>>sweepPhase, #finalizeReference: and #signalFinalization: >>> which seem to be the cornerstones of this process. All that >>> #signalFinalization: is currently doing is signaling a semaphore (well, >>> indicating that one should be "signaled" later). Why not keep a list of >>> (oop,i) [i is the offset of the weak reference in the oop] pairs and >>> somehow communicate those back to a Smalltalk object? As a total VM >>> novice it just seems too simple ;-) What I think I would do is >>> associate a queue like thing with every weak reference container. Then >>> when an object becomes GC-able I'd place the (oop,i) pair in that shared >>> queue. What I need is someone to hold my hand through... >>> >>> ...designing this "queue like thing". How about a circular array which >>> can only be "read" (move the read index) by ST code and only be written >>> by the VM code? This avoids a lot of concurrency issues. Are there any >>> examples like this in the VM? >>> >>> \end{amateurHour} > >> What you've described is not a bad idea in general (and it's probably >> what VW does) but there are things that I don't like about it. For >> example, part of why the finalization process takes so much time is that >> there are so many weak references lost that we don't care about - the >> whole idea that just because you use a weak array you need to know when >> its contents goes away is just bogus. Secondly, once you start relying >> on "accurate" finalization information you should really make sure it's >> accurate (e.g., one signal/entry per finalized object). And once you do >> that you need to deal with the ugly corner cases of an overflow of the >> finalization queue (and the effect that you probably can't allocate any >> larger one because the GC you're currently in was triggered by a low >> space condition to begin with ;-) Nasty, nasty issues. > >> Having said that, let me propose a mechanism that (I think) is >> fundamentally different and fundamentally simpler. Namely, to make the >> requirement that you only get notifications for the finalization of >> objects that you explicitly register for by creating a "finalizer" >> object, e.g., an observer which is allocated before it's ever needed. >> This simple change avoids both the problem of GC needing to allocate >> memory when there is none as well as sending notifications about >> finalizations that nobody cares about, which are both very desirable >> properties. When the object becomes eligible for garbage collection, the >> finalizer is then put into a list of objects that have indeed been >> finalized and the finalization process simply pulls them out of the >> queue and sends #finalize to them. > >> In its simplest form, this could mean a finalizer is a structure with >> (besides the prev and next links for putting it into a structore) two >> slots a "weak" slot for the object being guarded and a "strong" slot for >> the object performing the finalization (its #finalizer). When the >> garbage collector runs across a Finalizer and notices its observed value >> is being collected, it can simply put the finalizer into the >> finalization list and is done. (btw, this scheme is *vastly* easier to >> implement than your proposed scheme since everything is pre-allocated >> and you only move the object from one list to another). > >> But while we're at it, we could also shoot a little bit further and get >> away from post-mortem finalization (which I find a highly overrated >> concept in practice). The only thing we'd change in the above is that >> the garbage collector would now also transfer the object from the "weak" >> into the "strong" slot[*1]. This makes the finalizer the sole last >> reference to the object. If the finalizer drops it, it's gone. If the >> finalizer decides to store it, it will survive. Lots of interesting >> possibilities and much cleaner since you gain access to the full context >> of the object and its state. > >> [*1] The easiest way to do this would be to simply clone the object but >> unfortunately this also has the unbounded memory problem so something a >> bit more clever might be required. Basically we really want *all* >> references to the object except from the finalizer to be cleaned up. > >> Note that weak arrays or other weak classes wouldn't be affected at all >> by this since only Finalizers get the notifications - all other weak >> classes would simply drop the references when they get collected and >> never get notified about anything. > >> Cheers, >> - Andreas > > X -------------------------- > > > |
Free forum by Nabble | Edit this page |