Lots of seaside objects not being GCed (need gemstone advise)

Previous Topic Next Topic
 
classic Classic list List threaded Threaded
21 messages Options
12
Reply | Threaded
Open this post in threaded view
|

Re: Lots of seaside objects not being GCed (need gemstone advise)

GLASS mailing list
Ok, cristal clear! Thanks Dale for the explanation. 

On Mon, Jul 13, 2015 at 1:55 PM, Dale Henrichs <[hidden email]> wrote:


On 07/11/2015 02:28 PM, Mariano Martinez Peck wrote:


On Tue, Jul 7, 2015 at 3:56 PM, Dale Henrichs <[hidden email]> wrote:


On 07/07/2015 05:49 AM, Mariano Martinez Peck wrote:
Dale,

I have continue analyzing this in other stones and after some testing it is clear that some sessions (the size would depend on the system usage) are NOT GCed unless I shut all seaside gems down or cycle them. Originally I was having  GEM_TEMPOBJ_POMGEN_PRUNE_ON_VOTE on 90% and I was cycling seaside gems once a day as part of GC. Then, I changed it to 100% and stop restarting gems. Now...it COULD have happened that I did not restarted all gems since I modified the GEM_TEMPOBJ_POMGEN_PRUNE_ON_VOTE and so, the system was still running with 90% and yet I was not restarting seaside gems anymore.
Yes. The meaning of GEM_TEMPOBJ_POMGEN_PRUNE_ON_VOTE=100 is that all pomgen spaces are dropped ... this does not mean that all references to persistent objects in the vm are dropped ....

Indeed. That's why to be 100% sure to drop all references to persistent objects you likely need to recycle seaside gems (even with EM_TEMPOBJ_POMGEN_PRUNE_ON_VOTE=100)
Right ... the odds of dead object references drops lower when using this approach but to reach 100% drastic measures are needed ... Frankly this is why I made the initial comment about it being only 32 sessions ....
 
That could explain why I hold onto some instances, right?  Another possibility is the "stale reference" you mention below. I continue answering below:
 
Good point. Thanks. I will remember it for next time: each time I am dealing with this kind of stuff: cycle all seaside gems first! 
Thanks. BTW, my GEM_TEMPOBJ_POMGEN_PRUNE_ON_VOTE is 100% now to avoid having to cycle gems. 
I will continue with the tests with cycling/killing the gems... but.... continue reading below...
Do you also have the marksweep guy running?

The guy that every 30 minutes perform the "System _generationScavenge_vmMarkSweep."?  Then yes. Why you ask? how this guy could affect? He does not hold any seaside session as far as I know...i simply sends "System _generationScavenge_vmMarkSweep.". Could it be that the #wait: freezes the gem and therefore does not answer the the voting?
No if a gem is busy, the stone patiently waits for the gem to hit a transaction boundary - the vote happens on a transaction boundary.

Dale, with this comment, I do not understand why then the comment in the sys admin guide I pasted below "Gems do not vote until they complete their current transaction. If a Gem is sleeping or otherwise engaged in a long transaction, the vote cannot be finalized and garbage collection pauses at this point."
 
I'm not sure how the "the stone waits for the gem to hit a transaction boundary" is inconsistent with "gems do not vote until they complete their current transaction"...
This is one of the factors that causes reclaimAll to be non-deterministic (our goal is for recalimAll to be deterministic, but the system _is_ a complex state machine). Gems can be busy doing a long running transaction or a a gem can be idle sitting in transaction - like an idle topaz or GemTools and unless the system triggers an event to cause the gem to wake up, like hitting the commit record limit thresholds, the system patiently waits for the Gem to finish it's "work".

Ok... so it will wait. Ok, I got that.
Ah, good:)
 

Mmmmm now I read in the sysadmin guide: "Gems do not vote until they complete their current transaction. If a Gem is sleeping or otherwise engaged in a long transaction, the vote cannot be
finalized and garbage collection pauses at this point. Commit records accumulate, garbage accumulates, and a variety of problems can ensue."

Uffff maybe since this guys practically sleeps all the time and yet does not do a commit nor abort in each iteration of the loop...maybe this guy is preventing the vote?
Recall the little process that you installed the vm marksweep code? This particular process is there so that a Seaside gem is guaranteed to have a Smalltalk process ready and available to respond to the SigAbort ... The SigAbort is sent by the stone, when commit records accumulate ...

Well. Here is where I have the last question. That little process we are talking about does this code:

 [
  | count minutesToForceGemGC |
  count := 0.
  minutesToForceGemGC := 30.
   [ true ] whileTrue: [
  (Delay forSeconds: 30) wait.
  count := count + 1.
  (count \\\ (minutesToForceGemGC * 2)) = 0 ifTrue: [
  System _generationScavenge_vmMarkSweep.
  count := 0.
  ].
  ].
 ] forkAt: Processor lowestPriority.

So my question is.... in that code you see I do NOT ever do a commit or abort. So I don't see how this code can enter what you describe as "the vote happens on a transaction boundary". I mean...that code is 99.9% time in a #wait doing no commit nor abort. So...wouldn't that make the voting process to wait for it forever?  Or the SigAbort is what would prevent that?

Good question ... Immediately before the code you'r shown, you will find the following code:

 Exception
  installStaticException:
    [:ex :cat :num :args |
      "Run the abort in a lowPriority process, since we must acquire the
       transactionMutex."
      [
        GRPlatform current transactionMutex
          critical: [
            GRPlatform current doAbortTransaction ].
        System enableSignaledAbortError.
      ] forkAt: Processor lowestPriority.
    ]
  category: GemStoneError
  number: 6009
  subtype: nil.
 System enableSignaledAbortError.
 

The above code installs a static exception handler for the SigAbort exception (error number 6009). The SigAbort is an asynchronous signal that it is signaled upon notification from the stone. The vm signals the SigAbort in the context of  the currently active GsProcess. if there are no explicit handlers on the stack, the list of static handlers is searched. If a static handler is found, the handler is run by the currently active GsProcess.... If there are no active processes (i.e., all of the processes are blocked on a semaphore or a socket call), then the vm waits for the first process to go active ... if no process wakes up before the stone hits the STN_GEM_ABORT_TIMEOUT, the stone will signal a lost OT effectively killing the session ... Since Seaside gems could very well be blocked sitting on an accept() call, the "extra" process was created to wake up every 30 seconds (half of the default STN_GEM_ABORT_TIMEOUT) to try to guarantee that there will always be an active GsProcess available to abort when a Seaside gem is idle and waiting for requests ...

Dale



--

_______________________________________________
Glass mailing list
[hidden email]
http://lists.gemtalksystems.com/mailman/listinfo/glass
12