Smalltalk › Gemtalk › GLASS

gemstone reclaim voodoo

_‹ Previous Topic Next Topic _›

Classic

List

Threaded

4 messages Options

Johan Brichau-2

gemstone reclaim voodoo

Hi Dale,

I'm continuing to experience some voodoo regarding repository growth. Because I'm not a voodoo practitioner, I hope I will finally get to understand a bit more of it.
Just in case you might have some voodoo experience, I'm throwing this question onto the list ;-)

I have an extent that grew over 2Gb data (file size was bigger). The object log was empty, no session instances were trailing and the MFC told me there was plenty of space to be reclaimed. For over a week, the MFC reported for roughly 1.8Gb of possible dead objects on every run. However, that space just was never entirely reclaimed. The amount of garbage that was never reclaimed slowly grew and, as I discovered afterwards, was worth 1Gb (instead of the approximated 1.8Gb -- which would have been the entire repo -- now, I hope GS does not see Yesplan as garbage ;-)))

The 'System hasMissingGcGems' returned false. The logs of all gems were fine… except that the MFC was returning a large value to be reclaimed, which just stayed in place.

Because MFC/reclaim cycles were putting the system more and more under stress because of io-wait, I decided to restart the stone and all gems.

And… yes… the reclaim happened just minutes after restart. I'm happily running the repository at a comfortable size again.
And I see similar things in other stones.

So… would a weekly reboot of the stone be a good thing to do? Is this behavior that can happen?

I hope I'm not asking away too much ;-)
Johan

A sample log entry:

AbortingError 3020: Successful completion of markForCollection. <3806475> live objects found. <20494515> possible dead objects, occupying approximately <1844506350> bytes, may be reclaimed.
Repository: 3320.00M, Free: 1345.70M, Used: 1974.30M

Dale Henrichs

Re: gemstone reclaim voodoo

Johan,

I'm going to guess that you are suffering from "Issue 136: maintenance vm running only #unregisterExpiredHandlers may hang onto dead objects"[1].

If so, then you only need to recycle the maintenance vm before doing your gc. Norbert, have you found it necessary to recycle the seaside gems as well?

If that doesn't work we'll investigate further.

Dale

[1] http://code.google.com/p/glassdb/issues/detail?id=136

----- Original Message -----
| From: "Johan Brichau" <[hidden email]>
| To: "GemStone Seaside beta discussion" <[hidden email]>
| Cc: "Andy Kellens" <[hidden email]>, "Thomas Cleenewerck" <[hidden email]>
| Sent: Thursday, September 29, 2011 10:25:56 AM
| Subject: [GS/SS Beta] gemstone reclaim voodoo
|
| Hi Dale,
|
| I'm continuing to experience some voodoo regarding repository growth.
| Because I'm not a voodoo practitioner, I hope I will finally get to
| understand a bit more of it.
| Just in case you might have some voodoo experience, I'm throwing this
| question onto the list ;-)
|
| I have an extent that grew over 2Gb data (file size was bigger). The
| object log was empty, no session instances were trailing and the MFC
| told me there was plenty of space to be reclaimed. For over a week,
| the MFC reported for roughly 1.8Gb of possible dead objects on every
| run. However, that space just was never entirely reclaimed. The
| amount of garbage that was never reclaimed slowly grew and, as I
| discovered afterwards, was worth 1Gb (instead of the approximated
| 1.8Gb -- which would have been the entire repo -- now, I hope GS
| does not see Yesplan as garbage ;-)))
|
| The 'System hasMissingGcGems' returned false. The logs of all gems
| were fine… except that the MFC was returning a large value to be
| reclaimed, which just stayed in place.
|
| Because MFC/reclaim cycles were putting the system more and more
| under stress because of io-wait, I decided to restart the stone and
| all gems.
|
| And… yes… the reclaim happened just minutes after restart. I'm
| happily running the repository at a comfortable size again.
| And I see similar things in other stones.
|
| So… would a weekly reboot of the stone be a good thing to do? Is this
| behavior that can happen?
|
| I hope I'm not asking away too much ;-)
| Johan
|
| A sample log entry:
|
| AbortingError 3020: Successful completion of markForCollection.
| <3806475> live objects found. <20494515> possible dead objects,
| occupying approximately <1844506350> bytes, may be reclaimed.
| Repository: 3320.00M, Free: 1345.70M, Used: 1974.30M

Johan Brichau-2

Re: gemstone reclaim voodoo

Hi Dale,

I am starting the maintenance vm periodically via a cronjob. So it stops immediately after the mfc. Recycling all seaside gems might be the answer. I stopped the entire stone to kill the reclaim gems as well. They were running fine according to their logs.

Johan (sent from my mobile)

On 29 Sep 2011, at 21:21, Dale Henrichs <[hidden email]> wrote:

> Johan,
>
> I'm going to guess that you are suffering from "Issue 136: maintenance vm running only #unregisterExpiredHandlers may hang onto dead objects"[1].
>
> If so, then you only need to recycle the maintenance vm before doing your gc. Norbert, have you found it necessary to recycle the seaside gems as well?
>
> If that doesn't work we'll investigate further.
>
> Dale
>
> [1] http://code.google.com/p/glassdb/issues/detail?id=136
>
> ----- Original Message -----
> | From: "Johan Brichau" <[hidden email]>
> | To: "GemStone Seaside beta discussion" <[hidden email]>
> | Cc: "Andy Kellens" <[hidden email]>, "Thomas Cleenewerck" <[hidden email]>
> | Sent: Thursday, September 29, 2011 10:25:56 AM
> | Subject: [GS/SS Beta] gemstone reclaim voodoo
> |
> | Hi Dale,
> |
> | I'm continuing to experience some voodoo regarding repository growth.
> | Because I'm not a voodoo practitioner, I hope I will finally get to
> | understand a bit more of it.
> | Just in case you might have some voodoo experience, I'm throwing this
> | question onto the list ;-)
> |
> | I have an extent that grew over 2Gb data (file size was bigger). The
> | object log was empty, no session instances were trailing and the MFC
> | told me there was plenty of space to be reclaimed. For over a week,
> | the MFC reported for roughly 1.8Gb of possible dead objects on every
> | run. However, that space just was never entirely reclaimed. The
> | amount of garbage that was never reclaimed slowly grew and, as I
> | discovered afterwards, was worth 1Gb (instead of the approximated
> | 1.8Gb -- which would have been the entire repo -- now, I hope GS
> | does not see Yesplan as garbage ;-)))
> |
> | The 'System hasMissingGcGems' returned false. The logs of all gems
> | were fine… except that the MFC was returning a large value to be
> | reclaimed, which just stayed in place.
> |
> | Because MFC/reclaim cycles were putting the system more and more
> | under stress because of io-wait, I decided to restart the stone and
> | all gems.
> |
> | And… yes… the reclaim happened just minutes after restart. I'm
> | happily running the repository at a comfortable size again.
> | And I see similar things in other stones.
> |
> | So… would a weekly reboot of the stone be a good thing to do? Is this
> | behavior that can happen?
> |
> | I hope I'm not asking away too much ;-)
> | Johan
> |
> | A sample log entry:
> |
> | AbortingError 3020: Successful completion of markForCollection.
> | <3806475> live objects found. <20494515> possible dead objects,
> | occupying approximately <1844506350> bytes, may be reclaimed.
> | Repository: 3320.00M, Free: 1345.70M, Used: 1974.30M

Dale Henrichs

Re: gemstone reclaim voodoo

You might try the following in your system.conf:

GEM_TEMPOBJ_POMGEN_PRUNE_ON_VOTE=90

#=========================================================================
# GEM_TEMPOBJ_POMGEN_PRUNE_ON_VOTE: Percent of pom generation area
# to be cleared when voting on possible dead objects.
# If value is > 0 and < 100, subspaces of pom generation older
# than 5 minutes are cleared; the number of subspaces cleared is
# the specified percentage of spaces in use rounded down to an
# integral number of spaces.
# If value == 100, all subspaces of pom generation are cleared without
# regard to their age.
#
# If this value is not specified, or the specified value is out of range,
# the default is used.

You won't get a complete flush of POM objects on vote but eventually you'll flush out the older references... If you are experiencing this problem.

Dale

----- Original Message -----
| From: "Johan Brichau" <[hidden email]>
| To: "GemStone Seaside beta discussion" <[hidden email]>
| Cc: "Andy Kellens" <[hidden email]>, "GemStone Seaside beta discussion" <[hidden email]>, "Thomas
| Cleenewerck" <[hidden email]>
| Sent: Thursday, September 29, 2011 2:00:16 PM
| Subject: Re: [GS/SS Beta] gemstone reclaim voodoo
|
| Hi Dale,
|
| I am starting the maintenance vm periodically via a cronjob. So it
| stops immediately after the mfc. Recycling all seaside gems might be
| the answer. I stopped the entire stone to kill the reclaim gems as
| well. They were running fine according to their logs.
|
|
| Johan (sent from my mobile)
|
| On 29 Sep 2011, at 21:21, Dale Henrichs <[hidden email]> wrote:
|
| > Johan,
| >
| > I'm going to guess that you are suffering from "Issue 136:
| > maintenance vm running only #unregisterExpiredHandlers may hang
| > onto dead objects"[1].
| >
| > If so, then you only need to recycle the maintenance vm before
| > doing your gc. Norbert, have you found it necessary to recycle the
| > seaside gems as well?
| >
| > If that doesn't work we'll investigate further.
| >
| > Dale
| >
| > [1] http://code.google.com/p/glassdb/issues/detail?id=136
| >
| > ----- Original Message -----
| > | From: "Johan Brichau" <[hidden email]>
| > | To: "GemStone Seaside beta discussion"
| > | <[hidden email]>
| > | Cc: "Andy Kellens" <[hidden email]>, "Thomas Cleenewerck"
| > | <[hidden email]>
| > | Sent: Thursday, September 29, 2011 10:25:56 AM
| > | Subject: [GS/SS Beta] gemstone reclaim voodoo
| > |
| > | Hi Dale,
| > |
| > | I'm continuing to experience some voodoo regarding repository
| > | growth.
| > | Because I'm not a voodoo practitioner, I hope I will finally get
| > | to
| > | understand a bit more of it.
| > | Just in case you might have some voodoo experience, I'm throwing
| > | this
| > | question onto the list ;-)
| > |
| > | I have an extent that grew over 2Gb data (file size was bigger).
| > | The
| > | object log was empty, no session instances were trailing and the
| > | MFC
| > | told me there was plenty of space to be reclaimed. For over a
| > | week,
| > | the MFC reported for roughly 1.8Gb of possible dead objects on
| > | every
| > | run. However, that space just was never entirely reclaimed. The
| > | amount of garbage that was never reclaimed slowly grew and, as I
| > | discovered afterwards, was worth 1Gb (instead of the approximated
| > | 1.8Gb -- which would have been the entire repo -- now, I hope GS
| > | does not see Yesplan as garbage ;-)))
| > |
| > | The 'System hasMissingGcGems' returned false. The logs of all
| > | gems
| > | were fine… except that the MFC was returning a large value to be
| > | reclaimed, which just stayed in place.
| > |
| > | Because MFC/reclaim cycles were putting the system more and more
| > | under stress because of io-wait, I decided to restart the stone
| > | and
| > | all gems.
| > |
| > | And… yes… the reclaim happened just minutes after restart. I'm
| > | happily running the repository at a comfortable size again.
| > | And I see similar things in other stones.
| > |
| > | So… would a weekly reboot of the stone be a good thing to do? Is
| > | this
| > | behavior that can happen?
| > |
| > | I hope I'm not asking away too much ;-)
| > | Johan
| > |
| > | A sample log entry:
| > |
| > | AbortingError 3020: Successful completion of markForCollection.
| > | <3806475> live objects found. <20494515> possible dead objects,
| > | occupying approximately <1844506350> bytes, may be reclaimed.
| > | Repository: 3320.00M, Free: 1345.70M, Used: 1974.30M
|