[Glass] Maintenance Gem exits unexpectedly

Previous Topic Next Topic
 
classic Classic list List threaded Threaded
3 messages Options
Reply | Threaded
Open this post in threaded view
|

[Glass] Maintenance Gem exits unexpectedly

Alejandro Zuzek
Hi all,

I have realized I have a problem with the maintenance gem and I'd need help to solve it. The thing is that the maintenance gem exits about after one day of operation. This is on a server with very little activity. One data point that might be of relevance is that I added one maintenance task to WAGemStoneMaintenanceTask, but this was about a month ago and I am not sure if I had the maintenance gem problem before I made that change to WAGemStoneMaintenanceTask. Another fact that might be of interest is that the extent size has not changed for at least one month and has remained constant at 494927872 bytes. On the other hand, the tran log has gone from 509220864 bytes to 554483200 bytes in the same period. I am running GemStone64Bit3.1.0.1-x86_64.Linux. Below are the last lines on the logs that I think may have useful information:

tail -10 maintenance_gem.log
...Expired: 0 sessions.
Unregistering...2013-09-01T10:45:23.85392093658447-07:00
...Expired: 0 sessions.
Unregistering...2013-09-01T10:46:23.86733102798462-07:00
...Expired: 0 sessions.
Unregistering...2013-09-01T10:47:23.87195110321045-07:00
...Expired: 0 sessions.
Unregistering...2013-09-01T10:48:23.88839292526245-07:00
...Expired: 0 sessions.
--transcript--'Starting markForCollect.: 2013-09-01T10:48:23.89286208152771-07:00'

tail -13 seaside_30319pcmon.log
[09/01/13 14:48:25.718 ART]: Client died: Slot   12, PID   17784, LostOtFlags    0, Name mfc-0
[09/01/13 14:48:25.731 ART]: Starting crashed client recovery: Slot 12, PID 17784, Name mfc-0
    Cleaned up locked/pinned frame 562 for crashed process 17784
[09/01/13 14:48:26.002 ART]: Finished crashed client recovery: Slot 12, PID 17784, Name mfc-0
[09/01/13 14:48:26.002 ART]: Starting crashed client recovery: Slot 13, PID 17784, Name mfc-1
    Cleaned up locked/pinned frame 1087 for crashed process 17784
[09/01/13 14:48:26.063 ART]: Finished crashed client recovery: Slot 13, PID 17784, Name mfc-1
[09/01/13 14:48:26.063 ART]: Starting crashed client recovery: Slot 11, PID 17784, Name Maintenance-140-DataCurator
    Cleaned up locked/pinned frame 3521 for crashed process 17784
    Cleaned up locked/pinned frame 2828 for crashed process 17784
    Cleaned up locked/pinned frame 56 for crashed process 17784
    Disposing free frame cache from slot 11:  8 of 8 frames recovered.
[09/01/13 14:48:26.123 ART]: Finished crashed client recovery: Slot 11, PID 17784, Name Maintenance-140-DataCurator 

Any help with this will be appreciated.

Thanks,

Alejandro Zuzek

_______________________________________________
Glass mailing list
[hidden email]
http://lists.gemtalksystems.com/mailman/listinfo/glass
Reply | Threaded
Open this post in threaded view
|

Re: [Glass] Maintenance Gem exits unexpectedly

Ken Treis
Hi Alejandro,

Is there any chance that the kernel is killing it? Check `dmesg` output. These gems volunteer themselves for sacrifice in an out-of-memory situation, and the maintenance gem usually has a bigger memory footprint than the Seaside gems.

3.1.0.1 has a bug in the MFC process that can cause it to stall indefinitely:


I'd definitely recommend upgrading to 3.1.0.4 if you are able to. Otherwise, running `SystemRepository fastMarkForCollection` from topaz will sometimes let you complete MFC without a hang, but it's pretty resource-intensive if you run it that way.

--
Ken Treis
Miriam Technologies, Inc.
(866) 652-2040 x221

On Sep 2, 2013, at 4:47 PM, Alejandro Zuzek wrote:

Hi all,

I have realized I have a problem with the maintenance gem and I'd need help to solve it. The thing is that the maintenance gem exits about after one day of operation. This is on a server with very little activity. One data point that might be of relevance is that I added one maintenance task to WAGemStoneMaintenanceTask, but this was about a month ago and I am not sure if I had the maintenance gem problem before I made that change to WAGemStoneMaintenanceTask. Another fact that might be of interest is that the extent size has not changed for at least one month and has remained constant at 494927872 bytes. On the other hand, the tran log has gone from 509220864 bytes to 554483200 bytes in the same period. I am running GemStone64Bit3.1.0.1-x86_64.Linux. Below are the last lines on the logs that I think may have useful information:

tail -10 maintenance_gem.log
...Expired: 0 sessions.
Unregistering...2013-09-01T10:45:23.85392093658447-07:00
...Expired: 0 sessions.
Unregistering...2013-09-01T10:46:23.86733102798462-07:00
...Expired: 0 sessions.
Unregistering...2013-09-01T10:47:23.87195110321045-07:00
...Expired: 0 sessions.
Unregistering...2013-09-01T10:48:23.88839292526245-07:00
...Expired: 0 sessions.
--transcript--'Starting markForCollect.: 2013-09-01T10:48:23.89286208152771-07:00'

tail -13 seaside_30319pcmon.log
[09/01/13 14:48:25.718 ART]: Client died: Slot   12, PID   17784, LostOtFlags    0, Name mfc-0
[09/01/13 14:48:25.731 ART]: Starting crashed client recovery: Slot 12, PID 17784, Name mfc-0
    Cleaned up locked/pinned frame 562 for crashed process 17784
[09/01/13 14:48:26.002 ART]: Finished crashed client recovery: Slot 12, PID 17784, Name mfc-0
[09/01/13 14:48:26.002 ART]: Starting crashed client recovery: Slot 13, PID 17784, Name mfc-1
    Cleaned up locked/pinned frame 1087 for crashed process 17784
[09/01/13 14:48:26.063 ART]: Finished crashed client recovery: Slot 13, PID 17784, Name mfc-1
[09/01/13 14:48:26.063 ART]: Starting crashed client recovery: Slot 11, PID 17784, Name Maintenance-140-DataCurator
    Cleaned up locked/pinned frame 3521 for crashed process 17784
    Cleaned up locked/pinned frame 2828 for crashed process 17784
    Cleaned up locked/pinned frame 56 for crashed process 17784
    Disposing free frame cache from slot 11:  8 of 8 frames recovered.
[09/01/13 14:48:26.123 ART]: Finished crashed client recovery: Slot 11, PID 17784, Name Maintenance-140-DataCurator 

Any help with this will be appreciated.

Thanks,

Alejandro Zuzek
_______________________________________________
Glass mailing list
[hidden email]
http://lists.gemtalksystems.com/mailman/listinfo/glass


_______________________________________________
Glass mailing list
[hidden email]
http://lists.gemtalksystems.com/mailman/listinfo/glass
Reply | Threaded
Open this post in threaded view
|

Re: [Glass] Maintenance Gem exits unexpectedly

Alejandro Zuzek
Hi Ken,

Thanks for your reply. I will take the upgrade path and switch to 3.1.0.4. There is one question, though. I've been reviewing release notes and they refer to the install guide for upgrading. In the install guide I didn't find anything speciffic about backing up the extents or transaction logs before running the script. Does the install script consider the option of an upgrade or will it overwrite my current extent and tranlog with the default ones?

Thanks,

Alejandro


On Tue, Sep 3, 2013 at 2:13 AM, Ken Treis <[hidden email]> wrote:
Hi Alejandro,

Is there any chance that the kernel is killing it? Check `dmesg` output. These gems volunteer themselves for sacrifice in an out-of-memory situation, and the maintenance gem usually has a bigger memory footprint than the Seaside gems.

3.1.0.1 has a bug in the MFC process that can cause it to stall indefinitely:


I'd definitely recommend upgrading to 3.1.0.4 if you are able to. Otherwise, running `SystemRepository fastMarkForCollection` from topaz will sometimes let you complete MFC without a hang, but it's pretty resource-intensive if you run it that way.

--
Ken Treis
Miriam Technologies, Inc.
<a href="tel:%28866%29%20652-2040%20x221" value="+18666522040" target="_blank">(866) 652-2040 x221

On Sep 2, 2013, at 4:47 PM, Alejandro Zuzek wrote:

Hi all,

I have realized I have a problem with the maintenance gem and I'd need help to solve it. The thing is that the maintenance gem exits about after one day of operation. This is on a server with very little activity. One data point that might be of relevance is that I added one maintenance task to WAGemStoneMaintenanceTask, but this was about a month ago and I am not sure if I had the maintenance gem problem before I made that change to WAGemStoneMaintenanceTask. Another fact that might be of interest is that the extent size has not changed for at least one month and has remained constant at 494927872 bytes. On the other hand, the tran log has gone from 509220864 bytes to 554483200 bytes in the same period. I am running GemStone64Bit3.1.0.1-x86_64.Linux. Below are the last lines on the logs that I think may have useful information:

tail -10 maintenance_gem.log
...Expired: 0 sessions.
Unregistering...2013-09-01T10:45:23.85392093658447-07:00
...Expired: 0 sessions.
Unregistering...2013-09-01T10:46:23.86733102798462-07:00
...Expired: 0 sessions.
Unregistering...2013-09-01T10:47:23.87195110321045-07:00
...Expired: 0 sessions.
Unregistering...2013-09-01T10:48:23.88839292526245-07:00
...Expired: 0 sessions.
--transcript--'Starting markForCollect.: 2013-09-01T10:48:23.89286208152771-07:00'

tail -13 seaside_30319pcmon.log
[09/01/13 14:48:25.718 ART]: Client died: Slot   12, PID   17784, LostOtFlags    0, Name mfc-0
[09/01/13 14:48:25.731 ART]: Starting crashed client recovery: Slot 12, PID 17784, Name mfc-0
    Cleaned up locked/pinned frame 562 for crashed process 17784
[09/01/13 14:48:26.002 ART]: Finished crashed client recovery: Slot 12, PID 17784, Name mfc-0
[09/01/13 14:48:26.002 ART]: Starting crashed client recovery: Slot 13, PID 17784, Name mfc-1
    Cleaned up locked/pinned frame 1087 for crashed process 17784
[09/01/13 14:48:26.063 ART]: Finished crashed client recovery: Slot 13, PID 17784, Name mfc-1
[09/01/13 14:48:26.063 ART]: Starting crashed client recovery: Slot 11, PID 17784, Name Maintenance-140-DataCurator
    Cleaned up locked/pinned frame 3521 for crashed process 17784
    Cleaned up locked/pinned frame 2828 for crashed process 17784
    Cleaned up locked/pinned frame 56 for crashed process 17784
    Disposing free frame cache from slot 11:  8 of 8 frames recovered.
[09/01/13 14:48:26.123 ART]: Finished crashed client recovery: Slot 11, PID 17784, Name Maintenance-140-DataCurator 

Any help with this will be appreciated.

Thanks,

Alejandro Zuzek
_______________________________________________
Glass mailing list
[hidden email]
http://lists.gemtalksystems.com/mailman/listinfo/glass



_______________________________________________
Glass mailing list
[hidden email]
http://lists.gemtalksystems.com/mailman/listinfo/glass