Otto,
The UTL_GUARANTEE is hit when the connection to the ShrPCMonitor is interrupted. When a linux system is about to run out of swap space, it picks some processes to kill and in the case of GemStone this process can be the ShrPCMonitor ... When the ShrPCMonitor is killed the stone will come tumbling down ...
So as you suggest, you should make sure that you've got plenty of swap space allocated ....
If you want to send us the logs (stone and other processes) we could take a closer look and see if there's something else that might be going on ...
Dale
----- Original Message -----
| From: "Otto Behrens" <
[hidden email]>
| To: "GemStone Seaside beta discussion" <
[hidden email]>
| Cc: "Pieter Jacobs" <
[hidden email]>
| Sent: Thursday, October 4, 2012 3:19:24 AM
| Subject: [GS/SS Beta] standby
|
| Hi,
|
| We are running a warm standby for some of our production systems. We
| run GS 2.4.4.4 on ubuntu 10.04 LTS.
|
| Today our database crashed when we replayed a tranlog, with the
| following in the stone log:
|
| UTL_GUARANTEE failed, File
| /export/toronto3/users/buildgss/244x-1/src/shrpcclient.c line 72
|
| We pushed the ram usage on the machine over its limits, I think. We
| created large ram disks (tmpfs) which totaled 15.7GB, on a machine
| with 16GB of ram.
|
| We ran 2 GS databases + the standby on the machine, each with SPC's
| of
| around 1 GB,
|
| The tmpfs mounts are not full, so some of the ram is available for
| the
| OS to allocate to SPC's. But when the filesystem fills up those ram
| disks, we get it to break. I guess it has something to do with the OS
| trying to swap.
|
| Cheers
| Otto
|