Smalltalk › Squeak › Squeak - Dev

Fwd: [Vm-dev] VM stability issue on unix

_‹ Previous Topic Next Topic _›

Classic

List

Threaded

2 messages Options

Adrian Lienhard

Fwd: [Vm-dev] VM stability issue on unix

I've recently brought up the following issue on the VM mailing list,
but got no reply so far for whatever reason...

In short, the problem is that the unix VM blocks after the memory
consumption exceeds about 120MB. I think its a critical bug, likely
affecting many Seaside users who deploy on unix systems.

I've filed the following Mantis report: http://bugs.impara.de/
view.php?id=4709

Adrian

Begin forwarded message:

> From: Adrian Lienhard <[hidden email]>
> Date: August 31, 2006 8:23:20 AM GMT+02:00
> To: [hidden email]
> Subject: [Vm-dev] VM stability issue on unix
> Reply-To: Squeak Virtual Machine Development Discussion <vm-
> [hidden email]>
>
> Hi VM maintainers,
>
> We have run into the following problem with 3.7/3.9 unix VMs (but
> not with version 3.6). The VM hoggs the CPU and does not respond
> anymore after consuming more than about 120MB of memory. The
> problem is reproducible independently of the image version (simply
> by instantiating enough objects).
>
> This is the VM version we are using:
>
> 3.7-7 #1 Don Okt 20 11:25:27 CEST 2005 gcc-Version (Debian
> Squeak3.7 of '4 September 2004' [latest update: #5989] Linux 2.6.10
> #1 Tue Dec 28 21:16:21 CET 2004 i686 GNU/Linux
>
> Inspecting the Squeak process stacks with gdb does not show
> anything unusual, however, one process does not seem to get back
> from calling t he new: primitive.
>
> The call stack of the VM looks like this:
>
> #0 updatePointersInRangeFromto (memStart=231866373,
> memEnd=2138759076)
> at gnu-interp.c:21562
> #1 0x0805d500 in incCompBody () at gnu-interp.c:4650
> #2 0x0805d2ed in fullGC () at gnu-interp.c:4500
> #3 0x0806d576 in sufficientSpaceAfterGC (minFree=202536) at gnu-
> interp.c:21275
> #4 0x08067d5e in primitiveNewWithArg () at gnu-interp.c:16045
> #5 0x080614fa in interpret () at gnu-interp.c:7249
> #6 0x0805a5fb in main (argc=0, argv=0xbfeff8a4, envp=0x0)
> at /usr/src/Squeak-3.7-7/platforms/unix/vm/sqUnixMain.c:1367
>
> It looks like the same issue has been discussed in the following
> thread already: http://lists.squeakfoundation.org/pipermail/seaside/
> 2005-October/005897.html.
> The proposed workaround of explicitly setting -memory with a high
> enough value works, i.e., the vm does not stop working when the
> memory consumption exceeds 120MB.
>
> Since in Seaside applications images often grow up to 200MB or even
> more, this is a real show stopper...
>
> Cheers,
> Adrian

johnmci

Re: [Vm-dev] VM stability issue on unix

Ok, if you can forward this to the Seaside list that would be good.

It's possible that you run in the GC bug that I talk about in

http://minnow.cc.gatech.edu/squeak/3710

Basically when the GC logic approaches a decision to grow memory it
first does a full GC event, if that GC event
return just enough memory, then it will not grow the memory used by
the VM, however the amount of space required to
do anything is insufficient, and after a few message sends we again
try to grow memory and do a full GC and recover
just enough bytes not to force the grow. Repeat a few million
times... CPU goes to 100% always in GC logic no real work happens,
normally
it only interates a few thousand, 10 or 100 thousand times, sucking
CPU uselessly...

First of all you need a new Unix VM build with the latest VM maker
that has the needed code and primitive API.

The look at my change sets.

Somewhere at startup time you need to invoke:

Smalltalk setGCBiasToGrowGCLimit: 16*1024*1024.
Smalltalk setGCBiasToGrow: 1.
GCMonitor runActive.

setGCBiasToGrow alters the VM to grow versus doing a full GC and
deciding to grow.

setGCBiasToGrowGCLimit alters the limit to force a full GC after we
grow by this much to ensure growth is not unbound.

The GCMonitor class allows you to collect statistical data from the
GC, either at the end of each GC cycle, or on a timer.
This data can be then saved to a file for review to allow one to
intelligently adjust the GC parms .
The supplied GCMonitor more a template than a finished product.

It more importantly much like VisualWorks looks at some of the data
and makes runtime decisions like:

a) force a tenure if we find we are doing too much root table
scanning, this happens if you allocate a large collection and it gets
put in to
the root table, when looking for intergenerational references we scan
the entire million entry object since the GC knows the object
contains a pointer to an object that is a root, but which one?
Historically people would force a GC after allocating a large
collection to avoid this problem. However if I say

(statMarkCount ) > (statAllocationCount*2)
ifTrue: [Smalltalk forceTenure]. "Tenure if we think too much root
table marking is going on"

where we look at the mark count total versus the allocation count
total we can decide if we need to force a tenure to solve this
problem automatically.

The code also has some example code (not used) to alter the size of
the allocation and tenure targets to adjust a GC cycle to 1 millisecond.
Those values where picked, oh 10 years back for 25 Mhz machines, I'd
guess 3Ghz machines can increment GC much more memory.

(statIGCDeltaTime = 0) ifTrue:
[target _ (Smalltalk vmParameterAt: 5)+21.
Smalltalk vmParameterAt: 5 put: target. "do an incremental GC
after this many allocations"
Smalltalk vmParameterAt: 6 put: target*3//4. "tenure when more
than this many objects survive the GC"].
(statIGCDeltaTime > 0) ifTrue:
[target _ ((Smalltalk vmParameterAt: 5)-27) max: 2000.
Smalltalk vmParameterAt: 5 put: target. "do an incremental GC
after this many allocations"
Smalltalk vmParameterAt: 6 put: target*3//4. "tenure when more
than this many objects survive the GC"].

(statIGCDeltaTime < 1) ifTrue:
[target _ (Smalltalk vmParameterAt: 5)+21.
Smalltalk vmParameterAt: 5 put: target. "do an incremental GC
after this many allocations"
Smalltalk vmParameterAt: 6 put: target*3//4. "tenure when more
than this many objects survive the GC"].
(statIGCDeltaTime > 1) ifTrue:
[target _ ((Smalltalk vmParameterAt: 5)-27) max: 4000.
Smalltalk vmParameterAt: 5 put: target. "do an incremental GC
after this many allocations"
Smalltalk vmParameterAt: 6 put: target*3//4. "tenure when more
than this many objects survive the GC"].! !

I'll further note that Sophie has a SophieMemoryPolicy now to tune
Sophie GC behavior, perhaps SeaSide requires a SeasideMemoryPolicy to
best tune the GC at runtime like more sophisticated VMs, like
VisualWorks?

Lastly I'd welcome a statistical file or two to look at from a large
seaside image just to understand what's happening, and as always for
a fee I
can perform a GC memory audit on any large scale VW or Squeak
application.

On 5-Sep-06, at 3:06 AM, Adrian Lienhard wrote:

> I've recently brought up the following issue on the VM mailing
> list, but got no reply so far for whatever reason...
>
> In short, the problem is that the unix VM blocks after the memory
> consumption exceeds about 120MB. I think its a critical bug, likely
> affecting many Seaside users who deploy on unix systems.
>
> I've filed the following Mantis report: http://bugs.impara.de/
> view.php?id=4709
>
> Adrian
>
>
> Begin forwarded message:
>
>> From: Adrian Lienhard <[hidden email]>
>> Date: August 31, 2006 8:23:20 AM GMT+02:00
>> To: [hidden email]
>> Subject: [Vm-dev] VM stability issue on unix
>> Reply-To: Squeak Virtual Machine Development Discussion <vm-
>> [hidden email]>
>>
>> Hi VM maintainers,
>>
>> We have run into the following problem with 3.7/3.9 unix VMs (but
>> not with version 3.6). The VM hoggs the CPU and does not respond
>> anymore after consuming more than about 120MB of memory. The
>> problem is reproducible independently of the image version (simply
>> by instantiating enough objects).
>>
>> This is the VM version we are using:
>>
>> 3.7-7 #1 Don Okt 20 11:25:27 CEST 2005 gcc-Version (Debian
>> Squeak3.7 of '4 September 2004' [latest update: #5989] Linux
>> 2.6.10 #1 Tue Dec 28 21:16:21 CET 2004 i686 GNU/Linux
>>
>> Inspecting the Squeak process stacks with gdb does not show
>> anything unusual, however, one process does not seem to get back
>> from calling t he new: primitive.
>>
>> The call stack of the VM looks like this:
>>
>> #0 updatePointersInRangeFromto (memStart=231866373,
>> memEnd=2138759076)
>> at gnu-interp.c:21562
>> #1 0x0805d500 in incCompBody () at gnu-interp.c:4650
>> #2 0x0805d2ed in fullGC () at gnu-interp.c:4500
>> #3 0x0806d576 in sufficientSpaceAfterGC (minFree=202536) at gnu-
>> interp.c:21275
>> #4 0x08067d5e in primitiveNewWithArg () at gnu-interp.c:16045
>> #5 0x080614fa in interpret () at gnu-interp.c:7249
>> #6 0x0805a5fb in main (argc=0, argv=0xbfeff8a4, envp=0x0)
>> at /usr/src/Squeak-3.7-7/platforms/unix/vm/sqUnixMain.c:1367
>>
>> It looks like the same issue has been discussed in the following
>> thread already: http://lists.squeakfoundation.org/pipermail/
>> seaside/2005-October/005897.html.
>> The proposed workaround of explicitly setting -memory with a high
>> enough value works, i.e., the vm does not stop working when the
>> memory consumption exceeds 120MB.
>>
>> Since in Seaside applications images often grow up to 200MB or
>> even more, this is a real show stopper...
>>
>> Cheers,
>> Adrian
>
>

--
========================================================================
===
John M. McIntosh <[hidden email]>
Corporate Smalltalk Consulting Ltd. http://www.smalltalkconsulting.com
========================================================================
===