Fwd: [Vm-dev] VM stability issue on unix

Previous Topic Next Topic
 
classic Classic list List threaded Threaded
2 messages Options
Reply | Threaded
Open this post in threaded view
|

Fwd: [Vm-dev] VM stability issue on unix

Adrian Lienhard
I've recently brought up the following issue on the VM mailing list,  
but got no reply so far for whatever reason...

In short, the problem is that the unix VM blocks after the memory  
consumption exceeds about 120MB. I think its a critical bug, likely  
affecting many Seaside users who deploy on unix systems.

I've filed the following Mantis report: http://bugs.impara.de/ 
view.php?id=4709

Adrian


Begin forwarded message:

> From: Adrian Lienhard <[hidden email]>
> Date: August 31, 2006 8:23:20 AM GMT+02:00
> To: [hidden email]
> Subject: [Vm-dev] VM stability issue on unix
> Reply-To: Squeak Virtual Machine Development Discussion <vm-
> [hidden email]>
>
> Hi VM maintainers,
>
> We have run into the following problem with 3.7/3.9 unix VMs (but  
> not with version 3.6). The VM hoggs the CPU and does not respond  
> anymore after consuming more than about 120MB of memory. The  
> problem is reproducible independently of the image version (simply  
> by instantiating enough objects).
>
> This is the VM version we are using:
>
> 3.7-7 #1 Don Okt 20 11:25:27 CEST 2005 gcc-Version (Debian  
> Squeak3.7 of '4 September 2004' [latest update: #5989] Linux 2.6.10  
> #1 Tue Dec 28 21:16:21 CET 2004 i686 GNU/Linux
>
> Inspecting the Squeak process stacks with gdb does not show  
> anything unusual, however, one process does not seem to get back  
> from calling t he new: primitive.
>
> The call stack of the VM looks like this:
>
> #0  updatePointersInRangeFromto (memStart=231866373,  
> memEnd=2138759076)
>     at gnu-interp.c:21562
> #1  0x0805d500 in incCompBody () at gnu-interp.c:4650
> #2  0x0805d2ed in fullGC () at gnu-interp.c:4500
> #3  0x0806d576 in sufficientSpaceAfterGC (minFree=202536) at gnu-
> interp.c:21275
> #4  0x08067d5e in primitiveNewWithArg () at gnu-interp.c:16045
> #5  0x080614fa in interpret () at gnu-interp.c:7249
> #6  0x0805a5fb in main (argc=0, argv=0xbfeff8a4, envp=0x0)
>     at /usr/src/Squeak-3.7-7/platforms/unix/vm/sqUnixMain.c:1367
>
> It looks like the same issue has been discussed in the following  
> thread already: http://lists.squeakfoundation.org/pipermail/seaside/ 
> 2005-October/005897.html.
> The proposed workaround of explicitly setting -memory with a high  
> enough value works, i.e., the vm does not stop working when the  
> memory consumption exceeds 120MB.
>
> Since in Seaside applications images often grow up to 200MB or even  
> more, this is a real show stopper...
>
> Cheers,
> Adrian


Reply | Threaded
Open this post in threaded view
|

Re: [Vm-dev] VM stability issue on unix

johnmci
Ok, if you can forward this to the Seaside list that would be good.

It's possible that you run in the GC bug that I talk about  in

http://minnow.cc.gatech.edu/squeak/3710

Basically when the GC logic approaches a decision to grow memory it  
first does a full GC event, if that GC event
return just enough memory, then it will not grow the memory used by  
the VM, however the amount of space required to
do anything is insufficient, and after a few message sends we again  
try to grow memory and do a full GC and recover
just enough bytes not to force the grow.  Repeat a few million  
times...  CPU goes to 100% always in GC logic no real work happens,  
normally
it only interates a few thousand, 10 or 100 thousand times, sucking  
CPU uselessly...

First of all you need a new Unix VM build with the latest VM maker  
that has the needed code and primitive API.

The look at my change sets.

Somewhere at startup time you need to invoke:

Smalltalk setGCBiasToGrowGCLimit: 16*1024*1024.
Smalltalk setGCBiasToGrow: 1.
GCMonitor runActive.

setGCBiasToGrow alters the VM to grow versus doing a full GC and  
deciding to grow.

setGCBiasToGrowGCLimit alters the limit to force a full GC after we  
grow by this much to ensure growth is not unbound.

The GCMonitor class allows you to collect statistical data from the  
GC, either at the end of each GC cycle, or on a timer.
This data can be then saved to a file for review to allow one to  
intelligently adjust the GC parms .
The supplied GCMonitor more a template than a finished product.


It more importantly much like VisualWorks looks at some of the data  
and makes runtime decisions like:

a) force a tenure if we find we are doing too much root table  
scanning, this happens if you allocate a large collection and it gets  
put in to
the root table, when looking for intergenerational references we scan  
the entire million entry object since the GC knows the object  
contains a pointer to an object that is a root, but which one?  
Historically people would force a GC after allocating a large  
collection to avoid this problem.   However if I say

(statMarkCount ) > (statAllocationCount*2)
                ifTrue: [Smalltalk forceTenure].  "Tenure if we think too much root  
table marking is going on"

where we look at the mark count total versus the allocation count  
total we can decide if we need to force a tenure to solve this  
problem automatically.

The code also has some example code (not used) to  alter the size of  
the allocation and tenure targets to adjust a GC cycle to 1 millisecond.
Those values where picked, oh 10  years back for 25 Mhz machines, I'd  
guess 3Ghz machines can increment GC much more memory.

        (statIGCDeltaTime = 0) ifTrue:
                [target _ (Smalltalk vmParameterAt: 5)+21.
                Smalltalk vmParameterAt: 5 put: target.  "do an incremental GC  
after this many allocations"
                Smalltalk vmParameterAt: 6 put: target*3//4.  "tenure when more  
than this many objects survive the GC"].
        (statIGCDeltaTime > 0) ifTrue:
                [target _ ((Smalltalk vmParameterAt: 5)-27) max: 2000.
                Smalltalk vmParameterAt: 5 put: target.  "do an incremental GC  
after this many allocations"
                Smalltalk vmParameterAt: 6 put: target*3//4.  "tenure when more  
than this many objects survive the GC"].

       
        (statIGCDeltaTime < 1) ifTrue:
                [target _ (Smalltalk vmParameterAt: 5)+21.
                Smalltalk vmParameterAt: 5 put: target.  "do an incremental GC  
after this many allocations"
                Smalltalk vmParameterAt: 6 put: target*3//4.  "tenure when more  
than this many objects survive the GC"].
        (statIGCDeltaTime > 1) ifTrue:
                [target _ ((Smalltalk vmParameterAt: 5)-27) max: 4000.
                Smalltalk vmParameterAt: 5 put: target.  "do an incremental GC  
after this many allocations"
                Smalltalk vmParameterAt: 6 put: target*3//4.  "tenure when more  
than this many objects survive the GC"].! !


I'll further note that Sophie has a SophieMemoryPolicy now to tune  
Sophie GC behavior, perhaps SeaSide requires a SeasideMemoryPolicy to
best tune the GC at runtime like more sophisticated VMs, like  
VisualWorks?

Lastly I'd welcome a statistical file or two to look at from a large  
seaside image just to understand what's happening, and as always for  
a fee I
can perform a GC memory audit on any large scale VW or Squeak  
application.


On 5-Sep-06, at 3:06 AM, Adrian Lienhard wrote:

> I've recently brought up the following issue on the VM mailing  
> list, but got no reply so far for whatever reason...
>
> In short, the problem is that the unix VM blocks after the memory  
> consumption exceeds about 120MB. I think its a critical bug, likely  
> affecting many Seaside users who deploy on unix systems.
>
> I've filed the following Mantis report: http://bugs.impara.de/ 
> view.php?id=4709
>
> Adrian
>
>
> Begin forwarded message:
>
>> From: Adrian Lienhard <[hidden email]>
>> Date: August 31, 2006 8:23:20 AM GMT+02:00
>> To: [hidden email]
>> Subject: [Vm-dev] VM stability issue on unix
>> Reply-To: Squeak Virtual Machine Development Discussion <vm-
>> [hidden email]>
>>
>> Hi VM maintainers,
>>
>> We have run into the following problem with 3.7/3.9 unix VMs (but  
>> not with version 3.6). The VM hoggs the CPU and does not respond  
>> anymore after consuming more than about 120MB of memory. The  
>> problem is reproducible independently of the image version (simply  
>> by instantiating enough objects).
>>
>> This is the VM version we are using:
>>
>> 3.7-7 #1 Don Okt 20 11:25:27 CEST 2005 gcc-Version (Debian  
>> Squeak3.7 of '4 September 2004' [latest update: #5989] Linux  
>> 2.6.10 #1 Tue Dec 28 21:16:21 CET 2004 i686 GNU/Linux
>>
>> Inspecting the Squeak process stacks with gdb does not show  
>> anything unusual, however, one process does not seem to get back  
>> from calling t he new: primitive.
>>
>> The call stack of the VM looks like this:
>>
>> #0  updatePointersInRangeFromto (memStart=231866373,  
>> memEnd=2138759076)
>>     at gnu-interp.c:21562
>> #1  0x0805d500 in incCompBody () at gnu-interp.c:4650
>> #2  0x0805d2ed in fullGC () at gnu-interp.c:4500
>> #3  0x0806d576 in sufficientSpaceAfterGC (minFree=202536) at gnu-
>> interp.c:21275
>> #4  0x08067d5e in primitiveNewWithArg () at gnu-interp.c:16045
>> #5  0x080614fa in interpret () at gnu-interp.c:7249
>> #6  0x0805a5fb in main (argc=0, argv=0xbfeff8a4, envp=0x0)
>>     at /usr/src/Squeak-3.7-7/platforms/unix/vm/sqUnixMain.c:1367
>>
>> It looks like the same issue has been discussed in the following  
>> thread already: http://lists.squeakfoundation.org/pipermail/ 
>> seaside/2005-October/005897.html.
>> The proposed workaround of explicitly setting -memory with a high  
>> enough value works, i.e., the vm does not stop working when the  
>> memory consumption exceeds 120MB.
>>
>> Since in Seaside applications images often grow up to 200MB or  
>> even more, this is a real show stopper...
>>
>> Cheers,
>> Adrian
>
>

--
========================================================================
===
John M. McIntosh <[hidden email]>
Corporate Smalltalk Consulting Ltd.  http://www.smalltalkconsulting.com
========================================================================
===