Trying to understand tenuring

Previous Topic Next Topic
 
classic Classic list List threaded Threaded
6 messages Options
Reply | Threaded
Open this post in threaded view
|

Trying to understand tenuring

NorbertHartl
 
I'm inspecting memory behaviour at the moment. I have an image that receives roughly 1 request/s a zinc request. The request is processed without a notable amount of objects created on its way. Every 5 seconds it opens a zinc client connection to another host. That's basically all what happens. 

Looking at the memory consumption I get


What puzzles me is the growth of old space. The requests take a few milliseconds. The maximum time the objects are in image are the few milliseconds of processing time and the 5 seconds of buffering. Looking at the garbage collections I get



So doing the math gives me: We have a 160 milli garbage collections per second which is roughly a garbage collection every 6 seconds. A tenure is roughly every 10 GCs. So I would assume that an object needs to stay connected 60 seconds until it gets tenured. But all of the objects in the request don't live that long. How can I interpret the graph above where the old space is growing until a full GC?

Is there an easy explanation for this?

thanks,

Norbert
Reply | Threaded
Open this post in threaded view
|

Re: Trying to understand tenuring

Bert Freudenberg
 
On 08.05.2015, at 08:43, Norbert Hartl <[hidden email]> wrote:
> I would assume that an object needs to stay connected 60 seconds until it gets tenured. But all of the objects in the request don't live that long.

The object memory does not track how long an object was alive, or even how many incremental GCs it survived. Once the tenuring threshold is reached (after X allocations), all objects alive in new space at that point get tenured.

This is how it worked before Spur, anyway. Not sure what the tenuring policy is in Spur?

- Bert -


smime.p7s (5K) Download Attachment
Reply | Threaded
Open this post in threaded view
|

Re: Trying to understand tenuring

Clément Béra
 


2015-05-08 11:31 GMT+02:00 Bert Freudenberg <[hidden email]>:
 
On 08.05.2015, at 08:43, Norbert Hartl <[hidden email]> wrote:
> I would assume that an object needs to stay connected 60 seconds until it gets tenured. But all of the objects in the request don't live that long.

The object memory does not track how long an object was alive, or even how many incremental GCs it survived. Once the tenuring threshold is reached (after X allocations), all objects alive in new space at that point get tenured.

This is how it worked before Spur, anyway. Not sure what the tenuring policy is in Spur?

As far as I understood (Eliot correct me if I am wrong): 

In spur, when scavenging, if an object survives and has already survived a certain number of scavenges (I think currently 5), it is tenured instead of being moved to the future survivor space.

In addition, when scavenging, if the future survivor space gets almost full (I think currently at least 90% full), all objects surviving the current scavenge, starting from the point where the new survivor space reached the limit, are tenured.

So the policy depends on the number of scavenges survived by the object and the overall number of objects surviving the scavenges.

I don't know though how many objects will make it to the old space in your requests, but most probably less than currently.

Clement
 
- Bert -



Reply | Threaded
Open this post in threaded view
|

Re: Trying to understand tenuring

Eliot Miranda-2
In reply to this post by Bert Freudenberg
 
Hi Norbert,

    RRDTool looks nice.  Is it available for Mac OS X?  Was there much configuration?  Are you willing to make your set-up generally available?

On Fri, May 8, 2015 at 2:31 AM, Bert Freudenberg <[hidden email]> wrote:
 
On 08.05.2015, at 08:43, Norbert Hartl <[hidden email]> wrote:
> I would assume that an object needs to stay connected 60 seconds until it gets tenured. But all of the objects in the request don't live that long.

The object memory does not track how long an object was alive, or even how many incremental GCs it survived. Once the tenuring threshold is reached (after X allocations), all objects alive in new space at that point get tenured.

This is how it worked before Spur, anyway. Not sure what the tenuring policy is in Spur?

There are two policies.  The main one is as described in 

    An adaptive tenuring policy for generation scavengers
    David Ungar, Frank Jackson ParcPlace Systems
    ACM Transactions on Programming Languages and Systems
    Volume 14 Issue 1, Jan. 1992

This is very simple.  Based on how many objects survived the previous scavenge (how full the current survivor space is), an "age" above which objects will be tenured is determined.  If lots of objects (survivor space >= 90% full) have survived the previous scavenge, a proportion of the oldest objects in the survivor space will be tenured.  Because scavenging uses a breadth-first traversal, the order of objects in the survivor and eden spaces reflect their age.  The oldest are at the start of the spaces, the youngest at the end.  Hence the age is simply a pointer into the previous survivor space.

The proportion is read and written via vm parameter 6.  In good times (less than 90% of the survivor space is full) the proportion is zero, so that objects are only tenured if the survivor space overflows.  One can set the size of new space (default 4Mb) but the ratios of the spaces are fixed, 5/7 for eden, and 1/7 for each survivor space, as per David's original paper.

The second policy kicks in when the remembered set is very large.  When the remembered set is greater than a limit dependent on the size of new space (a 4Mb default eden sets a limit of about 750 entries in the remembered set), or when it is over 3/4 full (which ever is the larger), the scavenger uses a policy that attempts to shrink the remembered set by half.  The scavenger identifies those objects in new space that are referenced from the remembered set using a 3-bit reference count.  It then chooses a reference count that includes half of that population of new space objects, and then tenures all objects with at least that reference count.

This policy finds those new space objects that are referenced from many remembered set entries, and tenures those, hence statistically freeing those remembered table entries that reference the most new space objects.  This policy may seem a little elaborate, but
- the naïve policy of merely tenuring everything when the remembered set is full usually ends up tenuring lots of objects that themselves contain references to new objects and hence merely fills the remembered set with fresh new objects and hence simply tenures lots of objects
- I invented this policy to fix GC behaviour in a real world network monitoring application running on VisualWorks, so I know it works ;-), and I designed the Spur object header format to make its implementation a little simpler than VW's


Right now there /isn't/ a good policy for invoking the global mark-sweep garbage collector, and its compaction algorithm is slow.  The system merely remembers how many objects there are in old space, and does a full GC whenever tenuring results in the number of live objects in old space grows by 50%.  Of course, the image can decide to run the full GC, and does if a new: fails (which schedules a scavenge) and fails again after the immediate scavenge.  But we can do better.  Spur is young and there is lots of scope for adding intelligent (but please, simpler than VisualWorks') memory policy.

The thing that will really help is an incremental old space mark-sweep collector.  I'm looking both at

    Very concurrent mark-&-sweep garbage collection without fine-grain synchronization
    Lorenz Huelsbergen, Phil Winterbottom
    ISMM '98 Proceedings of the 1st international symposium on Memory management
    Pages 166 - 175 (ACM SIGPLAN Notices,  Volume 34 Issue 3, March 1999)

and the four-colour incremental mark-sweep in the LuaJIT VM, see http://wiki.luajit.org/New-Garbage-Collector#Quad-Color-Optimized-Incremental-Mark-&-Sweep.

The incremental GC would likely run in increments after each scavenge, and, if there was work to do, in the idle loop.  It could conceivably run in its own thread, but there are good arguments against that (basically a good GC costs very little, so making it concurrent doesn't gain much performance, but introduces complexity).  However, I've not got many cycles to address this and would love a collaborator who was motivated and knowledgeable to have a go at either of these, preferrably a combination of the two.
--
best,
Eliot
Reply | Threaded
Open this post in threaded view
|

VM/Image monitoring was Re: Trying to understand tenuring

Paul DeBruicker
Eliot Miranda-2 wrote
Hi Norbert,

    RRDTool looks nice.  Is it available for Mac OS X?  Was there much
configuration?  Are you willing to make your set-up generally available?


best,
Eliot

He's using munin (http://munin-monitoring.org/) for monitoring which uses RRDTool to draw the plots.  Munin is available on Mac OS X (http://munin-monitoring.org/wiki/MuninInstallationDarwin).  Once installed you'll want to create a plugin (http://munin-monitoring.org/wiki/plugins) to start monitoring the VM or image stats.  One plugin per plot, which can have multiple lines.  

I've used it but his setup is better.  


Hope this helps,


Paul
Reply | Threaded
Open this post in threaded view
|

Re: Trying to understand tenuring

NorbertHartl
In reply to this post by Eliot Miranda-2
 
Hi Eliot,

sorry for the late response but I was busy the last days.

Am 08.05.2015 um 19:12 schrieb Eliot Miranda <[hidden email]>:

Hi Norbert,

    RRDTool looks nice.  Is it available for Mac OS X?  Was there much configuration?  Are you willing to make your set-up generally available?

thanks. As I wrote this post I was already in the progress of ripping out the stuff and polish it a little bit to release. This weekend I've found the time to make it at least read to be used. All is pretty easy to do but the mix is important. I don't know what would be an OS X set up but I can try. Until now I've released the code here


and write an introductory blog post for it. You can read it here


Feedback welcomed as always,

Norbert

On Fri, May 8, 2015 at 2:31 AM, Bert Freudenberg <[hidden email]> wrote:
 
On 08.05.2015, at 08:43, Norbert Hartl <[hidden email]> wrote:
> I would assume that an object needs to stay connected 60 seconds until it gets tenured. But all of the objects in the request don't live that long.

The object memory does not track how long an object was alive, or even how many incremental GCs it survived. Once the tenuring threshold is reached (after X allocations), all objects alive in new space at that point get tenured.

This is how it worked before Spur, anyway. Not sure what the tenuring policy is in Spur?

There are two policies.  The main one is as described in 

    An adaptive tenuring policy for generation scavengers
    David Ungar, Frank Jackson ParcPlace Systems
    ACM Transactions on Programming Languages and Systems
    Volume 14 Issue 1, Jan. 1992

This is very simple.  Based on how many objects survived the previous scavenge (how full the current survivor space is), an "age" above which objects will be tenured is determined.  If lots of objects (survivor space >= 90% full) have survived the previous scavenge, a proportion of the oldest objects in the survivor space will be tenured.  Because scavenging uses a breadth-first traversal, the order of objects in the survivor and eden spaces reflect their age.  The oldest are at the start of the spaces, the youngest at the end.  Hence the age is simply a pointer into the previous survivor space.

The proportion is read and written via vm parameter 6.  In good times (less than 90% of the survivor space is full) the proportion is zero, so that objects are only tenured if the survivor space overflows.  One can set the size of new space (default 4Mb) but the ratios of the spaces are fixed, 5/7 for eden, and 1/7 for each survivor space, as per David's original paper.

The second policy kicks in when the remembered set is very large.  When the remembered set is greater than a limit dependent on the size of new space (a 4Mb default eden sets a limit of about 750 entries in the remembered set), or when it is over 3/4 full (which ever is the larger), the scavenger uses a policy that attempts to shrink the remembered set by half.  The scavenger identifies those objects in new space that are referenced from the remembered set using a 3-bit reference count.  It then chooses a reference count that includes half of that population of new space objects, and then tenures all objects with at least that reference count.

This policy finds those new space objects that are referenced from many remembered set entries, and tenures those, hence statistically freeing those remembered table entries that reference the most new space objects.  This policy may seem a little elaborate, but
- the naïve policy of merely tenuring everything when the remembered set is full usually ends up tenuring lots of objects that themselves contain references to new objects and hence merely fills the remembered set with fresh new objects and hence simply tenures lots of objects
- I invented this policy to fix GC behaviour in a real world network monitoring application running on VisualWorks, so I know it works ;-), and I designed the Spur object header format to make its implementation a little simpler than VW's


Right now there /isn't/ a good policy for invoking the global mark-sweep garbage collector, and its compaction algorithm is slow.  The system merely remembers how many objects there are in old space, and does a full GC whenever tenuring results in the number of live objects in old space grows by 50%.  Of course, the image can decide to run the full GC, and does if a new: fails (which schedules a scavenge) and fails again after the immediate scavenge.  But we can do better.  Spur is young and there is lots of scope for adding intelligent (but please, simpler than VisualWorks') memory policy.

The thing that will really help is an incremental old space mark-sweep collector.  I'm looking both at

    Very concurrent mark-&-sweep garbage collection without fine-grain synchronization
    Lorenz Huelsbergen, Phil Winterbottom
    ISMM '98 Proceedings of the 1st international symposium on Memory management
    Pages 166 - 175 (ACM SIGPLAN Notices,  Volume 34 Issue 3, March 1999)

and the four-colour incremental mark-sweep in the LuaJIT VM, see http://wiki.luajit.org/New-Garbage-Collector#Quad-Color-Optimized-Incremental-Mark-&-Sweep.

The incremental GC would likely run in increments after each scavenge, and, if there was work to do, in the idle loop.  It could conceivably run in its own thread, but there are good arguments against that (basically a good GC costs very little, so making it concurrent doesn't gain much performance, but introduces complexity).  However, I've not got many cycles to address this and would love a collaborator who was motivated and knowledgeable to have a go at either of these, preferrably a combination of the two.
--
best,
Eliot