GemStone GC, SSD

Previous Topic Next Topic
 
classic Classic list List threaded Threaded
8 messages Options
Reply | Threaded
Open this post in threaded view
|

GemStone GC, SSD

Thelliez
Hello,

I am wondering if I can optimize some disk usage patterns, in
particular around garbage collection.  How do I know how much data is
written to the disk during GC? What's the impact on # writes with
#mfcGcPageBufSize?


Yesterday the SSD I had been using for about a year died.  I had been
accustomed to decent speed and did not think too hard about it.  But
today I am painfully running a development/stagging/beta system off a
regular hard drive. That's a lot slower (an MFC that took 20 minutes
with the SSD has not yet completed after 2h30m ...).   While a
replacement SSD is in the way, I wonder how many writes were done in a
year.  Since SSDs have limited writes I am just curious how hard it is
pushed.

The application has only a small web traffic volume, but a large
amount of garbage is created every week when large reports are
refreshed. In other words, +2M objects live for about 1 week.  How
would I efficiently dispose of them? A backup file is about 45GB for
now.


Thanks
Thierry Thelliez
Reply | Threaded
Open this post in threaded view
|

Re: GemStone GC, SSD

Jon Paynter-2
Thierry,

if you still have access to the SSD, take a look at the smart data to see if it was worn out (100% usage), or if it failed for some other reason.  To have one die after 1 year due to writing too much data to it seems improbable.  A quick google search for your drive should give you ways to gather the data.

Jon.

On Mon, Jun 25, 2012 at 6:21 PM, Thierry Thelliez <[hidden email]> wrote:
Hello,

I am wondering if I can optimize some disk usage patterns, in
particular around garbage collection.  How do I know how much data is
written to the disk during GC? What's the impact on # writes with
#mfcGcPageBufSize?


Yesterday the SSD I had been using for about a year died.  I had been
accustomed to decent speed and did not think too hard about it.  But
today I am painfully running a development/stagging/beta system off a
regular hard drive. That's a lot slower (an MFC that took 20 minutes
with the SSD has not yet completed after 2h30m ...).   While a
replacement SSD is in the way, I wonder how many writes were done in a
year.  Since SSDs have limited writes I am just curious how hard it is
pushed.

The application has only a small web traffic volume, but a large
amount of garbage is created every week when large reports are
refreshed. In other words, +2M objects live for about 1 week.  How
would I efficiently dispose of them? A backup file is about 45GB for
now.


Thanks
Thierry Thelliez

Reply | Threaded
Open this post in threaded view
|

Re: GemStone GC, SSD

Thelliez
Ok, I got more data.

About the SSD: the old SSD is still alive somehow (disconnects after a
while).  The SMART data did not give anything interesting. The 'erase'
info that matters is low and would indicate only a 1% drive lifetime
loss in one year. Its deaf does not seem related to any abuse ;-)
SMART ATTRIBUTES:
 ID Description                         Status     Value     Worst
  Threshold Raw Value TEC
---------------------------------------------------------------------------------------------------------------------------------------------
202 Data Address Mark Errors             OK         99         99
  1         1         N.A.

(Interestingly enough, I switched from a Crucial to an Intel SSD and
the GC process that used to take ~35 minutes now takes 25 minutes.
Both are Sata 3 (6Gb/s).)


About the GC: I have been surprised about the difference between a
regular drive and the SSD. The same GC takes 25 minutes on an SSD
while it takes 4 hours with a regular (7200rpm) drive. According to
iostat, the GC process phase reads 76GB and writes 24MB. I played with
#mfcGcPageBufSize (setting it to 1000), but that did not change much
the iostat results.

Unless I misinterpret the results, the extent is 45GB, and it reads
76GB. Is that to be expected? Maybe I should try more memory?

The documentation points to a FDC/MGC two step process for managing
GCs.  But is there a way to control the GC for some objects?  I mean
that I have a large tree of objects refreshed weekly and/or on demand.
 I do not need the daily GC to re-parse this tree over and over,
unless I tell it to. I could try to limit GC only just after the
massive reports refresh?


Thanks,
Thierry
Reply | Threaded
Open this post in threaded view
|

Re: GemStone GC, SSD

Thelliez
Ah...  I found the issue with the SSD.  Nothing to do with data usage.
 Crucial had a bug in the M4 series, fixed with a firmware update...
The drive becomes unresponsive passed 5,000 hours
http://forum.crucial.com/t5/Solid-State-Drives-SSD/M4-firmware-0309-is-now-available/td-p/80286

I am still interested in tuning our GC (see previous message)...


Thierry
Reply | Threaded
Open this post in threaded view
|

Re: GemStone GC, SSD

otto
In reply to this post by Thelliez

Hi,

I'm no guru on this, but have played a bit. 

Unless I misinterpret the results, the extent is 45GB, and it reads
76GB. Is that to be expected? Maybe I should try more memory?

The major difference that the ssd makes is that random reads are so much faster because seek time is 0. 

How big is your SPC?

If you have an spc of 45GB + a bit more for overhead, GS will cache all objects in ram. The time it takes to do this is faster on the ssd because the traversing is not sequential.

As your spc gets smaller, more objects have to be swapped into the spc and out. I'm not 100% sure, but I think the mfc process may have to touch some objects more than once, and if that object happens to be in your spc, ok, but if not, it will read from disk.

The documentation points to a FDC/MGC two step process for managing
GCs.  But is there a way to control the GC for some objects?  I mean
that I have a large tree of objects refreshed weekly and/or on demand.
I do not need the daily GC to re-parse this tree over and over,
unless I tell it to. I could try to limit GC only just after the
massive reports refresh?

I think GS does a decent job of managing garbage in large collections. The tree is consists of quite a number of objects and GS should be reasonably efficient here. 

I'm thinking in terms of using the epoch gc here. If you switch that on, only changed objects will be considered for gc, so it may be ok to use mostly the epoch gc and once a while a full mfc.

Also, afaik, take into account that reclaiming packs live objects onto pages so space can be reclaimed. This may cause a number of random reads if dead & alive objects are on the same pages.

Otto
Reply | Threaded
Open this post in threaded view
|

Re: GemStone GC, SSD

Thelliez
Thanks Otto,


> As your spc gets smaller, more objects have to be swapped into the spc and
> out. I'm not 100% sure, but I think the mfc process may have to touch some
> objects more than once, and if that object happens to be in your spc, ok,
> but if not, it will read from disk.

Yes, I think that you are right.  I need to see how much memory I can
add under the current license.


> I think GS does a decent job of managing garbage in large collections. The
> tree is consists of quite a number of objects and GS should be reasonably
> efficient here.

For now my web traffic is really small.  I am using GLASS pretty much
out of the box, I can reduce the frequency of the GC for now. It feels
unneeded to have this 25 minute process every hour. (the last one
found 15K dead objects out of 293M).  But soon the site will be opened
to more people.  I guess that I will see the pattern then.

But by curiosity, what should be GC strategy in case of large database
size deployed under a large volume GLASS site? The parsing of a large
quantity of domain objects to find dead objects from Seaside sessions
does not seem optimum.


> I'm thinking in terms of using the epoch gc here. If you switch that on,
> only changed objects will be considered for gc, so it may be ok to use
> mostly the epoch gc and once a while a full mfc.

Yes, this was next on my list to experiment with.

> Also, afaik, take into account that reclaiming packs live objects onto pages
> so space can be reclaimed. This may cause a number of random reads if dead &
> alive objects are on the same pages.

You are right.  I did not think of that.


Thierry
Reply | Threaded
Open this post in threaded view
|

Re: GemStone GC, SSD

Dale Henrichs
In reply to this post by otto
Otto,

This is a good analysis. an embedded comment or two...

----- Original Message -----
| From: "Otto Behrens" <[hidden email]>
| To: "GemStone Seaside beta discussion" <[hidden email]>
| Sent: Saturday, June 30, 2012 8:12:32 AM
| Subject: Re: [GS/SS Beta] GemStone GC, SSD
|
|
|
| Hi,
|
|
| I'm no guru on this, but have played a bit.
|
|
|
|
| Unless I misinterpret the results, the extent is 45GB, and it reads
| 76GB. Is that to be expected? Maybe I should try more memory?
|
|
|
| The major difference that the ssd makes is that random reads are so
| much faster because seek time is 0.
|
|
| How big is your SPC?
|
|
| If you have an spc of 45GB + a bit more for overhead, GS will cache
| all objects in ram. The time it takes to do this is faster on the
| ssd because the traversing is not sequential.
|
|
| As your spc gets smaller, more objects have to be swapped into the
| spc and out. I'm not 100% sure, but I think the mfc process may have
| to touch some objects more than once, and if that object happens to
| be in your spc, ok, but if not, it will read from disk.

The gc doesn't necessarily hit the same object more than once, but it definitely ends up hitting some of the pages in the repository more than once ... with an SPC smaller than the repository size that means that the same page will be read from disk more than once ...

|
|
|
| The documentation points to a FDC/MGC two step process for managing
| GCs. But is there a way to control the GC for some objects? I mean
| that I have a large tree of objects refreshed weekly and/or on
| demand.
| I do not need the daily GC to re-parse this tree over and over,
| unless I tell it to. I could try to limit GC only just after the
| massive reports refresh?
|
|
|
| I think GS does a decent job of managing garbage in large
| collections. The tree is consists of quite a number of objects and
| GS should be reasonably efficient here.
|
|
| I'm thinking in terms of using the epoch gc here. If you switch that
| on, only changed objects will be considered for gc, so it may be ok
| to use mostly the epoch gc and once a while a full mfc.

This is a good recommendation, especially when your SPC is smaller than the repository size. An epoch will reap the short-lived Seaside session objects without scanning the entire repository. The yield of dead from an epoch gc is not 100% as some dead objects survive across the epoch boundaries so you will see a bit more repository growth than you experience with frequent MFCs. You can stretch out your MFC frequency to control this overall repository growth ...
 
|
|
| Also, afaik, take into account that reclaiming packs live objects
| onto pages so space can be reclaimed. This may cause a number of
| random reads if dead & alive objects are on the same pages.
|
|
| Otto
|
|
|
Reply | Threaded
Open this post in threaded view
|

Re: GemStone GC, SSD

Dale Henrichs
In reply to this post by Thelliez
embedded comments

----- Original Message -----
| From: "Thierry Thelliez" <[hidden email]>
| To: "GemStone Seaside beta discussion" <[hidden email]>
| Sent: Saturday, June 30, 2012 9:23:49 AM
| Subject: Re: [GS/SS Beta] GemStone GC, SSD
|
| Thanks Otto,
|
|
| > As your spc gets smaller, more objects have to be swapped into the
| > spc and
| > out. I'm not 100% sure, but I think the mfc process may have to
| > touch some
| > objects more than once, and if that object happens to be in your
| > spc, ok,
| > but if not, it will read from disk.
|
| Yes, I think that you are right.  I need to see how much memory I can
| add under the current license.
|
|
| > I think GS does a decent job of managing garbage in large
| > collections. The
| > tree is consists of quite a number of objects and GS should be
| > reasonably
| > efficient here.
|
| For now my web traffic is really small.  I am using GLASS pretty much
| out of the box, I can reduce the frequency of the GC for now. It
| feels
| unneeded to have this 25 minute process every hour. (the last one
| found 15K dead objects out of 293M).  But soon the site will be
| opened
| to more people.  I guess that I will see the pattern then.
|
| But by curiosity, what should be GC strategy in case of large
| database
| size deployed under a large volume GLASS site? The parsing of a large
| quantity of domain objects to find dead objects from Seaside sessions
| does not seem optimum.

Increasing the size of the SPC is one of the things you will want to explore as your repository size and volume grows. Talk to Monty and/or Norm for costs...

Epoch GC will become a bit more important to avoid excessive repository growth.

It is also worth reviewing your Seaside app and use RESTful calls for the places where you are strictly doing reads ... if you can ... that way you can avoid generating "meaningless session state" ... this is what we do for SS3 when folks are reading the repository - only the browser-based access generate session state.

You schedule the frequency of MFCs basically to control the growth of repository balanced by the amount of time the MFC takes ...

|
|
| > I'm thinking in terms of using the epoch gc here. If you switch
| > that on,
| > only changed objects will be considered for gc, so it may be ok to
| > use
| > mostly the epoch gc and once a while a full mfc.
|
| Yes, this was next on my list to experiment with.
|
| > Also, afaik, take into account that reclaiming packs live objects
| > onto pages
| > so space can be reclaimed. This may cause a number of random reads
| > if dead &
| > alive objects are on the same pages.
|
| You are right.  I did not think of that.

If you are seeing high i/o read rates and suspect fragmentation, you can `cluster` the objects in the repository to pack objects that are used to together onto the same pages ...


|
|
| Thierry
|