Adding a large number of objects to a Set leads to a

Previous Topic Next Topic
 
classic Classic list List threaded Threaded
9 messages Options
Reply | Threaded
Open this post in threaded view
|

Adding a large number of objects to a Set leads to a

Thelliez
Hello,

I need to fill a Set with a large number of objects (+2M).  But I
encounter 'VM temporary object memory is full, almost out of memory,
too many markSweeps since last successful scavenge' after processing
about ~1M objects. I could try to increase the memory for that session
but since I commit often, I wonder what is staying in memory.  And
what does the 'too many markSweeps' means?

These objects come from various smaller sets.  The translogs are
growing surprisingly fast (~5GB) for adding 0.9M objects.

The code itself is simple but the 'add:' is done in a block; for each
object a block is executed.

Suggestions?

Thanks,
Thierry
Reply | Threaded
Open this post in threaded view
|

Re: Adding a large number of objects to a Set leads to a

Dale Henrichs
Yeah, that's a good question ... I've got a post[1] that covers some of the basics in debugging the out-or-memory situation including generating a report about the makeup of the objects that are in temp-obj-space ...

If you are adding to indexed collections, you might need to commit even more often than you think, because the indexed collections consume temp obj space until committed ...

Dale

[1] http://gemstonesoup.wordpress.com/2008/11/19/gemstone-101-managing-out-of-memory-situations/

----- Original Message -----
| From: "Thierry Thelliez" <[hidden email]>
| To: "GemStone Seaside beta discussion" <[hidden email]>
| Sent: Friday, June 1, 2012 1:43:12 PM
| Subject: [GS/SS Beta] Adding a large number of objects to a Set leads to a
|
| Hello,
|
| I need to fill a Set with a large number of objects (+2M).  But I
| encounter 'VM temporary object memory is full, almost out of memory,
| too many markSweeps since last successful scavenge' after processing
| about ~1M objects. I could try to increase the memory for that
| session
| but since I commit often, I wonder what is staying in memory.  And
| what does the 'too many markSweeps' means?
|
| These objects come from various smaller sets.  The translogs are
| growing surprisingly fast (~5GB) for adding 0.9M objects.
|
| The code itself is simple but the 'add:' is done in a block; for each
| object a block is executed.
|
| Suggestions?
|
| Thanks,
| Thierry
|
Reply | Threaded
Open this post in threaded view
|

Re: Adding a large number of objects to a Set leads to a

Thelliez
Dale,

Thanks for your answer.  I also found this old thread
http://answerpot.com/showthread.php?151729-\%27too+many+markSweeps+since+last+successful\%27+error
  .  So looking at the log (see below), I see that gemstone is trying
to rebuild a collision table.

I did try to commit every 100 objects, but that did not change the situation.

If the table rebuilding is large, committing often is not going to
help, correct?

Thierry

DUMP_OPTIONS = TRUE;
GEM_GCI_LOG_ENABLED = FALSE;
GEM_FREE_FRAME_CACHE_SIZE = -1;
GEM_FREE_FRAME_LIMIT = -1;
GEM_HALT_ON_ERROR = 0;
GEM_IO_LIMIT = 5000;
GEM_KEEP_MIN_SOFTREFS = 0;
GEM_MAX_SMALLTALK_STACK_DEPTH = 1000;
GEM_PRIVATE_PAGE_CACHE_KB = 1000;
GEM_PGSVR_FREE_FRAME_CACHE_SIZE = -1;
GEM_PGSVR_FREE_FRAME_LIMIT = -1;
GEM_PGSVR_UPDATE_CACHE_ON_READ = FALSE;
GEM_RPCGCI_TIMEOUT = 0;
GEM_SOFTREF_CLEANUP_PERCENT_MEM = 50;
GEM_TEMPOBJ_AGGRESSIVE_STUBBING = TRUE;
GEM_TEMPOBJ_CACHE_SIZE = 100000;
GEM_TEMPOBJ_INITIAL_SIZE not used on this platform
GEM_TEMPOBJ_MESPACE_SIZE = 0;
GEM_TEMPOBJ_OOPMAP_SIZE = 0;
GEM_TEMPOBJ_POMGEN_SIZE = 0;
GEM_TEMPOBJ_POMGEN_PRUNE_ON_VOTE = 50;
GEM_TEMPOBJ_POMGEN_SCAVENGE_INTERVAL = 1800;
LOG_WARNINGS = TRUE;
SHR_NUM_FREE_FRAME_SERVERS = 1;
SHR_PAGE_CACHE_NUM_SHARED_COUNTERS = 1900;
SHR_PAGE_CACHE_SIZE_KB = 800000;
SHR_PAGE_CACHE_NUM_PROCS = 1017;
SHR_TARGET_FREE_FRAME_COUNT = -1;
(vmGc spaceSizes: eden init 2000K max 18744K , survivor init 400K max 3128K,
 vmGc    old max 75000K, code max 20000K, perm max 10000K, pom 10 *
8336K = 83360K,
 vmGc    remSet 1944K, meSpace max 95592K oopMapSize 524288 )
cmdLine=

[Info]: RPC client/gem/minimum GCI levels = 844/844/844
[Info]: Client PID: 5306
[Info]: User ID: DataCurator
[Info]: Repository: seaside
[Info]: Session ID: 9
[Info]: GCI Client Host: 127.0.0.1
[Info]: Page server PID: -1
[Info]: Login Time: 06/01/2012 02:36:35 PM.504 MDT
 vmGc MKSW softRefs: clear100% keepGoal 0, cleared 0 , live 0 nonNil 0
srThresh 0 srUseCnt 1
 vmGc   pom9:8331K pom0:8335K pom1:8335K pom2:8335K pom3:8336K
pom4:8335K pom5:8335K pom6:8335K pom7:8336K pom8:3508K
 vmGc MKSW           0/4248Knew 71167/72360Kold 78527Kpom 1205Kperm
293Kcode 18939Kme agrStubbed 0 127073us  dirtyBytes 0 pomT
hresh 0 almOOM 0 DalmOOM 0 ---)
 vmGc approaching OutOfMemory, requesting interpreter stack printout
 vmGc MKSW softRefs: clear100% keepGoal 0, cleared 0 , live 0 nonNil 0
srThresh 0 srUseCnt 1
 vmGc   pom9:8331K pom0:8335K pom1:8335K pom2:8335K pom3:8336K
pom4:8335K pom5:8335K pom6:8335K pom7:8336K pom8:3508K
 vmGc MKSW         467/4248Knew 73558/75000Kold 78527Kpom 1205Kperm
293Kcode 18939Kme agrStubbed 0 122078us  dirtyBytes 0 pomT
hresh 0 almOOM 0 DalmOOM 0 ---)
 vmGc approaching OutOfMemory, requesting interpreter stack printout
 vmGc MKSW softRefs: clear100% keepGoal 0, cleared 0 , live 0 nonNil 0
srThresh 0 srUseCnt 1
 vmGc   pom9:8331K pom0:8335K pom1:8335K pom2:8335K pom3:8336K
pom4:8335K pom5:8335K pom6:8335K pom7:8336K pom8:3508K
 vmGc MKSW         501/4248Knew 74930/75000Kold 78527Kpom 1205Kperm
293Kcode 18939Kme agrStubbed 0 87701us  dirtyBytes 0 pomTh
resh 0 almOOM 0 DalmOOM 0 ---)
 vmGc approaching OutOfMemory, requesting interpreter stack printout

Printing Smalltalk stack for memory usage diagnosis:
Smalltalk stack:
Smalltalk stack at [06/01/2012 03:11:35 PM.865 MDT]:
    iS->ARStackPtr = 0x2ab6d8fc3220, offset from base = 68
1 = TOP OF STACK,   stackDepth = 24

1  KeyValueDictionary >> rebuildTable: @IP 71  [GsMethod 2267393]
   11: 268 (OOP_TRUE)
   10: 10 (SmallInteger 1)
   9: 14234914 (SmallInteger 1779364)
   8: 18 (SmallInteger 2)
   7: 10 (SmallInteger 1) <--framePtr=0x2ab6d8fc3200 AR[64]
  VC at 0x2ab6d3905120   VC.unwindBlock= 20 (OOP_NIL)  VC.serialNum=
189205671545454666 (SmallInteger 23650708943181833)
   6: 189205671545454666 (SmallInteger 23650708943181833)
   5: 20 (OOP_NIL)
   4: 8163034 (SmallInteger 1020379)
   3: 14234914 (SmallInteger 1779364)
   2: 0x2ab6d49437c0 (cls:66817 Array) size:1779364)
   1: 8163034 (SmallInteger 1020379)
rcvr: 0x2ab6cd4c2000 oid:12494281217 (cls:79361 KeyValueDictionary)
size:2040762)  [framePtr=0x2ab6d8fc3200 AR[64]]

2  KeyValueDictionary >> at:put: @IP 129  [GsMethod 2270721]
   8: 20 (OOP_NIL)
   7: 10 (SmallInteger 1)
   6: 0x2ab6d3904ea8 (cls:114433 CollisionBucket) size:2)
   5: 0x2ab6d3905090 oid:16937245697 (cls:14858866689 CCMSModel) size:15)
   4: 0x2ab6d3905090 oid:16937245697 (cls:14858866689 CCMSModel) size:15)
   3: 3308674 (SmallInteger 413584)
   2: 0x2ab6d3904f88 oid:42245726721 (cls:14858866689 CCMSModel) size:15)
   1: 0x2ab6d3904f88 oid:42245726721 (cls:14858866689 CCMSModel) size:15)
rcvr: 0x2ab6cd4c2000 oid:12494281217 (cls:79361 KeyValueDictionary)
size:2040762) <--framePtr=0x2ab6d8fc31b8 AR[55]

3  Set >> add: @IP 54  [GsMethod 9650689]
   3: 268 (OOP_TRUE)
   2: 0x2ab6d7276438 oid:72193 (cls:206081 Object (C) ) size:19)
   1: 0x2ab6d3904f88 oid:42245726721 (cls:14858866689 CCMSModel) size:15)
rcvr: 0x2ab6d3551cd0 oid:12494280961 (cls:5680361985 CCMSModelSet)
size:5) <--framePtr=0x2ab6d8fc3198 AR[51]
Reply | Threaded
Open this post in threaded view
|

Re: Adding a large number of objects to a Set leads to a

Dale Henrichs
If the dictionary that is being rebuilt is persistent, then you could add an AlmostOutOfMemory handler that commits whenever you start running out of memory (MCPlatformSupport class>>commitOnAlmostOutOfMemoryDuring:) ... this is better than doing commits every so often ... you get a commit before you run out of memory ... if you look at the method you can see that the thresholds are tunable ...

With all of the said, I think you can pre-size the dictionary (with rebuildTable: and an AlmostOutOfMemory handler) to the final size ... then you'll take the one hit to make it big enough to fit and then the rest of the commits should go smoothly ...

Dale

----- Original Message -----
| From: "Thierry Thelliez" <[hidden email]>
| To: "GemStone Seaside beta discussion" <[hidden email]>
| Sent: Friday, June 1, 2012 2:20:46 PM
| Subject: Re: [GS/SS Beta] Adding a large number of objects to a Set leads to a
|
| Dale,
|
| Thanks for your answer.  I also found this old thread
| http://answerpot.com/showthread.php?151729-\%27too+many+markSweeps+since+last+successful\%27+error
|   .  So looking at the log (see below), I see that gemstone is trying
| to rebuild a collision table.
|
| I did try to commit every 100 objects, but that did not change the
| situation.
|
| If the table rebuilding is large, committing often is not going to
| help, correct?
|
| Thierry
|
| DUMP_OPTIONS = TRUE;
| GEM_GCI_LOG_ENABLED = FALSE;
| GEM_FREE_FRAME_CACHE_SIZE = -1;
| GEM_FREE_FRAME_LIMIT = -1;
| GEM_HALT_ON_ERROR = 0;
| GEM_IO_LIMIT = 5000;
| GEM_KEEP_MIN_SOFTREFS = 0;
| GEM_MAX_SMALLTALK_STACK_DEPTH = 1000;
| GEM_PRIVATE_PAGE_CACHE_KB = 1000;
| GEM_PGSVR_FREE_FRAME_CACHE_SIZE = -1;
| GEM_PGSVR_FREE_FRAME_LIMIT = -1;
| GEM_PGSVR_UPDATE_CACHE_ON_READ = FALSE;
| GEM_RPCGCI_TIMEOUT = 0;
| GEM_SOFTREF_CLEANUP_PERCENT_MEM = 50;
| GEM_TEMPOBJ_AGGRESSIVE_STUBBING = TRUE;
| GEM_TEMPOBJ_CACHE_SIZE = 100000;
| GEM_TEMPOBJ_INITIAL_SIZE not used on this platform
| GEM_TEMPOBJ_MESPACE_SIZE = 0;
| GEM_TEMPOBJ_OOPMAP_SIZE = 0;
| GEM_TEMPOBJ_POMGEN_SIZE = 0;
| GEM_TEMPOBJ_POMGEN_PRUNE_ON_VOTE = 50;
| GEM_TEMPOBJ_POMGEN_SCAVENGE_INTERVAL = 1800;
| LOG_WARNINGS = TRUE;
| SHR_NUM_FREE_FRAME_SERVERS = 1;
| SHR_PAGE_CACHE_NUM_SHARED_COUNTERS = 1900;
| SHR_PAGE_CACHE_SIZE_KB = 800000;
| SHR_PAGE_CACHE_NUM_PROCS = 1017;
| SHR_TARGET_FREE_FRAME_COUNT = -1;
| (vmGc spaceSizes: eden init 2000K max 18744K , survivor init 400K max
| 3128K,
|  vmGc    old max 75000K, code max 20000K, perm max 10000K, pom 10 *
| 8336K = 83360K,
|  vmGc    remSet 1944K, meSpace max 95592K oopMapSize 524288 )
| cmdLine=
|
| [Info]: RPC client/gem/minimum GCI levels = 844/844/844
| [Info]: Client PID: 5306
| [Info]: User ID: DataCurator
| [Info]: Repository: seaside
| [Info]: Session ID: 9
| [Info]: GCI Client Host: 127.0.0.1
| [Info]: Page server PID: -1
| [Info]: Login Time: 06/01/2012 02:36:35 PM.504 MDT
|  vmGc MKSW softRefs: clear100% keepGoal 0, cleared 0 , live 0 nonNil
|  0
| srThresh 0 srUseCnt 1
|  vmGc   pom9:8331K pom0:8335K pom1:8335K pom2:8335K pom3:8336K
| pom4:8335K pom5:8335K pom6:8335K pom7:8336K pom8:3508K
|  vmGc MKSW           0/4248Knew 71167/72360Kold 78527Kpom 1205Kperm
| 293Kcode 18939Kme agrStubbed 0 127073us  dirtyBytes 0 pomT
| hresh 0 almOOM 0 DalmOOM 0 ---)
|  vmGc approaching OutOfMemory, requesting interpreter stack printout
|  vmGc MKSW softRefs: clear100% keepGoal 0, cleared 0 , live 0 nonNil
|  0
| srThresh 0 srUseCnt 1
|  vmGc   pom9:8331K pom0:8335K pom1:8335K pom2:8335K pom3:8336K
| pom4:8335K pom5:8335K pom6:8335K pom7:8336K pom8:3508K
|  vmGc MKSW         467/4248Knew 73558/75000Kold 78527Kpom 1205Kperm
| 293Kcode 18939Kme agrStubbed 0 122078us  dirtyBytes 0 pomT
| hresh 0 almOOM 0 DalmOOM 0 ---)
|  vmGc approaching OutOfMemory, requesting interpreter stack printout
|  vmGc MKSW softRefs: clear100% keepGoal 0, cleared 0 , live 0 nonNil
|  0
| srThresh 0 srUseCnt 1
|  vmGc   pom9:8331K pom0:8335K pom1:8335K pom2:8335K pom3:8336K
| pom4:8335K pom5:8335K pom6:8335K pom7:8336K pom8:3508K
|  vmGc MKSW         501/4248Knew 74930/75000Kold 78527Kpom 1205Kperm
| 293Kcode 18939Kme agrStubbed 0 87701us  dirtyBytes 0 pomTh
| resh 0 almOOM 0 DalmOOM 0 ---)
|  vmGc approaching OutOfMemory, requesting interpreter stack printout
|
| Printing Smalltalk stack for memory usage diagnosis:
| Smalltalk stack:
| Smalltalk stack at [06/01/2012 03:11:35 PM.865 MDT]:
|     iS->ARStackPtr = 0x2ab6d8fc3220, offset from base = 68
| 1 = TOP OF STACK,   stackDepth = 24
|
| 1  KeyValueDictionary >> rebuildTable: @IP 71  [GsMethod 2267393]
|    11: 268 (OOP_TRUE)
|    10: 10 (SmallInteger 1)
|    9: 14234914 (SmallInteger 1779364)
|    8: 18 (SmallInteger 2)
|    7: 10 (SmallInteger 1) <--framePtr=0x2ab6d8fc3200 AR[64]
|   VC at 0x2ab6d3905120   VC.unwindBlock= 20 (OOP_NIL)  VC.serialNum=
| 189205671545454666 (SmallInteger 23650708943181833)
|    6: 189205671545454666 (SmallInteger 23650708943181833)
|    5: 20 (OOP_NIL)
|    4: 8163034 (SmallInteger 1020379)
|    3: 14234914 (SmallInteger 1779364)
|    2: 0x2ab6d49437c0 (cls:66817 Array) size:1779364)
|    1: 8163034 (SmallInteger 1020379)
| rcvr: 0x2ab6cd4c2000 oid:12494281217 (cls:79361 KeyValueDictionary)
| size:2040762)  [framePtr=0x2ab6d8fc3200 AR[64]]
|
| 2  KeyValueDictionary >> at:put: @IP 129  [GsMethod 2270721]
|    8: 20 (OOP_NIL)
|    7: 10 (SmallInteger 1)
|    6: 0x2ab6d3904ea8 (cls:114433 CollisionBucket) size:2)
|    5: 0x2ab6d3905090 oid:16937245697 (cls:14858866689 CCMSModel)
|    size:15)
|    4: 0x2ab6d3905090 oid:16937245697 (cls:14858866689 CCMSModel)
|    size:15)
|    3: 3308674 (SmallInteger 413584)
|    2: 0x2ab6d3904f88 oid:42245726721 (cls:14858866689 CCMSModel)
|    size:15)
|    1: 0x2ab6d3904f88 oid:42245726721 (cls:14858866689 CCMSModel)
|    size:15)
| rcvr: 0x2ab6cd4c2000 oid:12494281217 (cls:79361 KeyValueDictionary)
| size:2040762) <--framePtr=0x2ab6d8fc31b8 AR[55]
|
| 3  Set >> add: @IP 54  [GsMethod 9650689]
|    3: 268 (OOP_TRUE)
|    2: 0x2ab6d7276438 oid:72193 (cls:206081 Object (C) ) size:19)
|    1: 0x2ab6d3904f88 oid:42245726721 (cls:14858866689 CCMSModel)
|    size:15)
| rcvr: 0x2ab6d3551cd0 oid:12494280961 (cls:5680361985 CCMSModelSet)
| size:5) <--framePtr=0x2ab6d8fc3198 AR[51]
|
Reply | Threaded
Open this post in threaded view
|

Re: Adding a large number of objects to a Set leads to a

Thelliez
How do you use large sets in practice?

If I predefine a large set (Set new: 3000000), it seems to work
better. However I see the following problems:
1- The extent grows very fast. I just saw it growing 10GB for 1M
objects. That's 10KB per objects. I am not sure why that's so big.
The objects already exist; they are just added to the set.
2- If I reach the initial size (for example 3M), will the problem be
even worse?  I will need a huge amount of memory to grow it to 4M for
example.

What do I miss?
Thierry
Reply | Threaded
Open this post in threaded view
|

Re: Adding a large number of objects to a Set leads to a

Dale Henrichs
Oh sorry I didn't see that you were actually using a Set ... yeah not sure if there's a way to pre-grow that dictionary ...

However, if you are using large sets you should consider using an IdentitySet instead of Set ... the IdentitySet will give you significant performance advantages for doing things like #includes: ... the IdentitySets are not built on top of KeyValueDictionary so you can bypass the whole dictionary rebuild problem ... IdentitySets use internal tree structures indexed by oop and they're smokin' fast...

Dale

----- Original Message -----
| From: "Thierry Thelliez" <[hidden email]>
| To: "GemStone Seaside beta discussion" <[hidden email]>
| Sent: Friday, June 1, 2012 4:10:57 PM
| Subject: Re: [GS/SS Beta] Adding a large number of objects to a Set leads to a
|
| How do you use large sets in practice?
|
| If I predefine a large set (Set new: 3000000), it seems to work
| better. However I see the following problems:
| 1- The extent grows very fast. I just saw it growing 10GB for 1M
| objects. That's 10KB per objects. I am not sure why that's so big.
| The objects already exist; they are just added to the set.
| 2- If I reach the initial size (for example 3M), will the problem be
| even worse?  I will need a huge amount of memory to grow it to 4M for
| example.
|
| What do I miss?
| Thierry
|
Reply | Threaded
Open this post in threaded view
|

Re: Adding a large number of objects to a Set leads to a

Jon Paynter-2
Thierry,

this sounds familiar....
If the objects you are adding to the set already exist, then gemstone probably reading each object into the gem memory when it gets added to the set, and eventually the gem runs out of memory.

How are you getting at the objects to add to the set?  are they being copied from some other collection?

You may want to try a different kind of Set.  we use instances of IdentidySet here to hold large collections of domain objects (3mil to 5mil).  Although if you need the equality test (=) for the Set vs identity test (==) for an IdentitySet, then it may not work for you.

And like Dale said - they are MUCH faster

On Fri, Jun 1, 2012 at 4:25 PM, Dale Henrichs <[hidden email]> wrote:
Oh sorry I didn't see that you were actually using a Set ... yeah not sure if there's a way to pre-grow that dictionary ...

However, if you are using large sets you should consider using an IdentitySet instead of Set ... the IdentitySet will give you significant performance advantages for doing things like #includes: ... the IdentitySets are not built on top of KeyValueDictionary so you can bypass the whole dictionary rebuild problem ... IdentitySets use internal tree structures indexed by oop and they're smokin' fast...

Dale

----- Original Message -----
| From: "Thierry Thelliez" <[hidden email]>
| To: "GemStone Seaside beta discussion" <[hidden email]>
| Sent: Friday, June 1, 2012 4:10:57 PM
| Subject: Re: [GS/SS Beta] Adding a large number of objects to a Set leads to  a
|
| How do you use large sets in practice?
|
| If I predefine a large set (Set new: 3000000), it seems to work
| better. However I see the following problems:
| 1- The extent grows very fast. I just saw it growing 10GB for 1M
| objects. That's 10KB per objects. I am not sure why that's so big.
| The objects already exist; they are just added to the set.
| 2- If I reach the initial size (for example 3M), will the problem be
| even worse?  I will need a huge amount of memory to grow it to 4M for
| example.
|
| What do I miss?
| Thierry
|

Reply | Threaded
Open this post in threaded view
|

Re: Adding a large number of objects to a Set leads to a

James Foster-8
In reply to this post by Thelliez

On Jun 1, 2012, at 4:10 PM, Thierry Thelliez wrote:

> How do you use large sets in practice?
>
> If I predefine a large set (Set new: 3000000), it seems to work
> better. However I see the following problems:
> 1- The extent grows very fast. I just saw it growing 10GB for 1M
> objects. That's 10KB per objects. I am not sure why that's so big.
> The objects already exist; they are just added to the set.
> 2- If I reach the initial size (for example 3M), will the problem be
> even worse?  I will need a huge amount of memory to grow it to 4M for
> example.
>
> What do I miss?
> Thierry

Thierry,

As others have mentioned, an IdentitySet is more efficient than a Set (though different behavior), but I believe that what you are seeing is a result of garbage being created. If you log all the sessions out and then log in a single session and do a full mark for collection (MFC), then when the reclaim finishes I expect you will have a lot of free space in your repository. A typical cause of repository growth is having an idle session logged in and generating a commit record backlog (CRB). While you should avoid very large transactions, having transactions that are too small also has overhead since there is a minimum amount of record-keeping that is required for each one.

To get an idea of how much space a Set (or IdentitySet) requires, predefine your large set and then set each key and value to an immediate object (e.g., nil, Boolean, SmallInteger, SmallFloat, Character), and then sent it #'physicalSize'.

James

Reply | Threaded
Open this post in threaded view
|

Re: Adding a large number of objects to a Set leads to a

Thelliez
Just to follow up on this thread.  I switched to using IdentitySet (+
an equality index) and that did work better indeed.  Thanks.
Thierry