Hi, During this step, the memory used by my image grows from 50Mb to ~500Mb. I find it far too large since I'm pretty sure my arrays are the largest objects I create and only weight something like 300kb.My app is a parser/filter for binary files, that produces a bunch of ascii files. At the begining of the parsing, the filtering step involves the storage of the positions of 32 objects, each second for a whole day. So that's 32 Arrays with 86400 elements each. What are the tools I have to find where precisely the memory usage explodes ? For example, is it possible to browse the "old memory" objects to see which one fails to get GC'ed ? Thanks in advance, Thomas. |
Your calculation seem to be off.
32 * 86400 objects = 2.8 million objects. A shortint = 4 bytes, making 10.6 MB Everything else (except value objects) is larger. Stephan |
I meant a single array weight something like 300Kb with the 32 of them weighting arround 10Mb. I tried to look closely at the way the memory (with VirtualMachine>#memoryEnd) was incrementing and it follows this pattern:
Is this a normal behaviour ? On a side note, the computation of a single epoch (which is done 32*24*3600 times) uses 19 local variables. Not sure it is relevant.
2014-04-09 13:10 GMT+02:00 Stephan Eggermont <[hidden email]>: Your calculation seem to be off. |
In reply to this post by Thomas Bany
Hi Thomas,
Fixing memory consumption problems is hard, but important: memory efficient code is automatically faster in the long run as well. Your issue sounds serious. However, I would start by trying to figure out what is happening at your coding level: somehow you (or something you use) must be holding on too much memory. Questioning low level memory management functionality should be the last resort, not the first. There is SpaceTally that you could use before and after running part of your code. Once something unexpected survives GC, there is the PointerFinder functionality (Inspector > Explore Pointers) to find what holds onto objects. But no matter what, it is hard. If you have some public code that you could share to demonstrate your problem, then we could try to help. Sven On 09 Apr 2014, at 12:54, Thomas Bany <[hidden email]> wrote: > Hi, > > My app is a parser/filter for binary files, that produces a bunch of ascii files. > > At the begining of the parsing, the filtering step involves the storage of the positions of 32 objects, each second for a whole day. So that's 32 Arrays with 86400 elements each. > > During this step, the memory used by my image grows from 50Mb to ~500Mb. I find it far too large since I'm pretty sure my arrays are the largest objects I create and only weight something like 300kb. > > The profiling of the app shows that hte footprint of the "old memory" went up by 350Mb. Which I'm pretty sure is super bad. Maybe as a consequence, after the parsing is finished, the memory footprint of the image stays at ~500Mb > > What are the tools I have to find where precisely the memory usage explodes ? For example, is it possible to browse the "old memory" objects to see which one fails to get GC'ed ? > > Thanks in advance, > > Thomas. |
Thanks ! That's exactly what I was looking for. There is a compare method I dont quite understand but I think I found what is going on. I failed to grasp that an Array reply to #sizeInMemory with it's own size, without the sizes of its references. A single position object weight 96 bytes, which make the whole Array weight arround 8Mb, and the 32 objects arround 250 Mb. I'm not sure I can get arround that aspect since the computation is costly and I need its output multiple times. I will make further testing to see why the memory is not released at the end of the execution. Thanks again ! 2014-04-09 15:19 GMT+02:00 Sven Van Caekenberghe <[hidden email]>: Hi Thomas, |
On 09 Apr 2014, at 18:01, Thomas Bany <[hidden email]> wrote: > Thanks ! > > That's exactly what I was looking for. There is a compare method I dont quite understand but I think I found what is going on. > > I failed to grasp that an Array reply to #sizeInMemory with it's own size, without the sizes of its references. A single position object weight 96 bytes, which make the whole Array weight arround 8Mb, and the 32 objects arround 250 Mb. Yes, #sizeInMemory is confusing, a recursive version would be nice, but also dangerous as it could get in a loop. Here is something related: http://www.humane-assessment.com/blog/traversal-enabled-pharo-objects/ If you want to use a huge data structure, you have to think carefully about your representations. There are tricks you can use to conserve memory: use more primitive types (SmallIntegers, bit flags, Symbols), use shared instances, use alternatives like ZTimestamp which is half the size of DateAndTime, or use your own integer time, sparse data structures, and so on - and you can hide these optimisations behind your standard API. > I'm not sure I can get arround that aspect since the computation is costly and I need its output multiple times. > > I will make further testing to see why the memory is not released at the end of the execution. Good luck. > Thanks again ! > > > > 2014-04-09 15:19 GMT+02:00 Sven Van Caekenberghe <[hidden email]>: > Hi Thomas, > > Fixing memory consumption problems is hard, but important: memory efficient code is automatically faster in the long run as well. > > Your issue sounds serious. However, I would start by trying to figure out what is happening at your coding level: somehow you (or something you use) must be holding on too much memory. Questioning low level memory management functionality should be the last resort, not the first. > > There is SpaceTally that you could use before and after running part of your code. Once something unexpected survives GC, there is the PointerFinder functionality (Inspector > Explore Pointers) to find what holds onto objects. But no matter what, it is hard. > > If you have some public code that you could share to demonstrate your problem, then we could try to help. > > Sven > > On 09 Apr 2014, at 12:54, Thomas Bany <[hidden email]> wrote: > > > Hi, > > > > My app is a parser/filter for binary files, that produces a bunch of ascii files. > > > > At the begining of the parsing, the filtering step involves the storage of the positions of 32 objects, each second for a whole day. So that's 32 Arrays with 86400 elements each. > > > > During this step, the memory used by my image grows from 50Mb to ~500Mb. I find it far too large since I'm pretty sure my arrays are the largest objects I create and only weight something like 300kb. > > > > The profiling of the app shows that hte footprint of the "old memory" went up by 350Mb. Which I'm pretty sure is super bad. Maybe as a consequence, after the parsing is finished, the memory footprint of the image stays at ~500Mb > > > > What are the tools I have to find where precisely the memory usage explodes ? For example, is it possible to browse the "old memory" objects to see which one fails to get GC'ed ? > > > > Thanks in advance, > > > > Thomas. > -- Sven Van Caekenberghe Proudly supporting Pharo http://pharo.org http://association.pharo.org http://consortium.pharo.org |
Okey so I have found the Class instance that was holding onto the data. I can now save my image after a parsing without tripling its size. But the VM's memory footprint (according to Windows task manager) does not falls back to normal after a parse. My guess is that now that the VM was allocated more space, it juste doesn't give it back. It still bugs me a little since I can see the memory occupied by the VM drop when minimised in the task bar. If you want to use a huge data structure, you have to think carefully about your representations. There are tricks you can use to conserve memory: use more primitive types (SmallIntegers, bit flags, Symbols), use shared instances, use alternatives like ZTimestamp which is half the size of DateAndTime, or use your own integer time, sparse data structures, and so on - and you can hide these optimisations behind your standard API. I have a quick question regarding primitive type: 1 sizeInMemory -> 0 1.0 sizeInMemory -> 12 I do use a integer as my timestamp, encapsulated in a class, which also weight 12 bytes. This makes me think that the a Float weighting 12 bytes is an encapsulation of a primitive type I don't know of. If so, is it possible to not use a Float, but something like a double ? Thomas. |
Free forum by Nabble | Edit this page |