Memory profiling

Previous Topic Next Topic
 
classic Classic list List threaded Threaded
4 messages Options
Reply | Threaded
Open this post in threaded view
|

Memory profiling

abergel
Hi!

I have been working on a memory profiler for Pharo. People who attended Smalltalks'10 may remember Mariano Coca's presentation.

Unfortunately, I am not satisfied with what we can obtain from the execution of a piece of code in Pharo. The amount of memory consumed by a method _significantly varies_ across execution. I have the impression that memory consumption is a less stable metric than execution time.

For example, I used this as a benchmark:

-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=
        7 timesRepeat:   [
                5 timesRepeat: [  Smalltalk garbageCollect].
                MemoryProfiler
                        viewProfiling: [
                                20 timesRepeat:  [| view |
                                        5 timesRepeat: [  Smalltalk garbageCollect].
                                        Object new. Object new.
                                        5 timesRepeat: [  Smalltalk garbageCollect].
                                       
                                        view := MOViewRenderer new.
                                        view nodes: (1 to: 5000).
                                        view root applyLayout] ]
                        inPackagesMatching: 'Mondrian-*'
                        ]
-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=
It opens a simple visualization in Mondrian. I tried to clean as much as I can before executing the code with multiple garbage collect.

This piece of code return 7 profiles.
In a profile I have for the method MOBoundedShape>>heightFor: :
Number of Young Allocated Bytes: 42,879,008 bytes
Number of Old Allocated Bytes: 6,000,660 bytes
Total Allocated Bytes: 48,879,668 bytes
Number of Increased Memory Bytes: 0 bytes
Number of Incremental GC's: 1,415
Number of Full GC's: 0
Number of Tenures of Surving Objects: 164

In another profile I have:
Number of Young Allocated Bytes: -7,857,156 bytes
Number of Old Allocated Bytes: 6,008,872 bytes
Total Allocated Bytes: -1,848,284 bytes
Number of Increased Memory Bytes: 0 bytes
Number of Incremental GC's: 1,373
Number of Full GC's: 0
Number of Tenures of Surving Objects: 165

In another:
Number of Young Allocated Bytes: -11,956,720 bytes
Number of Old Allocated Bytes: 5,315,216 bytes
Total Allocated Bytes: -6,641,504 bytes
Number of Increased Memory Bytes: 0 bytes
Number of Incremental GC's: 1,354
Number of Full GC's: 0
Number of Tenures of Surving Objects: 145

And another:
Number of Young Allocated Bytes: 59,132,472 bytes
Number of Old Allocated Bytes: 6,177,772 bytes
Total Allocated Bytes: 65,310,244 bytes
Number of Increased Memory Bytes: 0 bytes
Number of Incremental GC's: 1,361
Number of Full GC's: 0
Number of Tenures of Surving Objects: 167

The metric "Number of Young Allocated Bytes", "Number of Old Allocated Bytes", "Total Allocated Bytes" are highly unpredictable: i.e., two profiles are likely to be completely different. I am not sure why is this so. I use the standard Pharo1.2 One click, without additional parameter to the VM. Do you think that other platforms and languages behave the same?

However, the positive point is that it seems that the amount of incremental GC and the number of tenured objects may play an important role in estimating the memory consumption. These metrics are much more stable.
What do you think on this? Does it make sense to say that more a method triggers incremental GC, more it consumes memory?

What exactly triggers an incremental GC?

Cheers,
Alexandre
--
_,.;:~^~:;._,.;:~^~:;._,.;:~^~:;._,.;:~^~:;._,.;:
Alexandre Bergel  http://www.bergel.eu
^~:;._,.;:~^~:;._,.;:~^~:;._,.;:~^~:;._,.;:~^~:;.






Reply | Threaded
Open this post in threaded view
|

Re: Memory profiling

Adrian Lienhard
Hi Alex,

I assume that your test run always creates the same number objects with the same size. Given that, there doesn't appear to be any reason why the amount of allocated memory for the given run should vary. So its rather the way you measure that makes the difference. I don't know how you get your profiles, but if you use the VM GC stats, they are global and hence other processes may influence the outcome.

Incremental GC is triggered depending on allocation count. There's a r/w VM param to configure this. So iGC is rather a crude way to measure memory usage as it doesn't consider object size (and it is a global counter)...

As a solution I suggest to take a look at my DTrace provider for the Squeak/Pharo VM [1]. It allows you to take statistics on questions like: how many instances of which size and of which class were created by which process in which methods? In the blog post I give an example that illustrates the use of DTrace for exactly such a question.

HTH,
Adrian

[1] http://www.adrian-lienhard.ch/blog?dialog=smalltak-meets-dtrace

 
On Apr 24, 2011, at 05:32 , Alexandre Bergel wrote:

> Hi!
>
> I have been working on a memory profiler for Pharo. People who attended Smalltalks'10 may remember Mariano Coca's presentation.
>
> Unfortunately, I am not satisfied with what we can obtain from the execution of a piece of code in Pharo. The amount of memory consumed by a method _significantly varies_ across execution. I have the impression that memory consumption is a less stable metric than execution time.
>
> For example, I used this as a benchmark:
>
> -=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=
> 7 timesRepeat:   [
> 5 timesRepeat: [  Smalltalk garbageCollect].
> MemoryProfiler
> viewProfiling: [
> 20 timesRepeat:  [| view |
> 5 timesRepeat: [  Smalltalk garbageCollect].
> Object new. Object new.
> 5 timesRepeat: [  Smalltalk garbageCollect].
>
> view := MOViewRenderer new.
> view nodes: (1 to: 5000).
> view root applyLayout] ]
> inPackagesMatching: 'Mondrian-*'
> ]
> -=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=
> It opens a simple visualization in Mondrian. I tried to clean as much as I can before executing the code with multiple garbage collect.
>
> This piece of code return 7 profiles.
> In a profile I have for the method MOBoundedShape>>heightFor: :
> Number of Young Allocated Bytes: 42,879,008 bytes
> Number of Old Allocated Bytes: 6,000,660 bytes
> Total Allocated Bytes: 48,879,668 bytes
> Number of Increased Memory Bytes: 0 bytes
> Number of Incremental GC's: 1,415
> Number of Full GC's: 0
> Number of Tenures of Surving Objects: 164
>
> In another profile I have:
> Number of Young Allocated Bytes: -7,857,156 bytes
> Number of Old Allocated Bytes: 6,008,872 bytes
> Total Allocated Bytes: -1,848,284 bytes
> Number of Increased Memory Bytes: 0 bytes
> Number of Incremental GC's: 1,373
> Number of Full GC's: 0
> Number of Tenures of Surving Objects: 165
>
> In another:
> Number of Young Allocated Bytes: -11,956,720 bytes
> Number of Old Allocated Bytes: 5,315,216 bytes
> Total Allocated Bytes: -6,641,504 bytes
> Number of Increased Memory Bytes: 0 bytes
> Number of Incremental GC's: 1,354
> Number of Full GC's: 0
> Number of Tenures of Surving Objects: 145
>
> And another:
> Number of Young Allocated Bytes: 59,132,472 bytes
> Number of Old Allocated Bytes: 6,177,772 bytes
> Total Allocated Bytes: 65,310,244 bytes
> Number of Increased Memory Bytes: 0 bytes
> Number of Incremental GC's: 1,361
> Number of Full GC's: 0
> Number of Tenures of Surving Objects: 167
>
> The metric "Number of Young Allocated Bytes", "Number of Old Allocated Bytes", "Total Allocated Bytes" are highly unpredictable: i.e., two profiles are likely to be completely different. I am not sure why is this so. I use the standard Pharo1.2 One click, without additional parameter to the VM. Do you think that other platforms and languages behave the same?
>
> However, the positive point is that it seems that the amount of incremental GC and the number of tenured objects may play an important role in estimating the memory consumption. These metrics are much more stable.
> What do you think on this? Does it make sense to say that more a method triggers incremental GC, more it consumes memory?
>
> What exactly triggers an incremental GC?
>
> Cheers,
> Alexandre
> --
> _,.;:~^~:;._,.;:~^~:;._,.;:~^~:;._,.;:~^~:;._,.;:
> Alexandre Bergel  http://www.bergel.eu
> ^~:;._,.;:~^~:;._,.;:~^~:;._,.;:~^~:;._,.;:~^~:;.
>
>
>
>
>
>


Reply | Threaded
Open this post in threaded view
|

Re: Memory profiling

Henrik Sperre Johansen
In reply to this post by abergel
On 24.04.2011 05:32, Alexandre Bergel wrote:

> Hi!
>
> I have been working on a memory profiler for Pharo. People who attended Smalltalks'10 may remember Mariano Coca's presentation.
>
> Unfortunately, I am not satisfied with what we can obtain from the execution of a piece of code in Pharo. The amount of memory consumed by a method _significantly varies_ across execution. I have the impression that memory consumption is a less stable metric than execution time.
>
> For example, I used this as a benchmark:
>
> -=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=
> 7 timesRepeat:   [
> 5 timesRepeat: [  Smalltalk garbageCollect].
> MemoryProfiler
> viewProfiling: [
> 20 timesRepeat:  [| view |
> 5 timesRepeat: [  Smalltalk garbageCollect].
> Object new. Object new.
> 5 timesRepeat: [  Smalltalk garbageCollect].
>
> view := MOViewRenderer new.
> view nodes: (1 to: 5000).
> view root applyLayout] ]
> inPackagesMatching: 'Mondrian-*'
> ]
> -=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=
> It opens a simple visualization in Mondrian. I tried to clean as much as I can before executing the code with multiple garbage collect.
>
> This piece of code return 7 profiles.
> In a profile I have for the method MOBoundedShape>>heightFor: :
> Number of Young Allocated Bytes: 42,879,008 bytes
> Number of Old Allocated Bytes: 6,000,660 bytes
> Total Allocated Bytes: 48,879,668 bytes
> Number of Increased Memory Bytes: 0 bytes
> Number of Incremental GC's: 1,415
> Number of Full GC's: 0
> Number of Tenures of Surving Objects: 164
>
> In another profile I have:
> Number of Young Allocated Bytes: -7,857,156 bytes
> Number of Old Allocated Bytes: 6,008,872 bytes
> Total Allocated Bytes: -1,848,284 bytes
> Number of Increased Memory Bytes: 0 bytes
> Number of Incremental GC's: 1,373
> Number of Full GC's: 0
> Number of Tenures of Surving Objects: 165
>
> In another:
> Number of Young Allocated Bytes: -11,956,720 bytes
> Number of Old Allocated Bytes: 5,315,216 bytes
> Total Allocated Bytes: -6,641,504 bytes
> Number of Increased Memory Bytes: 0 bytes
> Number of Incremental GC's: 1,354
> Number of Full GC's: 0
> Number of Tenures of Surving Objects: 145
>
> And another:
> Number of Young Allocated Bytes: 59,132,472 bytes
> Number of Old Allocated Bytes: 6,177,772 bytes
> Total Allocated Bytes: 65,310,244 bytes
> Number of Increased Memory Bytes: 0 bytes
> Number of Incremental GC's: 1,361
> Number of Full GC's: 0
> Number of Tenures of Surving Objects: 167
>
> The metric "Number of Young Allocated Bytes", "Number of Old Allocated Bytes", "Total Allocated Bytes" are highly unpredictable: i.e., two profiles are likely to be completely different. I am not sure why is this so. I use the standard Pharo1.2 One click, without additional parameter to the VM. Do you think that other platforms and languages behave the same?
It's not really surprising; since young space allocated size is shrunk
by GC,  a simple "value after block" - "value  before block", for how
much was actually allocated will give you rather random results.

The amount of surviving objects (ie tenured into old allocated
bytes)/tenures though is pretty much the same over the different runs,
the differences can probably be explained by higher-priority process
activity.

> However, the positive point is that it seems that the amount of incremental GC and the number of tenured objects may play an important role in estimating the memory consumption. These metrics are much more stable.
> What do you think on this? Does it make sense to say that more a method triggers incremental GC, more it consumes memory?
>
> What exactly triggers an incremental GC?
http://wiki.squeak.org/squeak/1469
"Eden collections are triggered by either running out of eden space, or
allocating more than a certain number of objects (currently 4000) since
the last collection "
Not sure if it's up to date, but I'd guess so.

It's hard to build a Profiler for creation of non-transient objects
during a run, since the creation and tenuring of created objects usually
happens at entirely different places (around 200 incGC's per tenure in
my image).
In other words, you can't really say that a method is "consuming" memory
simply because a tenure happened during it's execution, and there was an
increase in old space allocation, as some/many of the surviving objects
could have been created 200 incGC's earlier.

The amount of incGC's is usually a good indication of how many/how large
objects are created during a methods execution though, due to the above
mentioned trigger conditions.
In the context of determining overhead due to GC, those numbers are
usually the interesting ones anyways.

Cheers,
Henry

Reply | Threaded
Open this post in threaded view
|

Re: Memory profiling

abergel
Thanks Adrian and Henrik.
I will check this...

Cheers,
Alexandre


On 26 Apr 2011, at 08:21, Henrik Sperre Johansen wrote:

> On 24.04.2011 05:32, Alexandre Bergel wrote:
>> Hi!
>>
>> I have been working on a memory profiler for Pharo. People who attended Smalltalks'10 may remember Mariano Coca's presentation.
>>
>> Unfortunately, I am not satisfied with what we can obtain from the execution of a piece of code in Pharo. The amount of memory consumed by a method _significantly varies_ across execution. I have the impression that memory consumption is a less stable metric than execution time.
>>
>> For example, I used this as a benchmark:
>>
>> -=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=
>> 7 timesRepeat:   [
>> 5 timesRepeat: [  Smalltalk garbageCollect].
>> MemoryProfiler
>> viewProfiling: [
>> 20 timesRepeat:  [| view |
>> 5 timesRepeat: [  Smalltalk garbageCollect].
>> Object new. Object new.
>> 5 timesRepeat: [  Smalltalk garbageCollect].
>>
>> view := MOViewRenderer new.
>> view nodes: (1 to: 5000).
>> view root applyLayout] ]
>> inPackagesMatching: 'Mondrian-*'
>> ]
>> -=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=
>> It opens a simple visualization in Mondrian. I tried to clean as much as I can before executing the code with multiple garbage collect.
>>
>> This piece of code return 7 profiles.
>> In a profile I have for the method MOBoundedShape>>heightFor: :
>> Number of Young Allocated Bytes: 42,879,008 bytes
>> Number of Old Allocated Bytes: 6,000,660 bytes
>> Total Allocated Bytes: 48,879,668 bytes
>> Number of Increased Memory Bytes: 0 bytes
>> Number of Incremental GC's: 1,415
>> Number of Full GC's: 0
>> Number of Tenures of Surving Objects: 164
>>
>> In another profile I have:
>> Number of Young Allocated Bytes: -7,857,156 bytes
>> Number of Old Allocated Bytes: 6,008,872 bytes
>> Total Allocated Bytes: -1,848,284 bytes
>> Number of Increased Memory Bytes: 0 bytes
>> Number of Incremental GC's: 1,373
>> Number of Full GC's: 0
>> Number of Tenures of Surving Objects: 165
>>
>> In another:
>> Number of Young Allocated Bytes: -11,956,720 bytes
>> Number of Old Allocated Bytes: 5,315,216 bytes
>> Total Allocated Bytes: -6,641,504 bytes
>> Number of Increased Memory Bytes: 0 bytes
>> Number of Incremental GC's: 1,354
>> Number of Full GC's: 0
>> Number of Tenures of Surving Objects: 145
>>
>> And another:
>> Number of Young Allocated Bytes: 59,132,472 bytes
>> Number of Old Allocated Bytes: 6,177,772 bytes
>> Total Allocated Bytes: 65,310,244 bytes
>> Number of Increased Memory Bytes: 0 bytes
>> Number of Incremental GC's: 1,361
>> Number of Full GC's: 0
>> Number of Tenures of Surving Objects: 167
>>
>> The metric "Number of Young Allocated Bytes", "Number of Old Allocated Bytes", "Total Allocated Bytes" are highly unpredictable: i.e., two profiles are likely to be completely different. I am not sure why is this so. I use the standard Pharo1.2 One click, without additional parameter to the VM. Do you think that other platforms and languages behave the same?
> It's not really surprising; since young space allocated size is shrunk by GC,  a simple "value after block" - "value  before block", for how much was actually allocated will give you rather random results.
>
> The amount of surviving objects (ie tenured into old allocated bytes)/tenures though is pretty much the same over the different runs, the differences can probably be explained by higher-priority process activity.
>
>> However, the positive point is that it seems that the amount of incremental GC and the number of tenured objects may play an important role in estimating the memory consumption. These metrics are much more stable.
>> What do you think on this? Does it make sense to say that more a method triggers incremental GC, more it consumes memory?
>>
>> What exactly triggers an incremental GC?
> http://wiki.squeak.org/squeak/1469
> "Eden collections are triggered by either running out of eden space, or allocating more than a certain number of objects (currently 4000) since the last collection "
> Not sure if it's up to date, but I'd guess so.
>
> It's hard to build a Profiler for creation of non-transient objects during a run, since the creation and tenuring of created objects usually happens at entirely different places (around 200 incGC's per tenure in my image).
> In other words, you can't really say that a method is "consuming" memory simply because a tenure happened during it's execution, and there was an increase in old space allocation, as some/many of the surviving objects could have been created 200 incGC's earlier.
>
> The amount of incGC's is usually a good indication of how many/how large objects are created during a methods execution though, due to the above mentioned trigger conditions.
> In the context of determining overhead due to GC, those numbers are usually the interesting ones anyways.
>
> Cheers,
> Henry
>

--
_,.;:~^~:;._,.;:~^~:;._,.;:~^~:;._,.;:~^~:;._,.;:
Alexandre Bergel  http://www.bergel.eu
^~:;._,.;:~^~:;._,.;:~^~:;._,.;:~^~:;._,.;:~^~:;.