Strategy to finding memleaks

Previous Topic Next Topic
 
classic Classic list List threaded Threaded
17 messages Options
Reply | Threaded
Open this post in threaded view
|

Strategy to finding memleaks

Holger Freyther
Hi Paolo,

my postgres code is now working like it should and to test it I was letting it
run in a loop and I see that there is an increase in usage of memory. My first
try was to use ObjectMemory to force a global garbage collect but that did not
change anything, I will try with compacting heaps now as well.

Do you have any strategy to find memory leaks? How is this Eden working? If an
Object is moved there it will never be GCed?

any hints
        holger

_______________________________________________
help-smalltalk mailing list
[hidden email]
http://lists.gnu.org/mailman/listinfo/help-smalltalk
Reply | Threaded
Open this post in threaded view
|

Re: Strategy to finding memleaks

Paolo Bonzini-2
On 04/09/2011 09:55 AM, Holger Hans Peter Freyther wrote:
> Hi Paolo,
>
> my postgres code is now working like it should and to test it I was letting it
> run in a loop and I see that there is an increase in usage of memory. My first
> try was to use ObjectMemory to force a global garbage collect but that did not
> change anything, I will try with compacting heaps now as well.
>
> Do you have any strategy to find memory leaks? How is this Eden working? If an
> Object is moved there it will never be GCed?

No, absolutely not!  Objects are always GCed.  Eden is where objects are
_born_, and until they are in the eden reclaiming them is particularly
efficient.

To find memory leaks, I suggest you use instanceCount and find objects
with a very large number of instances.  From there, finding the owners
(which may not have a very large number of instances, but are the roots
of the leak) is usually easy.  Possibly you can do something like

     x := SomeClass someInstance.
     [o := x allOwners. o isEmpty] whileTrue: [ x := x nextInstance ].
     (o collect: [ :each |each class]) asBag

and recurse on the classes shown in the bag.

Usually, saving the image and doing the above in the REPL will also be
useful to find leaks.

I believe the archives have some information on finding leaks related to
Iliad (July 2009, should be).

Paolo

_______________________________________________
help-smalltalk mailing list
[hidden email]
http://lists.gnu.org/mailman/listinfo/help-smalltalk
Reply | Threaded
Open this post in threaded view
|

Re: Strategy to finding memleaks

Holger Freyther
On 04/09/2011 10:24 AM, Paolo Bonzini wrote:
> On 04/09/2011 09:55 AM, Holger Hans Peter Freyther wrote:
>> Hi Paolo,

>
>     x := SomeClass someInstance.
>     [o := x allOwners. o isEmpty] whileTrue: [ x := x nextInstance ].
>     (o collect: [ :each |each class]) asBag
>


Hi,
thanks for your fast reply. I was doing this test code:


    [
    [
        top select: 'SELECT * FROM version()'.
    ] repeat
    ] ensure: [
        ObjectMemory compact.
        DBI.Statement allSubinstancesDo: [:each | each class printNl].
        DBI.ResultSet allSubinstancesDo: [:each | each class printNl].
    ].

then using CTRL+C and it appears that after GC many ResultSets are still in
the image. I am now going to use your trick to find the owners.

thanks
        holger

_______________________________________________
help-smalltalk mailing list
[hidden email]
http://lists.gnu.org/mailman/listinfo/help-smalltalk
Reply | Threaded
Open this post in threaded view
|

Re: Strategy to finding memleaks

Holger Freyther
On 04/09/2011 10:27 AM, Holger Hans Peter Freyther wrote:

> then using CTRL+C and it appears that after GC many ResultSets are still in
> the image. I am now going to use your trick to find the owners.

hmm. do you think it is possible to add a primitive to create a dot file
showing the whole object relationship?

E.g. I see many objects that have no owners (only the local variable that I
just assigned it to) but they are not recycled. Some seem to still be in the
finalized list of something.

Should ObjectMemory compact clean all of these objects?

_______________________________________________
help-smalltalk mailing list
[hidden email]
http://lists.gnu.org/mailman/listinfo/help-smalltalk
Reply | Threaded
Open this post in threaded view
|

Re: Strategy to finding memleaks

Holger Freyther
On 04/09/2011 11:05 AM, Holger Hans Peter Freyther wrote:
> On 04/09/2011 10:27 AM, Holger Hans Peter Freyther wrote:
>

>
> Should ObjectMemory compact clean all of these objects?

Okay with ObjectMemory globalGarbageCollect; compact. all these objects will
be removed. But without these the VM looks like it will go OOM as it is
finalizing these objects but not removing them from memory.

Does this sound plausible? With my naive thinking the VM should be able to
recycle the allocations it made for the PGResultSet (same size) and the memory
size should never increase once it hit his 'working set'.


_______________________________________________
help-smalltalk mailing list
[hidden email]
http://lists.gnu.org/mailman/listinfo/help-smalltalk
Reply | Threaded
Open this post in threaded view
|

Re: Strategy to finding memleaks

Paolo Bonzini-2
On Sat, Apr 9, 2011 at 12:14, Holger Hans Peter Freyther
<[hidden email]> wrote:
> Okay with ObjectMemory globalGarbageCollect; compact. all these objects will
> be removed. But without these the VM looks like it will go OOM as it is
> finalizing these objects but not removing them from memory.

That's strange but possible. Do you do anything except "run a lot of queries"?

Paolo

_______________________________________________
help-smalltalk mailing list
[hidden email]
http://lists.gnu.org/mailman/listinfo/help-smalltalk
Reply | Threaded
Open this post in threaded view
|

Re: Strategy to finding memleaks

Holger Freyther
On 04/09/2011 01:37 PM, Paolo Bonzini wrote:
> On Sat, Apr 9, 2011 at 12:14, Holger Hans Peter Freyther
> <[hidden email]> wrote:
>> Okay with ObjectMemory globalGarbageCollect; compact. all these objects will
>> be removed. But without these the VM looks like it will go OOM as it is
>> finalizing these objects but not removing them from memory.
>
> That's strange but possible. Do you do anything except "run a lot of queries"?

I just execute the script below. In the case where I found this I was using
the result to determine how many rows where affected (to see if the update was
successful or failed).

Do you have a hint of where I could look in oop.c? In your mental model what
should happen?



Eval [
    | top |

     PackageLoader fileInPackage: 'DBD-PostgreSQL'.

    top := DBI.Connection
            connect: 'dbi:PostgreSQL:dbname=DB;hostname=localhost'
            user: 'USER' password: 'PW'.

    [
    [
        top select: 'SELECT * FROM version()'.
    ] repeat.

    ] ensure: [
        | ops o |
        ObjectMemory globalGarbageCollect.
        ObjectMemory compact.

        ops := DBI.PostgreSQL.PGResultSet someInstance.
        [ops isNil] whileFalse: [
                [o := ops allOwners.
                 o isEmpty printNl. o size < 2] whileTrue: [
                        ops := ops nextInstance ].

                [ o printNl. ] on: Error do: ['Already finalized?' printNl].
                ops := ops nextInstance. ops isNil].
    ].
]


_______________________________________________
help-smalltalk mailing list
[hidden email]
http://lists.gnu.org/mailman/listinfo/help-smalltalk
Reply | Threaded
Open this post in threaded view
|

Re: Strategy to finding memleaks

Paolo Bonzini-2
On Sat, Apr 9, 2011 at 13:49, Holger Hans Peter Freyther
<[hidden email]> wrote:

> On 04/09/2011 01:37 PM, Paolo Bonzini wrote:
>> On Sat, Apr 9, 2011 at 12:14, Holger Hans Peter Freyther
>> <[hidden email]> wrote:
>>> Okay with ObjectMemory globalGarbageCollect; compact. all these objects will
>>> be removed. But without these the VM looks like it will go OOM as it is
>>> finalizing these objects but not removing them from memory.
>>
>> That's strange but possible. Do you do anything except "run a lot of queries"?
>
> I just execute the script below. In the case where I found this I was using
> the result to determine how many rows where affected (to see if the update was
> successful or failed).
>
> Do you have a hint of where I could look in oop.c? In your mental model what
> should happen?

After some time the first GC would happen and the result sets should
be gathered into an array and finalized. Then the second GC would
happen and the result sets would be collected.

You can check: 1) if the finalizers are run; 2) who the owners are
after #finalize; 3) who the owners are for the array of objects to be
finalized, after its processing has ended.

You have a talent for writing testcases, anyway!

Paolo

_______________________________________________
help-smalltalk mailing list
[hidden email]
http://lists.gnu.org/mailman/listinfo/help-smalltalk
Reply | Threaded
Open this post in threaded view
|

Re: Strategy to finding memleaks

Holger Freyther
On 04/09/2011 03:33 PM, Paolo Bonzini wrote:

> After some time the first GC would happen and the result sets should
> be gathered into an array and finalized. Then the second GC would
> happen and the result sets would be collected.
>
> You can check: 1) if the finalizers are run; 2) who the owners are
> after #finalize; 3) who the owners are for the array of objects to be
> finalized, after its processing has ended.
>
> You have a talent for writing testcases, anyway!

I failed so far with a standalone testcase.. in the real one we have max 5000
Instances of DBI.PostgreSQL.PGResultSet. I think it is too much but that is
not a bug.. so hypothesis one and two are... memory fragmentation or leak on
the Postgres calls..

did you ever attempt to build the C code with boehmGC to detect leaks?

_______________________________________________
help-smalltalk mailing list
[hidden email]
http://lists.gnu.org/mailman/listinfo/help-smalltalk
Reply | Threaded
Open this post in threaded view
|

Re: Strategy to finding memleaks

Paolo Bonzini-2
On Sat, Apr 9, 2011 at 19:04, Holger Hans Peter Freyther
<[hidden email]> wrote:

> On 04/09/2011 03:33 PM, Paolo Bonzini wrote:
>
>> After some time the first GC would happen and the result sets should
>> be gathered into an array and finalized. Then the second GC would
>> happen and the result sets would be collected.
>>
>> You can check: 1) if the finalizers are run; 2) who the owners are
>> after #finalize; 3) who the owners are for the array of objects to be
>> finalized, after its processing has ended.
>>
>> You have a talent for writing testcases, anyway!
>
> I failed so far with a standalone testcase.. in the real one we have max 5000
> Instances of DBI.PostgreSQL.PGResultSet. I think it is too much but that is
> not a bug.. so hypothesis one and two are... memory fragmentation or leak on
> the Postgres calls..

Leaks in C code should be "obvious" by comparing ObjectMemory values
with those from top(1).

Also they wouldn't be fixed by Smalltalk GC. :)

Memory fragmentation can be fixed by "ObjectMemory compact" without a
previous  GC.

Paolo

_______________________________________________
help-smalltalk mailing list
[hidden email]
http://lists.gnu.org/mailman/listinfo/help-smalltalk
Reply | Threaded
Open this post in threaded view
|

Re: Strategy to finding memleaks

Holger Freyther
On 04/09/2011 08:15 PM, Paolo Bonzini wrote:

> Leaks in C code should be "obvious" by comparing ObjectMemory values
> with those from top(1).
>
> Also they wouldn't be fixed by Smalltalk GC. :)
>
> Memory fragmentation can be fixed by "ObjectMemory compact" without a
> previous  GC.

I will keep digging but to rule out Postgres memory leaks I was creating a
minimal stub and changed the libname in the package.xml.


gcc -shared -o libzpq.so
void *PQconnectdb()
{
        return 24;
}

void *PQexec()
{
        return 23;
}

void PQclear()
{
        return;
}

int PQstatus()
{
        return 0;
}

int PQresultStatus()
{
        return 2;
}

the Smalltalk code is this:

top := DBI.Connection
         connect: 'dbi:PostgreSQL:dbname=w;hostname=b'
         user: 'd' password: 'c'.

m := 0.
1 to: 10000000 do: [:each |
     (top select: 'SELECT * FROM topups') release.

        (each \\ 100) = 0 ifTrue: [
                | inst |
                inst := DBI.PostgreSQL.PGResultSet allInstances size.
                inst >= m ifTrue: [m := inst. inst printNl]]
]


so somehow if I call release on the ResultSet it is removing itself from the
finalizer list and the objects are GCed faster. Do you remember the reason why
a new process is started to finalize objects instead of doing it within the
process that waits for the Semaphore?



_______________________________________________
help-smalltalk mailing list
[hidden email]
http://lists.gnu.org/mailman/listinfo/help-smalltalk
Reply | Threaded
Open this post in threaded view
|

Re: Strategy to finding memleaks

Holger Freyther
On 04/09/2011 09:29 PM, Holger Hans Peter Freyther wrote:
>

Hi Paolo,

this seems a bit weird. once we add allocPtr + BYTES_TO.. and then we just do
allocPtr + size. I am not sure about the difference but it looks wrong. This
also raises the question when/if the eden is shrinked? I will try to find the
answer for that now.

@@ -762,13 +762,16 @@ _gst_alloc_obj (size_t size,
      GC, so we use a local var to hold its new value */
   newAllocPtr = _gst_mem.eden.allocPtr + BYTES_TO_SIZE (size);

  if UNCOMMON (size >= _gst_mem.big_object_threshold)


   if UNCOMMON (newAllocPtr >= _gst_mem.eden.maxPtr)
     {
       _gst_scavenge ();
-      newAllocPtr = _gst_mem.eden.allocPtr + size;
+      newAllocPtr = _gst_mem.eden.allocPtr + BYTES_TO_SIZE(size);
     }


_______________________________________________
help-smalltalk mailing list
[hidden email]
http://lists.gnu.org/mailman/listinfo/help-smalltalk
Reply | Threaded
Open this post in threaded view
|

Re: Strategy to finding memleaks

Holger Freyther
On 04/09/2011 10:41 PM, Holger Hans Peter Freyther wrote:
> On 04/09/2011 09:29 PM, Holger Hans Peter Freyther wrote:
>>
>
> Hi Paolo,


this does not seem to be documented well. if I have a
<cCall ... args: #(#self #string)..>

how is the string converted? will it be deleted after the call? do we have
types that can specify the one or the other?

current test code (with my stub lib)

    | o s|
    PackageLoader fileInPackage: 'DBD-PostgreSQL'.

    o := DBI.PostgreSQL.PQConnection address: 24.
    s := 'SELECT * FROM version()'.

    1 to: 10000000 do: [:each |
        o exec: s.]

_______________________________________________
help-smalltalk mailing list
[hidden email]
http://lists.gnu.org/mailman/listinfo/help-smalltalk
Reply | Threaded
Open this post in threaded view
|

Re: Strategy to finding memleaks

Holger Freyther
On 04/10/2011 12:09 AM, Holger Hans Peter Freyther wrote:

Hi,
I think this is what we want to do. There are some C APIs that rely on the
string being passed and owned by them and we would need a new keyword for
these? gst-browser is still starting up after this change.


diff --git a/libgst/cint.c b/libgst/cint.c
index 061a829..9e0363a 100644
--- a/libgst/cint.c
+++ b/libgst/cint.c
@@ -823,6 +823,10 @@ _gst_invoke_croutine (OOP cFuncOOP,
         case CDATA_WSTRING_OUT:
         case CDATA_STRING_OUT:
         case CDATA_BYTEARRAY_OUT:
+        case CDATA_STRING:
+        case CDATA_BYTEARRAY:
+        case CDATA_SYMBOL:
+        case CDATA_WSTRING:
          needPostprocessing = true;
          /* fall through */


_______________________________________________
help-smalltalk mailing list
[hidden email]
http://lists.gnu.org/mailman/listinfo/help-smalltalk
Reply | Threaded
Open this post in threaded view
|

Re: Strategy to finding memleaks

Paolo Bonzini-2
In reply to this post by Holger Freyther
On 04/09/2011 09:29 PM, Holger Hans Peter Freyther wrote:
> so somehow if I call release on the ResultSet it is removing itself from the
> finalizer list and the objects are GCed faster. Do you remember the reason why
> a new process is started to finalize objects instead of doing it within the
> process that waits for the Semaphore?

Yes, otherwise it doesn't work in case two GCs happen during a
finalization round.

Do you think the priority should be bumped, but with finalizers done in
batches every say 100 ms?

Paolo

_______________________________________________
help-smalltalk mailing list
[hidden email]
http://lists.gnu.org/mailman/listinfo/help-smalltalk
Reply | Threaded
Open this post in threaded view
|

Re: Strategy to finding memleaks

Paolo Bonzini-2
In reply to this post by Holger Freyther
On 04/10/2011 12:09 AM, Holger Hans Peter Freyther wrote:
> this does not seem to be documented well. if I have a
> <cCall ... args: #(#self #string)..>
>
> how is the string converted?

It is NULL-terminated, but the whole data from the object is passed.

> will it be deleted after the call?

Yes.

> do we have types that can specify the one or the other?

No.  You can use #asCData and #cObject to transfer ownership.

Paolo

_______________________________________________
help-smalltalk mailing list
[hidden email]
http://lists.gnu.org/mailman/listinfo/help-smalltalk
Reply | Threaded
Open this post in threaded view
|

Re: Strategy to finding memleaks

Paolo Bonzini-2
In reply to this post by Holger Freyther
On 04/10/2011 12:28 AM, Holger Hans Peter Freyther wrote:
> On 04/10/2011 12:09 AM, Holger Hans Peter Freyther wrote:
>
> Hi,
> I think this is what we want to do. There are some C APIs that rely on the
> string being passed and owned by them and we would need a new keyword for
> these?

No, the patch is correct.  Good catch.

Paolo

_______________________________________________
help-smalltalk mailing list
[hidden email]
http://lists.gnu.org/mailman/listinfo/help-smalltalk