Bug with long recursion

Previous Topic Next Topic
 
classic Classic list List threaded Threaded
3 messages Options
Reply | Threaded
Open this post in threaded view
|

Bug with long recursion

MrGwen
Hi,

I've investigate a bit the long recursion bug, to debug I've choose a
simple recursive code:

ObjectMemory growTo: 100 * 1024 * 1024. " This is not needed "
Object compile: 'foo: i [ i > 1 ifTrue: [ self foo: i - 1 ] ifFalse: [
#ici printNl ] ]'.
Object new foo: 1000000

If you run it it crashes, first you need to apply the mark patch. But
it still crash fortunately it says the OOP was free when trying to copy
it, but what was strange it was called in _gst_copy_registered_oops. So
it seems that a registered OOP or more (in fact more :s) is freed.

#0  0x00007ffff6ea8d05 in raise (sig=6) at
../nptl/sysdeps/unix/sysv/linux/raise.c:64
#1  0x00007ffff6eacab6 in abort () at abort.c:92
#2  0x00007ffff7aea9ed in _gst_copy_an_oop (oop=0x2b5be31a6030) at
oop.c:2099
#3  0x00007ffff7b27642 in _gst_copy_registered_oops () at callin.c:1124
#4  0x00007ffff7aea1c6 in copy_oops () at oop.c:1800
#5  0x00007ffff7ae9325 in _gst_scavenge () at oop.c:1265
...

To know where it happens I've putted a simple function called in
_gst_sweep_oop that aborts when the oop is registered :

void
_gst_check_reg (OOP oop)
{
   rb_node_t *node;
   rb_traverse_t t;

   /* Walk the OOP registry, yes I know this is not complete I should
scan range of OOP...  */
   for (node = rb_first(&(oop_registry_root->rb), &t);
        node; node = rb_next(&t))
     {
       oop_registry *k = (oop_registry *) node;
       if (k->oop == oop)
         {
         fprintf (stderr, "%O\n", oop); abort ();
         }
     }
}

And the problem occurs in function: reset_incremental_gc
when _gst_sweep is called (line 1451).

#2  0x00007ffff7b275f6 in _gst_check_reg (oop=0x2aac614ec890) at
callin.c:1108
#3  0x00007ffff7ae9943 in _gst_sweep_oop (oop=0x2aac614ec890) at oop.c:1480
#4  0x00007ffff7ae9827 in reset_incremental_gc (firstOOP=0x2aac614e0030)
at oop.c:1451
#5  0x00007ffff7ae8c69 in _gst_global_gc (next_allocation=0) at oop.c:1150
#6  0x00007ffff7ae9246 in _gst_scavenge () at oop.c:1235
...

But more intersting I've desactivated the incremental GC and the test
works. I don't have a solution but hope that little investigation can
help us to remove that "annoying" bug :-)

Cheers,
Gwen


_______________________________________________
help-smalltalk mailing list
[hidden email]
https://lists.gnu.org/mailman/listinfo/help-smalltalk
Reply | Threaded
Open this post in threaded view
|

Re: Bug with long recursion

Paolo Bonzini-2
On 07/26/2011 12:58 AM, Gwenael Casaccio wrote:
> Hi,
>
> I've investigate a bit the long recursion bug, to debug I've choose a
> simple recursive code:
>
> ObjectMemory growTo: 100 * 1024 * 1024. " This is not needed "
> Object compile: 'foo: i [ i > 1 ifTrue: [ self foo: i - 1 ] ifFalse: [
> #ici printNl ] ]'.
> Object new foo: 1000000

3.2 works:

st> Object compile: 'foo: i [ i > 1 ifTrue: [ self foo: i - 1 ] ifFalse:
[ #ici printNl ] ]'.
Object>>foo:
st> Object new foo: 1000000
"Global garbage collection... done, heap grown"
"Global garbage collection... done, heap grown"
...
"Global garbage collection... done, heap grown"
"Global garbage collection... done, heap grown"
#ici
an Object
st> Object new foo: 10000000
"Global garbage collection... done, heap compacted"
"Global garbage collection... done, heap grown"
"Global garbage collection... done, heap grown"
...
"Global garbage collection... done, heap grown"
"Global garbage collection... done, heap grown"
gst: out of memory allocating 262144 bytes


Paolo

_______________________________________________
help-smalltalk mailing list
[hidden email]
https://lists.gnu.org/mailman/listinfo/help-smalltalk
Reply | Threaded
Open this post in threaded view
|

Re: Bug with long recursion

Paolo Bonzini-2
On 07/27/2011 11:42 AM, Paolo Bonzini wrote:

>>
>> I've investigate a bit the long recursion bug, to debug I've choose a
>> simple recursive code:
>>
>> ObjectMemory growTo: 100 * 1024 * 1024. " This is not needed "
>> Object compile: 'foo: i [ i > 1 ifTrue: [ self foo: i - 1 ] ifFalse: [
>> #ici printNl ] ]'.
>> Object new foo: 1000000
>
> 3.2 works:

It's actually latent there.  Like all GC bugs, it's of the "how the heck
can it work at all" kind.

The OOP that is being freed incorrectly is the Smalltalk dictionary,
nothing less.  The incorrect free is due to the OOP being swept twice,
the first time during the search for the first unallocated OOP, the
second while searching for the last allocated OOP.  I haven't really
analyzed how it happens (no time; bisection points at an unrelated
commit), but the fix is "obvious": just use IS_OOP_VALID rather than
IS_OOP_VALID_GC in reset_incremental_gc.

Gwen suggested offlist assigning _gst_mem.next_oop_to_sweep outside the
#ifdef.  That is not enough, but it put me on the right track.  Thanks!

Here is the patch:

diff --git a/libgst/oop.c b/libgst/oop.c
index baa92e6..567cdb1 100644
--- a/libgst/oop.c
+++ b/libgst/oop.c
@@ -1402,19 +1402,21 @@ reset_incremental_gc (OOP firstOOP)

    _gst_mem.first_allocated_oop = oop;

-#ifdef NO_INCREMENTAL_GC
    _gst_mem.next_oop_to_sweep = _gst_mem.last_allocated_oop;
+  _gst_mem.last_swept_oop = oop - 1;
+
+#ifdef NO_INCREMENTAL_GC
    _gst_finish_incremental_gc ();
  #else
    /* Skip high OOPs that are unallocated.  */
-  for (oop = _gst_mem.last_allocated_oop; !IS_OOP_VALID_GC (oop); oop--)
+  for (oop = _gst_mem.last_allocated_oop; !IS_OOP_VALID (oop); oop--)
      _gst_sweep_oop (oop);

    _gst_mem.last_allocated_oop = oop;
    _gst_mem.next_oop_to_sweep = oop;
  #endif

-  _gst_mem.last_swept_oop = _gst_mem.first_allocated_oop - 1;
    _gst_mem.num_free_oops = _gst_mem.ot_size -
      (_gst_mem.last_allocated_oop - _gst_mem.ot);

Paolo

_______________________________________________
help-smalltalk mailing list
[hidden email]
https://lists.gnu.org/mailman/listinfo/help-smalltalk