Debugging Exupery

Previous Topic Next Topic
 
classic Classic list List threaded Threaded
3 messages Options
Reply | Threaded
Open this post in threaded view
|

Debugging Exupery

Bryce Kampjes

I've just found one of the old bugs that's causing crashes every hour
or so. I'm not sure if it's the causes of all the crashes I can
reproduce with the current slow reproduction or just one of the
crashes. Here's a quick walk through debugging real crashes in
Exupery.

First try running the background compiler to see if it crashes:

      Exupery initialiseExupery.
      Exupery startBackgroundCompilation.

Now just use Squeak as normal until it crashes. It takes about
ten minutes on my machine for it to finish compiling itself.

Load the logs using "explore it":

     ExuperyLogReader new readFile: 'Exupery.log'.

Exupery is a dynamic compiler, it chooses what to compile
based on profiling, so it can be very hard to reproduce
bugs as there's some luck in what's compiled.

Then replay the logs to reproduce the code cache as it
was when it crashed. Execute the following in the explorer
on the logs:

     self compile

Then try to make the crash reproducible by running the
last thing many times over because it looks like a race.
               
      10000 timesRepeat:
        [Exupery
                compileMethod: #visitMov:
                class: InstructionSelector]

Now reduce the amount of code in the code cache to make the bug
more understandable and normally faster to reproduce. The
following expressions are useful on the logs:

     3 to: 4 do: [:each| (self at: each) shouldCompile: false]
     51 to: 100 do: [:each| (self at: each) shouldCompile: false]
     3 to: 4 do: [:each| (self at: each) shouldCompile: true]

     (self at: 3) shouldCompile: false
     (self at: 3) shouldCompile: true

Then refine the reproduction to something easier to debug and
hopefully that can be turned into an SUnit test:

          Exupery initialiseExupery.
          Exupery
                compileMethod: #wait
                class: Delay
               
                10000 timesRepeat:
        [Exupery log: 'Compiling example'. Exupery
                compileMethod: (Exupery
                               definingMethodForSelector: #visitMov:
                               class: InstructionSelector)
                inlining: #()
                receiver: InstructionSelector]

Then debug it using logging and gdb. There's only been a few Exupery
bugs that were debug-gable with the Smalltalk debugger, bugs normally
crash the image.

In this case it's curious that the bug is triggered by compiling
Delay>>wait, the first method that the background compiler normally
compiles as often most of the time is spent waiting in the idle loop.

One bug is Exupery isn't checking if the context it's returning into
has already returned, it's PC is nil, and is crashing in Exupery's
return code when it fails to calculate the address to place it's
result in as the stack pointer is nil when it's expecting a very
small integer.

I'm not sure if that's the only bug that the reproduction above
is producing, only in one case was the crash at a time when it was
definitely due to only trying to return into a dead context. Most
crashes are however in the return code, and the context being returned
to isn't good.

The next step is to produce an SUnit test for returning from a
compiled method into a context with a nil PC and stack pointer.  Then
after fixing it, checking if that fixes the slow bug reproduction as
well.

Bryce
_______________________________________________
Exupery mailing list
[hidden email]
http://lists.squeakfoundation.org/cgi-bin/mailman/listinfo/exupery
Reply | Threaded
Open this post in threaded view
|

Re: Debugging Exupery

Jason Johnson-5
This sounds like some pre/post conditions might help, i.e. a returned
to context must always be valid, etc.  Just a thought, given that this
bug has been out there so long and turned out to be kind of a
"contract violation".

But interesting stuff.  Does the compiler run longer then an hour now?

On Mon, Nov 10, 2008 at 12:06 AM,  <[hidden email]> wrote:

>
> I've just found one of the old bugs that's causing crashes every hour
> or so. I'm not sure if it's the causes of all the crashes I can
> reproduce with the current slow reproduction or just one of the
> crashes. Here's a quick walk through debugging real crashes in
> Exupery.
>
> First try running the background compiler to see if it crashes:
>
>      Exupery initialiseExupery.
>      Exupery startBackgroundCompilation.
>
> Now just use Squeak as normal until it crashes. It takes about
> ten minutes on my machine for it to finish compiling itself.
>
> Load the logs using "explore it":
>
>     ExuperyLogReader new readFile: 'Exupery.log'.
>
> Exupery is a dynamic compiler, it chooses what to compile
> based on profiling, so it can be very hard to reproduce
> bugs as there's some luck in what's compiled.
>
> Then replay the logs to reproduce the code cache as it
> was when it crashed. Execute the following in the explorer
> on the logs:
>
>     self compile
>
> Then try to make the crash reproducible by running the
> last thing many times over because it looks like a race.
>
>      10000 timesRepeat:
>        [Exupery
>                compileMethod: #visitMov:
>                class: InstructionSelector]
>
> Now reduce the amount of code in the code cache to make the bug
> more understandable and normally faster to reproduce. The
> following expressions are useful on the logs:
>
>     3 to: 4 do: [:each| (self at: each) shouldCompile: false]
>     51 to: 100 do: [:each| (self at: each) shouldCompile: false]
>     3 to: 4 do: [:each| (self at: each) shouldCompile: true]
>
>     (self at: 3) shouldCompile: false
>     (self at: 3) shouldCompile: true
>
> Then refine the reproduction to something easier to debug and
> hopefully that can be turned into an SUnit test:
>
>          Exupery initialiseExupery.
>          Exupery
>                compileMethod: #wait
>                class: Delay
>
>                10000 timesRepeat:
>        [Exupery log: 'Compiling example'. Exupery
>                compileMethod: (Exupery
>                               definingMethodForSelector: #visitMov:
>                               class: InstructionSelector)
>                inlining: #()
>                receiver: InstructionSelector]
>
> Then debug it using logging and gdb. There's only been a few Exupery
> bugs that were debug-gable with the Smalltalk debugger, bugs normally
> crash the image.
>
> In this case it's curious that the bug is triggered by compiling
> Delay>>wait, the first method that the background compiler normally
> compiles as often most of the time is spent waiting in the idle loop.
>
> One bug is Exupery isn't checking if the context it's returning into
> has already returned, it's PC is nil, and is crashing in Exupery's
> return code when it fails to calculate the address to place it's
> result in as the stack pointer is nil when it's expecting a very
> small integer.
>
> I'm not sure if that's the only bug that the reproduction above
> is producing, only in one case was the crash at a time when it was
> definitely due to only trying to return into a dead context. Most
> crashes are however in the return code, and the context being returned
> to isn't good.
>
> The next step is to produce an SUnit test for returning from a
> compiled method into a context with a nil PC and stack pointer.  Then
> after fixing it, checking if that fixes the slow bug reproduction as
> well.
>
> Bryce
> _______________________________________________
> Exupery mailing list
> [hidden email]
> http://lists.squeakfoundation.org/cgi-bin/mailman/listinfo/exupery
>
_______________________________________________
Exupery mailing list
[hidden email]
http://lists.squeakfoundation.org/cgi-bin/mailman/listinfo/exupery
Reply | Threaded
Open this post in threaded view
|

Re: Debugging Exupery

Bryce Kampjes
Jason Johnson writes:
 > This sounds like some pre/post conditions might help, i.e. a returned
 > to context must always be valid, etc.  Just a thought, given that this
 > bug has been out there so long and turned out to be kind of a
 > "contract violation".
 >
 > But interesting stuff.  Does the compiler run longer then an hour now?

The problem with pre/post conditions is you'd need to add them
to every return and most VMs execute millions of sends every
second. Normally, I'll just add a test or two that covers the
cases required. I have tried adding debugging checks but it's
very easy to make the VM cripplingly slow.

It runs about an hour still, I haven't fixed the bug yet. Just
isolated one. Though it can last for longer. After about 10
minutes it's finished compiling itself so to compile anything
more requires use so the profiler sees some new code.

Bryce
_______________________________________________
Exupery mailing list
[hidden email]
http://lists.squeakfoundation.org/cgi-bin/mailman/listinfo/exupery