I've just found one of the old bugs that's causing crashes every hour or so. I'm not sure if it's the causes of all the crashes I can reproduce with the current slow reproduction or just one of the crashes. Here's a quick walk through debugging real crashes in Exupery. First try running the background compiler to see if it crashes: Exupery initialiseExupery. Exupery startBackgroundCompilation. Now just use Squeak as normal until it crashes. It takes about ten minutes on my machine for it to finish compiling itself. Load the logs using "explore it": ExuperyLogReader new readFile: 'Exupery.log'. Exupery is a dynamic compiler, it chooses what to compile based on profiling, so it can be very hard to reproduce bugs as there's some luck in what's compiled. Then replay the logs to reproduce the code cache as it was when it crashed. Execute the following in the explorer on the logs: self compile Then try to make the crash reproducible by running the last thing many times over because it looks like a race. 10000 timesRepeat: [Exupery compileMethod: #visitMov: class: InstructionSelector] Now reduce the amount of code in the code cache to make the bug more understandable and normally faster to reproduce. The following expressions are useful on the logs: 3 to: 4 do: [:each| (self at: each) shouldCompile: false] 51 to: 100 do: [:each| (self at: each) shouldCompile: false] 3 to: 4 do: [:each| (self at: each) shouldCompile: true] (self at: 3) shouldCompile: false (self at: 3) shouldCompile: true Then refine the reproduction to something easier to debug and hopefully that can be turned into an SUnit test: Exupery initialiseExupery. Exupery compileMethod: #wait class: Delay 10000 timesRepeat: [Exupery log: 'Compiling example'. Exupery compileMethod: (Exupery definingMethodForSelector: #visitMov: class: InstructionSelector) inlining: #() receiver: InstructionSelector] Then debug it using logging and gdb. There's only been a few Exupery bugs that were debug-gable with the Smalltalk debugger, bugs normally crash the image. In this case it's curious that the bug is triggered by compiling Delay>>wait, the first method that the background compiler normally compiles as often most of the time is spent waiting in the idle loop. One bug is Exupery isn't checking if the context it's returning into has already returned, it's PC is nil, and is crashing in Exupery's return code when it fails to calculate the address to place it's result in as the stack pointer is nil when it's expecting a very small integer. I'm not sure if that's the only bug that the reproduction above is producing, only in one case was the crash at a time when it was definitely due to only trying to return into a dead context. Most crashes are however in the return code, and the context being returned to isn't good. The next step is to produce an SUnit test for returning from a compiled method into a context with a nil PC and stack pointer. Then after fixing it, checking if that fixes the slow bug reproduction as well. Bryce _______________________________________________ Exupery mailing list [hidden email] http://lists.squeakfoundation.org/cgi-bin/mailman/listinfo/exupery |
This sounds like some pre/post conditions might help, i.e. a returned
to context must always be valid, etc. Just a thought, given that this bug has been out there so long and turned out to be kind of a "contract violation". But interesting stuff. Does the compiler run longer then an hour now? On Mon, Nov 10, 2008 at 12:06 AM, <[hidden email]> wrote: > > I've just found one of the old bugs that's causing crashes every hour > or so. I'm not sure if it's the causes of all the crashes I can > reproduce with the current slow reproduction or just one of the > crashes. Here's a quick walk through debugging real crashes in > Exupery. > > First try running the background compiler to see if it crashes: > > Exupery initialiseExupery. > Exupery startBackgroundCompilation. > > Now just use Squeak as normal until it crashes. It takes about > ten minutes on my machine for it to finish compiling itself. > > Load the logs using "explore it": > > ExuperyLogReader new readFile: 'Exupery.log'. > > Exupery is a dynamic compiler, it chooses what to compile > based on profiling, so it can be very hard to reproduce > bugs as there's some luck in what's compiled. > > Then replay the logs to reproduce the code cache as it > was when it crashed. Execute the following in the explorer > on the logs: > > self compile > > Then try to make the crash reproducible by running the > last thing many times over because it looks like a race. > > 10000 timesRepeat: > [Exupery > compileMethod: #visitMov: > class: InstructionSelector] > > Now reduce the amount of code in the code cache to make the bug > more understandable and normally faster to reproduce. The > following expressions are useful on the logs: > > 3 to: 4 do: [:each| (self at: each) shouldCompile: false] > 51 to: 100 do: [:each| (self at: each) shouldCompile: false] > 3 to: 4 do: [:each| (self at: each) shouldCompile: true] > > (self at: 3) shouldCompile: false > (self at: 3) shouldCompile: true > > Then refine the reproduction to something easier to debug and > hopefully that can be turned into an SUnit test: > > Exupery initialiseExupery. > Exupery > compileMethod: #wait > class: Delay > > 10000 timesRepeat: > [Exupery log: 'Compiling example'. Exupery > compileMethod: (Exupery > definingMethodForSelector: #visitMov: > class: InstructionSelector) > inlining: #() > receiver: InstructionSelector] > > Then debug it using logging and gdb. There's only been a few Exupery > bugs that were debug-gable with the Smalltalk debugger, bugs normally > crash the image. > > In this case it's curious that the bug is triggered by compiling > Delay>>wait, the first method that the background compiler normally > compiles as often most of the time is spent waiting in the idle loop. > > One bug is Exupery isn't checking if the context it's returning into > has already returned, it's PC is nil, and is crashing in Exupery's > return code when it fails to calculate the address to place it's > result in as the stack pointer is nil when it's expecting a very > small integer. > > I'm not sure if that's the only bug that the reproduction above > is producing, only in one case was the crash at a time when it was > definitely due to only trying to return into a dead context. Most > crashes are however in the return code, and the context being returned > to isn't good. > > The next step is to produce an SUnit test for returning from a > compiled method into a context with a nil PC and stack pointer. Then > after fixing it, checking if that fixes the slow bug reproduction as > well. > > Bryce > _______________________________________________ > Exupery mailing list > [hidden email] > http://lists.squeakfoundation.org/cgi-bin/mailman/listinfo/exupery > Exupery mailing list [hidden email] http://lists.squeakfoundation.org/cgi-bin/mailman/listinfo/exupery |
Jason Johnson writes:
> This sounds like some pre/post conditions might help, i.e. a returned > to context must always be valid, etc. Just a thought, given that this > bug has been out there so long and turned out to be kind of a > "contract violation". > > But interesting stuff. Does the compiler run longer then an hour now? The problem with pre/post conditions is you'd need to add them to every return and most VMs execute millions of sends every second. Normally, I'll just add a test or two that covers the cases required. I have tried adding debugging checks but it's very easy to make the VM cripplingly slow. It runs about an hour still, I haven't fixed the bug yet. Just isolated one. Though it can last for longer. After about 10 minutes it's finished compiling itself so to compile anything more requires use so the profiler sees some new code. Bryce _______________________________________________ Exupery mailing list [hidden email] http://lists.squeakfoundation.org/cgi-bin/mailman/listinfo/exupery |
Free forum by Nabble | Edit this page |