Frequent SegFaults in PharoVM with Pharo 3.0

Previous Topic Next Topic
 
classic Classic list List threaded Threaded
5 messages Options
Reply | Threaded
Open this post in threaded view
|

Frequent SegFaults in PharoVM with Pharo 3.0

Max Leske
 
Hi

We’ve been encountering frequent SegFaults when running the Fuel tests on the Pharo CI (https://ci.inria.fr/pharo-contribution/job/Fuel/). Since today I’ve also been able to reproduce the SegFaults on my MacBook Pro (OS X 10.9) too. We have not been able to determine the cause of the SegFault but we can produce SegFaults often, although not reliable (more reliable on the CI).

The CI uses the stable VM: http://files.pharo.org/vm/pharo/linux/stable.zip
I use a newer version from October on my machine: http://files.pharo.org/vm/pharo/mac/273.zip

I’ve attached all the dumps of the crashes I was able to produce on my machine, together with Apple’s crash logs.

I’ve been able to deduce the following:
- garbage collection seems to be a trigger for the SegFault. When one of the methods FLMethodContextSerialization>>testFuelShouldIgnoreFuel and FLMethodContextSerialization>>testMethodContextWithNilPc contain the line “3 timesRepeat: [Smalltalk garbageCollect]” the SegFault appears nearly always (on CI). When I remove the line the builds run through.
- Not all methods with the garbage collect line trigger a SegFault (I could only identify those two)
- the garbage collect line itself suffices as a trigger in the mentioned methods.
- The number of tests (amount of used memory?) seems to influence the appearence of the SegFault (e.g. loading "DevelopmentGroup" seems to trigger it more often than loading “Benchmarks”)
- the SegFault always appears after the tests with the garbage collect line have been run, never before
- the VM can’t write the crash.dmp every time

Since the SegFaults are so random I cannot give you an image to reproduce the problem. I’ve had the best results using a fresh 3.0 image (http://files.pharo.org/image/30/30549.zip) and then evaluating the following in a workspace:

Gofer it
        smalltalkhubUser: 'Pharo' project:  'Fuel';
        package: 'ConfigurationOfFuel';
        load.
       
((Smalltalk at: #ConfigurationOfFuel) project version: #bleedingEdge) load: 'DevelopmentGroup'.

HDTestReport
        runClasses: (TestCase allSubclasses select: [ :class | class name beginsWith: 'FL'])
        named: ‘foo'

If it doesn’t work, try using the TestRunner manually. Select the default Fuel tests (alphabetically at F) and the additional Fuel tests (at the bottom of the list) and run them.

If anybody has any clue about what could be going on I’d really appreciate any input. I’ll happily provide more information if I can.

Thanks for reading :)
Max


dumps.zip (140K) Download Attachment
Reply | Threaded
Open this post in threaded view
|

Re: Frequent SegFaults in PharoVM with Pharo 3.0

Mariano Martinez Peck
 
Max, did you try with Eliot VMs besides the pharo ones?

Thanks


On Wed, Nov 6, 2013 at 6:04 PM, Max Leske <[hidden email]> wrote:
 
Hi

We’ve been encountering frequent SegFaults when running the Fuel tests on the Pharo CI (https://ci.inria.fr/pharo-contribution/job/Fuel/). Since today I’ve also been able to reproduce the SegFaults on my MacBook Pro (OS X 10.9) too. We have not been able to determine the cause of the SegFault but we can produce SegFaults often, although not reliable (more reliable on the CI).

The CI uses the stable VM: http://files.pharo.org/vm/pharo/linux/stable.zip
I use a newer version from October on my machine: http://files.pharo.org/vm/pharo/mac/273.zip

I’ve attached all the dumps of the crashes I was able to produce on my machine, together with Apple’s crash logs.

I’ve been able to deduce the following:
- garbage collection seems to be a trigger for the SegFault. When one of the methods FLMethodContextSerialization>>testFuelShouldIgnoreFuel and FLMethodContextSerialization>>testMethodContextWithNilPc contain the line “3 timesRepeat: [Smalltalk garbageCollect]” the SegFault appears nearly always (on CI). When I remove the line the builds run through.
- Not all methods with the garbage collect line trigger a SegFault (I could only identify those two)
- the garbage collect line itself suffices as a trigger in the mentioned methods.
- The number of tests (amount of used memory?) seems to influence the appearence of the SegFault (e.g. loading "DevelopmentGroup" seems to trigger it more often than loading “Benchmarks”)
- the SegFault always appears after the tests with the garbage collect line have been run, never before
- the VM can’t write the crash.dmp every time

Since the SegFaults are so random I cannot give you an image to reproduce the problem. I’ve had the best results using a fresh 3.0 image (http://files.pharo.org/image/30/30549.zip) and then evaluating the following in a workspace:

Gofer it
        smalltalkhubUser: 'Pharo' project:  'Fuel';
        package: 'ConfigurationOfFuel';
        load.

((Smalltalk at: #ConfigurationOfFuel) project version: #bleedingEdge) load: 'DevelopmentGroup'.

HDTestReport
        runClasses: (TestCase allSubclasses select: [ :class | class name beginsWith: 'FL'])
        named: ‘foo'

If it doesn’t work, try using the TestRunner manually. Select the default Fuel tests (alphabetically at F) and the additional Fuel tests (at the bottom of the list) and run them.

If anybody has any clue about what could be going on I’d really appreciate any input. I’ll happily provide more information if I can.

Thanks for reading :)
Max





--
Mariano
http://marianopeck.wordpress.com
Reply | Threaded
Open this post in threaded view
|

Re: Frequent SegFaults in PharoVM with Pharo 3.0

Max Leske
In reply to this post by Max Leske
 
On 07.11.2013, at 00:27, [hidden email] wrote:

Date: Wed, 6 Nov 2013 19:23:47 -0200
From: Mariano Martinez Peck <[hidden email]>
Subject: Re: [Vm-dev] Frequent SegFaults in PharoVM with Pharo 3.0
To: Squeak Virtual Machine Development Discussion
<[hidden email]>
Message-ID:
<[hidden email]>
Content-Type: text/plain; charset="windows-1252"

Max, did you try with Eliot VMs besides the pharo ones?

No, I haven’t. I can try (though only on my machine), but if I can’t reproduce the SegFault that doesn’t mean much unfortunately. It took me 10 to 20 tries with the PharoVM too…

I’ll try anyway and let you know what I find.

Max


Thanks


On Wed, Nov 6, 2013 at 6:04 PM, Max Leske <[hidden email]> wrote:


Hi

We’ve been encountering frequent SegFaults when running the Fuel tests on
the Pharo CI (https://ci.inria.fr/pharo-contribution/job/Fuel/). Since
today I’ve also been able to reproduce the SegFaults on my MacBook Pro (OS
X 10.9) too. We have not been able to determine the cause of the SegFault
but we can produce SegFaults often, although not reliable (more reliable on
the CI).

The CI uses the stable VM:
http://files.pharo.org/vm/pharo/linux/stable.zip
I use a newer version from October on my machine:
http://files.pharo.org/vm/pharo/mac/273.zip

I’ve attached all the dumps of the crashes I was able to produce on my
machine, together with Apple’s crash logs.

I’ve been able to deduce the following:
- garbage collection seems to be a trigger for the SegFault. When one of
the methods FLMethodContextSerialization>>testFuelShouldIgnoreFuel and
FLMethodContextSerialization>>testMethodContextWithNilPc contain the line
“3 timesRepeat: [Smalltalk garbageCollect]” the SegFault appears nearly
always (on CI). When I remove the line the builds run through.
- Not all methods with the garbage collect line trigger a SegFault (I
could only identify those two)
- the garbage collect line itself suffices as a trigger in the mentioned
methods.
- The number of tests (amount of used memory?) seems to influence the
appearence of the SegFault (e.g. loading "DevelopmentGroup" seems to
trigger it more often than loading “Benchmarks”)
- the SegFault always appears after the tests with the garbage collect
line have been run, never before
- the VM can’t write the crash.dmp every time

Since the SegFaults are so random I cannot give you an image to reproduce
the problem. I’ve had the best results using a fresh 3.0 image (
http://files.pharo.org/image/30/30549.zip) and then evaluating the
following in a workspace:

Gofer it
       smalltalkhubUser: 'Pharo' project:  'Fuel';
       package: 'ConfigurationOfFuel';
       load.

((Smalltalk at: #ConfigurationOfFuel) project version: #bleedingEdge)
load: 'DevelopmentGroup'.

HDTestReport
       runClasses: (TestCase allSubclasses select: [ :class | class name
beginsWith: 'FL'])
       named: ‘foo'

If it doesn’t work, try using the TestRunner manually. Select the default
Fuel tests (alphabetically at F) and the additional Fuel tests (at the
bottom of the list) and run them.

If anybody has any clue about what could be going on I’d really appreciate
any input. I’ll happily provide more information if I can.

Thanks for reading :)
Max





-- 
Mariano
http://marianopeck.wordpress.com

Reply | Threaded
Open this post in threaded view
|

Re: Frequent SegFaults in PharoVM with Pharo 3.0

Max Leske
 

On 07.11.2013, at 08:20, Max Leske <[hidden email]> wrote:

On 07.11.2013, at 00:27, [hidden email] wrote:

Date: Wed, 6 Nov 2013 19:23:47 -0200
From: Mariano Martinez Peck <[hidden email]>
Subject: Re: [Vm-dev] Frequent SegFaults in PharoVM with Pharo 3.0
To: Squeak Virtual Machine Development Discussion
<[hidden email]>
Message-ID:
<[hidden email]>
Content-Type: text/plain; charset="windows-1252"

Max, did you try with Eliot VMs besides the pharo ones?

No, I haven’t. I can try (though only on my machine), but if I can’t reproduce the SegFault that doesn’t mean much unfortunately. It took me 10 to 20 tries with the PharoVM too…

I’ll try anyway and let you know what I find.

Max

I just gave CogVM and NBCog a shot. Unfortunately I can’t use the images because they’ve been built with PharoVM and hang after load. At least on the Pharo CI there are no jobs that create Cog / NBCog images.

So no luck with different VMs.

Max



Thanks


On Wed, Nov 6, 2013 at 6:04 PM, Max Leske <[hidden email]> wrote:


Hi

We’ve been encountering frequent SegFaults when running the Fuel tests on
the Pharo CI (https://ci.inria.fr/pharo-contribution/job/Fuel/). Since
today I’ve also been able to reproduce the SegFaults on my MacBook Pro (OS
X 10.9) too. We have not been able to determine the cause of the SegFault
but we can produce SegFaults often, although not reliable (more reliable on
the CI).

The CI uses the stable VM:
http://files.pharo.org/vm/pharo/linux/stable.zip
I use a newer version from October on my machine:
http://files.pharo.org/vm/pharo/mac/273.zip

I’ve attached all the dumps of the crashes I was able to produce on my
machine, together with Apple’s crash logs.

I’ve been able to deduce the following:
- garbage collection seems to be a trigger for the SegFault. When one of
the methods FLMethodContextSerialization>>testFuelShouldIgnoreFuel and
FLMethodContextSerialization>>testMethodContextWithNilPc contain the line
“3 timesRepeat: [Smalltalk garbageCollect]” the SegFault appears nearly
always (on CI). When I remove the line the builds run through.
- Not all methods with the garbage collect line trigger a SegFault (I
could only identify those two)
- the garbage collect line itself suffices as a trigger in the mentioned
methods.
- The number of tests (amount of used memory?) seems to influence the
appearence of the SegFault (e.g. loading "DevelopmentGroup" seems to
trigger it more often than loading “Benchmarks”)
- the SegFault always appears after the tests with the garbage collect
line have been run, never before
- the VM can’t write the crash.dmp every time

Since the SegFaults are so random I cannot give you an image to reproduce
the problem. I’ve had the best results using a fresh 3.0 image (
http://files.pharo.org/image/30/30549.zip) and then evaluating the
following in a workspace:

Gofer it
       smalltalkhubUser: 'Pharo' project:  'Fuel';
       package: 'ConfigurationOfFuel';
       load.

((Smalltalk at: #ConfigurationOfFuel) project version: #bleedingEdge)
load: 'DevelopmentGroup'.

HDTestReport
       runClasses: (TestCase allSubclasses select: [ :class | class name
beginsWith: 'FL'])
       named: ‘foo'

If it doesn’t work, try using the TestRunner manually. Select the default
Fuel tests (alphabetically at F) and the additional Fuel tests (at the
bottom of the list) and run them.

If anybody has any clue about what could be going on I’d really appreciate
any input. I’ll happily provide more information if I can.

Thanks for reading :)
Max





-- 
Mariano
http://marianopeck.wordpress.com


Reply | Threaded
Open this post in threaded view
|

Re: Frequent SegFaults in PharoVM with Pharo 3.0

Max Leske
 
Interesting development: no SegFaults (while using exactly the same code) when using an older Pharo image (version 30321 instead of latest).
This obviously suggests that the problem lies with the image and not necessarily the vm.

Max

On 10.11.2013, at 20:10, Max Leske <[hidden email]> wrote:


On 07.11.2013, at 08:20, Max Leske <[hidden email]> wrote:

On 07.11.2013, at 00:27, [hidden email] wrote:

Date: Wed, 6 Nov 2013 19:23:47 -0200
From: Mariano Martinez Peck <[hidden email]>
Subject: Re: [Vm-dev] Frequent SegFaults in PharoVM with Pharo 3.0
To: Squeak Virtual Machine Development Discussion
<[hidden email]>
Message-ID:
<[hidden email]>
Content-Type: text/plain; charset="windows-1252"

Max, did you try with Eliot VMs besides the pharo ones?

No, I haven’t. I can try (though only on my machine), but if I can’t reproduce the SegFault that doesn’t mean much unfortunately. It took me 10 to 20 tries with the PharoVM too…

I’ll try anyway and let you know what I find.

Max

I just gave CogVM and NBCog a shot. Unfortunately I can’t use the images because they’ve been built with PharoVM and hang after load. At least on the Pharo CI there are no jobs that create Cog / NBCog images.

So no luck with different VMs.

Max



Thanks


On Wed, Nov 6, 2013 at 6:04 PM, Max Leske <[hidden email]> wrote:


Hi

We’ve been encountering frequent SegFaults when running the Fuel tests on
the Pharo CI (https://ci.inria.fr/pharo-contribution/job/Fuel/). Since
today I’ve also been able to reproduce the SegFaults on my MacBook Pro (OS
X 10.9) too. We have not been able to determine the cause of the SegFault
but we can produce SegFaults often, although not reliable (more reliable on
the CI).

The CI uses the stable VM:
http://files.pharo.org/vm/pharo/linux/stable.zip
I use a newer version from October on my machine:
http://files.pharo.org/vm/pharo/mac/273.zip

I’ve attached all the dumps of the crashes I was able to produce on my
machine, together with Apple’s crash logs.

I’ve been able to deduce the following:
- garbage collection seems to be a trigger for the SegFault. When one of
the methods FLMethodContextSerialization>>testFuelShouldIgnoreFuel and
FLMethodContextSerialization>>testMethodContextWithNilPc contain the line
“3 timesRepeat: [Smalltalk garbageCollect]” the SegFault appears nearly
always (on CI). When I remove the line the builds run through.
- Not all methods with the garbage collect line trigger a SegFault (I
could only identify those two)
- the garbage collect line itself suffices as a trigger in the mentioned
methods.
- The number of tests (amount of used memory?) seems to influence the
appearence of the SegFault (e.g. loading "DevelopmentGroup" seems to
trigger it more often than loading “Benchmarks”)
- the SegFault always appears after the tests with the garbage collect
line have been run, never before
- the VM can’t write the crash.dmp every time

Since the SegFaults are so random I cannot give you an image to reproduce
the problem. I’ve had the best results using a fresh 3.0 image (
http://files.pharo.org/image/30/30549.zip) and then evaluating the
following in a workspace:

Gofer it
       smalltalkhubUser: 'Pharo' project:  'Fuel';
       package: 'ConfigurationOfFuel';
       load.

((Smalltalk at: #ConfigurationOfFuel) project version: #bleedingEdge)
load: 'DevelopmentGroup'.

HDTestReport
       runClasses: (TestCase allSubclasses select: [ :class | class name
beginsWith: 'FL'])
       named: ‘foo'

If it doesn’t work, try using the TestRunner manually. Select the default
Fuel tests (alphabetically at F) and the additional Fuel tests (at the
bottom of the list) and run them.

If anybody has any clue about what could be going on I’d really appreciate
any input. I’ll happily provide more information if I can.

Thanks for reading :)
Max





-- 
Mariano
http://marianopeck.wordpress.com