Crashes on snapshot with the new compactor

Previous Topic Next Topic
 
classic Classic list List threaded Threaded
8 messages Options
Reply | Threaded
Open this post in threaded view
|

Crashes on snapshot with the new compactor

Eliot Miranda-2
Hi All,

    a number of people are being affected by crashes on snapshotting the image, the worst possible time for a crash.  There is a bug in the new compactor that unfortunately bites when saving.  The compactor is invoked as part of a full garbage collect after the garbage collector has feed unreachable objects.  Normally the new compactor makes only a single pass through the heap, which may not move all the objects that are possible to move.  (The amount of objects that can be moved in a single pass is limited by available free space.)  But on snapshot the compactor makes as may passes as are necessary to slide all movable objects down as far as possible.  Unfortunately there is a bug in this second pass.

Fixing this bug is now my priority.  I have an example image from Esteban Lorenzano to test.  I am asking anyone else that can provide an image that reliably crashes when trying to save it to make the image and changes available to me for testing if possible.

In the mean time one may be able to work around the problem by doing a full garbage collect before snapshot.  This should do a GC with a single compaction pass which should not fail, and then make it much more likely that the GC during snapshot will do a single compaction pass, since fewer objects should be mobile after the single pass compaction in the explicit GC.

To do this in Pharo I would put a full gc here:

SessionManager>>snapshot: save andQuit: quit
| isImageStarting snapshotResult |
ChangesLog default logSnapshot: save andQuit: quit.

>> SmalltalkImage current primitiveGarbageCollect.

self currentSession stop: quit. "Image not usable from here until the session is restarted!"
...

In Squeak I would put a full GC here:

snapshot: save andQuit: quit withExitCode: exitCode embedded: embeddedFlag
"Mark the changes file and close all files as part of #processShutdownList.
If save is true, save the current state of this Smalltalk in the image file.
If quit is true, then exit to the outer OS shell.
If exitCode is not nil, then use it as exit code.
The latter part of this method runs when resuming a previously saved image. This resume logic checks for a document file to process when starting up."

| resuming msg |
Object flushDependents.
Object flushEvents.

...
Smalltalk processShutDownList: quit.
>> SmalltalkImage current primitiveGarbageCollect.
Cursor write show.
save ifTrue: [resuming := embeddedFlag 
ifTrue: [self snapshotEmbeddedPrimitive] 
ifFalse: [self snapshotPrimitive]]  "<-- PC frozen here on image file"
ifFalse: [resuming := false].

I do apologise for the bug.  I hope it will be fixed within a few days.

_,,,^..^,,,_
best, Eliot
Reply | Threaded
Open this post in threaded view
|

Re: Crashes on snapshot with the new compactor

Eliot Miranda-2
Hi All,

    I have fixed a bug in the compactor that accounts for the two cases I've analysed and the two fairly repeatable crashes I have at hand (three cases in all).  I hope that all those who have been experiencing crashes can start using the latest build asap.

It is fixed in these commits:

Name: VMMaker.oscog-eem.2187
Author: eem
Time: 27 March 2017, 3:00:06.676146 pm
UUID: 2259d299-65a4-42d0-a01b-4b25f5a89745
Ancestors: VMMaker.oscog-rsf.2186

SpurPlanningCompactor:
Fix a big in resetting the free chunk used for the firstUnusedFieldsSpace after non-final pasxses (i.e. on snapshot).  The old code didn't check to see if a free chunk was actually found(!!).

and

Branch: refs/heads/Cog
 Home:   https://github.com/OpenSmalltalk/opensmalltalk-vm
 Commit: 4ceff23323bcd0f2d3d0a4a43c2995f43d09c98a
     https://github.com/OpenSmalltalk/opensmalltalk-vm/commit/4ceff23323bcd0f2d3d0a4a43c2995f43d09c98a
 Author: Eliot Miranda <[hidden email]>
 Date:   2017-03-27 (Mon, 27 Mar 2017)

The bintray files are here:

On Mar 25, 2017, at 1:27 PM, Eliot Miranda <[hidden email]> wrote:

Hi All,

    a number of people are being affected by crashes on snapshotting the image, the worst possible time for a crash.  There is a bug in the new compactor that unfortunately bites when saving.  The compactor is invoked as part of a full garbage collect after the garbage collector has feed unreachable objects.  Normally the new compactor makes only a single pass through the heap, which may not move all the objects that are possible to move.  (The amount of objects that can be moved in a single pass is limited by available free space.)  But on snapshot the compactor makes as may passes as are necessary to slide all movable objects down as far as possible.  Unfortunately there is a bug in this second pass.

Fixing this bug is now my priority.  I have an example image from Esteban Lorenzano to test.  I am asking anyone else that can provide an image that reliably crashes when trying to save it to make the image and changes available to me for testing if possible.

In the mean time one may be able to work around the problem by doing a full garbage collect before snapshot.  This should do a GC with a single compaction pass which should not fail, and then make it much more likely that the GC during snapshot will do a single compaction pass, since fewer objects should be mobile after the single pass compaction in the explicit GC.

To do this in Pharo I would put a full gc here:

SessionManager>>snapshot: save andQuit: quit
| isImageStarting snapshotResult |
ChangesLog default logSnapshot: save andQuit: quit.

>> SmalltalkImage current primitiveGarbageCollect.

self currentSession stop: quit. "Image not usable from here until the session is restarted!"
...

In Squeak I would put a full GC here:

snapshot: save andQuit: quit withExitCode: exitCode embedded: embeddedFlag
"Mark the changes file and close all files as part of #processShutdownList.
If save is true, save the current state of this Smalltalk in the image file.
If quit is true, then exit to the outer OS shell.
If exitCode is not nil, then use it as exit code.
The latter part of this method runs when resuming a previously saved image. This resume logic checks for a document file to process when starting up."

| resuming msg |
Object flushDependents.
Object flushEvents.

...
Smalltalk processShutDownList: quit.
>> SmalltalkImage current primitiveGarbageCollect.
Cursor write show.
save ifTrue: [resuming := embeddedFlag 
ifTrue: [self snapshotEmbeddedPrimitive] 
ifFalse: [self snapshotPrimitive]]  "<-- PC frozen here on image file"
ifFalse: [resuming := false].

I do apologise for the bug.  I hope it will be fixed within a few days.

_,,,^..^,,,_
best, Eliot
Reply | Threaded
Open this post in threaded view
|

Re: Crashes on snapshot with the new compactor

Stanislav Krupoderov
Hi Eliot,

> The bintray files are here:
> https://bintray.com/opensmalltalk/vm/cog/201703272314 

Where can I find pharo binary for x86/x64 linux? I don't see it on bintray.
Reply | Threaded
Open this post in threaded view
|

Re: Crashes on snapshot with the new compactor

Ben Coman
On Wed, Mar 29, 2017 at 5:29 AM, Stanislav Krupoderov
<[hidden email]> wrote:
> Hi Eliot,
>
>> The bintray files are here:
>> https://bintray.com/opensmalltalk/vm/cog/201703272314
>
> Where can I find pharo binary for x86/x64 linux? I don't see it on bintray.

Thats strange. I'd expect to see it there.  Actually there are no
32-bit x86 Linux Pharo VM either.
https://bintray.com/opensmalltalk/vm/cog/201703272314#files

I'm not familiar with that CI infrastructure, but maybe the "Xpharo"
here is related...
  https://github.com/OpenSmalltalk/opensmalltalk-vm/blob/Cog/tests/smalltalkCI.sh
or lack of listing here...
  https://github.com/OpenSmalltalk/opensmalltalk-vm/blob/Cog/.travis.yml



Here is an alternative download...
   http://files.pharo.org/vm/pharo-spur64/linux/
and of course use a "...-64" image from here...
   http://files.pharo.org/image/60/

You might try the "threaded" heartbeat
to see if it now works for non-root users without modifying
/etc/security/limits.d

Otherwise take the "itimer" one if your not using OSProcess/OSSubprocess.

regards, Ben

Reply | Threaded
Open this post in threaded view
|

Re: Crashes on snapshot with the new compactor

EstebanLM
In reply to this post by Stanislav Krupoderov
we have our official distribution in our file server. 
pharo vm’s can be download here: 

there is also a bintray, but is only as mirror/backup: https://dl.bintray.com/pharo-project/pharo-vm/

cheers,
Esteban

ps: I think Pharo is also on osvm bintray but with a weird name (not sure about that) :P


On 28 Mar 2017, at 23:29, Stanislav Krupoderov <[hidden email]> wrote:

Hi Eliot,

The bintray files are here:
https://bintray.com/opensmalltalk/vm/cog/201703272314

Where can I find pharo binary for x86/x64 linux? I don't see it on bintray.



--
View this message in context: http://forum.world.st/Crashes-on-snapshot-with-the-new-compactor-tp4939961p4940268.html
Sent from the Pharo Smalltalk Developers mailing list archive at Nabble.com.


Reply | Threaded
Open this post in threaded view
|

Re: Crashes on snapshot with the new compactor

Stephane Ducasse-3
Esteban 

is there a new vm that we should use?

Stef

On Wed, Mar 29, 2017 at 8:01 AM, Esteban Lorenzano <[hidden email]> wrote:
we have our official distribution in our file server. 
pharo vm’s can be download here: 

there is also a bintray, but is only as mirror/backup: https://dl.bintray.com/pharo-project/pharo-vm/

cheers,
Esteban

ps: I think Pharo is also on osvm bintray but with a weird name (not sure about that) :P


On 28 Mar 2017, at 23:29, Stanislav Krupoderov <[hidden email]> wrote:

Hi Eliot,

The bintray files are here:
https://bintray.com/opensmalltalk/vm/cog/201703272314

Where can I find pharo binary for x86/x64 linux? I don't see it on bintray.



--
View this message in context: http://forum.world.st/Crashes-on-snapshot-with-the-new-compactor-tp4939961p4940268.html
Sent from the Pharo Smalltalk Developers mailing list archive at Nabble.com.



Reply | Threaded
Open this post in threaded view
|

Re: Crashes on snapshot with the new compactor

EstebanLM
latest :)

On 29 Mar 2017, at 08:48, Stephane Ducasse <[hidden email]> wrote:

Esteban 

is there a new vm that we should use?

Stef

On Wed, Mar 29, 2017 at 8:01 AM, Esteban Lorenzano <[hidden email]> wrote:
we have our official distribution in our file server. 
pharo vm’s can be download here: 

there is also a bintray, but is only as mirror/backup: https://dl.bintray.com/pharo-project/pharo-vm/

cheers,
Esteban

ps: I think Pharo is also on osvm bintray but with a weird name (not sure about that) :P


On 28 Mar 2017, at 23:29, Stanislav Krupoderov <[hidden email]> wrote:

Hi Eliot,

The bintray files are here:
https://bintray.com/opensmalltalk/vm/cog/201703272314

Where can I find pharo binary for x86/x64 linux? I don't see it on bintray.



--
View this message in context: http://forum.world.st/Crashes-on-snapshot-with-the-new-compactor-tp4939961p4940268.html
Sent from the Pharo Smalltalk Developers mailing list archive at Nabble.com.




Reply | Threaded
Open this post in threaded view
|

Re: Crashes on snapshot with the new compactor

Stephane Ducasse-3
oki tx


On Wed, Mar 29, 2017 at 8:51 AM, Esteban Lorenzano <[hidden email]> wrote:
latest :)

On 29 Mar 2017, at 08:48, Stephane Ducasse <[hidden email]> wrote:

Esteban 

is there a new vm that we should use?

Stef

On Wed, Mar 29, 2017 at 8:01 AM, Esteban Lorenzano <[hidden email]> wrote:
we have our official distribution in our file server. 
pharo vm’s can be download here: 

there is also a bintray, but is only as mirror/backup: https://dl.bintray.com/pharo-project/pharo-vm/

cheers,
Esteban

ps: I think Pharo is also on osvm bintray but with a weird name (not sure about that) :P


On 28 Mar 2017, at 23:29, Stanislav Krupoderov <[hidden email]> wrote:

Hi Eliot,

The bintray files are here:
https://bintray.com/opensmalltalk/vm/cog/201703272314

Where can I find pharo binary for x86/x64 linux? I don't see it on bintray.



--
View this message in context: http://forum.world.st/Crashes-on-snapshot-with-the-new-compactor-tp4939961p4940268.html
Sent from the Pharo Smalltalk Developers mailing list archive at Nabble.com.