Smalltalk › Squeak › Squeak VM

[OpenSmalltalk/opensmalltalk-vm] Reproduceable Segmentation fault while saving images (#444)

_‹ Previous Topic Next Topic _›

Classic

List

Threaded

56 messages Options

123

David T Lewis

[OpenSmalltalk/opensmalltalk-vm] Reproduceable Segmentation fault while saving images (#444)

Executing the following script produces a segmentation fault:

| aJson anArray |
aJson := ZnEasy get: 'https://data.nasa.gov/resource/y77d-th95.json' asZnUrl.
Array streamContents: [ :aStream |
	400 timesRepeat: [ 
		aStream nextPutAll: (STON fromString: aJson contents).
		Smalltalk saveSession ] ].

crash.dmp.txt

Reproduced on Pharo 8 Mac OS:

Image
-----
Pharo8.0.0
Build information: Pharo-8.0.0+build.972.sha.bbb812d1387fd27ef4096e35224cb4425a60ab6c (64 Bit)
Unnamed

Virtual Machine
---------------
/Users/.../Pharo/vms/80-x64/Pharo.app/Contents/MacOS/Pharo
CoInterpreter VMMaker.oscog-eem.2504 uuid: a00b0fad-c04c-47a6-8a11-5dbff110ac11 Jan  5 2019
StackToRegisterMappingCogit VMMaker.oscog-eem.2504 uuid: a00b0fad-c04c-47a6-8a11-5dbff110ac11 Jan  5 2019
VM: 201901051900 https://github.com/OpenSmalltalk/opensmalltalk-vm.git Date: Sat Jan 5 20:00:11 2019 CommitHash: 7a3c6b6 Plugins: 201901051900 https://github.com/OpenSmalltalk/opensmalltalk-vm.git

Mac OS X built on Jan  5 2019 19:11:02 UTC Compiler: 4.2.1 Compatible Apple LLVM 7.3.0 (clang-703.0.31)
VMMaker versionString VM: 201901051900 https://github.com/OpenSmalltalk/opensmalltalk-vm.git Date: Sat Jan 5 20:00:11 2019 CommitHash: 7a3c6b6 Plugins: 201901051900 https://github.com/OpenSmalltalk/opensmalltalk-vm.git
CoInterpreter VMMaker.oscog-eem.2504 uuid: a00b0fad-c04c-47a6-8a11-5dbff110ac11 Jan  5 2019
StackToRegisterMappingCogit VMMaker.oscog-eem.2504 uuid: a00b0fad-c04c-47a6-8a11-5dbff110ac11 Jan  5 2019

Also reproduced on Pharo 8 Mac OS:

Virtual Machine
---------------
/Users/syrel/Documents/Pharo/images/GToolkit/../../vms/80-x64/Pharo.app/Contents/MacOS/Pharo
CoInterpreter VMMaker.oscog-eem.2570 uuid: b61e294a-cb2a-4d9a-9e7e-8cc17676c920 Oct 16 2019
StackToRegisterMappingCogit VMMaker.oscog-eem.2570 uuid: b61e294a-cb2a-4d9a-9e7e-8cc17676c920 Oct 16 2019
VM: 201910161212 https://github.com/OpenSmalltalk/opensmalltalk-vm.git Date: Wed Oct 16 14:12:15 2019 CommitHash: da97dd5 Plugins: 201910161212 https://github.com/OpenSmalltalk/opensmalltalk-vm.git
Mac OS X built on Oct 16 2019 12:39:27 GMT Compiler: 4.2.1 Compatible Apple LLVM 8.1.0 (clang-802.0.42)

Also reproduced on Pharo 8 Linux Ubuntu 18.04:

Pharo VM version: 5.0-201902062351  Wed Feb  6 23:59:26 UTC 2019 gcc 4.8 [Production Spur 64-bit VM]
Built from: CoInterpreter VMMaker.oscog-eem.2509 uuid: 91e81f64-95de-4914-a960-8f842be3a194 Feb  6 2019
With: StackToRegisterMappingCogit VMMaker.oscog-eem.2509 uuid: 91e81f64-95de-4914-a960-8f842be3a194 Feb  6 2019
Revision: VM: 201902062351 https://github.com/OpenSmalltalk/opensmalltalk-vm.git Date: Wed Feb 6 15:51:18 2019 CommitHash: a838346b Plugins: 201902062351 https://github.com/OpenSmalltalk/opensmalltalk-vm.git
Build host: Linux travis-job-f2b22483-7f84-414f-b833-69f69518c685 4.4.0-101-generic #124~14.04.1-Ubuntu SMP Fri Nov 10 19:05:36 UTC 2017 x86_64 x86_64 x86_64 GNU/Linux
plugin path: /dev/shm/p8/pharo-vm/lib/pharo/5.0-201902062351 [default: /dev/shm/p8/pharo-vm/lib/pharo/5.0-201902062351/]

—
You are receiving this because you are subscribed to this thread.
Reply to this email directly, view it on GitHub, or unsubscribe.

David T Lewis

Re: [OpenSmalltalk/opensmalltalk-vm] Reproduceable Segmentation fault while saving images (#444)

This #391 (comment) issue comment report a similar reproducible script with a different stack dump on segmentation fault.

—
You are receiving this because you are subscribed to this thread.
Reply to this email directly, view it on GitHub, or unsubscribe.

David T Lewis

Re: [OpenSmalltalk/opensmalltalk-vm] Reproduceable Segmentation fault while saving images (#444)

In reply to this post by David T Lewis

Also crashes with the latest Pharo Stable VM on Mac OS Mojave

Users/andrei/Documents/Pharo/vms/80-x64/Pharo.app/Contents/MacOS/Pharo
CoInterpreter VMMaker.oscog-eem.2509 uuid: 91e81f64-95de-4914-a960-8f842be3a194 Feb  7 2019
StackToRegisterMappingCogit VMMaker.oscog-eem.2509 uuid: 91e81f64-95de-4914-a960-8f842be3a194 Feb  7 2019
VM: 201902062351 https://github.com/OpenSmalltalk/opensmalltalk-vm.git Date: Wed Feb 6 15:51:18 2019 CommitHash: a838346 Plugins: 201902062351 https://github.com/OpenSmalltalk/opensmalltalk-vm.git

Mac OS X built on Feb  7 2019 00:01:47 UTC Compiler: 4.2.1 Compatible Apple LLVM 7.3.0 (clang-703.0.31)
VMMaker versionString VM: 201902062351 https://github.com/OpenSmalltalk/opensmalltalk-vm.git Date: Wed Feb 6 15:51:18 2019 CommitHash: a838346 Plugins: 201902062351 https://github.com/OpenSmalltalk/opensmalltalk-vm.git
CoInterpreter VMMaker.oscog-eem.2509 uuid: 91e81f64-95de-4914-a960-8f842be3a194 Feb  7 2019
StackToRegisterMappingCogit VMMaker.oscog-eem.2509 uuid: 91e81f64-95de-4914-a960-8f842be3a194 Feb  7 2019

0   libsystem_kernel.dylib        	0x00007fff6e5bb2c6 __pthread_kill + 10
1   libsystem_pthread.dylib       	0x00007fff6e676bf1 pthread_kill + 284
2   libsystem_c.dylib             	0x00007fff6e5256a6 abort + 127
3   org.pharo.Pharo               	0x000000010913f662 sigsegv + 209
4   libsystem_platform.dylib      	0x00007fff6e66bb5d _sigtramp + 29
5   ???                           	000000000000000000 0 + 0
6   org.pharo.Pharo               	0x00000001090db9d2 markObjects + 463
7   org.pharo.Pharo               	0x00000001090db1aa fullGC + 72
8   org.pharo.Pharo               	0x00000001090facdf snapshot + 206
9   org.pharo.Pharo               	0x00000001090ef837 primitiveSnapshot + 11
10  org.pharo.Pharo               	0x00000001090bb38d interpret + 17947
11  org.pharo.Pharo               	0x00000001090c4c1f enterSmalltalkExecutiveImplementation + 149
12  org.pharo.Pharo               	0x00000001090b6fe6 interpret + 628
13  org.pharo.Pharo               	0x0000000109140a82 -[sqSqueakMainApplication runSqueak] + 393
14  com.apple.Foundation          	0x00007fff4489fc4a __NSFirePerformWithOrder + 362
15  com.apple.CoreFoundation      	0x00007fff42591928 __CFRUNLOOP_IS_CALLING_OUT_TO_AN_OBSERVER_CALLBACK_FUNCTION__ + 23
16  com.apple.CoreFoundation      	0x00007fff4259185d __CFRunLoopDoObservers + 451
17  com.apple.CoreFoundation      	0x00007fff42533f80 __CFRunLoopRun + 1136
18  com.apple.CoreFoundation      	0x00007fff425338be CFRunLoopRunSpecific + 455
19  com.apple.HIToolbox           	0x00007fff4181f96b RunCurrentEventLoopInMode + 292
20  com.apple.HIToolbox           	0x00007fff4181f5ad ReceiveNextEventCommon + 355
21  com.apple.HIToolbox           	0x00007fff4181f436 _BlockUntilNextEventMatchingListInModeWithFilter + 64
22  com.apple.AppKit              	0x00007fff3fbb9987 _DPSNextEvent + 965
23  com.apple.AppKit              	0x00007fff3fbb871f -[NSApplication(NSEvent) _nextEventMatchingEventMask:untilDate:inMode:dequeue:] + 1361
24  com.apple.AppKit              	0x00007fff3fbb283c -[NSApplication run] + 699
25  com.apple.AppKit              	0x00007fff3fba1d7c NSApplicationMain + 777
26  libdyld.dylib                 	0x00007fff6e4803d5 start + 1

—
You are receiving this because you are subscribed to this thread.
Reply to this email directly, view it on GitHub, or unsubscribe.

David T Lewis

Re: [OpenSmalltalk/opensmalltalk-vm] Reproduceable Segmentation fault while saving images (#444)

In reply to this post by David T Lewis

This corroborates what pablo told me and pavel found.
Tx

—
You are receiving this because you are subscribed to this thread.
Reply to this email directly, view it on GitHub, or unsubscribe.

David T Lewis

Re: [OpenSmalltalk/opensmalltalk-vm] Reproduceable Segmentation fault while saving images (#444)

In reply to this post by David T Lewis

If one starts up the latest Pharo 8 64-bit image using an assert VM with leak checking turned on one ends that the image is already corrupted. It is not the script, nor the VM that is at fault, but the initial image, which is already corrupted.

Here's what I get when I launch the latest 64-bit 8.0 image using an assert VM with leak checking turned on (commentary after the run):

$ pharo64cavm --lldb --leakcheck 1 gccrash8-64.image gccrash.st

run --leakcheck 1 /Users/eliot/Documents/Pharo/images/Pharo 8.0 - 64bit/gccrash8-64.image gccrash.st

(lldb) target create "/Users/eliot/oscogvm/build.macos64x64/pharo.cog.spur/PharoAssert.app/Contents/MacOS/Pharo"
Current executable set to '/Users/eliot/oscogvm/build.macos64x64/pharo.cog.spur/PharoAssert.app/Contents/MacOS/Pharo' (x86_64).
(lldb) settings set -- target.run-args  "/Users/eliot/Documents/Pharo/images/Pharo 8.0 - 64bit/gccrash8-64.image" "gccrash.st"
(lldb) b Pharo`warning
Breakpoint 1: where = Pharo`warning + 9 at gcc3x-cointerp.c:44, address = 0x0000000100002079
(lldb) run --leakcheck 1 gccrash8-64.image gccrash.st
Process 13469 launched: '/Users/eliot/oscogvm/build.macos64x64/pharo.cog.spur/PharoAssert.app/Contents/MacOS/Pharo' (x86_64)
2019-11-13 21:53:49.806855-0800 Pharo[13469:7168447] MessageTracer: load_domain_whitelist_search_tree:73: Search tree file's format version number (0) is not supported
2019-11-13 21:53:49.806878-0800 Pharo[13469:7168447] MessageTracer: Falling back to default whitelist
object leak in        0x110cc6720 @ 0 =        0x11a542c50
object leak in        0x110d402e0 @ 1 =        0x11a542c50
object leak in        0x110d40418 @ 1 =        0x11a702018
object leak in        0x110d40790 @ 1 =        0x11a53bec8
object leak in        0x110d557f8 @ 4 =        0x11a542c88
object leak in        0x110e1a940 @ 0 =        0x11a8ffb48
object leak in        0x110e1a9a0 @ 0 = 0xfffffffffb3d0000
object leak in        0x110e1a9f8 @ 0 =        0x11a70bdc0
object leak in        0x110e1a9f8 @ 5 =        0x11a702018
object leak in        0x110e1aa30 @ 4 =        0x11a73a320
object leak in        0x110e1aa30 @ 5 =        0x11a702050
object leak in        0x110e1aa68 @ 0 = 0xfffffffffb3d0000
object leak in        0x110e1ab08 @ 1 =        0x11a70bdc0
object leak in        0x110e1ab20 @ 1 =        0x11a70bdf8
object leak in        0x110e1ab38 @ 1 =        0x11a70be68
object leak in        0x110e1ab50 @ 1 =        0x11a73a7a8
object leak in        0x110e1ab68 @ 1 =        0x11a70bea0
object leak in        0x110e1ab80 @ 1 =        0x11a70bed8
object leak in        0x110e1ab98 @ 1 =        0x11a70bf10
object leak in        0x110e1abb0 @ 1 =        0x11a70be30
object leak in        0x110e1ae00 @ 0 =        0x11a73af40
object leak in        0x110e1ae20 @ 1 =        0x11a73af68
object leak in        0x110e1e258 @ 0 =        0x11a73af98
object leak in        0x110e1e290 @ 5 =        0x11a7058b0
object leak in        0x110e1e448 @ 1 =        0x11a73bbe8
object leak in        0x110e1e6a0 @ 1 =        0x11a73d3a0
object leak in        0x110e1e6b8 @ 1 =        0x11a73d478
object leak in        0x110e1ea48 @ 1 =        0x11a73f950
object leak in        0x110e1ebe0 @ 1 =        0x11a740a48
object leak in        0x110e1f348 @ 1 =        0x11a745698
object leak in        0x110e1f648 @ 1 =        0x11a7475a0
object leak in        0x110e1fbb8 @ 1 =        0x11a74ad98
object leak in        0x110e20008 @ 1 =        0x11a74db10
object leak in        0x110e20068 @ 1 =        0x11a74dee8
object leak in        0x110e20128 @ 1 =        0x11a74e650
object leak in        0x110e20188 @ 1 =        0x11a74ea28
object leak in        0x110e20398 @ 1 =        0x11a74feb0
object leak in        0x110e207e8 @ 1 =        0x11a752b08
object leak in        0x110e217d8 @ 1 =        0x11a75cbb8
object leak in        0x110e21958 @ 1 =        0x11a75dad0
object leak in        0x110e21fa0 @ 1 =        0x11a761ac0
object leak in        0x110e22180 @ 1 =        0x11a762dc8
object leak in        0x110e222d0 @ 1 =        0x11a763b00
object leak in        0x110e22420 @ 1 =        0x11a764820
object leak in        0x110e225b8 @ 1 =        0x11a765828
object leak in        0x110e22798 @ 1 =        0x11a766ad0
object leak in        0x110e22948 @ 1 =        0x11a767bf8
object leak in        0x110e22bd0 @ 1 =        0x11a7695f0
object leak in        0x110e22cf0 @ 1 =        0x11a76a1a8
object leak in        0x110e22d80 @ 1 =        0x11a76a760
object leak in        0x110e234a0 @ 1 =        0x11a76f050
object leak in        0x110e23ce0 @ 1 =        0x11a774438
Pharo was compiled with optimization - stepping may behave oddly; variables may not be available.
Process 13469 stopped
* thread #1, queue = 'com.apple.main-thread', stop reason = breakpoint 1.1
    frame #0: 0x0000000100002079 Pharo`warning(s="checkHeapIntegrityclassIndicesShouldBeValid(0, 1) 56761") at gcc3x-cointerp.c:44 [opt]
   41  	sqInt warnpid, erroronwarn;
   42  	void
   43  	warning(char *s) { /* Print an error message but don't necessarily exit. */
-> 44  		if (erroronwarn) error(s);
   45  		if (warnpid)
   46  			printf("\n%s pid %ld\n", s, (long)warnpid);
   47  		else
Target 0: (Pharo) stopped.
(lldb) thr b
* thread #1, queue = 'com.apple.main-thread', stop reason = breakpoint 1.1
  * frame #0: 0x0000000100002079 Pharo`warning(s="checkHeapIntegrityclassIndicesShouldBeValid(0, 1) 56761") at gcc3x-cointerp.c:44 [opt]
    frame #1: 0x000000010003872f Pharo`runLeakCheckerFor(gcModes=1) at gcc3x-cointerp.c:56761 [opt]
    frame #2: 0x0000000100012d1e Pharo`loadInitialContext at gcc3x-cointerp.c:65476 [opt]
    frame #3: 0x000000010000231d Pharo`interpret at gcc3x-cointerp.c:2770 [opt]
    frame #4: 0x00000001000bd0c9 Pharo`-[sqSqueakMainApplication runSqueak](self=0x000000010045e840, _cmd=<unavailable>) at sqSqueakMainApplication.m:201 [opt]
    frame #5: 0x00007fff319e809c Foundation`__NSFirePerformWithOrder + 360
    frame #6: 0x00007fff2f85c257 CoreFoundation`__CFRUNLOOP_IS_CALLING_OUT_TO_AN_OBSERVER_CALLBACK_FUNCTION__ + 23
    frame #7: 0x00007fff2f85c17f CoreFoundation`__CFRunLoopDoObservers + 527
    frame #8: 0x00007fff2f83e6f8 CoreFoundation`__CFRunLoopRun + 1240
    frame #9: 0x00007fff2f83df93 CoreFoundation`CFRunLoopRunSpecific + 483
    frame #10: 0x00007fff2eb28d96 HIToolbox`RunCurrentEventLoopInMode + 286
    frame #11: 0x00007fff2eb28a0f HIToolbox`ReceiveNextEventCommon + 366
    frame #12: 0x00007fff2eb28884 HIToolbox`_BlockUntilNextEventMatchingListInModeWithFilter + 64
    frame #13: 0x00007fff2cdd7a3b AppKit`_DPSNextEvent + 2085
    frame #14: 0x00007fff2d56de34 AppKit`-[NSApplication(NSEvent) _nextEventMatchingEventMask:untilDate:inMode:dequeue:] + 3044
    frame #15: 0x00007fff2cdcc84d AppKit`-[NSApplication run] + 764
    frame #16: 0x00007fff2cd9ba3a AppKit`NSApplicationMain + 804
    frame #17: 0x00007fff5777c015 libdyld.dylib`start + 1

My script pharo64cavm runs an assert VM, specifically /Users/eliot/oscogvm/build.macos64x64/pharo.cog.spur/PharoAssert.app/Contents/MacOS/Pharo. Supplying the --lldb argument has the script launch lldb (a low-level debugger for native executables) on the VM.

This command places a breakpoint in the error/warning output routine called when the VM wants to report that asserts have failed, leaks in the heap have been found, etc.

(lldb) b Pharo`warning

This command then launches the VM under the control of the debugger.

run --leakcheck 1 gccrash8-64.image gccrash.st

The arguments to leak check are a combination of the following flags:

SpurMemoryManager<<setCheckForLeaks: integerFlags
	" 0 = do nothing.
	  1 = check for leaks on fullGC (GCModeFull).
	  2 = check for leaks on scavenger (GCModeNewSpace).
	  4 = check for leaks on incremental (GCModeIncremental)
	  8 = check for leaks on become (GCModeBecome)
	 16 = check for leaks on image segments (GCModeImageSegment)"
	checkForLeaks := integerFlags

If GCModeFull is set then the VM performs a leak check on loading the initial image. From the back trace you can see that the Vm has not yet started running, loadInitialCOntext being the routine that sets up the VM to run from the context that performed the snapshot:

Pharo`warning(s="checkHeapIntegrityclassIndicesShouldBeValid(0, 1) 56761") at gcc3x-cointerp.c:44 [opt]
    frame #1: 0x000000010003872f Pharo`runLeakCheckerFor(gcModes=1) at gcc3x-cointerp.c:56761 [opt]
    frame #2: 0x0000000100012d1e Pharo`loadInitialContext at gcc3x-cointerp.c:65476 [opt]
    frame #3: 0x000000010000231d Pharo`interpret at gcc3x-cointerp.c:2770 [opt]
    frame #4: 0x00000001000bd0c9 Pharo`-[sqSqueakMainApplication runSqueak](self=0x000000010045e840, _cmd=<unavailable>) at sqSqueakMainApplication.m:201 [opt]
    frame #5: 0x00007fff319e809c Foundation`__NSFirePerformWithOrder + 360

The following leak report shows that there are many leaks in this image:

object leak in        0x110cc6720 @ 0 =        0x11a542c50
object leak in        0x110d402e0 @ 1 =        0x11a542c50
object leak in        0x110d40418 @ 1 =        0x11a702018
object leak in        0x110d40790 @ 1 =        0x11a53bec8
object leak in        0x110d557f8 @ 4 =        0x11a542c88
object leak in        0x110e1a940 @ 0 =        0x11a8ffb48
object leak in        0x110e1a9a0 @ 0 = 0xfffffffffb3d0000
object leak in        0x110e1a9f8 @ 0 =        0x11a70bdc0
object leak in        0x110e1a9f8 @ 5 =        0x11a702018
object leak in        0x110e1aa30 @ 4 =        0x11a73a320
object leak in        0x110e1aa30 @ 5 =        0x11a702050
object leak in        0x110e1aa68 @ 0 = 0xfffffffffb3d0000
...

Let's take a look at some of these objects. In lldb we can call the VM's debug printing routines, just as we can in the simulator:

(lldb) call printOop(0x110cc6720)
       0x110cc6720: a(n) FreeTypeCacheEntry
       0x11a542c50        0x110ca0fb8              0x221               0x21 0x81ffae4000000004
       0x110d25ea8
(lldb) call printOop(0x110d402e0)
       0x110d402e0: a(n) Association
              0x21        0x11a542c50
(lldb) call printOop(0x110d402e0)
       0x110d402e0: a(n) Association
              0x21        0x11a542c50
(sqInt) $1 = 0
(lldb) print whereIs(0x11a542c50)
(char *) $2 = 0x0000000100110864 " is no where obvious"
(lldb) call printOop(0x110d40418)
       0x110d40418: a(n) Association
     0x16000000001        0x11a702018
(sqInt) $3 = 0
(lldb) call printOop(0x110d40790)
       0x110d40790: a(n) Association
              0x21        0x11a53bec8
(sqInt) $4 = 0
(lldb) call printOop(0x110d557f8)
       0x110d557f8: a(n) FreeTypeCacheEntry
       0x110d55ad0        0x110ca0fb8              0x181              0x7c1        0x11a542c88
       0x110d55718

So the first suspect (to me) looks like external C memory management in FreeType font management. Let me suggest you add a step in the release process which involves checking the validity of images before they're released. Let me also suggest that you appoint a team to look at FreeType font management using the leak checker, et al, to find and fix these issues which I think have been around for quite a while.

—
You are receiving this because you are subscribed to this thread.
Reply to this email directly, view it on GitHub, or unsubscribe.

johnmci

Re: [OpenSmalltalk/opensmalltalk-vm] Reproduceable Segmentation fault while saving images (#444)

An interesting question to ask here is can you tag the image memory as read only during a FFI call out for debugging purposes? If writes to image memory are required can they sandboxed? If writes to a display area are required can that be protected by no read/write pages before/after the screen buffer to trap overwrites or reads?

....

John M. McIntosh. Corporate Smalltalk Consulting Ltd https://www.linkedin.com/in/smalltalk

On Wed, Nov 13, 2019 at 10:12 PM, Eliot Miranda <[hidden email]> wrote:

Here's what I get when I launch the latest 64-bit 8.0 image using an assert VM with leak checking turned on (commentary after the run):

$ pharo64cavm --lldb --leakcheck 1 gccrash8-64.image gccrash.st

run --leakcheck 1 /Users/eliot/Documents/Pharo/images/Pharo 8.0 - 64bit/gccrash8-64.image gccrash.st

(lldb) target create "/Users/eliot/oscogvm/build.macos64x64/pharo.cog.spur/PharoAssert.app/Contents/MacOS/Pharo"
Current executable set to '/Users/eliot/oscogvm/build.macos64x64/pharo.cog.spur/PharoAssert.app/Contents/MacOS/Pharo' (x86_64).
(lldb) settings set -- target.run-args  "/Users/eliot/Documents/Pharo/images/Pharo 8.0 - 64bit/gccrash8-64.image" "gccrash.st"
(lldb) b Pharo`warning
Breakpoint 1: where = Pharo`warning + 9 at gcc3x-cointerp.c:44, address = 0x0000000100002079
(lldb) run --leakcheck 1 gccrash8-64.image gccrash.st
Process 13469 launched: '/Users/eliot/oscogvm/build.macos64x64/pharo.cog.spur/PharoAssert.app/Contents/MacOS/Pharo' (x86_64)
2019-11-13 21:53:49.806855-0800 Pharo[13469:7168447] MessageTracer: load_domain_whitelist_search_tree:73: Search tree file's format version number (0) is not supported
2019-11-13 21:53:49.806878-0800 Pharo[13469:7168447] MessageTracer: Falling back to default whitelist
object leak in        0x110cc6720 @ 0 =        0x11a542c50
object leak in        0x110d402e0 @ 1 =        0x11a542c50
object leak in        0x110d40418 @ 1 =        0x11a702018
object leak in        0x110d40790 @ 1 =        0x11a53bec8
object leak in        0x110d557f8 @ 4 =        0x11a542c88
object leak in        0x110e1a940 @ 0 =        0x11a8ffb48
object leak in        0x110e1a9a0 @ 0 = 0xfffffffffb3d0000
object leak in        0x110e1a9f8 @ 0 =        0x11a70bdc0
object leak in        0x110e1a9f8 @ 5 =        0x11a702018
object leak in        0x110e1aa30 @ 4 =        0x11a73a320
object leak in        0x110e1aa30 @ 5 =        0x11a702050
object leak in        0x110e1aa68 @ 0 = 0xfffffffffb3d0000
object leak in        0x110e1ab08 @ 1 =        0x11a70bdc0
object leak in        0x110e1ab20 @ 1 =        0x11a70bdf8
object leak in        0x110e1ab38 @ 1 =        0x11a70be68
object leak in        0x110e1ab50 @ 1 =        0x11a73a7a8
object leak in        0x110e1ab68 @ 1 =        0x11a70bea0
object leak in        0x110e1ab80 @ 1 =        0x11a70bed8
object leak in        0x110e1ab98 @ 1 =        0x11a70bf10
object leak in        0x110e1abb0 @ 1 =        0x11a70be30
object leak in        0x110e1ae00 @ 0 =        0x11a73af40
object leak in        0x110e1ae20 @ 1 =        0x11a73af68
object leak in        0x110e1e258 @ 0 =        0x11a73af98
object leak in        0x110e1e290 @ 5 =        0x11a7058b0
object leak in        0x110e1e448 @ 1 =        0x11a73bbe8
object leak in        0x110e1e6a0 @ 1 =        0x11a73d3a0
object leak in        0x110e1e6b8 @ 1 =        0x11a73d478
object leak in        0x110e1ea48 @ 1 =        0x11a73f950
object leak in        0x110e1ebe0 @ 1 =        0x11a740a48
object leak in        0x110e1f348 @ 1 =        0x11a745698
object leak in        0x110e1f648 @ 1 =        0x11a7475a0
object leak in        0x110e1fbb8 @ 1 =        0x11a74ad98
object leak in        0x110e20008 @ 1 =        0x11a74db10
object leak in        0x110e20068 @ 1 =        0x11a74dee8
object leak in        0x110e20128 @ 1 =        0x11a74e650
object leak in        0x110e20188 @ 1 =        0x11a74ea28
object leak in        0x110e20398 @ 1 =        0x11a74feb0
object leak in        0x110e207e8 @ 1 =        0x11a752b08
object leak in        0x110e217d8 @ 1 =        0x11a75cbb8
object leak in        0x110e21958 @ 1 =        0x11a75dad0
object leak in        0x110e21fa0 @ 1 =        0x11a761ac0
object leak in        0x110e22180 @ 1 =        0x11a762dc8
object leak in        0x110e222d0 @ 1 =        0x11a763b00
object leak in        0x110e22420 @ 1 =        0x11a764820
object leak in        0x110e225b8 @ 1 =        0x11a765828
object leak in        0x110e22798 @ 1 =        0x11a766ad0
object leak in        0x110e22948 @ 1 =        0x11a767bf8
object leak in        0x110e22bd0 @ 1 =        0x11a7695f0
object leak in        0x110e22cf0 @ 1 =        0x11a76a1a8
object leak in        0x110e22d80 @ 1 =        0x11a76a760
object leak in        0x110e234a0 @ 1 =        0x11a76f050
object leak in        0x110e23ce0 @ 1 =        0x11a774438
Pharo was compiled with optimization - stepping may behave oddly; variables may not be available.
Process 13469 stopped
* thread #1, queue = 'com.apple.main-thread', stop reason = breakpoint 1.1
    frame #0: 0x0000000100002079 Pharo`warning(s="checkHeapIntegrityclassIndicesShouldBeValid(0, 1) 56761") at gcc3x-cointerp.c:44 [opt]
   41  	sqInt warnpid, erroronwarn;
   42  	void
   43  	warning(char *s) { /* Print an error message but don't necessarily exit. */
-> 44  		if (erroronwarn) error(s);
   45  		if (warnpid)
   46  			printf("\n%s pid %ld\n", s, (long)warnpid);
   47  		else
Target 0: (Pharo) stopped.
(lldb) thr b
* thread #1, queue = 'com.apple.main-thread', stop reason = breakpoint 1.1
  * frame #0: 0x0000000100002079 Pharo`warning(s="checkHeapIntegrityclassIndicesShouldBeValid(0, 1) 56761") at gcc3x-cointerp.c:44 [opt]
    frame #1: 0x000000010003872f Pharo`runLeakCheckerFor(gcModes=1) at gcc3x-cointerp.c:56761 [opt]
    frame #2: 0x0000000100012d1e Pharo`loadInitialContext at gcc3x-cointerp.c:65476 [opt]
    frame #3: 0x000000010000231d Pharo`interpret at gcc3x-cointerp.c:2770 [opt]
    frame #4: 0x00000001000bd0c9 Pharo`-[sqSqueakMainApplication runSqueak](self=0x000000010045e840, _cmd=<unavailable>) at sqSqueakMainApplication.m:201 [opt]
    frame #5: 0x00007fff319e809c Foundation`__NSFirePerformWithOrder + 360
    frame #6: 0x00007fff2f85c257 CoreFoundation`__CFRUNLOOP_IS_CALLING_OUT_TO_AN_OBSERVER_CALLBACK_FUNCTION__ + 23
    frame #7: 0x00007fff2f85c17f CoreFoundation`__CFRunLoopDoObservers + 527
    frame #8: 0x00007fff2f83e6f8 CoreFoundation`__CFRunLoopRun + 1240
    frame #9: 0x00007fff2f83df93 CoreFoundation`CFRunLoopRunSpecific + 483
    frame #10: 0x00007fff2eb28d96 HIToolbox`RunCurrentEventLoopInMode + 286
    frame #11: 0x00007fff2eb28a0f HIToolbox`ReceiveNextEventCommon + 366
    frame #12: 0x00007fff2eb28884 HIToolbox`_BlockUntilNextEventMatchingListInModeWithFilter + 64
    frame #13: 0x00007fff2cdd7a3b AppKit`_DPSNextEvent + 2085
    frame #14: 0x00007fff2d56de34 AppKit`-[NSApplication(NSEvent) _nextEventMatchingEventMask:untilDate:inMode:dequeue:] + 3044
    frame #15: 0x00007fff2cdcc84d AppKit`-[NSApplication run] + 764
    frame #16: 0x00007fff2cd9ba3a AppKit`NSApplicationMain + 804
    frame #17: 0x00007fff5777c015 libdyld.dylib`start + 1

This command places a breakpoint in the error/warning output routine called when the VM wants to report that asserts have failed, leaks in the heap have been found, etc.

(lldb) b Pharo`warning

This command then launches the VM under the control of the debugger.

run --leakcheck 1 gccrash8-64.image gccrash.st

The arguments to leak check are a combination of the following flags:

SpurMemoryManager<<setCheckForLeaks: integerFlags
	" 0 = do nothing.
	  1 = check for leaks on fullGC (GCModeFull).
	  2 = check for leaks on scavenger (GCModeNewSpace).
	  4 = check for leaks on incremental (GCModeIncremental)
	  8 = check for leaks on become (GCModeBecome)
	 16 = check for leaks on image segments (GCModeImageSegment)"
	checkForLeaks := integerFlags

Pharo`warning(s="checkHeapIntegrityclassIndicesShouldBeValid(0, 1) 56761") at gcc3x-cointerp.c:44 [opt]
    frame #1: 0x000000010003872f Pharo`runLeakCheckerFor(gcModes=1) at gcc3x-cointerp.c:56761 [opt]
    frame #2: 0x0000000100012d1e Pharo`loadInitialContext at gcc3x-cointerp.c:65476 [opt]
    frame #3: 0x000000010000231d Pharo`interpret at gcc3x-cointerp.c:2770 [opt]
    frame #4: 0x00000001000bd0c9 Pharo`-[sqSqueakMainApplication runSqueak](self=0x000000010045e840, _cmd=<unavailable>) at sqSqueakMainApplication.m:201 [opt]
    frame #5: 0x00007fff319e809c Foundation`__NSFirePerformWithOrder + 360

The following leak report shows that there are many leaks in this image:

object leak in        0x110cc6720 @ 0 =        0x11a542c50
object leak in        0x110d402e0 @ 1 =        0x11a542c50
object leak in        0x110d40418 @ 1 =        0x11a702018
object leak in        0x110d40790 @ 1 =        0x11a53bec8
object leak in        0x110d557f8 @ 4 =        0x11a542c88
object leak in        0x110e1a940 @ 0 =        0x11a8ffb48
object leak in        0x110e1a9a0 @ 0 = 0xfffffffffb3d0000
object leak in        0x110e1a9f8 @ 0 =        0x11a70bdc0
object leak in        0x110e1a9f8 @ 5 =        0x11a702018
object leak in        0x110e1aa30 @ 4 =        0x11a73a320
object leak in        0x110e1aa30 @ 5 =        0x11a702050
object leak in        0x110e1aa68 @ 0 = 0xfffffffffb3d0000
...

Let's take a look at some of these objects. In lldb we can call the VM's debug printing routines, just as we can in the simulator:

(lldb) call printOop(0x110cc6720)
       0x110cc6720: a(n) FreeTypeCacheEntry
       0x11a542c50        0x110ca0fb8              0x221               0x21 0x81ffae4000000004
       0x110d25ea8
(lldb) call printOop(0x110d402e0)
       0x110d402e0: a(n) Association
              0x21        0x11a542c50
(lldb) call printOop(0x110d402e0)
       0x110d402e0: a(n) Association
              0x21        0x11a542c50
(sqInt) $1 = 0
(lldb) print whereIs(0x11a542c50)
(char *) $2 = 0x0000000100110864 " is no where obvious"
(lldb) call printOop(0x110d40418)
       0x110d40418: a(n) Association
     0x16000000001        0x11a702018
(sqInt) $3 = 0
(lldb) call printOop(0x110d40790)
       0x110d40790: a(n) Association
              0x21        0x11a53bec8
(sqInt) $4 = 0
(lldb) call printOop(0x110d557f8)
       0x110d557f8: a(n) FreeTypeCacheEntry
       0x110d55ad0        0x110ca0fb8              0x181              0x7c1        0x11a542c88
       0x110d55718

—
You are receiving this because you are subscribed to this thread.
Reply to this email directly, view it on GitHub, or unsubscribe.

David T Lewis

Re: [OpenSmalltalk/opensmalltalk-vm] Reproduceable Segmentation fault while saving images (#444)

In reply to this post by David T Lewis

John, one could easily add that facility, but I believe that the problem is more likely to do with dangling pointers than FreeType writing into the heap. I suspect that what happens is that on a previous save or restart, pointers to C memory that was allocated in the run before the current one are not invalidated and still used. I believe the problem is that the FFI is not being used properly and that it is not at fault. Instead, stale pointers are being followed abd memory corruption occurring.

As I said above the necessity is in checking that a valid image is created and that stake pointers are invalidated.

This is an age old problem with Smalltalk programs that use external memory, external handles, descriptors, etc. There is a style which desks well with this and it should be followed.

when creating references to external C memory that may persist across a snapshot (such as to an opened external font object), the object that references the external memory should be registered in some set.
The Smalltalk system sends a post-snapshot resumption message to any and all classes that register for this service, with an argument that indicates whether the system is continuing after a snapshot or loading the snapshot in a fresh run.
When loading the snapshot in a fresh run all objects that reference external C memory are visited very early in system startup and each ensures that it invalidates any and all pointers to external memory, file descriptors, etc.

Using this style we do not have to close and reopen around a snapshot, but we do have to perform the invalidation early enough so that there is no chance of accessing anything external before all invalidations are complete.

Further, using a registry of objects is much much better than using, for example, allInstances because typically there are few (tens, hundreds at most, not thousands) of objects that reference external resources, and they may be of various classes, so the registry is able to reference them in more or less linear time in the size of the registry, independent of image size, while using allInstances accesses objects in time proportional to the product of the number of classes and the image size. Clearly this does not scale as the system gets more complex and the image size grows. Startup time is very important. I led the VisualWorks team through this exercise and we were able to reduce start up times from hundreds of milliseconds to forty milliseconds (IIRC) in the VW 3.0 timeframe.

—
You are receiving this because you commented.
Reply to this email directly, view it on GitHub, or unsubscribe.

David T Lewis

Re: [OpenSmalltalk/opensmalltalk-vm] Reproduceable Segmentation fault while saving images (#444)

In reply to this post by David T Lewis

This issue is also recurrent in Pharo 7:

Image
-----
Pharo7.0.4
Build information: Pharo-7.0.4+build.168.sha.ccd1f6489120f58ddeacb2cac77cd3a0f0dcfbe6 (64 Bit)
Unnamed

Virtual Machine
---------------
/Users/jurajkubelka/Pharo/vms/70-x64/Pharo.app/Contents/MacOS/Pharo
CoInterpreter VMMaker.oscog-eem.2504 uuid: a00b0fad-c04c-47a6-8a11-5dbff110ac11 Jan  5 2019
StackToRegisterMappingCogit VMMaker.oscog-eem.2504 uuid: a00b0fad-c04c-47a6-8a11-5dbff110ac11 Jan  5 2019
VM: 201901051900 https://github.com/OpenSmalltalk/opensmalltalk-vm.git Date: Sat Jan 5 20:00:11 2019 CommitHash: 7a3c6b6 Plugins: 201901051900 https://github.com/OpenSmalltalk/opensmalltalk-vm.git

Mac OS X built on Jan  5 2019 19:11:02 UTC Compiler: 4.2.1 Compatible Apple LLVM 7.3.0 (clang-703.0.31)
VMMaker versionString VM: 201901051900 https://github.com/OpenSmalltalk/opensmalltalk-vm.git Date: Sat Jan 5 20:00:11 2019 CommitHash: 7a3c6b6 Plugins: 201901051900 https://github.com/OpenSmalltalk/opensmalltalk-vm.git
CoInterpreter VMMaker.oscog-eem.2504 uuid: a00b0fad-c04c-47a6-8a11-5dbff110ac11 Jan  5 2019
StackToRegisterMappingCogit VMMaker.oscog-eem.2504 uuid: a00b0fad-c04c-47a6-8a11-5dbff110ac11 Jan  5 2019

—
You are receiving this because you commented.
Reply to this email directly, view it on GitHub, or unsubscribe.

David T Lewis

Re: [OpenSmalltalk/opensmalltalk-vm] Reproduceable Segmentation fault while saving images (#444)

In reply to this post by David T Lewis

This also happens in Pharo 6:

Image
-----
Pharo6.0
Latest update: #60547
Unnamed

Virtual Machine
---------------
/Users/jurajkubelka/Pharo/vms/61-x64/Pharo.app/Contents/MacOS/Pharo
CoInterpreter VMMaker.oscog-eem.2401 uuid: 29232e0e-c9e3-41d8-ae75-519db862e02c Jun 28 2018
StackToRegisterMappingCogit VMMaker.oscog-eem.2401 uuid: 29232e0e-c9e3-41d8-ae75-519db862e02c Jun 28 2018
VM: 201806281256 https://github.com/OpenSmalltalk/opensmalltalk-vm.git Date: Thu Jun 28 14:56:30 2018 CommitHash: a8a1dc1 Plugins: 201806281256 https://github.com/OpenSmalltalk/opensmalltalk-vm.git

Mac OS X built on Jun 28 2018 13:07:33 UTC Compiler: 4.2.1 Compatible Apple LLVM 7.3.0 (clang-703.0.31)
VMMaker versionString VM: 201806281256 https://github.com/OpenSmalltalk/opensmalltalk-vm.git Date: Thu Jun 28 14:56:30 2018 CommitHash: a8a1dc1 Plugins: 201806281256 https://github.com/OpenSmalltalk/opensmalltalk-vm.git
CoInterpreter VMMaker.oscog-eem.2401 uuid: 29232e0e-c9e3-41d8-ae75-519db862e02c Jun 28 2018
StackToRegisterMappingCogit VMMaker.oscog-eem.2401 uuid: 29232e0e-c9e3-41d8-ae75-519db862e02c Jun 28 2018

—
You are receiving this because you commented.
Reply to this email directly, view it on GitHub, or unsubscribe.

David T Lewis

Re: [OpenSmalltalk/opensmalltalk-vm] Reproduceable Segmentation fault while saving images (#444)

In reply to this post by David T Lewis

Just to confirm that it's probably not a garbageCollect problem, I could not reproduce in latest Squeak trunk. I did not use Zinc because it's too much difficult to install in Squeak and just replaced with WebClient. STON is available (installed thru Squit/Squot git support):

| aJson anArray |
aJson := WebClient httpGet: 'https://data.nasa.gov/resource/y77d-th95.json'.
Array streamContents: [ :aStream |
	400 timesRepeat: [ 
		aStream nextPutAll: (STON fromString: aJson content).
		Smalltalk saveSession ] ].

The resulting image file is 540Mbytes long.

—
You are receiving this because you commented.
Reply to this email directly, view it on GitHub, or unsubscribe.

David T Lewis

Re: [OpenSmalltalk/opensmalltalk-vm] Reproduceable Segmentation fault while saving images (#444)

In reply to this post by David T Lewis

Hi Eliot and Nicolas,

Nicolas, thanks for checking on Squeak, that is useful to know.

Juraj and I have both built assert VMs and been trying to reproduce Eliot's findings.

If I add a couple of print statements to the code and run with a normal VM, I get a number of:

evicted zombie process from run queue

messages.

If I run gccrash.st with the assert VM I get:

((classIndex >= 1) && (classIndex <= (classTablePageSize()))) 50170

((classIndex >= 1) && (classIndex <= (classTablePageSize()))) 50170

before the process seg faults.

Do these provide any additional information to help track down the issue? (I'll include more complete information below)

I tried running a headless VM and printing instance counts about FreeType external objects in a clean image:

$ vmh/pharo Pharo.image eval "'ftcount.st' asFileReference fileIn"
FTBBox -> 0
FTBitmap -> 0
FTBitmapSize -> 0
FTCharMapRec -> 0
FTFaceRec -> 0
FTGeneric -> 0
FTGlyphMetrics -> 0
FTGlyphSlotRec -> 0
FTListRec -> 0
FTMatrix -> 0
FTOutline -> 0
FTSizeMetrics -> 0
FTSizeRec -> 0
FTVector -> 0
FT2Handle -> 0
FT2Face -> 0
FT2Library -> 1
a FT2Library(@ 16r00000000)<0x0>
File @ ftcount.st

Once the image has been started normally, the pointer in FT2Library becomes non-zero.

Which to me seems to suggest that rather than the image being delivered in a corrupt state, it's something that happens early in session resumption.

—
You are receiving this because you commented.
Reply to this email directly, view it on GitHub, or unsubscribe.

David T Lewis

Re: [OpenSmalltalk/opensmalltalk-vm] Reproduceable Segmentation fault while saving images (#444)

In reply to this post by David T Lewis

More complete crash dump with normal VM:

$ vmga/pharo-ui Pharo.image gccrash.st 
Start
Loop: 1
Start
Loop: 2
Start
Loop: 3

...

Start
Loop: 13
Start

evicted zombie process from run queue
Loop: 14
Start

evicted zombie process from run queue

evicted zombie process from run queue

evicted zombie process from run queue
Loop: 15

...

Start
Loop: 50
Start

Segmentation fault Fri Nov 15 08:14:26 2019


/home/alistair/pharo8/gccrash/vmga/pharo-vm/lib/pharo/5.0-201902062351/pharo
Pharo VM version: 5.0-201902062351  Wed Feb  6 23:59:26 UTC 2019 gcc 4.8 [Production Spur 64-bit VM]
Built from: CoInterpreter VMMaker.oscog-eem.2509 uuid: 91e81f64-95de-4914-a960-8f842be3a194 Feb  6 2019
With: StackToRegisterMappingCogit VMMaker.oscog-eem.2509 uuid: 91e81f64-95de-4914-a960-8f842be3a194 Feb  6 2019
Revision: VM: 201902062351 https://github.com/OpenSmalltalk/opensmalltalk-vm.git Date: Wed Feb 6 15:51:18 2019 CommitHash: a838346b Plugins: 201902062351 https://github.com/OpenSmalltalk/opensmalltalk-vm.git
Build host: Linux travis-job-f2b22483-7f84-414f-b833-69f69518c685 4.4.0-101-generic #124~14.04.1-Ubuntu SMP Fri Nov 10 19:05:36 UTC 2017 x86_64 x86_64 x86_64 GNU/Linux
plugin path: /home/alistair/pharo8/gccrash/vmga/pharo-vm/lib/pharo/5.0-201902062351 [default: /home/alistair/pharo8/gccrash/vmga/pharo-vm/lib/pharo/5.0-201902062351/]


C stack backtrace & registers:
	rax 0x00000000 rbx 0x00000011 rcx 0x00000033 rdx 0x00000000
	rdi 0x00000011 rsi 0x00000011 rbp 0x09928f00 rsp 0x7ffe8478f9a0
	r8  0x000d131c r9  0x006898e7 r10 0x7ffe8478fa40 r11 0x5dce5052
	r12 0x099b0a08 r13 0x80000000000000 r14 0x02cefb70 r15 0x0993b688
	rip 0x0043d9f0
*/home/alistair/pharo8/gccrash/vmga/pharo-vm/lib/pharo/5.0-201902062351/pharo[0x43d9f0]
/home/alistair/pharo8/gccrash/vmga/pharo-vm/lib/pharo/5.0-201902062351/pharo[0x41b0b3]
/home/alistair/pharo8/gccrash/vmga/pharo-vm/lib/pharo/5.0-201902062351/pharo[0x41cb0e]
/lib/x86_64-linux-gnu/libpthread.so.0(+0x12890)[0x7f2b7ee8a890]
/home/alistair/pharo8/gccrash/vmga/pharo-vm/lib/pharo/5.0-201902062351/pharo[0x43d9f0]
/home/alistair/pharo8/gccrash/vmga/pharo-vm/lib/pharo/5.0-201902062351/pharo(markAndTrace+0x141)[0x43f3b1]
/home/alistair/pharo8/gccrash/vmga/pharo-vm/lib/pharo/5.0-201902062351/pharo[0x452fe4]
/home/alistair/pharo8/gccrash/vmga/pharo-vm/lib/pharo/5.0-201902062351/pharo(fullGC+0x46)[0x457366]
/home/alistair/pharo8/gccrash/vmga/pharo-vm/lib/pharo/5.0-201902062351/pharo[0x457a50]
/home/alistair/pharo8/gccrash/vmga/pharo-vm/lib/pharo/5.0-201902062351/pharo[0x457e33]
/home/alistair/pharo8/gccrash/vmga/pharo-vm/lib/pharo/5.0-201902062351/pharo(interpret+0xa1f5)[0x467625]
/home/alistair/pharo8/gccrash/vmga/pharo-vm/lib/pharo/5.0-201902062351/pharo[0x468866]
/home/alistair/pharo8/gccrash/vmga/pharo-vm/lib/pharo/5.0-201902062351/pharo(interpret+0x246)[0x45d676]
/home/alistair/pharo8/gccrash/vmga/pharo-vm/lib/pharo/5.0-201902062351/pharo(main+0x2fa)[0x41a59a]
/lib/x86_64-linux-gnu/libc.so.6(__libc_start_main+0xe7)[0x7f2b7eaa8b97]
/home/alistair/pharo8/gccrash/vmga/pharo-vm/lib/pharo/5.0-201902062351/pharo[0x41a8c4]
[0x0]


Smalltalk stack dump:
    0x7ffe847d3b20 I SessionManager>launchSnapshot:andQuit: 0x2d744f8: a(n) SessionManager
    0x7ffe847d3b90 I [] in SessionManager>snapshot:andQuit: 0x2d744f8: a(n) SessionManager
    0x7ffe847d3bd0 I [] in INVALID RECEIVER>newProcess 0x2a39178 is in new space

Most recent primitives
snip...
wait
snapshotPrimitive
**IncrementalGC**
**FullGC**

stack page bytes 8192 available headroom 5576 minimum unused headroom 4672

	(Segmentation fault)
vmga/pharo-ui: line 11:   322 Aborted                 (core dumped) "$DIR"/"pharo-vm/pharo" "$@"

—
You are receiving this because you commented.
Reply to this email directly, view it on GitHub, or unsubscribe.

David T Lewis

Re: [OpenSmalltalk/opensmalltalk-vm] Reproduceable Segmentation fault while saving images (#444)

In reply to this post by David T Lewis

Running with the assert VM in lldb:

(lldb) run Pharo.image gccrash.st
Process 251 launched: '/home/alistair/pharo8/gccrash/vm/lib/pharo/5.0-201911140223/pharo' (x86_64)

((classIndex >= 1) && (classIndex <= (classTablePageSize()))) 50170

((classIndex >= 1) && (classIndex <= (classTablePageSize()))) 50170
Process 251 stopped
* thread #1, name = 'pharo', stop reason = signal SIGIO
    frame #0: 0x00007ffff6a2403f libc.so.6`__select + 95
libc.so.6`__select:
->  0x7ffff6a2403f <+95>:  cmpq   $-0x1000, %rax            ; imm = 0xF000 
    0x7ffff6a24045 <+101>: ja     0x7ffff6a2407a            ; <+154>
    0x7ffff6a24047 <+103>: movl   %r9d, %edi
    0x7ffff6a2404a <+106>: movl   %eax, 0xc(%rsp)
(lldb) bt
* thread #1, name = 'pharo', stop reason = signal SIGIO
  * frame #0: 0x00007ffff6a2403f libc.so.6`__select + 95
    frame #1: 0x00000000004a91ce pharo`aioPoll(microSeconds=50000) at aio.c:316
    frame #2: 0x00007ffff358d6de vm-display-X11.so`display_ioRelinquishProcessorForMicroseconds(microSeconds=<unavailable>) at sqUnixX11.c:4943
    frame #3: 0x0000000000419d3e pharo`ioRelinquishProcessorForMicroseconds(us=<unavailable>) at sqUnixMain.c:588
    frame #4: 0x0000000000468de6 pharo`primitiveRelinquishProcessor at gcc3x-cointerp.c:34635
    frame #5: 0x00000000004235f9 pharo`interpret at gcc3x-cointerp.c:6203
    frame #6: 0x000000000042b5f1 pharo`enterSmalltalkExecutiveImplementation at gcc3x-cointerp.c:16294
    frame #7: 0x000000000041d6d1 pharo`interpret at gcc3x-cointerp.c:2772
    frame #8: 0x000000000041bae6 pharo`main(argc=3, argv=0x00007fffffffe288, envp=<unavailable>) at sqUnixMain.c:2150
    frame #9: 0x00007ffff692eb97 libc.so.6`__libc_start_main + 231
    frame #10: 0x00000000004197ca pharo`_start + 42

Thanks, Eliot!

—
You are receiving this because you commented.
Reply to this email directly, view it on GitHub, or unsubscribe.

David T Lewis

Re: [OpenSmalltalk/opensmalltalk-vm] Reproduceable Segmentation fault while saving images (#444)

In reply to this post by David T Lewis

Hi Alistair,

to get lldb to break when it reports an assert failure so that you can investigate, put a breakpoint on warning:

lldb> b Pharo`warning

—
You are receiving this because you commented.
Reply to this email directly, view it on GitHub, or unsubscribe.

David T Lewis

Re: [OpenSmalltalk/opensmalltalk-vm] Reproduceable Segmentation fault while saving images (#444)

In reply to this post by David T Lewis

Hi All, to simplify checking there is now a generated image checker. This is a cut-down VM that only loads an image and runs the leak checker, answering 0 (unix's OK exit code) if the image is free of leaks, and non-zero if it is leaky. The program takes a -verbose/--verbose argument that will cause it to list the leaks or write a reassuring message if there are none.

This can be built for mac in build.macos64x64/squeak.stack.spur & build.macos32x86/squeak.stack.spur by saying make production (image leak checker) and it produces a program called validImage in squeak.stack.spur/build/vm. I saw that it took 2 seconds to load and check a 1Gb image so it should be fast enough to be used in a CI context.

See f83bde2

HTH

—
You are receiving this because you commented.
Reply to this email directly, view it on GitHub, or unsubscribe.

David T. Lewis

Re: [OpenSmalltalk/opensmalltalk-vm] Reproduceable Segmentation fault while saving images (#444)

Bravo, this is a really good idea :-)

Dave

On Sat, Nov 16, 2019 at 04:47:01PM -0800, Eliot Miranda wrote:

>
> Hi All, to simplify checking there is now a generated image checker. This is a cut-down VM that only loads an image and runs the leak checker, answering 0 (unix's OK exit code) if the image is free of leaks, and non-zero if it is leaky. The program takes a -verbose/--verbose argument that will cause it to list the leaks or write a reassuring message if there are none.
>
> This can be built for mac in build.macos64x64/squeak.stack.spur & build.macos32x86/squeak.stack.spur by saying make production (image leak checker) and it produces a program called validImage in squeak.stack.spur/build/vm. I saw that it took 2 seconds to load and check a 1Gb image so it should be fast enough to be used in a CI context.
>
> See https://github.com/OpenSmalltalk/opensmalltalk-vm/commit/f83bde2bf5c325ce26f3368bc221578a752a9631
>
> HTH
>
> --
> You are receiving this because you commented.
> Reply to this email directly or view it on GitHub:
> https://github.com/OpenSmalltalk/opensmalltalk-vm/issues/444#issuecomment-554689372

David T Lewis

Re: [OpenSmalltalk/opensmalltalk-vm] Reproduceable Segmentation fault while saving images (#444)

In reply to this post by David T Lewis

Hi Eliot,

I've spent some more time trying to track this down...

I've been working with a minimal pharo image, which can be downloaded from:

http://files.pharo.org/image/80/latest-minimal-64.zip

The minimal image doesn't have FreeType yet loaded, so we can rule out FreeType as the cause of this particular issue (not to say that it doesn't have problems).

Running the following script with the minimal image and a debug VM:

| aJson anArray |
aJson := ZnEasy get: 'https://data.nasa.gov/resource/y77d-th95.json' asZnUrl.
Array streamContents: [ :aStream |
    1 to: 400 do: [ :i |
		Stdio stdout
			<< 'ExternalAddress: ';
			print: ExternalAddress instanceCount; lf;
			<< 'Start';
			lf; flush.
        aStream nextPutAll: (STON fromString: aJson contents).
		Smalltalk garbageCollect.
		Stdio stdout
			<< 'Loop: ';
			print: i;
			lf; flush ] ].

Shows two things:

There are no instances of ExternalAddress, so the chance of this being ffi related seems quite small.
The script runs to completion.

So there appears to be no memory corruption up to this stage.

Modifying the script once more to save the image instead of just garbage collecting:

| aJson anArray |
aJson := ZnEasy get: 'https://data.nasa.gov/resource/y77d-th95.json' asZnUrl.
Array streamContents: [ :aStream |
    1 to: 400 do: [ :i |
		Stdio stdout
			<< 'ExternalAddress: ';
			print: ExternalAddress instanceCount; lf;
			<< 'Start';
			lf; flush.
        aStream nextPutAll: (STON fromString: aJson contents).
		Smalltalk saveImageInFileNamed: 'Save.', i asString, '.image'.
		Stdio stdout
			<< 'Loop: ';
			print: i;
			lf; flush ] ].

Results in the segmentation fault. In this case it was while saving the 90th image:

$ ./validImage Save.88.image 
$ ./validImage Save.89.image 
Segmentation fault (core dumped)

Save.90.image doesn't exist, but Save.90.changes does.

I'll try attaching a file containing the terminal output (as much as was buffered).

Please let me know if you disagree with any of my reasoning.

The only difference between the two scripts is that the second one writes the image to disk, which seems to suggest that it's the image saving that could be the cause of the issue.

What do you think?

Thanks!
Alistair

—
You are receiving this because you commented.
Reply to this email directly, view it on GitHub, or unsubscribe.

David T Lewis

Re: [OpenSmalltalk/opensmalltalk-vm] Reproduceable Segmentation fault while saving images (#444)

In reply to this post by David T Lewis

To paraphrase github's error message: We don't support uploading text files, try again with TXT.

A short extract:

(isMarked(oop)) || (obj == (hiddenRootsObject())) 60126

isPostMobile(fwd) 60129

isMarked(oop) 60112

isPostMobile(fwd) 60115

(isMarked(oop)) || (obj == (hiddenRootsObject())) 60126

isPostMobile(fwd) 60129
Loop: 89
ExternalAddress: 0
Start

objCouldBeClassObj(classObj) 51687

objCouldBeClassObj(classObj) 51687

objCouldBeClassObj(classObj) 51687

objCouldBeClassObj(classObj) 51687

objCouldBeClassObj(classObj) 51687

objCouldBeClassObj(classObj) 51687

objCouldBeClassObj(classObj) 51687

objCouldBeClassObj(classObj) 51687

objCouldBeClassObj(classObj) 51687

objCouldBeClassObj(classObj) 51687

objCouldBeClassObj(classObj) 51687

objCouldBeClassObj(classObj) 51687

objCouldBeClassObj(classObj) 51687

objCouldBeClassObj(classObj) 51687

objCouldBeClassObj(classObj) 51687

Segmentation fault Mon Nov 18 13:20:09 2019


/dev/shm/minimal/vmdbg/lib/pharo/5.0-201911160620/pharo
Pharo VM version: 5.0-201911160620  Mon Nov 18 11:31:44 CET 2019 clang [Debug Spur 64-bit VM]
Built from: CoInterpreter VMMaker.oscog-eem.2585 uuid: 5282e96d-1d2e-4039-a905-429834c37da2 Nov 18 2019
With: StackToRegisterMappingCogit VMMaker.oscog-eem.2585 uuid: 5282e96d-1d2e-4039-a905-429834c37da2 Nov 18 2019
Revision: VM: 201911160620 alistair@alistair-Precision-3541:opensmalltalk-vm/opensmalltalk-vm Date: Fri Nov 15 22:20:29 2019 CommitHash: 21112274a Plugins: 201911160620 alistair@alistair-Precision-3541:opensmalltalk-vm/opensmalltalk-vm
Build host: Linux 2273a8715b2d 5.0.0-36-generic #39~18.04.1-Ubuntu SMP Tue Nov 12 11:09:50 UTC 2019 x86_64 x86_64 x86_64 GNU/Linux
plugin path: vmdbg/lib/pharo/5.0-201911160620 [default: /dev/shm/minimal/vmdbg/lib/pharo/5.0-201911160620/]


C stack backtrace & registers:
	rax 0x00000030 rbx 0x00000000 rcx 0x00000270 rdx 0x034c8810
	rdi 0x00000033 rsi 0x00000000 rbp 0x7ffc9e69e070 rsp 0x7ffc9e69de80
	r8  0x00000000 r9  0x00000022 r10 0xffffffde r11 0x00000000
	r12 0x004198a0 r13 0x7ffc9e708c80 r14 0x00000000 r15 0x00000000
	rip 0x0048762f
*vmdbg/lib/pharo/5.0-201911160620/pharo(markAndTrace+0x87f)[0x48762f]
vmdbg/lib/pharo/5.0-201911160620/pharo[0x41afa4]
vmdbg/lib/pharo/5.0-201911160620/pharo[0x41e0d8]
/lib/x86_64-linux-gnu/libpthread.so.0(+0x12890)[0x7f90fa38c890]
vmdbg/lib/pharo/5.0-201911160620/pharo(markAndTrace+0x87f)[0x48762f]
vmdbg/lib/pharo/5.0-201911160620/pharo[0x481651]
vmdbg/lib/pharo/5.0-201911160620/pharo(fullGC+0xea)[0x4805fa]
vmdbg/lib/pharo/5.0-201911160620/pharo[0x4c9f65]
vmdbg/lib/pharo/5.0-201911160620/pharo[0x4add41]
vmdbg/lib/pharo/5.0-201911160620/pharo(interpret+0xa661)[0x42a6d1]
vmdbg/lib/pharo/5.0-201911160620/pharo[0x43e032]
vmdbg/lib/pharo/5.0-201911160620/pharo(interpret+0x140)[0x4201b0]
vmdbg/lib/pharo/5.0-201911160620/pharo(main+0x2d6)[0x41d8d6]
/lib/x86_64-linux-gnu/libc.so.6(__libc_start_main+0xe7)[0x7f90f9faab97]
vmdbg/lib/pharo/5.0-201911160620/pharo(_start+0x2a)[0x4198ca]
[0x0]

—
You are receiving this because you commented.
Reply to this email directly, view it on GitHub, or unsubscribe.

David T Lewis

Re: [OpenSmalltalk/opensmalltalk-vm] Reproduceable Segmentation fault while saving images (#444)

In reply to this post by David T Lewis

Just a bit more:

$ /dev/shm/minimal/validImage --verbose Save.88.image 
Image Save.88.image is free of leaks

$ /dev/shm/minimal/validImage --verbose Save.89.image | wc -l
3574

$ /dev/shm/minimal/validImage --verbose Save.89.image | head
object leak in          0x5363788 @ 5 =          0x53e34e8
object leak in          0x5363788 @ 9 =          0x53e3538
object leak in          0x5363788 @ 10 =          0x53e3560
object leak in          0x5363788 @ 11 =          0x53e3580
object leak in          0x5363788 @ 16 =          0x53e35c0
object leak in          0x5363788 @ 17 =          0x53e35f0
object leak in          0x5363788 @ 21 =          0x53e3610
object leak in          0x5363848 @ 8 =          0x53e3680
object leak in          0x5363848 @ 10 =          0x53e36d0
object leak in          0x5363848 @ 12 =          0x53e3710

—
You are receiving this because you commented.
Reply to this email directly, view it on GitHub, or unsubscribe.

David T Lewis

Re: [OpenSmalltalk/opensmalltalk-vm] Reproduceable Segmentation fault while saving images (#444)

In reply to this post by David T Lewis

P.P.S. It would be great to be able to call validateImage() from within the debugger.

—
You are receiving this because you commented.
Reply to this email directly, view it on GitHub, or unsubscribe.

123