[OpenSmalltalk/opensmalltalk-vm] UnixProcess forkSqueak broken since October (#548)

Previous Topic Next Topic
 
classic Classic list List threaded Threaded
9 messages Options
Reply | Threaded
Open this post in threaded view
|

[OpenSmalltalk/opensmalltalk-vm] UnixProcess forkSqueak broken since October (#548)

David T Lewis
 

UnixProcess class>>forkSqueak is no longer working. The forked child process VM crashes with segmentation fault. Testing with VMs from bintray shows that version 5.0-202009300634 works, and any version 5.0-202010192227 or later fails. Stack dump sometimes (but not always) shows failure in aioPoll() for example:

*/usr/local/bin/../lib/squeak/5.0-202101160259/squeak(aioPoll+0x12e)[0x4bc0fe]

I am not able to catch the failure in gdb because it happens in the child process. My initial guess is that it may be related to the epoll enhancements added in this time frame, because forking the VM requires initializing things like this in the new child VM process.


You are receiving this because you are subscribed to this thread.
Reply to this email directly, view it on GitHub, or unsubscribe.

<script type="application/ld+json">[ { "@context": "http://schema.org", "@type": "EmailMessage", "potentialAction": { "@type": "ViewAction", "target": "https://github.com/OpenSmalltalk/opensmalltalk-vm/issues/548", "url": "https://github.com/OpenSmalltalk/opensmalltalk-vm/issues/548", "name": "View Issue" }, "description": "View this Issue on GitHub", "publisher": { "@type": "Organization", "name": "GitHub", "url": "https://github.com" } } ]</script>
Reply | Threaded
Open this post in threaded view
|

Re: [OpenSmalltalk/opensmalltalk-vm] UnixProcess forkSqueak broken since October (#548)

Jan Vrany
 
On Sun, 2021-01-24 at 18:59 -0800, David T Lewis wrote:
>
>    */usr/local/bin/../lib/squeak/5.0-
> 202101160259/squeak(aioPoll+0x12e)[0x4bc0fe]
>
> I am not able to catch the failure in gdb because it happens in the
> child process.

GDB can follow fork(), see

(gdb) help set follow-fork-mode
Set debugger response to a program call of fork or vfork.
A fork or vfork creates a new process.  follow-fork-mode can be:
  parent  - the original process is debugged after a fork
  child   - the new process is debugged after a fork
The unfollowed process will continue to run.
By default, the debugger will follow the parent process.

HTH, Jan

Reply | Threaded
Open this post in threaded view
|

Re: [OpenSmalltalk/opensmalltalk-vm] UnixProcess forkSqueak broken since October (#548)

David T Lewis
In reply to this post by David T Lewis
 

Thank you Jan!


You are receiving this because you commented.
Reply to this email directly, view it on GitHub, or unsubscribe.

<script type="application/ld+json">[ { "@context": "http://schema.org", "@type": "EmailMessage", "potentialAction": { "@type": "ViewAction", "target": "https://github.com/OpenSmalltalk/opensmalltalk-vm/issues/548#issuecomment-767147652", "url": "https://github.com/OpenSmalltalk/opensmalltalk-vm/issues/548#issuecomment-767147652", "name": "View Issue" }, "description": "View this Issue on GitHub", "publisher": { "@type": "Organization", "name": "GitHub", "url": "https://github.com" } } ]</script>
Reply | Threaded
Open this post in threaded view
|

Re: [OpenSmalltalk/opensmalltalk-vm] UnixProcess forkSqueak broken since October (#548)

David T Lewis
In reply to this post by David T Lewis
 

The segfault happens in the child process that was forked by the forkSqueak prim. It occurs in the new epoll code. I don't yet see the cause (there is no obvious null pointer issue) but the gdb backtrace is:

(gdb) bt
#0 0x0000000000dd39b0 in ?? ()
#1 0x00000000004d62dc in aioPoll (microSeconds=0) at /home/lewis/squeak/git/opensmalltalk-vm/platforms/unix/vm/aio.c:405
#2 0x00007fa016315e39 in display_ioProcessEvents () at /home/lewis/squeak/git/opensmalltalk-vm/platforms/unix/vm-display-X11/sqUnixX11.c:4867
#3 0x0000000000417ca3 in ioProcessEvents () at /home/lewis/squeak/git/opensmalltalk-vm/platforms/unix/vm/sqUnixMain.c:726
#4 0x0000000000441f58 in checkForEventsMayContextSwitch (mayContextSwitch=1) at /home/lewis/squeak/git/opensmalltalk-vm/spurstack64src/vm/gcc3x-interp.c:50306
#5 0x00000000004401ca in handleStackOverflowOrEventAllowContextSwitch (mayContextSwitch=1)
at /home/lewis/squeak/git/opensmalltalk-vm/spurstack64src/vm/gcc3x-interp.c:53718
#6 0x0000000000426cbd in interpret () at /home/lewis/squeak/git/opensmalltalk-vm/spurstack64src/vm/gcc3x-interp.c:5844
#7 0x000000000043ab1f in enterSmalltalkExecutiveImplementation () at /home/lewis/squeak/git/opensmalltalk-vm/spurstack64src/vm/gcc3x-interp.c:51798
#8 0x000000000041d4ba in interpret () at /home/lewis/squeak/git/opensmalltalk-vm/spurstack64src/vm/gcc3x-interp.c:2493
#9 0x000000000041ad0a in main (argc=2, argv=0x7ffc3ba11f78, envp=0x7ffc3ba11f90) at /home/lewis/squeak/git/opensmalltalk-vm/platforms/unix/vm/sqUnixMain.c:2164


You are receiving this because you commented.
Reply to this email directly, view it on GitHub, or unsubscribe.

<script type="application/ld+json">[ { "@context": "http://schema.org", "@type": "EmailMessage", "potentialAction": { "@type": "ViewAction", "target": "https://github.com/OpenSmalltalk/opensmalltalk-vm/issues/548#issuecomment-770284089", "url": "https://github.com/OpenSmalltalk/opensmalltalk-vm/issues/548#issuecomment-770284089", "name": "View Issue" }, "description": "View this Issue on GitHub", "publisher": { "@type": "Organization", "name": "GitHub", "url": "https://github.com" } } ]</script>
Reply | Threaded
Open this post in threaded view
|

Re: [OpenSmalltalk/opensmalltalk-vm] UnixProcess forkSqueak broken since October (#548)

David T Lewis
In reply to this post by David T Lewis
 

The problem is that the file descriptors and structures are shared between parent and child after fork. However, after the fork, the epoll structures point to data that belongs to the parent. At line 405 the child process tries to access that data, and I think that causes the segfault.
The child should close the inherited epoll file descriptor and recreate it along with the necessary data structures. This can be done by a handler registered with pthread_atfork().


You are receiving this because you commented.
Reply to this email directly, view it on GitHub, or unsubscribe.

<script type="application/ld+json">[ { "@context": "http://schema.org", "@type": "EmailMessage", "potentialAction": { "@type": "ViewAction", "target": "https://github.com/OpenSmalltalk/opensmalltalk-vm/issues/548#issuecomment-770291427", "url": "https://github.com/OpenSmalltalk/opensmalltalk-vm/issues/548#issuecomment-770291427", "name": "View Issue" }, "description": "View this Issue on GitHub", "publisher": { "@type": "Organization", "name": "GitHub", "url": "https://github.com" } } ]</script>
Reply | Threaded
Open this post in threaded view
|

Re: [OpenSmalltalk/opensmalltalk-vm] UnixProcess forkSqueak broken since October (#548)

David T Lewis
In reply to this post by David T Lewis
 

See it explained at
https://copyconstruct.medium.com/the-method-to-epolls-madness-d9d2d6378642

Le sam. 30 janv. 2021 à 23:38, smalltalking <[hidden email]> a
écrit :

> The problem is that the file descriptors and structures are shared between
> parent and child after fork. However, after the fork, the epoll structures
> point to data that belongs to the parent. At line 405 the child process
> tries to access that data, and I think that causes the segfault.
> The child should close the inherited epoll file descriptor and recreate it
> along with the necessary data structures. This can be done by a handler
> registered with pthread_atfork().
>
> —
> You are receiving this because you are subscribed to this thread.
> Reply to this email directly, view it on GitHub
> <https://github.com/OpenSmalltalk/opensmalltalk-vm/issues/548#issuecomment-770291427>,
> or unsubscribe
> <https://github.com/notifications/unsubscribe-auth/AAFRYIUUQU763FNFDLYNS73S4SC45ANCNFSM4WRCJIUQ>
> .
>


You are receiving this because you commented.
Reply to this email directly, view it on GitHub, or unsubscribe.

<script type="application/ld+json">[ { "@context": "http://schema.org", "@type": "EmailMessage", "potentialAction": { "@type": "ViewAction", "target": "https://github.com/OpenSmalltalk/opensmalltalk-vm/issues/548#issuecomment-770291776", "url": "https://github.com/OpenSmalltalk/opensmalltalk-vm/issues/548#issuecomment-770291776", "name": "View Issue" }, "description": "View this Issue on GitHub", "publisher": { "@type": "Organization", "name": "GitHub", "url": "https://github.com" } } ]</script>
Reply | Threaded
Open this post in threaded view
|

Re: [OpenSmalltalk/opensmalltalk-vm] UnixProcess forkSqueak broken since October (#548)

David T Lewis
In reply to this post by David T Lewis
 

Update - The actual forkSqueak is working fine, but we get failures associated with aio handling for the socket connection to the X11 server. The child closes the socket and calls aioDisable for the socket fd to unregister it. When using epoll rather than generic aio event handling, this apparently affects the Linux kernel epoll registration for the socket fd (I am not sure if I understand this correctly, but this appears to be the case). The result seems to be failures in either the child or parent VM process, or both. The problem goes away if I #ifdef the call to aioDisable() in the forgetXDisplay() function. I am not sure if this is a proper fix or just a workaround kludge, but it does work.


You are receiving this because you commented.
Reply to this email directly, view it on GitHub, or unsubscribe.

<script type="application/ld+json">[ { "@context": "http://schema.org", "@type": "EmailMessage", "potentialAction": { "@type": "ViewAction", "target": "https://github.com/OpenSmalltalk/opensmalltalk-vm/issues/548#issuecomment-770523083", "url": "https://github.com/OpenSmalltalk/opensmalltalk-vm/issues/548#issuecomment-770523083", "name": "View Issue" }, "description": "View this Issue on GitHub", "publisher": { "@type": "Organization", "name": "GitHub", "url": "https://github.com" } } ]</script>
Reply | Threaded
Open this post in threaded view
|

Re: [OpenSmalltalk/opensmalltalk-vm] UnixProcess forkSqueak broken since October (#548)

David T Lewis
In reply to this post by David T Lewis
 

The workaround (fix?) for forkSqueak is in pull request #550


You are receiving this because you commented.
Reply to this email directly, view it on GitHub, or unsubscribe.

<script type="application/ld+json">[ { "@context": "http://schema.org", "@type": "EmailMessage", "potentialAction": { "@type": "ViewAction", "target": "https://github.com/OpenSmalltalk/opensmalltalk-vm/issues/548#issuecomment-770559995", "url": "https://github.com/OpenSmalltalk/opensmalltalk-vm/issues/548#issuecomment-770559995", "name": "View Issue" }, "description": "View this Issue on GitHub", "publisher": { "@type": "Organization", "name": "GitHub", "url": "https://github.com" } } ]</script>
Reply | Threaded
Open this post in threaded view
|

Re: [OpenSmalltalk/opensmalltalk-vm] UnixProcess forkSqueak broken since October (#548)

David T Lewis
In reply to this post by David T Lewis
 

I opened a different PR to address the issue as recommended above:
For epoll aio, close and reopen the epoll fd in forked child process #552


You are receiving this because you commented.
Reply to this email directly, view it on GitHub, or unsubscribe.

<script type="application/ld+json">[ { "@context": "http://schema.org", "@type": "EmailMessage", "potentialAction": { "@type": "ViewAction", "target": "https://github.com/OpenSmalltalk/opensmalltalk-vm/issues/548#issuecomment-775460732", "url": "https://github.com/OpenSmalltalk/opensmalltalk-vm/issues/548#issuecomment-775460732", "name": "View Issue" }, "description": "View this Issue on GitHub", "publisher": { "@type": "Organization", "name": "GitHub", "url": "https://github.com" } } ]</script>