[bug] async signal queue can miss events

Previous Topic Next Topic
 
classic Classic list List threaded Threaded
4 messages Options
Reply | Threaded
Open this post in threaded view
|

[bug] async signal queue can miss events

Holger Hans Peter Freyther-3
Issue status update for
http://smalltalk.gnu.org/node/651
Post a follow up:
http://smalltalk.gnu.org/project/comments/add/651

 Project:      GNU Smalltalk
 Version:      <none>
 Component:    VM
 Category:     bug reports
 Priority:     normal
 Assigned to:  Unassigned
 Reported by:  zecke
 Updated by:   zecke
 Status:       active

I had this once on a deployed image but can more easily reproduce it
right now.

The symptom:
A Process is stuck reading from a socket while data is available.

$ netstat -np | grep gst
tcp       20      0 127.0.0.1:48944         127.0.0.1:3002        
ESTABLISHED 21642/gst

This happens on this code:

Eval [
   1 to: 100 do:[:each | | socket |
       socket := Sockets.StreamSocket
           remote: 'localhost' port: 3002.
       socket next.
       socket close.
   ].
]

it occurs with 3.2.4+ from the stable-3.2 branch.



_______________________________________________
help-smalltalk mailing list
[hidden email]
https://lists.gnu.org/mailman/listinfo/help-smalltalk
Reply | Threaded
Open this post in threaded view
|

Re: [bug] async signal queue can miss events

Holger Hans Peter Freyther-3
Issue status update for
http://smalltalk.gnu.org/project/issue/651
Post a follow up:
http://smalltalk.gnu.org/project/comments/add/651

 Project:      GNU Smalltalk
 Version:      <none>
 Component:    VM
 Category:     bug reports
 Priority:     normal
 Assigned to:  Unassigned
 Reported by:  zecke
 Updated by:   zecke
 Status:       active

The 23:00 analysis... and hypothesis...

_gst_async_file_polling
1.) check for polling...
                             2.) new data arrives
3.) re-alloc
4.) set_filter_interrupt...

It has been some time bit SIGIO/fasync is just edge triggered? isn't it?

Strace with success:
socket(PF_INET, SOCK_STREAM|SOCK_CLOEXEC, IPPROTO_TCP) = 4
fcntl64(4, F_GETFL)                     = 0x2 (flags O_RDWR)
fcntl64(4, F_SETFL, O_RDWR|O_NONBLOCK)  = 0
connect(4, {sa_family=AF_INET, sin_port=htons(3002),
sin_addr=inet_addr("127.0.0.1")}, 16) = -1 EINPROGRESS (Operation now in
progress)
poll([{fd=4, events=POLLOUT}], 1, 0)    = 1 ([{fd=4, revents=POLLOUT}])
poll([{fd=4, events=POLLOUT}], 1, 0)    = 1 ([{fd=4, revents=POLLOUT}])
getpeername(4, {sa_family=AF_INET, sin_port=htons(3002),
sin_addr=inet_addr("127.0.0.1")}, [16]) = 0
poll([{fd=4, events=POLLIN}], 1, 0)     = 1 ([{fd=4, revents=POLLIN}])
poll([{fd=4, events=POLLIN}], 1, 0)     = 1 ([{fd=4, revents=POLLIN}])
poll([{fd=4, events=POLLIN}], 1, 0)     = 1 ([{fd=4, revents=POLLIN}])
poll([{fd=4, events=POLLIN}], 1, 0)     = 1 ([{fd=4, revents=POLLIN}])
recvfrom(4, "\0\21\376\4\1\10\1\7\1\2\1\3\1\4\1\5\1\1\1\0", 1024, 0,
NULL, NULL) = 20
close(4)

Strace with failure:
socket(PF_INET, SOCK_STREAM|SOCK_CLOEXEC, IPPROTO_TCP) = 4
fcntl64(4, F_GETFL)                     = 0x2 (flags O_RDWR)
fcntl64(4, F_SETFL, O_RDWR|O_NONBLOCK)  = 0
connect(4, {sa_family=AF_INET, sin_port=htons(3002),
sin_addr=inet_addr("127.0.0.1")}, 16) = -1 EINPROGRESS (Operation now in
progress)
poll([{fd=4, events=POLLOUT}], 1, 0)    = 1 ([{fd=4, revents=POLLOUT}])
poll([{fd=4, events=POLLOUT}], 1, 0)    = 1 ([{fd=4, revents=POLLOUT}])
getpeername(4, {sa_family=AF_INET, sin_port=htons(3002),
sin_addr=inet_addr("127.0.0.1")}, [16]) = 0
poll([{fd=4, events=POLLIN}], 1, 0)     = 0 (Timeout)
fcntl64(4, F_GETFL)                     = 0x802 (flags
O_RDWR|O_NONBLOCK)
fcntl64(4, F_SETFL, O_RDWR|O_NONBLOCK|O_ASYNC) = 0
fcntl64(4, F_SETOWN, 22087)             = 0



_______________________________________________
help-smalltalk mailing list
[hidden email]
https://lists.gnu.org/mailman/listinfo/help-smalltalk
Reply | Threaded
Open this post in threaded view
|

Re: [bug] async signal queue can miss events

Paolo Bonzini-2
Il 28/08/2012 23:10, Holger Hans Peter Freyther ha scritto:

> Issue status update for http://smalltalk.gnu.org/project/issue/651
> Post a follow up: http://smalltalk.gnu.org/project/comments/add/651
>
> Project:      GNU Smalltalk
> Version:      <none>
> Component:    VM
> Category:     bug reports
> Priority:     normal
> Assigned to:  Unassigned
> Reported by:  zecke
> Updated by:   zecke
> Status:       active
>
> The 23:00 analysis... and hypothesis...
>
> _gst_async_file_polling
> 1.) check for polling...
>                             2.) new data arrives
> 3.) re-alloc
> 4.) set_filter_interrupt...
>
> It has been some time bit SIGIO/fasync is just edge triggered? isn't it?

Yes.  Does this fix it?

diff --git a/libgst/sysdep/posix/events.c b/libgst/sysdep/posix/events.c
index da3a784..8c864ec 100644
--- a/libgst/sysdep/posix/events.c
+++ b/libgst/sysdep/posix/events.c
@@ -395,6 +395,8 @@ _gst_async_file_polling (int fd,
   polling_queue *new;

   index = num_used_pollfds++;
+  set_file_interrupt (fd, file_polling_handler);
+
   result = _gst_sync_file_polling (fd, cond);
   if (result != 0)
     {
@@ -431,8 +433,6 @@ _gst_async_file_polling (int fd,
     }
   pollfds[index].revents = 0;

-  set_file_interrupt (fd, file_polling_handler);
-
   /* Even if I/O was made possible while setting up our machinery,
      the list will only be walked before the next bytecode, so there
      is no race.  We incremented num_used_pollfds very early so that

Paolo

_______________________________________________
help-smalltalk mailing list
[hidden email]
https://lists.gnu.org/mailman/listinfo/help-smalltalk
Reply | Threaded
Open this post in threaded view
|

Re: [bug] async signal queue can miss events

Holger Freyther
On Wed, Aug 29, 2012 at 01:06:30PM +0200, Paolo Bonzini wrote:
> Il 28/08/2012 23:10, Holger Hans Peter Freyther ha scritto:
> > It has been some time bit SIGIO/fasync is just edge triggered? isn't it?
>
> Yes.  Does this fix it?

It does fix it. Without the patch it will still hang, with the patch
I had thousands of connects without a hang.

So the patch is doing:
        1.) Enable ASYNC handling
        2.) Poll to check if there is already data..
        3.) Possible register the 'fd'

What can happen:
SIGIO between: set_file_interrupt/gst_sync_file_polling. In that case
we have an extra wakeup and will find no fd as the gst_sync_file_polling
will already return 1.

SIGIO after the gst_sync_file_polling and the registration. The checking
for fds will occur after this code has registered the fd and nothing bad
will happen.

it sounds all good to me. The potential cost is one extra poll for some
cases.


_______________________________________________
help-smalltalk mailing list
[hidden email]
https://lists.gnu.org/mailman/listinfo/help-smalltalk