Obscure hang

Previous Topic Next Topic
 
classic Classic list List threaded Threaded
3 messages Options
Reply | Threaded
Open this post in threaded view
|

Obscure hang

Bert Freudenberg-3
Hi folks,

I've had some rather obscure hangs on my squeak-source server lately.  
The funny thing is, as soon as I connect to the Xvnc server and move  
the mouse, things start rolling again. Otherwise, it takes a couple  
of minutes until serving continues. I have no idea why.

Here are the file descriptors for reference. #3 is the X server  
(Xvnc) connection:

squeaksou 30574 source    0r   CHR        1,3             31583 /dev/
null
squeaksou 30574 source    1w   REG        3,5    32895 16482319 .../
some.log
squeaksou 30574 source    2w   REG        3,5    32895 16482319 .../
some.log
squeaksou 30574 source    3u  unix 0xf2119ac0           3367464 socket
squeaksou 30574 source    4r   REG        3,5 23206828 12402706 .../
some.image
squeaksou 30574 source    5r   DIR        3,5     4096 12402698 .../
somedir
squeaksou 30574 source    6r   REG        3,5 14542313 17317896 .../
SqueakV3.sources
squeaksou 30574 source    7u   REG        3,5 20230368 12402705 .../
some.changes
squeaksou 30574 source    8u  IPv4    4290750               TCP  
*:someport (LISTEN)

Unfortunately I took this only after the hang. Socket #9 must be a  
file that was interrupted while writing, 10 and 11 is an HTTP  
connection pair.

There were already 3581255 bytes of 4457385 bytes written to file #9  
at this point.

=== According to strace the VM was chugging along just fine:

gettimeofday({1136823729, 265939}, NULL) = 0
gettimeofday({1136823729, 265985}, NULL) = 0
gettimeofday({1136823729, 266050}, NULL) = 0
gettimeofday({1136823729, 266092}, NULL) = 0
--- SIGALRM (Alarm clock) @ 0 (0) ---
sigreturn()                             = ? (mask now [])
gettimeofday({1136823729, 277789}, NULL) = 0
gettimeofday({1136823729, 277819}, NULL) = 0
gettimeofday({1136823729, 277856}, NULL) = 0
gettimeofday({1136823729, 277877}, NULL) = 0
gettimeofday({1136823729, 277910}, NULL) = 0
gettimeofday({1136823729, 277939}, NULL) = 0
ioctl(3, FIONREAD, [0])                 = 0
gettimeofday({1136823729, 278015}, NULL) = 0
select(9, [3], [], [3 8], {0, 0})       = 0 (Timeout)
gettimeofday({1136823729, 278212}, NULL) = 0
gettimeofday({1136823729, 289711}, NULL) = 0
gettimeofday({1136823729, 289751}, NULL) = 0
gettimeofday({1136823729, 289813}, NULL) = 0
gettimeofday({1136823729, 289854}, NULL) = 0
...

=== but nothing was served to sockets 10/11 nor written to file #9.  
Now some X data arrives on socket #3

...
gettimeofday({1136823748, 170924}, NULL) = 0
gettimeofday({1136823748, 170956}, NULL) = 0
gettimeofday({1136823748, 170984}, NULL) = 0
ioctl(3, FIONREAD, [32])                = 0
read(3, "\7\1W\204A\306\371\257%\0\0\0\1\0\200\0\2\0\200\0\217
\003"..., 32) = 32
ioctl(3, FIONREAD, [32])                = 0
read(3, "\6\0W\204B\306\371\257%\0\0\0\2\0\200\0\0\0\0\0\211\003"...,  
32) = 32
gettimeofday({1136823748, 171177}, NULL) = 0
ioctl(3, FIONREAD, [0])                 = 0
gettimeofday({1136823748, 171221}, NULL) = 0
select(9, [3], [], [3 8], {0, 0})       = 0 (Timeout)
...

==== and 5 seconds later it answers the (pending?) request on sockets  
10/11

ioctl(3, FIONREAD, [0])                 = 0
gettimeofday({1136823753, 233667}, NULL) = 0
select(9, [3], [], [3 8], {0, 0})       = 0 (Timeout)
gettimeofday({1136823753, 234002}, NULL) = 0
gettimeofday({1136823753, 234042}, NULL) = 0
gettimeofday({1136823753, 234070}, NULL) = 0
gettimeofday({1136823753, 234110}, NULL) = 0
ioctl(10, FIONBIO, [1])                 = 0
gettimeofday({1136823753, 234306}, NULL) = 0
gettimeofday({1136823753, 234330}, NULL) = 0
gettimeofday({1136823753, 234352}, NULL) = 0
gettimeofday({1136823753, 234377}, NULL) = 0
gettimeofday({1136823753, 234410}, NULL) = 0
gettimeofday({1136823753, 234431}, NULL) = 0
gettimeofday({1136823753, 234456}, NULL) = 0
getpeername(10, {sa_family=AF_INET, sin_port=htons(44431),  
sin_addr=inet_addr("127.0.0.1")}, [16]) = 0
time(NULL)                              = 1136823753
gettimeofday({1136823753, 234716}, NULL) = 0
gettimeofday({1136823753, 234738}, NULL) = 0
recv(10, "G", 1, MSG_PEEK)              = 1
read(10, "GET /3d/?C=M;O=D HTTP/1.1\r\nHost:"..., 2000) = 266
time(NULL)                              = 1136823753
--- SIGALRM (Alarm clock) @ 0 (0) ---
sigreturn()                             = ? (mask now [])
gettimeofday({1136823753, 239059}, NULL) = 0
gettimeofday({1136823753, 239600}, NULL) = 0
gettimeofday({1136823753, 239624}, NULL) = 0
time(NULL)                              = 1136823753
gettimeofday({1136823753, 243498}, NULL) = 0
select(11, NULL, [10], NULL, {0, 0})    = 1 (out [10], left {0, 0})
write(10, "HTTP/1.1 200 OK\r\nKeep-Alive: tim"..., 190) = 190
gettimeofday({1136823753, 243785}, NULL) = 0
select(11, NULL, [10], NULL, {0, 0})    = 1 (out [10], left {0, 0})
write(10, "<?xml version=\"1.0\" encoding=\"is"..., 3811) = -1 EPIPE  
(Broken pipe)
--- SIGPIPE (Broken pipe) @ 0 (0) ---


=== immediately followed by writing to file #9

gettimeofday({1136823753, 244020}, NULL) = 0
getpeername(10, 0xbffff9d0, [16])       = -1 ENOTCONN (Transport  
endpoint is not connected)
time(NULL)                              = 1136823753
setsockopt(10, SOL_SOCKET, SO_LINGER, {onoff=0, linger=0}, 8) = 0
close(10)                               = 0
write(9, "\10", 1)                      = 1
write(9, "\0", 1)                       = 1
write(9, "\0", 1)                       = 1
write(9, "\0", 1)                       = 1
write(9, "\1", 1)                       = 1
write(9, "\t", 1)                       = 1
_llseek(9, 3579904, [3579904], SEEK_SET) = 0
read(9, "ubclasses.\20\3\37\312\20\5\37\325\4\0\0\0\0\n\0\34\315"...,  
1356) = 1356
read(9, "\t", 4096)                     = 1
_llseek(9, 3581261, [3581261], SEEK_SET) = 0
_llseek(9, 3581261, [3581261], SEEK_SET) = 0
_llseek(9, -1, [3581260], SEEK_CUR)     = 0
write(9, "\20", 1)                      = 1
write(9, "\t", 1)                       = 1
write(9, "\37", 1)                      = 1
write(9, "\3", 1)                       = 1

=== 11 seconds later the saving was done (yes storing is that slow):

close(9)                                = 0
munmap(0x40188000, 4096)                = 0
gettimeofday({1136823764, 800773}, NULL) = 0

=== ... and everything is back to normal ...

All this is on Linux, 2.4 kernel, Ian's 3.7.7 VM.

Did anybody see something similar? Any ideas?

- Bert -


Reply | Threaded
Open this post in threaded view
|

Re: Obscure hang

johnmci
Bert have you thought about using that message sending patch we did  
to use signals to get method stack tracebacks?

On 9-Jan-06, at 9:43 AM, Bert Freudenberg wrote:

> Hi folks,
>
> I've had some rather obscure hangs on my squeak-source server  
> lately. The funny thing is, as soon as I connect to the Xvnc server  
> and move the mouse, things start rolling again. Otherwise, it takes  
> a couple of minutes until serving continues. I have no idea why.
>
========================================================================
===
John M. McIntosh <[hidden email]> 1-800-477-2659
Corporate Smalltalk Consulting Ltd.  http://www.smalltalkconsulting.com
========================================================================
===


Reply | Threaded
Open this post in threaded view
|

Re: Obscure hang

Bert Freudenberg-3
Yes, that would have been helpful, but I haven't built a VM with that  
patch, yet.

- Bert -

Am 09.01.2006 um 18:54 schrieb John M McIntosh:

> Bert have you thought about using that message sending patch we did  
> to use signals to get method stack tracebacks?
>
> On 9-Jan-06, at 9:43 AM, Bert Freudenberg wrote:
>
>> Hi folks,
>>
>> I've had some rather obscure hangs on my squeak-source server  
>> lately. The funny thing is, as soon as I connect to the Xvnc  
>> server and move the mouse, things start rolling again. Otherwise,  
>> it takes a couple of minutes until serving continues. I have no  
>> idea why.
>>
> ======================================================================
> =====
> John M. McIntosh <[hidden email]> 1-800-477-2659
> Corporate Smalltalk Consulting Ltd.  http://
> www.smalltalkconsulting.com
> ======================================================================
> =====