Hi folks,
I've had some rather obscure hangs on my squeak-source server lately. The funny thing is, as soon as I connect to the Xvnc server and move the mouse, things start rolling again. Otherwise, it takes a couple of minutes until serving continues. I have no idea why. Here are the file descriptors for reference. #3 is the X server (Xvnc) connection: squeaksou 30574 source 0r CHR 1,3 31583 /dev/ null squeaksou 30574 source 1w REG 3,5 32895 16482319 .../ some.log squeaksou 30574 source 2w REG 3,5 32895 16482319 .../ some.log squeaksou 30574 source 3u unix 0xf2119ac0 3367464 socket squeaksou 30574 source 4r REG 3,5 23206828 12402706 .../ some.image squeaksou 30574 source 5r DIR 3,5 4096 12402698 .../ somedir squeaksou 30574 source 6r REG 3,5 14542313 17317896 .../ SqueakV3.sources squeaksou 30574 source 7u REG 3,5 20230368 12402705 .../ some.changes squeaksou 30574 source 8u IPv4 4290750 TCP *:someport (LISTEN) Unfortunately I took this only after the hang. Socket #9 must be a file that was interrupted while writing, 10 and 11 is an HTTP connection pair. There were already 3581255 bytes of 4457385 bytes written to file #9 at this point. === According to strace the VM was chugging along just fine: gettimeofday({1136823729, 265939}, NULL) = 0 gettimeofday({1136823729, 265985}, NULL) = 0 gettimeofday({1136823729, 266050}, NULL) = 0 gettimeofday({1136823729, 266092}, NULL) = 0 --- SIGALRM (Alarm clock) @ 0 (0) --- sigreturn() = ? (mask now []) gettimeofday({1136823729, 277789}, NULL) = 0 gettimeofday({1136823729, 277819}, NULL) = 0 gettimeofday({1136823729, 277856}, NULL) = 0 gettimeofday({1136823729, 277877}, NULL) = 0 gettimeofday({1136823729, 277910}, NULL) = 0 gettimeofday({1136823729, 277939}, NULL) = 0 ioctl(3, FIONREAD, [0]) = 0 gettimeofday({1136823729, 278015}, NULL) = 0 select(9, [3], [], [3 8], {0, 0}) = 0 (Timeout) gettimeofday({1136823729, 278212}, NULL) = 0 gettimeofday({1136823729, 289711}, NULL) = 0 gettimeofday({1136823729, 289751}, NULL) = 0 gettimeofday({1136823729, 289813}, NULL) = 0 gettimeofday({1136823729, 289854}, NULL) = 0 ... === but nothing was served to sockets 10/11 nor written to file #9. Now some X data arrives on socket #3 ... gettimeofday({1136823748, 170924}, NULL) = 0 gettimeofday({1136823748, 170956}, NULL) = 0 gettimeofday({1136823748, 170984}, NULL) = 0 ioctl(3, FIONREAD, [32]) = 0 read(3, "\7\1W\204A\306\371\257%\0\0\0\1\0\200\0\2\0\200\0\217 \003"..., 32) = 32 ioctl(3, FIONREAD, [32]) = 0 read(3, "\6\0W\204B\306\371\257%\0\0\0\2\0\200\0\0\0\0\0\211\003"..., 32) = 32 gettimeofday({1136823748, 171177}, NULL) = 0 ioctl(3, FIONREAD, [0]) = 0 gettimeofday({1136823748, 171221}, NULL) = 0 select(9, [3], [], [3 8], {0, 0}) = 0 (Timeout) ... ==== and 5 seconds later it answers the (pending?) request on sockets 10/11 ioctl(3, FIONREAD, [0]) = 0 gettimeofday({1136823753, 233667}, NULL) = 0 select(9, [3], [], [3 8], {0, 0}) = 0 (Timeout) gettimeofday({1136823753, 234002}, NULL) = 0 gettimeofday({1136823753, 234042}, NULL) = 0 gettimeofday({1136823753, 234070}, NULL) = 0 gettimeofday({1136823753, 234110}, NULL) = 0 ioctl(10, FIONBIO, [1]) = 0 gettimeofday({1136823753, 234306}, NULL) = 0 gettimeofday({1136823753, 234330}, NULL) = 0 gettimeofday({1136823753, 234352}, NULL) = 0 gettimeofday({1136823753, 234377}, NULL) = 0 gettimeofday({1136823753, 234410}, NULL) = 0 gettimeofday({1136823753, 234431}, NULL) = 0 gettimeofday({1136823753, 234456}, NULL) = 0 getpeername(10, {sa_family=AF_INET, sin_port=htons(44431), sin_addr=inet_addr("127.0.0.1")}, [16]) = 0 time(NULL) = 1136823753 gettimeofday({1136823753, 234716}, NULL) = 0 gettimeofday({1136823753, 234738}, NULL) = 0 recv(10, "G", 1, MSG_PEEK) = 1 read(10, "GET /3d/?C=M;O=D HTTP/1.1\r\nHost:"..., 2000) = 266 time(NULL) = 1136823753 --- SIGALRM (Alarm clock) @ 0 (0) --- sigreturn() = ? (mask now []) gettimeofday({1136823753, 239059}, NULL) = 0 gettimeofday({1136823753, 239600}, NULL) = 0 gettimeofday({1136823753, 239624}, NULL) = 0 time(NULL) = 1136823753 gettimeofday({1136823753, 243498}, NULL) = 0 select(11, NULL, [10], NULL, {0, 0}) = 1 (out [10], left {0, 0}) write(10, "HTTP/1.1 200 OK\r\nKeep-Alive: tim"..., 190) = 190 gettimeofday({1136823753, 243785}, NULL) = 0 select(11, NULL, [10], NULL, {0, 0}) = 1 (out [10], left {0, 0}) write(10, "<?xml version=\"1.0\" encoding=\"is"..., 3811) = -1 EPIPE (Broken pipe) --- SIGPIPE (Broken pipe) @ 0 (0) --- === immediately followed by writing to file #9 gettimeofday({1136823753, 244020}, NULL) = 0 getpeername(10, 0xbffff9d0, [16]) = -1 ENOTCONN (Transport endpoint is not connected) time(NULL) = 1136823753 setsockopt(10, SOL_SOCKET, SO_LINGER, {onoff=0, linger=0}, 8) = 0 close(10) = 0 write(9, "\10", 1) = 1 write(9, "\0", 1) = 1 write(9, "\0", 1) = 1 write(9, "\0", 1) = 1 write(9, "\1", 1) = 1 write(9, "\t", 1) = 1 _llseek(9, 3579904, [3579904], SEEK_SET) = 0 read(9, "ubclasses.\20\3\37\312\20\5\37\325\4\0\0\0\0\n\0\34\315"..., 1356) = 1356 read(9, "\t", 4096) = 1 _llseek(9, 3581261, [3581261], SEEK_SET) = 0 _llseek(9, 3581261, [3581261], SEEK_SET) = 0 _llseek(9, -1, [3581260], SEEK_CUR) = 0 write(9, "\20", 1) = 1 write(9, "\t", 1) = 1 write(9, "\37", 1) = 1 write(9, "\3", 1) = 1 === 11 seconds later the saving was done (yes storing is that slow): close(9) = 0 munmap(0x40188000, 4096) = 0 gettimeofday({1136823764, 800773}, NULL) = 0 === ... and everything is back to normal ... All this is on Linux, 2.4 kernel, Ian's 3.7.7 VM. Did anybody see something similar? Any ideas? - Bert - |
Bert have you thought about using that message sending patch we did
to use signals to get method stack tracebacks? On 9-Jan-06, at 9:43 AM, Bert Freudenberg wrote: > Hi folks, > > I've had some rather obscure hangs on my squeak-source server > lately. The funny thing is, as soon as I connect to the Xvnc server > and move the mouse, things start rolling again. Otherwise, it takes > a couple of minutes until serving continues. I have no idea why. > ======================================================================== === John M. McIntosh <[hidden email]> 1-800-477-2659 Corporate Smalltalk Consulting Ltd. http://www.smalltalkconsulting.com ======================================================================== === |
Yes, that would have been helpful, but I haven't built a VM with that
patch, yet. - Bert - Am 09.01.2006 um 18:54 schrieb John M McIntosh: > Bert have you thought about using that message sending patch we did > to use signals to get method stack tracebacks? > > On 9-Jan-06, at 9:43 AM, Bert Freudenberg wrote: > >> Hi folks, >> >> I've had some rather obscure hangs on my squeak-source server >> lately. The funny thing is, as soon as I connect to the Xvnc >> server and move the mouse, things start rolling again. Otherwise, >> it takes a couple of minutes until serving continues. I have no >> idea why. >> > ====================================================================== > ===== > John M. McIntosh <[hidden email]> 1-800-477-2659 > Corporate Smalltalk Consulting Ltd. http:// > www.smalltalkconsulting.com > ====================================================================== > ===== |
Free forum by Nabble | Edit this page |