Oddd image lock-up on Unix

Previous Topic Next Topic
 
classic Classic list List threaded Threaded
10 messages Options
Reply | Threaded
Open this post in threaded view
|

Oddd image lock-up on Unix

Andreas.Raab
 
Hi Guys -

We just had a very strange image lockup on our server. After a variety
of attempts to connect to the image we finally fired up gdb and had the
image do a printAllStacks. The dump is attached below, what is
interesting is that it appears as if the Delay mutex was entered but not
left successfully, e.g.,

Process
-1737213740 Semaphore>critical:
-1755652204 Delay class>timerInterruptWatcher
-1755651864 [] in Delay class>startTimerInterruptWatcher

Process
-1737212772 Semaphore>critical:
-1737212956 Delay>unschedule
-1737226548 Semaphore>waitTimeoutMSecs:

And from there things go downhill quickly. The one change that I did do
in this image was to include Tim's "large delay fix" and I am wondering
if that may be related to it?

In any case, if anyone has ever seen something similar, I'd really
appreciate finding out more about it. Seeing how it keeps things just
entirely locked is quite disturbing.

Cheers,
   - Andreas

The full dump:

Process
-1742550148 ProcessorScheduler class>idleProcess
-1743108016 [] in ProcessorScheduler class>startUp
-1743107924 [] in BlockContext>newProcess

Process
-1737212168 Semaphore>wait
-1743080028 WeakArray class>finalizationProcess
-1743080212 [] in WeakArray class>restartFinalizationProcess
-1743080120 [] in BlockContext>newProcess

Process
-1737219904 Semaphore>critical:
-1737219996 Delay>unschedule
-1737229232 Semaphore>waitTimeoutMSecs:
-1737229692 Socket>waitForDataFor:ifClosed:ifTimedOut:
-1737229784 Socket>waitForDataFor:
-1737229600 [] in TMessageRelay>runReaderProcess
-1737229876 BlockContext>on:do:
-1738663052 TMessageRelay>runReaderProcess
-1738658624 [] in Object>fork:at:
-1738658532 [] in BlockContext>newProcess

Process
-1737218372 Semaphore>critical:
-1737218580 Delay>schedule
-1737218672 Delay class>timeoutSemaphore:afterMSecs:
-1737218764 Semaphore>waitTimeoutMSecs:
-1737219256 Socket>waitForDataFor:ifClosed:ifTimedOut:
-1737219348 Socket>waitForDataFor:
-1737219132 [] in TMessageRelay>runReaderProcess
-1737219440 BlockContext>on:do:
-1742529444 TMessageRelay>runReaderProcess
-1742525016 [] in Object>fork:at:
-1742524924 [] in BlockContext>newProcess

Process
-1737217168 Semaphore>critical:
-1737217376 Delay>schedule
-1737217468 Delay class>timeoutSemaphore:afterMSecs:
-1737217560 Semaphore>waitTimeoutMSecs:
-1737218052 Socket>waitForDataFor:ifClosed:ifTimedOut:
-1737218144 Socket>waitForDataFor:
-1737217928 [] in TMessageRelay>runReaderProcess
-1737218236 BlockContext>on:do:
-1742510448 TMessageRelay>runReaderProcess
-1742506020 [] in Object>fork:at:
-1742505928 [] in BlockContext>newProcess

Process
-1737215980 Semaphore>critical:
-1737216188 Delay>schedule
-1737216280 Delay class>timeoutSemaphore:afterMSecs:
-1737216372 Semaphore>waitTimeoutMSecs:
-1737216864 Socket>waitForDataFor:ifClosed:ifTimedOut:
-1737216956 Socket>waitForDataFor:
-1737216740 [] in TMessageRelay>runReaderProcess
-1737217048 BlockContext>on:do:
-1742501032 TMessageRelay>runReaderProcess
-1742496604 [] in Object>fork:at:
-1742496512 [] in BlockContext>newProcess

Process
-1737215020 Semaphore>critical:
-1737215112 Delay>schedule
-1737215204 Delay>wait
-1742522336 [] in TMediaControlEndpoint>runPinger
-1742522612 BlockContext>repeat
-1742522704 TMediaControlEndpoint>runPinger
-1742522520 [] in TMediaControlEndpoint>startPinger
-1742522796 BlockContext>on:do:
-1742523112 [] in TMediaControlEndpoint>startPinger
-1742522928 [] in BlockContext>newProcess

Process
-1737214400 Semaphore>critical:
-1737214492 Delay>schedule
-1737214584 Delay>wait
-1742493924 [] in TMediaControlEndpoint>runPinger
-1742494200 BlockContext>repeat
-1742494292 TMediaControlEndpoint>runPinger
-1742494108 [] in TMediaControlEndpoint>startPinger
-1742494384 BlockContext>on:do:
-1742494700 [] in TMediaControlEndpoint>startPinger
-1742494516 [] in BlockContext>newProcess

Process
-1737213740 Semaphore>critical:
-1755652204 Delay class>timerInterruptWatcher
-1755651864 [] in Delay class>startTimerInterruptWatcher
-1755651772 [] in BlockContext>newProcess

Process
-1737212772 Semaphore>critical:
-1737212956 Delay>unschedule
-1737226548 Semaphore>waitTimeoutMSecs:
-1737227040 Socket>waitForDataFor:ifClosed:ifTimedOut:
-1737227132 Socket>waitForDataFor:
-1737226916 [] in TMessageRelay>runReaderProcess
-1737227224 BlockContext>on:do:
-1738650508 TMessageRelay>runReaderProcess
-1738650272 [] in Object>fork:at:
-1738650180 [] in BlockContext>newProcess

Process
-1737212496 Semaphore>critical:
-1737212680 Delay>unschedule
-1737225588 Semaphore>waitTimeoutMSecs:
-1737226080 Socket>waitForDataFor:ifClosed:ifTimedOut:
-1737226172 Socket>waitForDataFor:
-1737225956 [] in TMessageRelay>runReaderProcess
-1737226264 BlockContext>on:do:
-1738361832 TMessageRelay>runReaderProcess
-1738361596 [] in Object>fork:at:
-1738361504 [] in BlockContext>newProcess

Process
-1737175104 Semaphore>critical:
-1737175288 Delay>unschedule
-1737266784 Semaphore>waitTimeoutMSecs:
-1737266876 Socket>waitForDataFor:ifClosed:ifTimedOut:
-1737266968 Socket>waitForDataFor:
-1737275344 Socket>receiveDataSignallingTimeout:into:startingAt:
-1737275436 SocketStream>receiveData
-1737275528 SocketStream>next
-1737275620 XMLTokenizer>peek
-1737275712 XMLTokenizer>nextWhitespace
-1737276052 XMLTokenizer>nextEntity
-1737276144 XMLDOMParser>nextEntity
-1737276236 QServiceProviderV2>processRequest:
-1737275952 [] in QServiceProviderV2>runServiceProcess
-1737276328 BlockContext>on:do:
-1743007756 QServiceProviderV2>runServiceProcess
-1743044068 [] in Object>fork:at:
-1743043976 [] in BlockContext>newProcess

Process
-1737173864 Semaphore>critical:
-1737174000 Delay>schedule
-1737174092 Delay>wait
-1737174184 WorldState>interCyclePause:
-1737174920 WorldState>doOneCycleFor:
-1737175012 PasteUpMorph>doOneCycle
-1738256796 [] in Project class>spawnNewProcess
-1738256612 [] in BlockContext>newProcess

Process
-1737173496 Semaphore>critical:
-1737173680 Delay>unschedule
-1737288304 Semaphore>waitTimeoutMSecs:
-1737288764 Socket>waitForDataFor:ifClosed:ifTimedOut:
-1737288856 Socket>waitForDataFor:
-1737288672 [] in TMessageRelay>runReaderProcess
-1737288948 BlockContext>on:do:
-1742548340 TMessageRelay>runReaderProcess
-1742543912 [] in Object>fork:at:
-1742543820 [] in BlockContext>newProcess

Process
-1737173220 Semaphore>critical:
-1737173404 Delay>unschedule
-1737277292 Semaphore>waitTimeoutMSecs:
-1737277752 Socket>waitForDataFor:ifClosed:ifTimedOut:
-1737277844 Socket>waitForDataFor:
-1737277660 [] in TMessageRelay>runReaderProcess
-1737277936 BlockContext>on:do:
-1738378076 TMessageRelay>runReaderProcess
-1738373648 [] in Object>fork:at:
-1738373556 [] in BlockContext>newProcess

Process
-1737172944 Semaphore>critical:
-1737173128 Delay>unschedule
-1737231056 Semaphore>waitTimeoutMSecs:
-1737231548 Socket>waitForDataFor:ifClosed:ifTimedOut:
-1737231640 Socket>waitForDataFor:
-1737231424 [] in TMessageRelay>runReaderProcess
-1737231732 BlockContext>on:do:
-1742538860 TMessageRelay>runReaderProcess
-1742534432 [] in Object>fork:at:
-1742534340 [] in BlockContext>newProcess

Process
-1737172668 Semaphore>critical:
-1737172852 Delay>unschedule
-1737230128 Semaphore>waitTimeoutMSecs:
-1737230620 Socket>waitForDataFor:ifClosed:ifTimedOut:
-1737230712 Socket>waitForDataFor:
-1737230496 [] in TMessageRelay>runReaderProcess
-1737230804 BlockContext>on:do:
-1742519864 TMessageRelay>runReaderProcess
-1742515436 [] in Object>fork:at:
-1742515344 [] in BlockContext>newProcess

Process
-1737172392 Semaphore>critical:
-1737172576 Delay>unschedule
-1737279024 Semaphore>waitTimeoutMSecs:
-1737279116 Socket>waitForConnectionFor:ifTimedOut:
-1737279484 Socket>waitForConnectionFor:
-1737279392 [] in Socket>waitForAcceptFor:
-1737279588 BlockContext>on:do:
-1737279696 Socket>waitForAcceptFor:
-1743045624 TConnectionDispatcher>runServer
-1743045808 [] in Object>fork:at:
-1743045716 [] in BlockContext>newProcess

Process
-1737172116 Semaphore>critical:
-1737172300 Delay>unschedule
-1737311692 Semaphore>waitTimeoutMSecs:
-1737312152 Socket>waitForDataFor:ifClosed:ifTimedOut:
-1737312244 Socket>waitForDataFor:
-1737312060 [] in TMessageRelay>runReaderProcess
-1737312336 BlockContext>on:do:
-1737314988 TMessageRelay>runReaderProcess
-1737314752 [] in Object>fork:at:
-1737314660 [] in BlockContext>newProcess

Process
-1737171840 Semaphore>critical:
-1737172024 Delay>unschedule
-1737302368 Semaphore>waitTimeoutMSecs:
-1737302460 Socket>waitForConnectionFor:ifTimedOut:
-1737302644 Socket>waitForConnectionFor:
-1737302920 [] in Socket>waitForAcceptFor:
-1737302736 BlockContext>on:do:
-1737303012 Socket>waitForAcceptFor:
-1737303196 [] in RFBServer>runLoop
-1737303288 BlockContext>on:do:
-1742896304 RFBServer>runLoop
-1742896092 [] in RFBServer>start:
-1742896000 [] in BlockContext>newProcess

Process
-1749883968 Semaphore>wait
-1749884060 Python class>runCallbackProcess
-1749884336 [] in Object>fork:at:
-1749884152 [] in BlockContext>newProcess

Process
-1743109816 Semaphore>wait
-1743109948 InputSensor>userInterruptWatcher
-1743110152 [] in InputSensor>installInterruptWatcher
-1743110040 [] in BlockContext>newProcess

Process
-1737224332 Semaphore>wait
-1737224424 Delay>wait
-1737224608 [] in EventSensor>eventTickler
-1737224700 BlockContext>on:do:
-1743109356 EventSensor>eventTickler
-1743109632 [] in EventSensor>installEventTickler
-1743109448 [] in BlockContext>newProcess

Process
-1743108200 Semaphore>wait
-1743108444 SystemDictionary>lowSpaceWatcher
-1743108628 [] in SystemDictionary>installLowSpaceWatcher
-1743108352 [] in BlockContext>newProcess

Process
-1743103636 Semaphore>wait
-1743103544 [] in UnixOSProcessAccessor>grimReaperProcess
-1743103728 BlockContext>repeat
-1743106440 [] in UnixOSProcessAccessor>grimReaperProcess
-1743106348 [] in BlockContext>newProcess

Process
-1737181332 Semaphore>wait
-1737181424 SharedQueue>next
-1743047256 TSessionDispatcher>runLoggerProcess
-1743047000 [] in Object>fork:at:
-1743046908 [] in BlockContext>newProcess

Process
-1737453796 Semaphore>wait
-1737453888 Delay>wait
-1743007480 QServiceProviderV2>runUpdateProcess
-1743043708 [] in Object>fork:at:
-1743043524 [] in BlockContext>newProcess

Process
-1737287448 Semaphore>wait
-1737287540 SharedQueue>next
-1742549512 TMessageRouter>runEventLoop
-1742549336 [] in Object>fork:at:
-1742549244 [] in BlockContext>newProcess

Process
-1737287884 Semaphore>wait
-1737287976 SharedQueue>next
-1742543240 TMessageRelay>runWriterProcess
-1742543424 [] in Object>fork:at:
-1742543332 [] in BlockContext>newProcess

Process
-1737223064 Semaphore>wait
-1737223156 SharedQueue>next
-1742540032 TMessageRouter>runEventLoop
-1742539856 [] in Object>fork:at:
-1742539764 [] in BlockContext>newProcess

Process
-1737223840 Semaphore>wait
-1737223932 SharedQueue>next
-1742533760 TMessageRelay>runWriterProcess
-1742533944 [] in Object>fork:at:
-1742533852 [] in BlockContext>newProcess

Process
-1737215796 Semaphore>wait
-1737215888 SharedQueue>next
-1742530616 TMessageRouter>runEventLoop
-1742530440 [] in Object>fork:at:
-1742530348 [] in BlockContext>newProcess

Process
-1737222124 Semaphore>wait
-1737222216 SharedQueue>next
-1742524344 TMessageRelay>runWriterProcess
-1742524528 [] in Object>fork:at:
-1742524436 [] in BlockContext>newProcess

Process
-1737222880 Semaphore>wait
-1737222972 SharedQueue>next
-1742521036 TMessageRouter>runEventLoop
-1742520860 [] in Object>fork:at:
-1742520768 [] in BlockContext>newProcess

Process
-1737223536 Semaphore>wait
-1737223628 SharedQueue>next
-1742514764 TMessageRelay>runWriterProcess
-1742514948 [] in Object>fork:at:
-1742514856 [] in BlockContext>newProcess

Process
-1737214824 Semaphore>wait
-1737214928 SharedQueue>next
-1742511620 TMessageRouter>runEventLoop
-1742511444 [] in Object>fork:at:
-1742511352 [] in BlockContext>newProcess

Process
-1737215492 Semaphore>wait
-1737215584 SharedQueue>next
-1742505348 TMessageRelay>runWriterProcess
-1742505532 [] in Object>fork:at:
-1742505440 [] in BlockContext>newProcess

Process
-1737215308 Semaphore>wait
-1737215400 SharedQueue>next
-1742502204 TMessageRouter>runEventLoop
-1742502028 [] in Object>fork:at:
-1742501936 [] in BlockContext>newProcess

Process
-1737223248 Semaphore>wait
-1737223340 SharedQueue>next
-1742495932 TMessageRelay>runWriterProcess
-1742496116 [] in Object>fork:at:
-1742496024 [] in BlockContext>newProcess

Process
-1742275584 Semaphore>wait
-1742275676 Socket>waitForDataIfClosed:
-1742275768 Socket>waitForData
-1742275860 TConnectionDispatcher>acceptConnectionFrom:
-1742276136 [] in TConnectionDispatcher>futureAcceptConnectionFrom:
-1742275952 BlockContext>on:do:
-1742277028 [] in TConnectionDispatcher>futureAcceptConnectionFrom:
-1742276936 [] in BlockContext>newProcess

Process
-1737204640 Semaphore>wait
-1737204732 SharedQueue>next
-1738680880 TMessageRouter>runEventLoop
-1738680752 [] in Object>fork:at:
-1738680660 [] in BlockContext>newProcess

Process
-1737221816 Semaphore>wait
-1737221908 SharedQueue>next
-1738657952 TMessageRelay>runWriterProcess
-1738658136 [] in Object>fork:at:
-1738658044 [] in BlockContext>newProcess

Process
-1737211292 Semaphore>wait
-1737211384 SharedQueue>next
-1738649596 TMessageRelay>runWriterProcess
-1738649872 [] in Object>fork:at:
-1738649688 [] in BlockContext>newProcess

Process
-1737211800 Semaphore>wait
-1737220784 Delay>wait
-1738506672 TMessageRouterClient>runHeartbeat
-1738506996 [] in Object>fork:at:
-1738506904 [] in BlockContext>newProcess

Process
-1737276872 Semaphore>wait
-1737276964 SharedQueue>next
-1738372976 TMessageRelay>runWriterProcess
-1738373160 [] in Object>fork:at:
-1738373068 [] in BlockContext>newProcess

Process
-1737211108 Semaphore>wait
-1737211200 SharedQueue>next
-1738369928 TMessageRouter>runEventLoop
-1738369800 [] in Object>fork:at:
-1738369708 [] in BlockContext>newProcess

Process
-1737212260 Semaphore>wait
-1737212352 SharedQueue>next
-1738360920 TMessageRelay>runWriterProcess
-1738361196 [] in Object>fork:at:
-1738361012 [] in BlockContext>newProcess

Process
-1737211892 Semaphore>wait
-1737220876 Delay>wait
-1738305304 TMessageRouterClient>runHeartbeat
-1738305592 [] in Object>fork:at:
-1738305500 [] in BlockContext>newProcess

Process
-1737299840 Semaphore>wait
-1737299932 Delay>wait
-1738303108 [] in TAudioStreamInput>runPinger
-1738303244 BlockContext>repeat
-1738303376 TAudioStreamInput>runPinger
-1738303652 [] in TAudioStreamInput>startPinger
-1738303468 [] in BlockContext>newProcess

Process
-1737301448 Semaphore>wait
-1737301540 Delay>wait
-1737950096 [] in TAudioStreamInput>runPinger
-1737950232 BlockContext>repeat
-1737950364 TAudioStreamInput>runPinger
-1737950640 [] in TAudioStreamInput>startPinger
-1737950456 [] in BlockContext>newProcess

Process
-1737211984 Semaphore>wait
-1737212076 SharedQueue>next
-1737323020 TMessageRouter>runEventLoop
-1737322892 [] in Object>fork:at:
-1737322800 [] in BlockContext>newProcess

Process
-1737320664 Semaphore>wait
-1737320756 SharedQueue>next
-1737320572 [] in TErrorReportPlugin class>runLoggerProcess
-1737320848 BlockContext>on:do:
-1737321304 TErrorReportPlugin class>runLoggerProcess
-1737321212 [] in Object>fork:at:
-1737321120 [] in BlockContext>newProcess

Process
-1737219576 Semaphore>wait
-1737219668 SharedQueue>next
-1737314076 TMessageRelay>runWriterProcess
-1737314352 [] in Object>fork:at:
-1737314168 [] in BlockContext>newProcess

Process
-1737220992 Semaphore>wait
-1737221084 Delay>wait
-1737312544 TMessageRouterClient>runHeartbeat
-1737312924 [] in Object>fork:at:
-1737312740 [] in BlockContext>newProcess

Process
-1737294120 Semaphore>wait
-1737294268 Delay>wait
-1742503340 [] in TMediaControlEndpoint>runPinger
-1742503616 BlockContext>repea


Reply | Threaded
Open this post in threaded view
|

Re: Oddd image lock-up on Unix

Bert Freudenberg
 
A backtrace from gdb might have been useful, too.

There was an increasing number of reports of Linux VMs freezing or  
crashing lately. We suspect some possibly unrelated problems. One is  
due to oops-as-signed, a symptom is that GC fails to find the end of  
memory. The other seems related to async i/o. In yet others (and this  
may be yours) we see nanosleeps that are called with huge negative  
microsecond counts, yet higher up in the stack the value was 1000  
microseconds.

- Bert -

On Apr 2, 2007, at 21:22 , Andreas Raab wrote:

> Hi Guys -
>
> We just had a very strange image lockup on our server. After a  
> variety of attempts to connect to the image we finally fired up gdb  
> and had the image do a printAllStacks. The dump is attached below,  
> what is interesting is that it appears as if the Delay mutex was  
> entered but not left successfully, e.g.,
>
> Process
> -1737213740 Semaphore>critical:
> -1755652204 Delay class>timerInterruptWatcher
> -1755651864 [] in Delay class>startTimerInterruptWatcher
>
> Process
> -1737212772 Semaphore>critical:
> -1737212956 Delay>unschedule
> -1737226548 Semaphore>waitTimeoutMSecs:
>
> And from there things go downhill quickly. The one change that I  
> did do in this image was to include Tim's "large delay fix" and I  
> am wondering if that may be related to it?
>
> In any case, if anyone has ever seen something similar, I'd really  
> appreciate finding out more about it. Seeing how it keeps things  
> just entirely locked is quite disturbing.
>
> Cheers,
>   - Andreas
>
> The full dump:
>
> Process
> -1742550148 ProcessorScheduler class>idleProcess
> -1743108016 [] in ProcessorScheduler class>startUp
> -1743107924 [] in BlockContext>newProcess
>
> Process
> -1737212168 Semaphore>wait
> -1743080028 WeakArray class>finalizationProcess
> -1743080212 [] in WeakArray class>restartFinalizationProcess
> -1743080120 [] in BlockContext>newProcess
>
> Process
> -1737219904 Semaphore>critical:
> -1737219996 Delay>unschedule
> -1737229232 Semaphore>waitTimeoutMSecs:
> -1737229692 Socket>waitForDataFor:ifClosed:ifTimedOut:
> -1737229784 Socket>waitForDataFor:
> -1737229600 [] in TMessageRelay>runReaderProcess
> -1737229876 BlockContext>on:do:
> -1738663052 TMessageRelay>runReaderProcess
> -1738658624 [] in Object>fork:at:
> -1738658532 [] in BlockContext>newProcess
>
> Process
> -1737218372 Semaphore>critical:
> -1737218580 Delay>schedule
> -1737218672 Delay class>timeoutSemaphore:afterMSecs:
> -1737218764 Semaphore>waitTimeoutMSecs:
> -1737219256 Socket>waitForDataFor:ifClosed:ifTimedOut:
> -1737219348 Socket>waitForDataFor:
> -1737219132 [] in TMessageRelay>runReaderProcess
> -1737219440 BlockContext>on:do:
> -1742529444 TMessageRelay>runReaderProcess
> -1742525016 [] in Object>fork:at:
> -1742524924 [] in BlockContext>newProcess
>
> Process
> -1737217168 Semaphore>critical:
> -1737217376 Delay>schedule
> -1737217468 Delay class>timeoutSemaphore:afterMSecs:
> -1737217560 Semaphore>waitTimeoutMSecs:
> -1737218052 Socket>waitForDataFor:ifClosed:ifTimedOut:
> -1737218144 Socket>waitForDataFor:
> -1737217928 [] in TMessageRelay>runReaderProcess
> -1737218236 BlockContext>on:do:
> -1742510448 TMessageRelay>runReaderProcess
> -1742506020 [] in Object>fork:at:
> -1742505928 [] in BlockContext>newProcess
>
> Process
> -1737215980 Semaphore>critical:
> -1737216188 Delay>schedule
> -1737216280 Delay class>timeoutSemaphore:afterMSecs:
> -1737216372 Semaphore>waitTimeoutMSecs:
> -1737216864 Socket>waitForDataFor:ifClosed:ifTimedOut:
> -1737216956 Socket>waitForDataFor:
> -1737216740 [] in TMessageRelay>runReaderProcess
> -1737217048 BlockContext>on:do:
> -1742501032 TMessageRelay>runReaderProcess
> -1742496604 [] in Object>fork:at:
> -1742496512 [] in BlockContext>newProcess
>
> Process
> -1737215020 Semaphore>critical:
> -1737215112 Delay>schedule
> -1737215204 Delay>wait
> -1742522336 [] in TMediaControlEndpoint>runPinger
> -1742522612 BlockContext>repeat
> -1742522704 TMediaControlEndpoint>runPinger
> -1742522520 [] in TMediaControlEndpoint>startPinger
> -1742522796 BlockContext>on:do:
> -1742523112 [] in TMediaControlEndpoint>startPinger
> -1742522928 [] in BlockContext>newProcess
>
> Process
> -1737214400 Semaphore>critical:
> -1737214492 Delay>schedule
> -1737214584 Delay>wait
> -1742493924 [] in TMediaControlEndpoint>runPinger
> -1742494200 BlockContext>repeat
> -1742494292 TMediaControlEndpoint>runPinger
> -1742494108 [] in TMediaControlEndpoint>startPinger
> -1742494384 BlockContext>on:do:
> -1742494700 [] in TMediaControlEndpoint>startPinger
> -1742494516 [] in BlockContext>newProcess
>
> Process
> -1737213740 Semaphore>critical:
> -1755652204 Delay class>timerInterruptWatcher
> -1755651864 [] in Delay class>startTimerInterruptWatcher
> -1755651772 [] in BlockContext>newProcess
>
> Process
> -1737212772 Semaphore>critical:
> -1737212956 Delay>unschedule
> -1737226548 Semaphore>waitTimeoutMSecs:
> -1737227040 Socket>waitForDataFor:ifClosed:ifTimedOut:
> -1737227132 Socket>waitForDataFor:
> -1737226916 [] in TMessageRelay>runReaderProcess
> -1737227224 BlockContext>on:do:
> -1738650508 TMessageRelay>runReaderProcess
> -1738650272 [] in Object>fork:at:
> -1738650180 [] in BlockContext>newProcess
>
> Process
> -1737212496 Semaphore>critical:
> -1737212680 Delay>unschedule
> -1737225588 Semaphore>waitTimeoutMSecs:
> -1737226080 Socket>waitForDataFor:ifClosed:ifTimedOut:
> -1737226172 Socket>waitForDataFor:
> -1737225956 [] in TMessageRelay>runReaderProcess
> -1737226264 BlockContext>on:do:
> -1738361832 TMessageRelay>runReaderProcess
> -1738361596 [] in Object>fork:at:
> -1738361504 [] in BlockContext>newProcess
>
> Process
> -1737175104 Semaphore>critical:
> -1737175288 Delay>unschedule
> -1737266784 Semaphore>waitTimeoutMSecs:
> -1737266876 Socket>waitForDataFor:ifClosed:ifTimedOut:
> -1737266968 Socket>waitForDataFor:
> -1737275344 Socket>receiveDataSignallingTimeout:into:startingAt:
> -1737275436 SocketStream>receiveData
> -1737275528 SocketStream>next
> -1737275620 XMLTokenizer>peek
> -1737275712 XMLTokenizer>nextWhitespace
> -1737276052 XMLTokenizer>nextEntity
> -1737276144 XMLDOMParser>nextEntity
> -1737276236 QServiceProviderV2>processRequest:
> -1737275952 [] in QServiceProviderV2>runServiceProcess
> -1737276328 BlockContext>on:do:
> -1743007756 QServiceProviderV2>runServiceProcess
> -1743044068 [] in Object>fork:at:
> -1743043976 [] in BlockContext>newProcess
>
> Process
> -1737173864 Semaphore>critical:
> -1737174000 Delay>schedule
> -1737174092 Delay>wait
> -1737174184 WorldState>interCyclePause:
> -1737174920 WorldState>doOneCycleFor:
> -1737175012 PasteUpMorph>doOneCycle
> -1738256796 [] in Project class>spawnNewProcess
> -1738256612 [] in BlockContext>newProcess
>
> Process
> -1737173496 Semaphore>critical:
> -1737173680 Delay>unschedule
> -1737288304 Semaphore>waitTimeoutMSecs:
> -1737288764 Socket>waitForDataFor:ifClosed:ifTimedOut:
> -1737288856 Socket>waitForDataFor:
> -1737288672 [] in TMessageRelay>runReaderProcess
> -1737288948 BlockContext>on:do:
> -1742548340 TMessageRelay>runReaderProcess
> -1742543912 [] in Object>fork:at:
> -1742543820 [] in BlockContext>newProcess
>
> Process
> -1737173220 Semaphore>critical:
> -1737173404 Delay>unschedule
> -1737277292 Semaphore>waitTimeoutMSecs:
> -1737277752 Socket>waitForDataFor:ifClosed:ifTimedOut:
> -1737277844 Socket>waitForDataFor:
> -1737277660 [] in TMessageRelay>runReaderProcess
> -1737277936 BlockContext>on:do:
> -1738378076 TMessageRelay>runReaderProcess
> -1738373648 [] in Object>fork:at:
> -1738373556 [] in BlockContext>newProcess
>
> Process
> -1737172944 Semaphore>critical:
> -1737173128 Delay>unschedule
> -1737231056 Semaphore>waitTimeoutMSecs:
> -1737231548 Socket>waitForDataFor:ifClosed:ifTimedOut:
> -1737231640 Socket>waitForDataFor:
> -1737231424 [] in TMessageRelay>runReaderProcess
> -1737231732 BlockContext>on:do:
> -1742538860 TMessageRelay>runReaderProcess
> -1742534432 [] in Object>fork:at:
> -1742534340 [] in BlockContext>newProcess
>
> Process
> -1737172668 Semaphore>critical:
> -1737172852 Delay>unschedule
> -1737230128 Semaphore>waitTimeoutMSecs:
> -1737230620 Socket>waitForDataFor:ifClosed:ifTimedOut:
> -1737230712 Socket>waitForDataFor:
> -1737230496 [] in TMessageRelay>runReaderProcess
> -1737230804 BlockContext>on:do:
> -1742519864 TMessageRelay>runReaderProcess
> -1742515436 [] in Object>fork:at:
> -1742515344 [] in BlockContext>newProcess
>
> Process
> -1737172392 Semaphore>critical:
> -1737172576 Delay>unschedule
> -1737279024 Semaphore>waitTimeoutMSecs:
> -1737279116 Socket>waitForConnectionFor:ifTimedOut:
> -1737279484 Socket>waitForConnectionFor:
> -1737279392 [] in Socket>waitForAcceptFor:
> -1737279588 BlockContext>on:do:
> -1737279696 Socket>waitForAcceptFor:
> -1743045624 TConnectionDispatcher>runServer
> -1743045808 [] in Object>fork:at:
> -1743045716 [] in BlockContext>newProcess
>
> Process
> -1737172116 Semaphore>critical:
> -1737172300 Delay>unschedule
> -1737311692 Semaphore>waitTimeoutMSecs:
> -1737312152 Socket>waitForDataFor:ifClosed:ifTimedOut:
> -1737312244 Socket>waitForDataFor:
> -1737312060 [] in TMessageRelay>runReaderProcess
> -1737312336 BlockContext>on:do:
> -1737314988 TMessageRelay>runReaderProcess
> -1737314752 [] in Object>fork:at:
> -1737314660 [] in BlockContext>newProcess
>
> Process
> -1737171840 Semaphore>critical:
> -1737172024 Delay>unschedule
> -1737302368 Semaphore>waitTimeoutMSecs:
> -1737302460 Socket>waitForConnectionFor:ifTimedOut:
> -1737302644 Socket>waitForConnectionFor:
> -1737302920 [] in Socket>waitForAcceptFor:
> -1737302736 BlockContext>on:do:
> -1737303012 Socket>waitForAcceptFor:
> -1737303196 [] in RFBServer>runLoop
> -1737303288 BlockContext>on:do:
> -1742896304 RFBServer>runLoop
> -1742896092 [] in RFBServer>start:
> -1742896000 [] in BlockContext>newProcess
>
> Process
> -1749883968 Semaphore>wait
> -1749884060 Python class>runCallbackProcess
> -1749884336 [] in Object>fork:at:
> -1749884152 [] in BlockContext>newProcess
>
> Process
> -1743109816 Semaphore>wait
> -1743109948 InputSensor>userInterruptWatcher
> -1743110152 [] in InputSensor>installInterruptWatcher
> -1743110040 [] in BlockContext>newProcess
>
> Process
> -1737224332 Semaphore>wait
> -1737224424 Delay>wait
> -1737224608 [] in EventSensor>eventTickler
> -1737224700 BlockContext>on:do:
> -1743109356 EventSensor>eventTickler
> -1743109632 [] in EventSensor>installEventTickler
> -1743109448 [] in BlockContext>newProcess
>
> Process
> -1743108200 Semaphore>wait
> -1743108444 SystemDictionary>lowSpaceWatcher
> -1743108628 [] in SystemDictionary>installLowSpaceWatcher
> -1743108352 [] in BlockContext>newProcess
>
> Process
> -1743103636 Semaphore>wait
> -1743103544 [] in UnixOSProcessAccessor>grimReaperProcess
> -1743103728 BlockContext>repeat
> -1743106440 [] in UnixOSProcessAccessor>grimReaperProcess
> -1743106348 [] in BlockContext>newProcess
>
> Process
> -1737181332 Semaphore>wait
> -1737181424 SharedQueue>next
> -1743047256 TSessionDispatcher>runLoggerProcess
> -1743047000 [] in Object>fork:at:
> -1743046908 [] in BlockContext>newProcess
>
> Process
> -1737453796 Semaphore>wait
> -1737453888 Delay>wait
> -1743007480 QServiceProviderV2>runUpdateProcess
> -1743043708 [] in Object>fork:at:
> -1743043524 [] in BlockContext>newProcess
>
> Process
> -1737287448 Semaphore>wait
> -1737287540 SharedQueue>next
> -1742549512 TMessageRouter>runEventLoop
> -1742549336 [] in Object>fork:at:
> -1742549244 [] in BlockContext>newProcess
>
> Process
> -1737287884 Semaphore>wait
> -1737287976 SharedQueue>next
> -1742543240 TMessageRelay>runWriterProcess
> -1742543424 [] in Object>fork:at:
> -1742543332 [] in BlockContext>newProcess
>
> Process
> -1737223064 Semaphore>wait
> -1737223156 SharedQueue>next
> -1742540032 TMessageRouter>runEventLoop
> -1742539856 [] in Object>fork:at:
> -1742539764 [] in BlockContext>newProcess
>
> Process
> -1737223840 Semaphore>wait
> -1737223932 SharedQueue>next
> -1742533760 TMessageRelay>runWriterProcess
> -1742533944 [] in Object>fork:at:
> -1742533852 [] in BlockContext>newProcess
>
> Process
> -1737215796 Semaphore>wait
> -1737215888 SharedQueue>next
> -1742530616 TMessageRouter>runEventLoop
> -1742530440 [] in Object>fork:at:
> -1742530348 [] in BlockContext>newProcess
>
> Process
> -1737222124 Semaphore>wait
> -1737222216 SharedQueue>next
> -1742524344 TMessageRelay>runWriterProcess
> -1742524528 [] in Object>fork:at:
> -1742524436 [] in BlockContext>newProcess
>
> Process
> -1737222880 Semaphore>wait
> -1737222972 SharedQueue>next
> -1742521036 TMessageRouter>runEventLoop
> -1742520860 [] in Object>fork:at:
> -1742520768 [] in BlockContext>newProcess
>
> Process
> -1737223536 Semaphore>wait
> -1737223628 SharedQueue>next
> -1742514764 TMessageRelay>runWriterProcess
> -1742514948 [] in Object>fork:at:
> -1742514856 [] in BlockContext>newProcess
>
> Process
> -1737214824 Semaphore>wait
> -1737214928 SharedQueue>next
> -1742511620 TMessageRouter>runEventLoop
> -1742511444 [] in Object>fork:at:
> -1742511352 [] in BlockContext>newProcess
>
> Process
> -1737215492 Semaphore>wait
> -1737215584 SharedQueue>next
> -1742505348 TMessageRelay>runWriterProcess
> -1742505532 [] in Object>fork:at:
> -1742505440 [] in BlockContext>newProcess
>
> Process
> -1737215308 Semaphore>wait
> -1737215400 SharedQueue>next
> -1742502204 TMessageRouter>runEventLoop
> -1742502028 [] in Object>fork:at:
> -1742501936 [] in BlockContext>newProcess
>
> Process
> -1737223248 Semaphore>wait
> -1737223340 SharedQueue>next
> -1742495932 TMessageRelay>runWriterProcess
> -1742496116 [] in Object>fork:at:
> -1742496024 [] in BlockContext>newProcess
>
> Process
> -1742275584 Semaphore>wait
> -1742275676 Socket>waitForDataIfClosed:
> -1742275768 Socket>waitForData
> -1742275860 TConnectionDispatcher>acceptConnectionFrom:
> -1742276136 [] in TConnectionDispatcher>futureAcceptConnectionFrom:
> -1742275952 BlockContext>on:do:
> -1742277028 [] in TConnectionDispatcher>futureAcceptConnectionFrom:
> -1742276936 [] in BlockContext>newProcess
>
> Process
> -1737204640 Semaphore>wait
> -1737204732 SharedQueue>next
> -1738680880 TMessageRouter>runEventLoop
> -1738680752 [] in Object>fork:at:
> -1738680660 [] in BlockContext>newProcess
>
> Process
> -1737221816 Semaphore>wait
> -1737221908 SharedQueue>next
> -1738657952 TMessageRelay>runWriterProcess
> -1738658136 [] in Object>fork:at:
> -1738658044 [] in BlockContext>newProcess
>
> Process
> -1737211292 Semaphore>wait
> -1737211384 SharedQueue>next
> -1738649596 TMessageRelay>runWriterProcess
> -1738649872 [] in Object>fork:at:
> -1738649688 [] in BlockContext>newProcess
>
> Process
> -1737211800 Semaphore>wait
> -1737220784 Delay>wait
> -1738506672 TMessageRouterClient>runHeartbeat
> -1738506996 [] in Object>fork:at:
> -1738506904 [] in BlockContext>newProcess
>
> Process
> -1737276872 Semaphore>wait
> -1737276964 SharedQueue>next
> -1738372976 TMessageRelay>runWriterProcess
> -1738373160 [] in Object>fork:at:
> -1738373068 [] in BlockContext>newProcess
>
> Process
> -1737211108 Semaphore>wait
> -1737211200 SharedQueue>next
> -1738369928 TMessageRouter>runEventLoop
> -1738369800 [] in Object>fork:at:
> -1738369708 [] in BlockContext>newProcess
>
> Process
> -1737212260 Semaphore>wait
> -1737212352 SharedQueue>next
> -1738360920 TMessageRelay>runWriterProcess
> -1738361196 [] in Object>fork:at:
> -1738361012 [] in BlockContext>newProcess
>
> Process
> -1737211892 Semaphore>wait
> -1737220876 Delay>wait
> -1738305304 TMessageRouterClient>runHeartbeat
> -1738305592 [] in Object>fork:at:
> -1738305500 [] in BlockContext>newProcess
>
> Process
> -1737299840 Semaphore>wait
> -1737299932 Delay>wait
> -1738303108 [] in TAudioStreamInput>runPinger
> -1738303244 BlockContext>repeat
> -1738303376 TAudioStreamInput>runPinger
> -1738303652 [] in TAudioStreamInput>startPinger
> -1738303468 [] in BlockContext>newProcess
>
> Process
> -1737301448 Semaphore>wait
> -1737301540 Delay>wait
> -1737950096 [] in TAudioStreamInput>runPinger
> -1737950232 BlockContext>repeat
> -1737950364 TAudioStreamInput>runPinger
> -1737950640 [] in TAudioStreamInput>startPinger
> -1737950456 [] in BlockContext>newProcess
>
> Process
> -1737211984 Semaphore>wait
> -1737212076 SharedQueue>next
> -1737323020 TMessageRouter>runEventLoop
> -1737322892 [] in Object>fork:at:
> -1737322800 [] in BlockContext>newProcess
>
> Process
> -1737320664 Semaphore>wait
> -1737320756 SharedQueue>next
> -1737320572 [] in TErrorReportPlugin class>runLoggerProcess
> -1737320848 BlockContext>on:do:
> -1737321304 TErrorReportPlugin class>runLoggerProcess
> -1737321212 [] in Object>fork:at:
> -1737321120 [] in BlockContext>newProcess
>
> Process
> -1737219576 Semaphore>wait
> -1737219668 SharedQueue>next
> -1737314076 TMessageRelay>runWriterProcess
> -1737314352 [] in Object>fork:at:
> -1737314168 [] in BlockContext>newProcess
>
> Process
> -1737220992 Semaphore>wait
> -1737221084 Delay>wait
> -1737312544 TMessageRouterClient>runHeartbeat
> -1737312924 [] in Object>fork:at:
> -1737312740 [] in BlockContext>newProcess
>
> Process
> -1737294120 Semaphore>wait
> -1737294268 Delay>wait
> -1742503340 [] in TMediaControlEndpoint>runPinger
> -1742503616 BlockContext>repea
>
>




Reply | Threaded
Open this post in threaded view
|

Re: Oddd image lock-up on Unix

Andreas.Raab
 
Bert Freudenberg wrote:
> A backtrace from gdb might have been useful, too.

Damn, didn't think of that. We *did* poke it via SIGUSR before attaching
GDB though and the result looked perfectly reasonable:

Received user signal, printing stack trace:

-1742550148 ProcessorScheduler class>idleProcess
-1743108016 [] in ProcessorScheduler class>startUp
-1743107924 [] in BlockContext>newProcess

Given that the VM responded properly I wouldn't suspect there to be a
problem with nanosleep() or so.

> There was an increasing number of reports of Linux VMs freezing or
> crashing lately. We suspect some possibly unrelated problems. One is due
> to oops-as-signed, a symptom is that GC fails to find the end of memory.
> The other seems related to async i/o. In yet others (and this may be
> yours) we see nanosleeps that are called with huge negative microsecond
> counts, yet higher up in the stack the value was 1000 microseconds.

It is the very first time we've seen this kind of problem - usually the
symptoms are 100% CPU not 0% CPU consumption ;-) This is also why I
suspected that the delay fix might have to do with it.

Cheers,
   - Andreas
Reply | Threaded
Open this post in threaded view
|

Re: Oddd image lock-up on Unix

timrowledge
 

On 2-Apr-07, at 1:28 PM, Andreas Raab wrote:

>
> It is the very first time we've seen this kind of problem - usually  
> the symptoms are 100% CPU not 0% CPU consumption ;-) This is also  
> why I suspected that the delay fix might have to do with it.
What does nanosleep do with -ve params? Might it have bad behaviour  
under such circumstances?

The long delay changes were tested on special VMs that pretended  
quite short periods were the maximum delay length and appeared to  
have no problems but it is certainly possible some problem crept into  
the conversion to use full-length delays. For example the length of a  
guardianDelay (see Delay>beGuardianDelay) is SmallInteger maxVal and  
maybe there is some clash with the 'MillisecondClockMask' value used  
interp.c? It is, after all, half of maxVal and used a lot in tick  
related code. We must also note that the guardian delay stuff is only  
ever used if a Delay with  a non-SmallInterger length is activated,  
so it is unlikely to have any effect in normal usage.

tim
--
tim Rowledge; [hidden email]; http://www.rowledge.org/tim
Strange OpCodes: PUS: PUrge System


Reply | Threaded
Open this post in threaded view
|

Re: Oddd image lock-up on Unix

Philippe Marschall
In reply to this post by Andreas.Raab
 
2007/4/2, Andreas Raab <[hidden email]>:

>
> Hi Guys -
>
> We just had a very strange image lockup on our server. After a variety
> of attempts to connect to the image we finally fired up gdb and had the
> image do a printAllStacks. The dump is attached below, what is
> interesting is that it appears as if the Delay mutex was entered but not
> left successfully, e.g.,
>
> Process
> -1737213740 Semaphore>critical:
> -1755652204 Delay class>timerInterruptWatcher
> -1755651864 [] in Delay class>startTimerInterruptWatcher

All these active Semaphore>#critical: look awfully familiar. I've seen
this happening on a connection pool in a server application. It would
run for months and then suddenly Semaphore>#critical: wouldn't exit
anymore.

That's either a 3.6 or 3.7 VM, don't remember exactly anymore.

Philippe

> Process
> -1737212772 Semaphore>critical:
> -1737212956 Delay>unschedule
> -1737226548 Semaphore>waitTimeoutMSecs:
>
> And from there things go downhill quickly. The one change that I did do
> in this image was to include Tim's "large delay fix" and I am wondering
> if that may be related to it?
>
> In any case, if anyone has ever seen something similar, I'd really
> appreciate finding out more about it. Seeing how it keeps things just
> entirely locked is quite disturbing.
>
> Cheers,
>    - Andreas
>
> The full dump:
>
> Process
> -1742550148 ProcessorScheduler class>idleProcess
> -1743108016 [] in ProcessorScheduler class>startUp
> -1743107924 [] in BlockContext>newProcess
>
> Process
> -1737212168 Semaphore>wait
> -1743080028 WeakArray class>finalizationProcess
> -1743080212 [] in WeakArray class>restartFinalizationProcess
> -1743080120 [] in BlockContext>newProcess
>
> Process
> -1737219904 Semaphore>critical:
> -1737219996 Delay>unschedule
> -1737229232 Semaphore>waitTimeoutMSecs:
> -1737229692 Socket>waitForDataFor:ifClosed:ifTimedOut:
> -1737229784 Socket>waitForDataFor:
> -1737229600 [] in TMessageRelay>runReaderProcess
> -1737229876 BlockContext>on:do:
> -1738663052 TMessageRelay>runReaderProcess
> -1738658624 [] in Object>fork:at:
> -1738658532 [] in BlockContext>newProcess
>
> Process
> -1737218372 Semaphore>critical:
> -1737218580 Delay>schedule
> -1737218672 Delay class>timeoutSemaphore:afterMSecs:
> -1737218764 Semaphore>waitTimeoutMSecs:
> -1737219256 Socket>waitForDataFor:ifClosed:ifTimedOut:
> -1737219348 Socket>waitForDataFor:
> -1737219132 [] in TMessageRelay>runReaderProcess
> -1737219440 BlockContext>on:do:
> -1742529444 TMessageRelay>runReaderProcess
> -1742525016 [] in Object>fork:at:
> -1742524924 [] in BlockContext>newProcess
>
> Process
> -1737217168 Semaphore>critical:
> -1737217376 Delay>schedule
> -1737217468 Delay class>timeoutSemaphore:afterMSecs:
> -1737217560 Semaphore>waitTimeoutMSecs:
> -1737218052 Socket>waitForDataFor:ifClosed:ifTimedOut:
> -1737218144 Socket>waitForDataFor:
> -1737217928 [] in TMessageRelay>runReaderProcess
> -1737218236 BlockContext>on:do:
> -1742510448 TMessageRelay>runReaderProcess
> -1742506020 [] in Object>fork:at:
> -1742505928 [] in BlockContext>newProcess
>
> Process
> -1737215980 Semaphore>critical:
> -1737216188 Delay>schedule
> -1737216280 Delay class>timeoutSemaphore:afterMSecs:
> -1737216372 Semaphore>waitTimeoutMSecs:
> -1737216864 Socket>waitForDataFor:ifClosed:ifTimedOut:
> -1737216956 Socket>waitForDataFor:
> -1737216740 [] in TMessageRelay>runReaderProcess
> -1737217048 BlockContext>on:do:
> -1742501032 TMessageRelay>runReaderProcess
> -1742496604 [] in Object>fork:at:
> -1742496512 [] in BlockContext>newProcess
>
> Process
> -1737215020 Semaphore>critical:
> -1737215112 Delay>schedule
> -1737215204 Delay>wait
> -1742522336 [] in TMediaControlEndpoint>runPinger
> -1742522612 BlockContext>repeat
> -1742522704 TMediaControlEndpoint>runPinger
> -1742522520 [] in TMediaControlEndpoint>startPinger
> -1742522796 BlockContext>on:do:
> -1742523112 [] in TMediaControlEndpoint>startPinger
> -1742522928 [] in BlockContext>newProcess
>
> Process
> -1737214400 Semaphore>critical:
> -1737214492 Delay>schedule
> -1737214584 Delay>wait
> -1742493924 [] in TMediaControlEndpoint>runPinger
> -1742494200 BlockContext>repeat
> -1742494292 TMediaControlEndpoint>runPinger
> -1742494108 [] in TMediaControlEndpoint>startPinger
> -1742494384 BlockContext>on:do:
> -1742494700 [] in TMediaControlEndpoint>startPinger
> -1742494516 [] in BlockContext>newProcess
>
> Process
> -1737213740 Semaphore>critical:
> -1755652204 Delay class>timerInterruptWatcher
> -1755651864 [] in Delay class>startTimerInterruptWatcher
> -1755651772 [] in BlockContext>newProcess
>
> Process
> -1737212772 Semaphore>critical:
> -1737212956 Delay>unschedule
> -1737226548 Semaphore>waitTimeoutMSecs:
> -1737227040 Socket>waitForDataFor:ifClosed:ifTimedOut:
> -1737227132 Socket>waitForDataFor:
> -1737226916 [] in TMessageRelay>runReaderProcess
> -1737227224 BlockContext>on:do:
> -1738650508 TMessageRelay>runReaderProcess
> -1738650272 [] in Object>fork:at:
> -1738650180 [] in BlockContext>newProcess
>
> Process
> -1737212496 Semaphore>critical:
> -1737212680 Delay>unschedule
> -1737225588 Semaphore>waitTimeoutMSecs:
> -1737226080 Socket>waitForDataFor:ifClosed:ifTimedOut:
> -1737226172 Socket>waitForDataFor:
> -1737225956 [] in TMessageRelay>runReaderProcess
> -1737226264 BlockContext>on:do:
> -1738361832 TMessageRelay>runReaderProcess
> -1738361596 [] in Object>fork:at:
> -1738361504 [] in BlockContext>newProcess
>
> Process
> -1737175104 Semaphore>critical:
> -1737175288 Delay>unschedule
> -1737266784 Semaphore>waitTimeoutMSecs:
> -1737266876 Socket>waitForDataFor:ifClosed:ifTimedOut:
> -1737266968 Socket>waitForDataFor:
> -1737275344 Socket>receiveDataSignallingTimeout:into:startingAt:
> -1737275436 SocketStream>receiveData
> -1737275528 SocketStream>next
> -1737275620 XMLTokenizer>peek
> -1737275712 XMLTokenizer>nextWhitespace
> -1737276052 XMLTokenizer>nextEntity
> -1737276144 XMLDOMParser>nextEntity
> -1737276236 QServiceProviderV2>processRequest:
> -1737275952 [] in QServiceProviderV2>runServiceProcess
> -1737276328 BlockContext>on:do:
> -1743007756 QServiceProviderV2>runServiceProcess
> -1743044068 [] in Object>fork:at:
> -1743043976 [] in BlockContext>newProcess
>
> Process
> -1737173864 Semaphore>critical:
> -1737174000 Delay>schedule
> -1737174092 Delay>wait
> -1737174184 WorldState>interCyclePause:
> -1737174920 WorldState>doOneCycleFor:
> -1737175012 PasteUpMorph>doOneCycle
> -1738256796 [] in Project class>spawnNewProcess
> -1738256612 [] in BlockContext>newProcess
>
> Process
> -1737173496 Semaphore>critical:
> -1737173680 Delay>unschedule
> -1737288304 Semaphore>waitTimeoutMSecs:
> -1737288764 Socket>waitForDataFor:ifClosed:ifTimedOut:
> -1737288856 Socket>waitForDataFor:
> -1737288672 [] in TMessageRelay>runReaderProcess
> -1737288948 BlockContext>on:do:
> -1742548340 TMessageRelay>runReaderProcess
> -1742543912 [] in Object>fork:at:
> -1742543820 [] in BlockContext>newProcess
>
> Process
> -1737173220 Semaphore>critical:
> -1737173404 Delay>unschedule
> -1737277292 Semaphore>waitTimeoutMSecs:
> -1737277752 Socket>waitForDataFor:ifClosed:ifTimedOut:
> -1737277844 Socket>waitForDataFor:
> -1737277660 [] in TMessageRelay>runReaderProcess
> -1737277936 BlockContext>on:do:
> -1738378076 TMessageRelay>runReaderProcess
> -1738373648 [] in Object>fork:at:
> -1738373556 [] in BlockContext>newProcess
>
> Process
> -1737172944 Semaphore>critical:
> -1737173128 Delay>unschedule
> -1737231056 Semaphore>waitTimeoutMSecs:
> -1737231548 Socket>waitForDataFor:ifClosed:ifTimedOut:
> -1737231640 Socket>waitForDataFor:
> -1737231424 [] in TMessageRelay>runReaderProcess
> -1737231732 BlockContext>on:do:
> -1742538860 TMessageRelay>runReaderProcess
> -1742534432 [] in Object>fork:at:
> -1742534340 [] in BlockContext>newProcess
>
> Process
> -1737172668 Semaphore>critical:
> -1737172852 Delay>unschedule
> -1737230128 Semaphore>waitTimeoutMSecs:
> -1737230620 Socket>waitForDataFor:ifClosed:ifTimedOut:
> -1737230712 Socket>waitForDataFor:
> -1737230496 [] in TMessageRelay>runReaderProcess
> -1737230804 BlockContext>on:do:
> -1742519864 TMessageRelay>runReaderProcess
> -1742515436 [] in Object>fork:at:
> -1742515344 [] in BlockContext>newProcess
>
> Process
> -1737172392 Semaphore>critical:
> -1737172576 Delay>unschedule
> -1737279024 Semaphore>waitTimeoutMSecs:
> -1737279116 Socket>waitForConnectionFor:ifTimedOut:
> -1737279484 Socket>waitForConnectionFor:
> -1737279392 [] in Socket>waitForAcceptFor:
> -1737279588 BlockContext>on:do:
> -1737279696 Socket>waitForAcceptFor:
> -1743045624 TConnectionDispatcher>runServer
> -1743045808 [] in Object>fork:at:
> -1743045716 [] in BlockContext>newProcess
>
> Process
> -1737172116 Semaphore>critical:
> -1737172300 Delay>unschedule
> -1737311692 Semaphore>waitTimeoutMSecs:
> -1737312152 Socket>waitForDataFor:ifClosed:ifTimedOut:
> -1737312244 Socket>waitForDataFor:
> -1737312060 [] in TMessageRelay>runReaderProcess
> -1737312336 BlockContext>on:do:
> -1737314988 TMessageRelay>runReaderProcess
> -1737314752 [] in Object>fork:at:
> -1737314660 [] in BlockContext>newProcess
>
> Process
> -1737171840 Semaphore>critical:
> -1737172024 Delay>unschedule
> -1737302368 Semaphore>waitTimeoutMSecs:
> -1737302460 Socket>waitForConnectionFor:ifTimedOut:
> -1737302644 Socket>waitForConnectionFor:
> -1737302920 [] in Socket>waitForAcceptFor:
> -1737302736 BlockContext>on:do:
> -1737303012 Socket>waitForAcceptFor:
> -1737303196 [] in RFBServer>runLoop
> -1737303288 BlockContext>on:do:
> -1742896304 RFBServer>runLoop
> -1742896092 [] in RFBServer>start:
> -1742896000 [] in BlockContext>newProcess
>
> Process
> -1749883968 Semaphore>wait
> -1749884060 Python class>runCallbackProcess
> -1749884336 [] in Object>fork:at:
> -1749884152 [] in BlockContext>newProcess
>
> Process
> -1743109816 Semaphore>wait
> -1743109948 InputSensor>userInterruptWatcher
> -1743110152 [] in InputSensor>installInterruptWatcher
> -1743110040 [] in BlockContext>newProcess
>
> Process
> -1737224332 Semaphore>wait
> -1737224424 Delay>wait
> -1737224608 [] in EventSensor>eventTickler
> -1737224700 BlockContext>on:do:
> -1743109356 EventSensor>eventTickler
> -1743109632 [] in EventSensor>installEventTickler
> -1743109448 [] in BlockContext>newProcess
>
> Process
> -1743108200 Semaphore>wait
> -1743108444 SystemDictionary>lowSpaceWatcher
> -1743108628 [] in SystemDictionary>installLowSpaceWatcher
> -1743108352 [] in BlockContext>newProcess
>
> Process
> -1743103636 Semaphore>wait
> -1743103544 [] in UnixOSProcessAccessor>grimReaperProcess
> -1743103728 BlockContext>repeat
> -1743106440 [] in UnixOSProcessAccessor>grimReaperProcess
> -1743106348 [] in BlockContext>newProcess
>
> Process
> -1737181332 Semaphore>wait
> -1737181424 SharedQueue>next
> -1743047256 TSessionDispatcher>runLoggerProcess
> -1743047000 [] in Object>fork:at:
> -1743046908 [] in BlockContext>newProcess
>
> Process
> -1737453796 Semaphore>wait
> -1737453888 Delay>wait
> -1743007480 QServiceProviderV2>runUpdateProcess
> -1743043708 [] in Object>fork:at:
> -1743043524 [] in BlockContext>newProcess
>
> Process
> -1737287448 Semaphore>wait
> -1737287540 SharedQueue>next
> -1742549512 TMessageRouter>runEventLoop
> -1742549336 [] in Object>fork:at:
> -1742549244 [] in BlockContext>newProcess
>
> Process
> -1737287884 Semaphore>wait
> -1737287976 SharedQueue>next
> -1742543240 TMessageRelay>runWriterProcess
> -1742543424 [] in Object>fork:at:
> -1742543332 [] in BlockContext>newProcess
>
> Process
> -1737223064 Semaphore>wait
> -1737223156 SharedQueue>next
> -1742540032 TMessageRouter>runEventLoop
> -1742539856 [] in Object>fork:at:
> -1742539764 [] in BlockContext>newProcess
>
> Process
> -1737223840 Semaphore>wait
> -1737223932 SharedQueue>next
> -1742533760 TMessageRelay>runWriterProcess
> -1742533944 [] in Object>fork:at:
> -1742533852 [] in BlockContext>newProcess
>
> Process
> -1737215796 Semaphore>wait
> -1737215888 SharedQueue>next
> -1742530616 TMessageRouter>runEventLoop
> -1742530440 [] in Object>fork:at:
> -1742530348 [] in BlockContext>newProcess
>
> Process
> -1737222124 Semaphore>wait
> -1737222216 SharedQueue>next
> -1742524344 TMessageRelay>runWriterProcess
> -1742524528 [] in Object>fork:at:
> -1742524436 [] in BlockContext>newProcess
>
> Process
> -1737222880 Semaphore>wait
> -1737222972 SharedQueue>next
> -1742521036 TMessageRouter>runEventLoop
> -1742520860 [] in Object>fork:at:
> -1742520768 [] in BlockContext>newProcess
>
> Process
> -1737223536 Semaphore>wait
> -1737223628 SharedQueue>next
> -1742514764 TMessageRelay>runWriterProcess
> -1742514948 [] in Object>fork:at:
> -1742514856 [] in BlockContext>newProcess
>
> Process
> -1737214824 Semaphore>wait
> -1737214928 SharedQueue>next
> -1742511620 TMessageRouter>runEventLoop
> -1742511444 [] in Object>fork:at:
> -1742511352 [] in BlockContext>newProcess
>
> Process
> -1737215492 Semaphore>wait
> -1737215584 SharedQueue>next
> -1742505348 TMessageRelay>runWriterProcess
> -1742505532 [] in Object>fork:at:
> -1742505440 [] in BlockContext>newProcess
>
> Process
> -1737215308 Semaphore>wait
> -1737215400 SharedQueue>next
> -1742502204 TMessageRouter>runEventLoop
> -1742502028 [] in Object>fork:at:
> -1742501936 [] in BlockContext>newProcess
>
> Process
> -1737223248 Semaphore>wait
> -1737223340 SharedQueue>next
> -1742495932 TMessageRelay>runWriterProcess
> -1742496116 [] in Object>fork:at:
> -1742496024 [] in BlockContext>newProcess
>
> Process
> -1742275584 Semaphore>wait
> -1742275676 Socket>waitForDataIfClosed:
> -1742275768 Socket>waitForData
> -1742275860 TConnectionDispatcher>acceptConnectionFrom:
> -1742276136 [] in TConnectionDispatcher>futureAcceptConnectionFrom:
> -1742275952 BlockContext>on:do:
> -1742277028 [] in TConnectionDispatcher>futureAcceptConnectionFrom:
> -1742276936 [] in BlockContext>newProcess
>
> Process
> -1737204640 Semaphore>wait
> -1737204732 SharedQueue>next
> -1738680880 TMessageRouter>runEventLoop
> -1738680752 [] in Object>fork:at:
> -1738680660 [] in BlockContext>newProcess
>
> Process
> -1737221816 Semaphore>wait
> -1737221908 SharedQueue>next
> -1738657952 TMessageRelay>runWriterProcess
> -1738658136 [] in Object>fork:at:
> -1738658044 [] in BlockContext>newProcess
>
> Process
> -1737211292 Semaphore>wait
> -1737211384 SharedQueue>next
> -1738649596 TMessageRelay>runWriterProcess
> -1738649872 [] in Object>fork:at:
> -1738649688 [] in BlockContext>newProcess
>
> Process
> -1737211800 Semaphore>wait
> -1737220784 Delay>wait
> -1738506672 TMessageRouterClient>runHeartbeat
> -1738506996 [] in Object>fork:at:
> -1738506904 [] in BlockContext>newProcess
>
> Process
> -1737276872 Semaphore>wait
> -1737276964 SharedQueue>next
> -1738372976 TMessageRelay>runWriterProcess
> -1738373160 [] in Object>fork:at:
> -1738373068 [] in BlockContext>newProcess
>
> Process
> -1737211108 Semaphore>wait
> -1737211200 SharedQueue>next
> -1738369928 TMessageRouter>runEventLoop
> -1738369800 [] in Object>fork:at:
> -1738369708 [] in BlockContext>newProcess
>
> Process
> -1737212260 Semaphore>wait
> -1737212352 SharedQueue>next
> -1738360920 TMessageRelay>runWriterProcess
> -1738361196 [] in Object>fork:at:
> -1738361012 [] in BlockContext>newProcess
>
> Process
> -1737211892 Semaphore>wait
> -1737220876 Delay>wait
> -1738305304 TMessageRouterClient>runHeartbeat
> -1738305592 [] in Object>fork:at:
> -1738305500 [] in BlockContext>newProcess
>
> Process
> -1737299840 Semaphore>wait
> -1737299932 Delay>wait
> -1738303108 [] in TAudioStreamInput>runPinger
> -1738303244 BlockContext>repeat
> -1738303376 TAudioStreamInput>runPinger
> -1738303652 [] in TAudioStreamInput>startPinger
> -1738303468 [] in BlockContext>newProcess
>
> Process
> -1737301448 Semaphore>wait
> -1737301540 Delay>wait
> -1737950096 [] in TAudioStreamInput>runPinger
> -1737950232 BlockContext>repeat
> -1737950364 TAudioStreamInput>runPinger
> -1737950640 [] in TAudioStreamInput>startPinger
> -1737950456 [] in BlockContext>newProcess
>
> Process
> -1737211984 Semaphore>wait
> -1737212076 SharedQueue>next
> -1737323020 TMessageRouter>runEventLoop
> -1737322892 [] in Object>fork:at:
> -1737322800 [] in BlockContext>newProcess
>
> Process
> -1737320664 Semaphore>wait
> -1737320756 SharedQueue>next
> -1737320572 [] in TErrorReportPlugin class>runLoggerProcess
> -1737320848 BlockContext>on:do:
> -1737321304 TErrorReportPlugin class>runLoggerProcess
> -1737321212 [] in Object>fork:at:
> -1737321120 [] in BlockContext>newProcess
>
> Process
> -1737219576 Semaphore>wait
> -1737219668 SharedQueue>next
> -1737314076 TMessageRelay>runWriterProcess
> -1737314352 [] in Object>fork:at:
> -1737314168 [] in BlockContext>newProcess
>
> Process
> -1737220992 Semaphore>wait
> -1737221084 Delay>wait
> -1737312544 TMessageRouterClient>runHeartbeat
> -1737312924 [] in Object>fork:at:
> -1737312740 [] in BlockContext>newProcess
>
> Process
> -1737294120 Semaphore>wait
> -1737294268 Delay>wait
> -1742503340 [] in TMediaControlEndpoint>runPinger
> -1742503616 BlockContext>repea
>
>
>
Reply | Threaded
Open this post in threaded view
|

Re: Odd image lock-up on Unix

Andreas.Raab
In reply to this post by Bert Freudenberg
 
Bert Freudenberg wrote:
> There was an increasing number of reports of Linux VMs freezing or
> crashing lately. We suspect some possibly unrelated problems. One is due
> to oops-as-signed, a symptom is that GC fails to find the end of memory.
> The other seems related to async i/o. In yet others (and this may be
> yours) we see nanosleeps that are called with huge negative microsecond
> counts, yet higher up in the stack the value was 1000 microseconds.

No, it wasn't the problem. I think we got *that* problem in our second
lockup today:

(gdb) where
#0  0x007e0402 in ?? ()
#1  0x0019d7d0 in __nanosleep_nocancel () from /lib/libc.so.6
#2  0x0806fc03 in aioSleep (microSeconds=-1075955024) at
/home/araab/QwaqVM/platforms/unix/vm/aio.c:273
#3  0x00b34001 in display_ioRelinquishProcessorForMicroseconds
(microSeconds=1000)
     at /home/araab/QwaqVM/platforms/unix/vm-display-X11/sqUnixX11.c:2304
#4  0x08058d20 in ioRelinquishProcessorForMicroseconds (us=1000) at
/home/araab/QwaqVM/platforms/unix/vm/sqUnixMain.c:471
#5  0x0805ed9b in primitiveRelinquishProcessor () at gnu-interp.c:17940
#6  0x0806d64e in interpret () at gnu-interp.c:7882
#7  0x0805a4e8 in main (argc=8, argv=0xbfde4f14, envp=0xbfde4f38) at
/home/araab/QwaqVM/platforms/unix/vm/sqUnixMain.c:1396


Note the negative microSeconds in aioSleep() and also note how the
process dump (attached at the end) doesn't have the critical:[] sections
all over it.

So there are at least two problems here although it does make me wonder
that we got this so suddenly after (previously) zero problems. The two
things that have changed on our end were both inside the image: For one
the delay changes, but the other one is that we started to aggressively
timeout network connections (too aggressively actually) which causes a
lot of processes to be terminated, sockets destroyed etc. Previously we
never did that, so maybe there is a race condition or something that
leaves the unix socket code in a sick state?

Cheers,
   - Andreas

Process dump:
Process
-1741776716 ProcessorScheduler class>idleProcess
-1742284720 [] in ProcessorScheduler class>startUp
-1742284628 [] in BlockContext>newProcess

Process
-1714822836 Semaphore>wait
-1742256716 WeakArray class>finalizationProcess
-1742256900 [] in WeakArray class>restartFinalizationProcess
-1742256808 [] in BlockContext>newProcess

Process
-1714565076 Semaphore>wait
-1754828908 Delay class>timerInterruptWatcher
-1754828568 [] in Delay class>startTimerInterruptWatcher
-1754828476 [] in BlockContext>newProcess

Process
-1749060672 Semaphore>wait
-1749060764 Python class>runCallbackProcess
-1749061040 [] in Object>fork:at:
-1749060856 [] in BlockContext>newProcess

Process
-1742286520 Semaphore>wait
-1742286652 InputSensor>userInterruptWatcher
-1742286856 [] in InputSensor>installInterruptWatcher
-1742286744 [] in BlockContext>newProcess

Process
-1714738232 Semaphore>wait
-1714739244 Delay>wait
-1714738968 [] in EventSensor>eventTickler
-1714739336 BlockContext>on:do:
-1742286060 EventSensor>eventTickler
-1742286336 [] in EventSensor>installEventTickler
-1742286152 [] in BlockContext>newProcess

Process
-1742284904 Semaphore>wait
-1742285148 SystemDictionary>lowSpaceWatcher
-1742285332 [] in SystemDictionary>installLowSpaceWatcher
-1742285056 [] in BlockContext>newProcess

Process
-1718494260 Semaphore>wait
-1742275748 [] in UnixOSProcessAccessor>grimReaperProcess
-1742275840 BlockContext>repeat
-1742283044 [] in UnixOSProcessAccessor>grimReaperProcess
-1742282952 [] in BlockContext>newProcess

Process
-1714766652 Semaphore>wait
-1714766744 SharedQueue>next
-1742228012 QLogger>runLoggerProcess
-1742228196 [] in Object>fork:at:
-1742228104 [] in BlockContext>newProcess

Process
-1718568664 Semaphore>wait
-1718568756 SharedQueue>next
-1742198140 TIslandController>runEventLoop
-1742197864 [] in Object>fork:at:
-1742197772 [] in BlockContext>newProcess

Process
-1718567340 Semaphore>wait
-1718567588 TPromise>wait
-1742189848 [] in TMediaControlEndpoint>runPinger
-1742190124 BlockContext>repeat
-1742190216 TMediaControlEndpoint>runPinger
-1742190032 [] in TMediaControlEndpoint>startPinger
-1742190308 BlockContext>on:do:
-1742190624 [] in TMediaControlEndpoint>startPinger
-1742190440 [] in BlockContext>newProcess

Process
-1714963228 Semaphore>wait
-1714963320 SharedQueue>next
-1718535404 TIslandController>runEventLoop
-1718535128 [] in Object>fork:at:
-1718535036 [] in BlockContext>newProcess

Process
-1714563460 Semaphore>wait
-1714564432 Delay>wait
-1718492696 [] in QVNCScreen>runWatchdog
-1718492832 BlockContext>repeat
-1718492964 QVNCScreen>runWatchdog
-1718493240 [] in QVNCScreen>startWatchdog
-1718493056 [] in BlockContext>newProcess

Process
-1715192564 Semaphore>wait
-1715192656 Socket>waitForDataIfClosed:
-1715192748 Socket>waitForData
-1718489960 [] in RFBClientTalker>clientRunLoop
-1718489408 BlockContext>whileTrue:
-1718489684 [] in RFBSocket>runSafely:
-1718490052 BlockContext>on:do:
-1718489868 [] in RFBSocket>runSafely:
-1718490144 BlockContext>ensure:
-1718490236 RFBSocket>runSafely:
-1718490328 RFBClientTalker>clientRunLoop
-1718491004 [] in RFBClientTalker>connectTo:display:portOffset:
-1718490912 [] in BlockContext>newProcess

Process
-1714961064 Semaphore>wait
-1714961312 TPromise>wait
-1718527168 [] in TMediaControlEndpoint>runPinger
-1718527076 BlockContext>repeat
-1718527260 TMediaControlEndpoint>runPinger
-1718527444 [] in TMediaControlEndpoint>startPinger
-1718527536 BlockContext>on:do:
-1718527852 [] in TMediaControlEndpoint>startPinger
-1718527668 [] in BlockContext>newProcess

Process
-1714596252 Semaphore>wait
-1714596344 SharedQueue>next
-1714810692 TIslandController>runEventLoop
-1714810416 [] in Object>fork:at:
-1714810324 [] in BlockContext>newProcess

Process
-1714634468 Semaphore>wait
-1714634560 SharedQueue>next
-1714804448 TMessageRelay>runWriterProcess
-1714804632 [] in Object>fork:at:
-1714804540 [] in BlockContext>newProcess

Process
-1714670904 Semaphore>wait
-1714671848 Semaphore>waitTimeoutMSecs:
-1714672616 Socket>waitForDataFor:ifClosed:ifTimedOut:
-1714672708 Socket>waitForDataFor:
-1714672216 [] in TMessageRelay>runReaderProcess
-1714672800 BlockContext>on:do:
-1714805036 TMessageRelay>runReaderProcess
-1714805220 [] in Object>fork:at:
-1714805128 [] in BlockContext>newProcess

Process
-1714594604 Semaphore>wait
-1714594696 SharedQueue>next
-1714799248 TIslandController>runEventLoop
-1714798972 [] in Object>fork:at:
-1714798880 [] in BlockContext>newProcess

Process
-1714622296 Semaphore>wait
-1714622388 SharedQueue>next
-1714793004 TMessageRelay>runWriterProcess
-1714793188 [] in Object>fork:at:
-1714793096 [] in BlockContext>newProcess

Process
-1714662828 Semaphore>wait
-1714663772 Semaphore>waitTimeoutMSecs:
-1714664540 Socket>waitForDataFor:ifClosed:ifTimedOut:
-1714664632 Socket>waitForDataFor:
-1714664140 [] in TMessageRelay>runReaderProcess
-1714664724 BlockContext>on:do:
-1714793592 TMessageRelay>runReaderProcess
-1714793776 [] in Object>fork:at:
-1714793684 [] in BlockContext>newProcess

Process
-1714592956 Semaphore>wait
-1714593048 SharedQueue>next
-1714787096 TIslandController>runEventLoop
-1714786820 [] in Object>fork:at:
-1714786728 [] in BlockContext>newProcess

Process
-1714610116 Semaphore>wait
-1714610208 SharedQueue>next
-1714780388 TMessageRelay>runWriterProcess
-1714780572 [] in Object>fork:at:
-1714780480 [] in BlockContext>newProcess

Process
-1714654728 Semaphore>wait
-1714655672 Semaphore>waitTimeoutMSecs:
-1714656440 Socket>waitForDataFor:ifClosed:ifTimedOut:
-1714656532 Socket>waitForDataFor:
-1714656040 [] in TMessageRelay>runReaderProcess
-1714656624 BlockContext>on:do:
-1714780976 TMessageRelay>runReaderProcess
-1714781160 [] in Object>fork:at:
-1714781068 [] in BlockContext>newProcess

Process
-1714591308 Semaphore>wait
-1714591400 SharedQueue>next
-1714772080 TIslandController>runEventLoop
-1714771804 [] in Object>fork:at:
-1714771712 [] in BlockContext>newProcess

Process
-1714597936 Semaphore>wait
-1714598028 SharedQueue>next
-1714765440 TMessageRelay>runWriterProcess
-1714765624 [] in Object>fork:at:
-1714765532 [] in BlockContext>newProcess

Process
-1714646620 Semaphore>wait
-1714647564 Semaphore>waitTimeoutMSecs:
-1714648332 Socket>waitForDataFor:ifClosed:ifTimedOut:
-1714648424 Socket>waitForDataFor:
-1714647932 [] in TMessageRelay>runReaderProcess
-1714648516 BlockContext>on:do:
-1714766028 TMessageRelay>runReaderProcess
-1714766212 [] in Object>fork:at:
-1714766120 [] in BlockContext>newProcess

Process
-1714756184 Semaphore>wait
-1714756320 Delay>wait
-1742085088 QAppServer>runUpdateProcess
-1742137448 [] in Object>fork:at:
-1742137264 [] in BlockContext>newProcess

Process
-1714596528 Semaphore>wait
-1714597428 Delay>wait
-1714802356 [] in TMediaControlEndpoint>runPinger
-1714802632 BlockContext>repeat
-1714802724 TMediaControlEndpoint>runPinger
-1714802540 [] in TMediaControlEndpoint>startPinger
-1714802816 BlockContext>on:do:
-1714803132 [] in TMediaControlEndpoint>startPinger
-1714802948 [] in BlockContext>newProcess

Process
-1714594880 Semaphore>wait
-1714595780 Delay>wait
-1714791004 [] in TMediaControlEndpoint>runPinger
-1714790912 BlockContext>repeat
-1714791096 TMediaControlEndpoint>runPinger
-1714791280 [] in TMediaControlEndpoint>startPinger
-1714791372 BlockContext>on:do:
-1714791688 [] in TMediaControlEndpoint>startPinger
-1714791504 [] in BlockContext>newProcess

Process
-1714593232 Semaphore>wait
-1714594132 Delay>wait
-1714777792 [] in TMediaControlEndpoint>runPinger
-1714777700 BlockContext>repeat
-1714777884 TMediaControl
Reply | Threaded
Open this post in threaded view
|

Re: Odd image lock-up on Unix

johnmci
 
I've attached a sqUnixSocket.c that I had poked at earlier in the  
year. It has a lot more error checking for errors after
system calls.

Note the call to SysBeep() can be commented out, they were in there  
for debugging purposes.

On Apr 2, 2007, at 3:57 PM, Andreas Raab wrote:

> aioSleep

--
========================================================================
===
John M. McIntosh <[hidden email]>
Corporate Smalltalk Consulting Ltd.  http://www.smalltalkconsulting.com
========================================================================
===




sqUnixSocket.c (552 bytes) Download Attachment
sqUnixSocket.c (40K) Download Attachment
Reply | Threaded
Open this post in threaded view
|

Re: Odd image lock-up on Unix

johnmci
In reply to this post by Andreas.Raab
 
The other thing I noted was in

aio.c


aioFini()
did not check for errno == EINTR on close(), perhaps we don't care.

      n= select(maxFd, &rd, &wr, &ex, &tv);
we don't check for EINTR, perhaps we don't care.  I'll note on the  
select() if we choose that logic that EINTR is a fairly usual case.
Mind I wonder if we have to do the select again to fully service any  
pending interrupts?

and
          nanosleep(&rqtp, &rmtp);   /* EINTR here, but likely we want to  
wake up? */


On Apr 2, 2007, at 3:57 PM, Andreas Raab wrote:

> aioSleep

--
========================================================================
===
John M. McIntosh <[hidden email]>
Corporate Smalltalk Consulting Ltd.  http://www.smalltalkconsulting.com
========================================================================
===


Reply | Threaded
Open this post in threaded view
|

Re: Odd image lock-up on Unix

Andreas.Raab
 
What I'm much more curious about is this:

#2  0x0806fc03 in aioSleep (microSeconds=-1075955024) at
/home/araab/QwaqVM/platforms/unix/vm/aio.c:273
#3  0x00b34001 in display_ioRelinquishProcessorForMicroseconds
(microSeconds=1000)
     at /home/araab/QwaqVM/platforms/unix/vm-display-X11/sqUnixX11.c:2304

How can a call to display_ioRelinquishProcessorForMicroseconds(1000) end
up to be a call to aioSleep(-1075955024) given that it is implemented
like here:

static sqInt display_ioRelinquishProcessorForMicroseconds(sqInt
microSeconds)
{
   aioSleep(handleEvents() ? 0 : microSeconds);
   return 0;
}

The only difference I see is that the argument is declared as sqInt but
that aioSleep takes an int - but I wouldn't expect that to cause
problems. Unless it's been declared even differently from that, but in
this case I would expect every call to go wrong.

Any theories to offer anyone? (other than cosmic rays)

Cheers,
   - Andreas
Reply | Threaded
Open this post in threaded view
|

Re: Odd image lock-up on Unix

johnmci
 
well no, microSeconds shouldn't be -1075955024

I'll note in aioPoll we have

       now= ioMSecs();
       microSeconds -= (now - ms) * 1000;
       if (microSeconds <= 0)
        return 0;


maybe the aioPoll is inlined into the aioSleep ?
Did you look at the assembler?

On Apr 2, 2007, at 8:15 PM, Andreas Raab wrote:

> What I'm much more curious about is this:
>
> #2  0x0806fc03 in aioSleep (microSeconds=-1075955024) at /home/
> araab/QwaqVM/platforms/unix/vm/aio.c:273
> #3  0x00b34001 in display_ioRelinquishProcessorForMicroseconds  
> (microSeconds=1000)
>     at /home/araab/QwaqVM/platforms/unix/vm-display-X11/sqUnixX11.c:
> 2304
>
> How can a call to display_ioRelinquishProcessorForMicroseconds
> (1000) end up to be a call to aioSleep(-1075955024) given that it  
> is implemented like here:
>
> static sqInt display_ioRelinquishProcessorForMicroseconds(sqInt  
> microSeconds)
> {
>   aioSleep(handleEvents() ? 0 : microSeconds);
>   return 0;
> }
>
> The only difference I see is that the argument is declared as sqInt  
> but that aioSleep takes an int - but I wouldn't expect that to  
> cause problems. Unless it's been declared even differently from  
> that, but in this case I would expect every call to go wrong.
>
> Any theories to offer anyone? (other than cosmic rays)
>
> Cheers,
>   - Andreas

--
========================================================================
===
John M. McIntosh <[hidden email]>
Corporate Smalltalk Consulting Ltd.  http://www.smalltalkconsulting.com
========================================================================
===