On Debian8, CogSpur64 5.0-201612221637, Cuis 5.0 3043... after lots of user interaction, views opening and closing, and animation, the image becomes unresponsive. Cmd+. does nothing and the clock in the Cuis taskbar no longer updates. The length of time to produce this varies from 5 to 20 minutes. There seems to be no dump and no log. 'Smalltalk garbageCollectMost' reports 25550736 to 30517872 over 34 samples. Sorry to be so vague. This has occurred with several images - sometimes scrolling through a senders list, or stepping through a debugger, but repeatably with one of my images. I can supply this image, warmed up so that it might not take so long to reproduce the problem if you would like. Or give me a hint as to how to narrow down the problem - Dan |
Hi Dan, You can send the USR1 signal to the VM process to make it write some debug information to the console. With this information you can easily tell what's happening. I suspect it's stuck in a long GC. Levente On Fri, 27 Jan 2017, Dan Norton wrote: > > On Debian8, CogSpur64 5.0-201612221637, Cuis 5.0 3043... > > after lots of user interaction, views opening and closing, and > animation, the image becomes unresponsive. Cmd+. does nothing and the > clock in the Cuis taskbar no longer updates. The length of time to > produce this varies from 5 to 20 minutes. > > There seems to be no dump and no log. 'Smalltalk garbageCollectMost' > reports 25550736 to 30517872 over 34 samples. > > Sorry to be so vague. This has occurred with several images - sometimes > scrolling through a senders list, or stepping through a debugger, but > repeatably with one of my images. I can supply this image, warmed up so > that it might not take so long to reproduce the problem if you would > like. Or give me a hint as to how to narrow down the problem > > - Dan > > |
In reply to this post by Dan Norton
Hi Levente, I directed stdout to a file and prior to the freeze, did: pkill -USR1 -n -x squeak a few times and noticed "stack overflow" mentioned twice. After the freeze, pkill sent nothing else to stdout. Maybe something in stderr? I'll try again and concatenate stderr to the end of stdout. Meanwhile, the file with the stack overflows is attached. Thanks for the help. - Dan On Sat, 28 Jan 2017, Levente Uzonyi wrote: >Hi Dan, >You can send the USR1 signal to the VM process to make it write some debug >information to the console. With this information you can easily tell >what's happening. I suspect it's stuck in a long GC. >Levente >On Fri, 27 Jan 2017, Dan Norton wrote: >> >> On Debian8, CogSpur64 5.0-201612221637, Cuis 5.0 3043... >> >> after lots of user interaction, views opening and closing, and >> animation, the image becomes unresponsive. Cmd+. does nothing and the >> clock in the Cuis taskbar no longer updates. The length of time to >> produce this varies from 5 to 20 minutes. >> >> There seems to be no dump and no log. 'Smalltalk garbageCollectMost' >> reports 25550736 to 30517872 over 34 samples. >> >> Sorry to be so vague. This has occurred with several images - sometimes >> scrolling through a senders list, or stepping through a debugger, but >> repeatably with one of my images. I can supply this image, warmed up so >> that it might not take so long to reproduce the problem if you would >> like. Or give me a hint as to how to narrow down the problem freeze.bz2 (3K) Download Attachment |
Highly speculative idea... from a shell could you log your system time to see if anything there correlates. cheers -ben On Sun, Jan 29, 2017 at 11:40 AM, Dan Norton <[hidden email]> wrote: > > > Hi Levente, > > I directed stdout to a file and prior to the freeze, did: > > pkill -USR1 -n -x squeak > > a few times and noticed "stack overflow" mentioned twice. After the freeze, > pkill sent nothing else to stdout. Maybe something in stderr? I'll try > again and concatenate stderr to the end of stdout. Meanwhile, the file with > the stack overflows is attached. > > Thanks for the help. > > - Dan > > On Sat, 28 Jan 2017, Levente Uzonyi wrote: > > >>Hi Dan, > >>You can send the USR1 signal to the VM process to make it write some debug >>information to the console. With this information you can easily tell >>what's happening. I suspect it's stuck in a long GC. > >>Levente > >>On Fri, 27 Jan 2017, Dan Norton wrote: > >>> >>> On Debian8, CogSpur64 5.0-201612221637, Cuis 5.0 3043... >>> >>> after lots of user interaction, views opening and closing, and >>> animation, the image becomes unresponsive. Cmd+. does nothing and the >>> clock in the Cuis taskbar no longer updates. The length of time to >>> produce this varies from 5 to 20 minutes. >>> >>> There seems to be no dump and no log. 'Smalltalk garbageCollectMost' >>> reports 25550736 to 30517872 over 34 samples. >>> >>> Sorry to be so vague. This has occurred with several images - sometimes >>> scrolling through a senders list, or stepping through a debugger, but >>> repeatably with one of my images. I can supply this image, warmed up so >>> that it might not take so long to reproduce the problem if you would >>> like. Or give me a hint as to how to narrow down the problem > > |
In reply to this post by Dan Norton
Hi Dan, The expected output is in the file you attached. There seem to be no active processes to run, so only the idle process is being executed: All Smalltalk process stacks (active first): Process 0x1c9a9c8 priority 10 0x7ffc012e81c0 M ProcessorScheduler class>idleProcess 0x15d2a30: a(n) ProcessorScheduler class 0x7ffc012e8200 I [] in ProcessorScheduler class>startUp 0x15d2a30: a(n) ProcessorScheduler class 0x7ffc012e8240 I [] in BlockClosure>newProcess 0x1d1aa40: a(n) BlockClosure Most recent primitives relinquishProcessorForMicroseconds: relinquishProcessorForMicroseconds: relinquishProcessorForMicroseconds: relinquishProcessorForMicroseconds: relinquishProcessorForMicroseconds: relinquishProcessorForMicroseconds: ... Among the suspended processes is the UI process, which is waiting on a Delay. So, there may be issues on the image side with Delays. Process 0x1c9a790 priority 40 0x7ffc012e60d8 M [] in Delay>wait 0x1c9ab48: a(n) Delay 0x7ffc012e6118 M BlockClosure>ifCurtailed: 0x11b1228: a(n) BlockClosure 0x7ffc012e6150 M Delay>wait 0x1c9ab48: a(n) Delay 0x7ffc012e6190 M WorldState>doOneCycleFor: 0x15cd6e0: a(n) WorldState 0x7ffc012e61c8 M PasteUpMorph>doOneCycle 0x165f890: a(n) PasteUpMorph 0x7ffc012e6200 M [] in ProjectX class>spawnNewMorphicProcessFor: 0x15d7900: a(n) ProjectX class 0x7ffc012e6240 I [] in BlockClosure>newProcess 0x1d1a940: a(n) BlockClosure I don't see any sign of stack overflow you mentioned. Levente |
On Sun, Jan 29, 2017 at 7:02 PM, Levente Uzonyi <[hidden email]> wrote:
So as part of exploring the problem, an awful hack would be deleting that #wait from #dowOneCycleFor:. cheers -ben 0x7ffc012e61c8 M PasteUpMorph>doOneCycle 0x165f890: a(n) PasteUpMorph |
In reply to this post by Levente Uzonyi
On 01/29/2017 06:02 AM, Levente Uzonyi wrote: > > Hi Dan, > > The expected output is in the file you attached. There seem to be no > active processes to run, so only the idle process is being executed: > > All Smalltalk process stacks (active first): > Process 0x1c9a9c8 priority 10 > 0x7ffc012e81c0 M ProcessorScheduler class>idleProcess 0x15d2a30: > a(n) ProcessorScheduler class > 0x7ffc012e8200 I [] in ProcessorScheduler class>startUp 0x15d2a30: > a(n) ProcessorScheduler class > 0x7ffc012e8240 I [] in BlockClosure>newProcess 0x1d1aa40: a(n) > BlockClosure > > Most recent primitives > relinquishProcessorForMicroseconds: > relinquishProcessorForMicroseconds: > relinquishProcessorForMicroseconds: > relinquishProcessorForMicroseconds: > relinquishProcessorForMicroseconds: > relinquishProcessorForMicroseconds: > ... > > Among the suspended processes is the UI process, which is waiting on a > Delay. So, there may be issues on the image side with Delays. > > Process 0x1c9a790 priority 40 > 0x7ffc012e60d8 M [] in Delay>wait 0x1c9ab48: a(n) Delay > 0x7ffc012e6118 M BlockClosure>ifCurtailed: 0x11b1228: a(n) > BlockClosure > 0x7ffc012e6150 M Delay>wait 0x1c9ab48: a(n) Delay > 0x7ffc012e6190 M WorldState>doOneCycleFor: 0x15cd6e0: a(n) WorldState > 0x7ffc012e61c8 M PasteUpMorph>doOneCycle 0x165f890: a(n) PasteUpMorph > 0x7ffc012e6200 M [] in ProjectX class>spawnNewMorphicProcessFor: > 0x15d7900: a(n) ProjectX class > 0x7ffc012e6240 I [] in BlockClosure>newProcess 0x1d1a940: a(n) > BlockClosure > > I don't see any sign of stack overflow you mentioned. There are 3 sets of USR1 outputs. The overflows are at lines 600 and 622 in the second(?) output. - Dan |
In reply to this post by Ben Coman
On 01/29/2017 07:48 AM, Ben Coman
wrote:
Hi Ben, I started the image by doing this, in effect: /home/dan/cuis/cogspur64/squeak /home/dan/cuis/Cuis-Smalltalk-Dev/a64.image > freeze.log 2>&1 After loading packages containing my code I deleted: (Delay for seconds: ... ) wait. and ran for a while, did the 'pkill -USR1 -n -x squeak'; ran some more until it froze, did another 'pkill'. The stdout looked like the other runs. There was no sign of stderr output. - Dan
|
In reply to this post by Ben Coman
I'd rather check if the delay would be longer than a second. If it would, then log the value and trim it to a second. Levente |
On Sun, Jan 29, 2017 at 6:35 PM, Levente Uzonyi <[hidden email]> wrote:
Does Cmd-. normally work on this VM+image if the UI process is waiting on a Delay? - Bert - |
In reply to this post by Ben Coman
On 01/29/2017 12:11 AM, Ben Coman wrote: > > Highly speculative idea... from a shell could you log your system time > to see if anything there correlates. Hi Ben, This is what I did: # tail -f /var/log/messages > /home/dan/systime.log Up to a freeze, this was written: Jan 29 09:52:51 debian kernel: [ 12.957276] IPv6: ADDRCONF(NETDEV_CHANGE): wlan0: link becomes ready Jan 29 09:52:55 debian org.a11y.Bus[1050]: Activating service name='org.a11y.atspi.Registry' Jan 29 09:52:55 debian org.a11y.Bus[1050]: Successfully activated service 'org.a11y.atspi.Registry' Jan 29 09:52:55 debian org.a11y.atspi.Registry[1056]: SpiRegistry daemon is running with well-known name - org.a11y.atspi.Registry Jan 29 09:53:08 debian org.a11y.Bus[1127]: Activating service name='org.a11y.atspi.Registry' Jan 29 09:53:08 debian org.a11y.Bus[1127]: Successfully activated service 'org.a11y.atspi.Registry' Jan 29 09:53:08 debian org.a11y.atspi.Registry[1182]: SpiRegistry daemon is running with well-known name - org.a11y.atspi.Registry Jan 29 09:57:50 debian rsyslogd: [origin software="rsyslogd" swVersion="8.4.2" x-pid="859" x-info="http://www.rsyslog.com"] rsyslogd was HUPed Jan 29 10:07:44 debian rsyslogd0: action 'action 17' resumed (module 'builtin:ompipe') [try http://www.rsyslog.com/e/0 ] Jan 29 10:07:44 debian rsyslogd-2359: action 'action 17' resumed (module 'builtin:ompipe') [try http://www.rsyslog.com/e/2359 ] - Dan > cheers -ben > > On Sun, Jan 29, 2017 at 11:40 AM, Dan Norton <[hidden email]> wrote: >> >> Hi Levente, >> >> I directed stdout to a file and prior to the freeze, did: >> >> pkill -USR1 -n -x squeak >> >> a few times and noticed "stack overflow" mentioned twice. After the freeze, >> pkill sent nothing else to stdout. Maybe something in stderr? I'll try >> again and concatenate stderr to the end of stdout. Meanwhile, the file with >> the stack overflows is attached. >> >> Thanks for the help. >> >> - Dan >> >> On Sat, 28 Jan 2017, Levente Uzonyi wrote: >> >> >>> Hi Dan, >>> You can send the USR1 signal to the VM process to make it write some debug >>> information to the console. With this information you can easily tell >>> what's happening. I suspect it's stuck in a long GC. >>> Levente >>> On Fri, 27 Jan 2017, Dan Norton wrote: >>>> On Debian8, CogSpur64 5.0-201612221637, Cuis 5.0 3043... >>>> >>>> after lots of user interaction, views opening and closing, and >>>> animation, the image becomes unresponsive. Cmd+. does nothing and the >>>> clock in the Cuis taskbar no longer updates. The length of time to >>>> produce this varies from 5 to 20 minutes. >>>> >>>> There seems to be no dump and no log. 'Smalltalk garbageCollectMost' >>>> reports 25550736 to 30517872 over 34 samples. >>>> >>>> Sorry to be so vague. This has occurred with several images - sometimes >>>> scrolling through a senders list, or stepping through a debugger, but >>>> repeatably with one of my images. I can supply this image, warmed up so >>>> that it might not take so long to reproduce the problem if you would >>>> like. Or give me a hint as to how to narrow down the problem >> |
In reply to this post by Bert Freudenberg
Hi Levente and Bert, The arg to #forSeconds: is 0.4 and Cmd-. does not interrupt this image during a Delay. - Dan On 01/29/2017 12:58 PM, Bert
Freudenberg wrote:
|
On Sun, Jan 29, 2017 at 8:27 PM, Dan Norton <[hidden email]> wrote:
Hmm, that might be worth investigating. It works fine with Cuis5.0-3040-spur-64 on a Mac VM. - Bert - |
On 01/29/2017 03:49 PM, Bert
Freudenberg wrote:
Correction: fat fingers here was holding down the wrong key. Cmd-. (as in alt-.) does interrupt this image. - Dan |
On Sun, Jan 29, 2017 at 10:31 PM, Dan Norton <[hidden email]> wrote:
Good. At least a modest amount of sanity prevails ;) But you're saying that in the abnormal case, alt-. does not work, yes? - Bert - |
In reply to this post by Dan Norton
btw, I was looking for time jumping backwards. Seems okay.
cheers -ben On Mon, Jan 30, 2017 at 2:56 AM, Dan Norton <[hidden email]> wrote:
|
In reply to this post by Bert Freudenberg
On 01/29/2017 06:13 PM, Bert
Freudenberg wrote:
When the freeze happens, alt-. has no effect. - Dan |
In reply to this post by Ben Coman
OK - I was not sure how to "log my system time".
- Dan
On 01/29/2017 06:29 PM, Ben Coman
wrote:
|
In reply to this post by Bert Freudenberg
The plot thickens, without clarity... On two occasions when the freeze occurred, things locked up to the extent that the system pointer in Debian could not be moved. Some keyboard events were accepted: ctl-esc followed by arrow keys alt-space followed by arrow keys I noticed that 'pkill -USR1 -n -x squeak' will append to crash.dmp in the Cuis-Smalltalk-Dev directory. No redirection of stdout needed. In the last run there were 4 stack overflows. Are these of any relevance? copyBits @ @ @ @ @ @ @ @ @ @ fractionPart truncated fractionPart truncated @ **StackOverflow** copyBits @ @ @ @ @ @ @ @ @ @ fractionPart truncated fractionPart truncated @ **StackOverflow** scanCharactersFrom:to:in:rightX:stopConditions:kern: @ fractionPart truncated fractionPart truncated @ **StackOverflow** scanCharactersFrom:to:in:rightX:stopConditions:kern: @ fractionPart truncated fractionPart truncated @ **StackOverflow** - Dan |
In reply to this post by Dan Norton
On Mon, Jan 30, 2017 at 9:22 AM, Dan Norton <[hidden email]> wrote: > > > OK - I was not sure how to "log my system time". Something like this would do... #!/bin/bash while true do before=$now now=`date +%H:%M:%S` echo "$now" b=`date --date="$before" +%s` n=`date --date="$now" +%s` if [ $b -gt $n ]; then echo "!!!" fi sleep 1 done cheers -ben |
Free forum by Nabble | Edit this page |