On Tue, 2011-08-23 at 13:32 -0700, Jon Paynter wrote:
> This may or may not help..... > > But I ran into exactly the same symptoms while upgrading from vw77 to > vw78. The image would launch, the cpu would peg at 100%, and memory > use would grow to 500mb and then just sit there. > > In my case the problem was I was using a vw77 VM to open a vw78 image > -- easy to fix. It's definitely a 7.7.1 VM with a 7.7 image. That led to the problems finding the source on windows (the origin of this thread, under "vw7.7.1nc can't find source on Windows"), but other than that it ran OK on Windows. I was only running on Windows because of the problems discussed here running it in squeeze. Under what OS or OS's did you notice your 7.7 -> 7.8 problem? Since the image has stuff in it, migrating the code and objects out is slightly non-trivial (also discussed in this thread, though with different subject). But, at this point, it might be the least trivial solution:) I know that, under squeeze, the 7.7.1 VM runs a stock 7.7.1 image fine. Still, I wonder what the problem is and would like to diagnose it. Ross > > maybe that can point your investigation in a useful direction. > > > On Tue, Aug 23, 2011 at 1:18 PM, Ross Boylan <[hidden email]> > wrote: > On Tue, 2011-08-23 at 02:16 -0700, Andres Valloud wrote: > > Under -o11s, 100% CPU utilization is a side effect of very > expensive > > checks being performed while running the IGC in > interruptible mode. > > Basically, the length of the checks effectively ensure the > IGC will > > always be interrupted before you can do any work. Go to the > memory > > policy and you will find something equivalent to > > > > idleLoopIGCAction > > > > [ > > self igcObjects: nil interruptibly: true. > > self updateMemoryStatus. > > self memoryStatus incGCIsResting > > ] whileFalse. > > self updateNumScavengesAsOfLastGC. > > ObjectMemory resetSpaceLimitsIfHigher. > > (self shouldForkCompactData or: [self > dataNeedsCompacting]) > > ifTrue: [self forkCompactData]. > > self shrinkMemoryIfRequired > > > > > > (your implementation may vary since I've been cleaning up > that stuff). > > Either comment out the IGC code, or change the code so that > the idle > > loop never decides to use the IGC. > > > > Is all the code above IGC code, or are there parts of it that > should > stay? > Ross > > > > > On 8/23/2011 12:03 AM, Holger Kleinsorgen wrote: > > > Hi Ross > > > > > > try to launch the image with > > > > > > vmlinux86dbg -o10s visual.im > debug.log > > > > > > All message sends will be printed to stdout. I assume that > there is an > > > infinite recursion, which should be diagnosable by looking > at the output. > > > > > > If it's a headfull image, my buest guess is an infinite > debugger > > > recursion. You might want to try to add the following > parcel to the > > > image and run it on Squeeze: > > > > > > http://dl.dropbox.com/u/21555916/StartupPatch.zip > > > > > > it will create an error.log file when the windowing system > is not active. > > > > > > On 23.08.2011 06:33, Ross Boylan wrote: > > >> I started the program with vmlinux86dbg -o11s about 4.5 > days ago. It > > >> continues to run, spitting out messages on the launching > terminal like > > >> checkMarksInOt: incremenal GC was conservative by xxx > objects. > > >> It uses all of one CPU, and has slowly grown to 442M. > > >> > > >> Since the process is 51M when it runs successfully on > lenny, and the > > >> image file is 37M on disk, something has likely already > gone very wrong. > > >> > > >> I attached the debugger to it, but I am unable to > reattach. When I do > > >> so, I get the message > > >> Attaching to program ... > > >> Cannot access memory at address xxxx > > >> A program is being debugged already. Kill it? (y or n) > > > > > > > > > _______________________________________________ > > > vwnc mailing list > > > [hidden email] > > > http://lists.cs.uiuc.edu/mailman/listinfo/vwnc > > _______________________________________________ > > vwnc mailing list > > [hidden email] > > http://lists.cs.uiuc.edu/mailman/listinfo/vwnc > > _______________________________________________ > vwnc mailing list > [hidden email] > http://lists.cs.uiuc.edu/mailman/listinfo/vwnc > > > _______________________________________________ > vwnc mailing list > [hidden email] > http://lists.cs.uiuc.edu/mailman/listinfo/vwnc _______________________________________________ vwnc mailing list [hidden email] http://lists.cs.uiuc.edu/mailman/listinfo/vwnc |
On Tue, Aug 23, 2011 at 2:02 PM, Ross Boylan <[hidden email]> wrote:
This was win7 x64 Since the image has stuff in it, migrating the code and objects out is Do a sanity check & Verify you are running the right VM? _______________________________________________ vwnc mailing list [hidden email] http://lists.cs.uiuc.edu/mailman/listinfo/vwnc |
In reply to this post by Ross Boylan
> Even with a regular vm and no debug options, the process used 100% CPU
> until it was around 450M, and then CPU dropped to 0. No windows ever > opened; sometimes I'm pretty sure it eventually crashed. Ok, then let's use -o10s, redirect the output to a file, and see what happens. _______________________________________________ vwnc mailing list [hidden email] http://lists.cs.uiuc.edu/mailman/listinfo/vwnc |
In reply to this post by Ross Boylan
You should be able to comment out the whole thing without ill effect.
Or you can go to the sender and change the boolean expression so this method is never called. On 8/23/2011 1:18 PM, Ross Boylan wrote: > On Tue, 2011-08-23 at 02:16 -0700, Andres Valloud wrote: >> Under -o11s, 100% CPU utilization is a side effect of very expensive >> checks being performed while running the IGC in interruptible mode. >> Basically, the length of the checks effectively ensure the IGC will >> always be interrupted before you can do any work. Go to the memory >> policy and you will find something equivalent to >> >> idleLoopIGCAction >> >> [ >> self igcObjects: nil interruptibly: true. >> self updateMemoryStatus. >> self memoryStatus incGCIsResting >> ] whileFalse. >> self updateNumScavengesAsOfLastGC. >> ObjectMemory resetSpaceLimitsIfHigher. >> (self shouldForkCompactData or: [self dataNeedsCompacting]) >> ifTrue: [self forkCompactData]. >> self shrinkMemoryIfRequired >> >> >> (your implementation may vary since I've been cleaning up that stuff). >> Either comment out the IGC code, or change the code so that the idle >> loop never decides to use the IGC. >> > Is all the code above IGC code, or are there parts of it that should > stay? > Ross >> >> On 8/23/2011 12:03 AM, Holger Kleinsorgen wrote: >>> Hi Ross >>> >>> try to launch the image with >>> >>> vmlinux86dbg -o10s visual.im> debug.log >>> >>> All message sends will be printed to stdout. I assume that there is an >>> infinite recursion, which should be diagnosable by looking at the output. >>> >>> If it's a headfull image, my buest guess is an infinite debugger >>> recursion. You might want to try to add the following parcel to the >>> image and run it on Squeeze: >>> >>> http://dl.dropbox.com/u/21555916/StartupPatch.zip >>> >>> it will create an error.log file when the windowing system is not active. >>> >>> On 23.08.2011 06:33, Ross Boylan wrote: >>>> I started the program with vmlinux86dbg -o11s about 4.5 days ago. It >>>> continues to run, spitting out messages on the launching terminal like >>>> checkMarksInOt: incremenal GC was conservative by xxx objects. >>>> It uses all of one CPU, and has slowly grown to 442M. >>>> >>>> Since the process is 51M when it runs successfully on lenny, and the >>>> image file is 37M on disk, something has likely already gone very wrong. >>>> >>>> I attached the debugger to it, but I am unable to reattach. When I do >>>> so, I get the message >>>> Attaching to program ... >>>> Cannot access memory at address xxxx >>>> A program is being debugged already. Kill it? (y or n) >>> >>> >>> _______________________________________________ >>> vwnc mailing list >>> [hidden email] >>> http://lists.cs.uiuc.edu/mailman/listinfo/vwnc >> _______________________________________________ >> vwnc mailing list >> [hidden email] >> http://lists.cs.uiuc.edu/mailman/listinfo/vwnc > > vwnc mailing list [hidden email] http://lists.cs.uiuc.edu/mailman/listinfo/vwnc |
In reply to this post by Ross Boylan
Hello,
On 23.08.2011 18:27, Ross Boylan wrote: > On Tue, 2011-08-23 at 09:03 +0200, Holger Kleinsorgen wrote: >> Hi Ross >> >> try to launch the image with >> >> vmlinux86dbg -o10s visual.im> debug.log >> >> All message sends will be printed to stdout. I assume that there is an >> infinite recursion, which should be diagnosable by looking at the output. > > Is -o10s going to show me anything that -o11s doesn't already? > -o10s and -o11s are completely different switches. -o11s enables additional assertions -o10s enables printing all message sends to stdout it will look like that: ---FromImm---a Snapshot>>postSnapshotBootstrap 1 ---FromImm---a DeferredBinding>>value 2 ---FromImm---a DeferredBinding>>resolveBinding 3 ---FromOrd---a ByteSymbol>>isSymbol 4 ---FromOrd---a CompiledMethod>>mclass 5 ---FromOrd---a Class (Snapshot)>>isNil 6 ---FromOrd---a Class (Snapshot)>>bindingFor: 7 ---FromImm---a ResolvedDeferredConstant>>value 8 ---FromOrd---a Class (NameSpaceSearchRules)>>new 9 ---FromSup---a Class (NameSpaceSearchRules)>>new 10 ---FromOrd---a NameSpaceSearchRules>>initialize 11 ---FromImm---a ResolvedDeferredConstant>>value 12 ---FromOrd---a Class (OrderedCollection)>>new: 13 ---FromSup---a Class (OrderedCollection)>>new: 14 ---FromOrd---a OrderedCollection>>setIndices 15 ---FromImm---a Class (Snapshot)>>bindingFor:modifiers: 16 ---FromImm---a Class (Snapshot)>>classPool 17 ---FromOrd---a NameSpaceOfClass>>bindingFor:modifiers: 18 ---FromOrd---a NameSpaceSearchRules>>hasVisited: 19 ---FromOrd---a OrderedCollection>>includes: 20 ---FromImm---a NameSpaceOfClass>>asKey: 21 ---FromOrd---a ByteSymbol>>asSymbol 22 and so on... so my suggestion is to 1. launch your image with vmlinux86dbg -o10s visual.im > debug.log 2. kill the process after a few seconds 3. have a look at debug.log and see what's going on >> >> If it's a headfull image, my buest guess is an infinite debugger >> recursion. > > Why do you think that? Although I'm running under the debugger, the > misbehavior (huge memory, high CPU, no windows) occurred with a regular > VM. I did not mean the gdb debugger, but the Smalltalk debugger, which handles exceptions. It will fail to handle exceptions if the windowing system is not active. But it will usually cause a core dump quite fast, so I don't think anymore that this is what you're experiencing. _______________________________________________ vwnc mailing list [hidden email] http://lists.cs.uiuc.edu/mailman/listinfo/vwnc |
In reply to this post by Andres Valloud-6
On Tue, 2011-08-23 at 23:37 -0700, Andres Valloud wrote:
> > Even with a regular vm and no debug options, the process used 100% CPU > > until it was around 450M, and then CPU dropped to 0. No windows ever > > opened; sometimes I'm pretty sure it eventually crashed. > > Ok, then let's use -o10s, redirect the output to a file, and see what > happens. I get a very big file! I may be combining different pieces of advice in ways not intended. Here's the setup: 1. I took a slightly later version of the problem image and loaded the parcel http://dl.dropbox.com/u/21555916/StartupPatch.zip as Holger suggested. This was under Debian lenny with the 7.7 VM. 2. I left the idle loop untouched, since the code is quite a bit different in my image. I think the operative part is MemoryPolicy>>idleLoopAction "Run the incremental GC until interrupted if the potential benefits seem to warrant the effort. This message should be sent only if the system has nothing better to do." self updateMemoryStatus. self tooManyAllocationProbes ifTrue: [self primeThreadedDataList ifFalse: [[self compactData] forkAt: Processor userInterruptPriority]]. self idleLoopGCJustified ifTrue: [[ObjectMemory primIncrementalGCIsInterruptable: true objects: SmallInteger maxVal bytes: SmallInteger maxVal. self updateMemoryStatus. memoryStatus incrementalGCState == #resting] whileFalse. NumScavengesAsOfLastGC := memoryStatus numScavenges. ObjectMemory resetSpaceLimitsIfHigher. self dataNeedsCompacting ifTrue: [[self compactData] forkAt: Processor userInterruptPriority]] There is no idleLoopIGCAction. 3. I copied the image to squeeze, with the 7.7.1 VM. 4. launch with -o10s. The log shows various Prim fail in src/mman/mmAllocate.c @ various addresses (many 245) Prim fail in src/prim/prArithmetic.c @ 229 (while doing FontDescription calls) It shows some of my application code executing (some application windows are open in the original image). The bottom 98% of the log is dominated by --FromPerf---a Class (ColorValue)>>scaledRed:scaledGren:scaledBlue: NNNNN with intermittent blocks of Prim fail in scr/mman/mmSubscript.c @ NNN (e.g., 802) I'll try to send some more details from the machine where this was running. Ross _______________________________________________ vwnc mailing list [hidden email] http://lists.cs.uiuc.edu/mailman/listinfo/vwnc |
In reply to this post by Andres Valloud-6
On Tue, 2011-08-23 at 23:37 -0700, Andres Valloud wrote:
> > Even with a regular vm and no debug options, the process used 100% CPU > > until it was around 450M, and then CPU dropped to 0. No windows ever > > opened; sometimes I'm pretty sure it eventually crashed. > > Ok, then let's use -o10s, redirect the output to a file, and see what > happens. Here is the first 1.6M of the log file, gzipped. The log file was up to 78M when I stopped it, but the last 98%, as far as I can tell, is a repeat of the ColorValue>>scaledRed:scaledGreen:scaledBlue messages interspersed with Prim fail in src/mman/mmSubscript.c @ 802 (among other addresses). I also attach the stack trace from gdb when I broke into the process (when I attached to it; it was launched outside of the debugger). Ross _______________________________________________ vwnc mailing list [hidden email] http://lists.cs.uiuc.edu/mailman/listinfo/vwnc |
In reply to this post by Ross Boylan
> 2. I left the idle loop untouched, since the code is quite a bit
> different in my image. I think the operative part is > MemoryPolicy>>idleLoopAction > "Run the incremental GC until interrupted if the potential benefits > seem to warrant the effort. This message should be sent only if the > system has nothing better to do." > > self updateMemoryStatus. > self tooManyAllocationProbes ifTrue: > [self primeThreadedDataList ifFalse: > [[self compactData] forkAt: Processor userInterruptPriority]]. > self idleLoopGCJustified ifTrue: > [[ObjectMemory > primIncrementalGCIsInterruptable: true > objects: SmallInteger maxVal > bytes: SmallInteger maxVal. > self updateMemoryStatus. > memoryStatus incrementalGCState == #resting] whileFalse. > NumScavengesAsOfLastGC := memoryStatus numScavenges. > ObjectMemory resetSpaceLimitsIfHigher. > self dataNeedsCompacting ifTrue: > [[self compactData] forkAt: Processor userInterruptPriority]] > > There is no idleLoopIGCAction. Yes, that has been refactored. Just replace "self idleLoopGCJustified" with "false". > 3. I copied the image to squeeze, with the 7.7.1 VM. > 4. launch with -o10s. > > The log shows various > Prim fail in src/mman/mmAllocate.c @ various addresses (many 245) Basically, #new failed. The result is to do GC or some such. Normal. > Prim fail in src/prim/prArithmetic.c @ 229 (while doing FontDescription > calls) Harmless, it's because the code is doing aSmallInteger * aFloat or stuff like that, which is handled by the image in the failure code. But newer code prevents the failures so you don't have to deal with them so much during debugging. > It shows some of my application code executing (some application windows > are open in the original image). > > The bottom 98% of the log is dominated by > --FromPerf---a Class (ColorValue)>>scaledRed:scaledGren:scaledBlue: > NNNNN Typical GUI code. > with intermittent blocks of > Prim fail in scr/mman/mmSubscript.c @ NNN (e.g., 802) That's another failure that shouldn't have been there to begin with. The latest build has a fix for this and other failures. The net result is a general speedup of background operations. Andres. _______________________________________________ vwnc mailing list [hidden email] http://lists.cs.uiuc.edu/mailman/listinfo/vwnc |
In reply to this post by Ross Boylan
Sorry, I don't see anything interesting in the file. There are only 26k
message sends logged there. On 8/25/2011 10:41 PM, Ross Boylan wrote: > On Tue, 2011-08-23 at 23:37 -0700, Andres Valloud wrote: >>> Even with a regular vm and no debug options, the process used 100% CPU >>> until it was around 450M, and then CPU dropped to 0. No windows ever >>> opened; sometimes I'm pretty sure it eventually crashed. >> >> Ok, then let's use -o10s, redirect the output to a file, and see what >> happens. > Here is the first 1.6M of the log file, gzipped. The log file was up to > 78M when I stopped it, but the last 98%, as far as I can tell, is a > repeat of the ColorValue>>scaledRed:scaledGreen:scaledBlue messages > interspersed with > Prim fail in src/mman/mmSubscript.c @ 802 (among other addresses). > > I also attach the stack trace from gdb when I broke into the process > (when I attached to it; it was launched outside of the debugger). > > Ross vwnc mailing list [hidden email] http://lists.cs.uiuc.edu/mailman/listinfo/vwnc |
In reply to this post by Holger Kleinsorgen-4
On Tue, 2011-08-23 at 09:03 +0200, Holger Kleinsorgen wrote:
> Hi Ross > > try to launch the image with > > vmlinux86dbg -o10s visual.im > debug.log > > All message sends will be printed to stdout. I assume that there is an > infinite recursion, which should be diagnosable by looking at the output. > > If it's a headfull image, my buest guess is an infinite debugger > recursion. You might want to try to add the following parcel to the > image and run it on Squeeze: > > http://dl.dropbox.com/u/21555916/StartupPatch.zip > > it will create an error.log file when the windowing system is not active. error.log file showed up in any directories I checked. I did at least get an error at src/mman/mmScavenge.c:2427 Fatal error. Out of memory. It ran for just over 24 hours. Ross > > On 23.08.2011 06:33, Ross Boylan wrote: > > I started the program with vmlinux86dbg -o11s about 4.5 days ago. It > > continues to run, spitting out messages on the launching terminal like > > checkMarksInOt: incremenal GC was conservative by xxx objects. > > It uses all of one CPU, and has slowly grown to 442M. > > > > Since the process is 51M when it runs successfully on lenny, and the > > image file is 37M on disk, something has likely already gone very wrong. > > > > I attached the debugger to it, but I am unable to reattach. When I do > > so, I get the message > > Attaching to program ... > > Cannot access memory at address xxxx > > A program is being debugged already. Kill it? (y or n) > _______________________________________________ > vwnc mailing list > [hidden email] > http://lists.cs.uiuc.edu/mailman/listinfo/vwnc _______________________________________________ vwnc mailing list [hidden email] http://lists.cs.uiuc.edu/mailman/listinfo/vwnc |
> I ran vmlinux86dbg without any options, but with StartupPatch. No
> error.log file showed up in any directories I checked. > > I did at least get an error at > src/mman/mmScavenge.c:2427 > Fatal error. > Out of memory. > > It ran for just over 24 hours. Ok, use the -xq switch on the same VM and try to replicate the mmScavenge problem. When it crashes, you should get a file called stack.txt. Post it here, it should have clues as to what's happening. Also, to make it crash faster, first save the image with a memory upper bound of e.g.: 128mb. To do that, go to the memory tab under the image settings, then save the image, and use that image to run the test. Andres. _______________________________________________ vwnc mailing list [hidden email] http://lists.cs.uiuc.edu/mailman/listinfo/vwnc |
On Sun, 2011-08-28 at 02:20 -0700, Andres Valloud wrote:
> > I ran vmlinux86dbg without any options, but with StartupPatch. No > > error.log file showed up in any directories I checked. > > > > I did at least get an error at > > src/mman/mmScavenge.c:2427 > > Fatal error. > > Out of memory. > > > > It ran for just over 24 hours. > > Ok, use the -xq switch on the same VM and try to replicate the > mmScavenge problem. When it crashes, you should get a file called > stack.txt. Post it here, it should have clues as to what's happening. > Also, to make it crash faster, first save the image with a memory upper > bound of e.g.: 128mb. To do that, go to the memory tab under the image > settings, then save the image, and use that image to run the test. > > Andres. font-related. This doesn't appear to be the same problem as before. I've also include the error.log, which I believe was produced by the StartupPatch mentioned earlier in this thread. Ross _______________________________________________ vwnc mailing list [hidden email] http://lists.cs.uiuc.edu/mailman/listinfo/vwnc |
I think I've found and fixed the problem, at least with the current
image. I was missing a font the image needed. A couple nagging issues remain. Details below. On Tue, 2011-08-30 at 20:26 -0700, Ross Boylan wrote: > On Sun, 2011-08-28 at 02:20 -0700, Andres Valloud wrote: > > > I ran vmlinux86dbg without any options, but with StartupPatch. No > > > error.log file showed up in any directories I checked. > > > > > > I did at least get an error at > > > src/mman/mmScavenge.c:2427 > > > Fatal error. > > > Out of memory. > > > > > > It ran for just over 24 hours. > > > > Ok, use the -xq switch on the same VM and try to replicate the > > mmScavenge problem. When it crashes, you should get a file called > > stack.txt. Post it here, it should have clues as to what's happening. > > Also, to make it crash faster, first save the image with a memory upper > > bound of e.g.: 128mb. To do that, go to the memory tab under the image > > settings, then save the image, and use that image to run the test. > > > > Andres. > Here are the results (lowered memory to 128M and used -xq with debug > vm). The program ended almost immediately, and it seems to be > font-related. This doesn't appear to be the same problem as before. > > I've also include the error.log, which I believe was produced by the > StartupPatch mentioned earlier in this thread. The failure mode has changed. When I run with no options it fails quickly, with error.log again showing "There are no fonts matching this description". Run with -o10s I get a different set of message sends than before, most notably one that is relatively brief (still 1MB, with 15k message sends--but this is for the entire session). I (with Cincom's help) tweaked my image (including its code) to use Verdana as the font. This was to fix horrible fonts I saw with the default image. I didn't specify a fall-back list of fonts. All that was way before the current problem arose. Verdana is present on my Debian Lenny system, but absent from the squeeze system. It's an MS font; I assume it's on Windows. So the real issue is the availability of the font, not the OS or flavor of OS. When I installed the font on squeeze the image launched! Yea! Actually, it failed until I logged out and logged back in, although the installation claimed it was registering the fonts right away. There are two remaining loose ends. First, the successful launch (with vwlinux86dbg but no options) did print out (on the terminal from which it was launched) many "lost time signal in waitForIOWithTimeout, milliseconds 4" messages, with milliseconds ranging from 0 through 10. Does this indicate a problem, or is it part of the normal chatter? Second, it's odd that the startup failure seems to have changed. It's easy to imagine an infinite loop as the system attempts to open a window saying it can't find the font, and it fails to find the font for the new window, etc. That would explain the previous behavior of high CPU use, long runtime, slowly increasing memory, and ultimate failure with memory exhaustion. But why I should have got an infinite loop before and a single failure now I don't know. I did open a new window (system settings) when I reset the memory, and saved the image with that window open. Ross _______________________________________________ vwnc mailing list [hidden email] http://lists.cs.uiuc.edu/mailman/listinfo/vwnc |
In reply to this post by Ross Boylan
> Here are the results (lowered memory to 128M and used -xq with debug
> vm). The program ended almost immediately, and it seems to be > font-related. This doesn't appear to be the same problem as before. At first glance, it looks like stack.txt shows an MNU problem. I don't immediately see how to reproduce the MNU. Can you play with it a little bit (increase memory, or run it again, or something like that) and see if you can get the error.log to show what the MNU is? _______________________________________________ vwnc mailing list [hidden email] http://lists.cs.uiuc.edu/mailman/listinfo/vwnc |
In reply to this post by Ross Boylan
For now, this is normal debug message chatter.
On 8/30/2011 10:15 PM, Ross Boylan wrote: > First, the successful launch (with vwlinux86dbg but no options) did > print out (on the terminal from which it was launched) many > "lost time signal in waitForIOWithTimeout, milliseconds 4" > messages, with milliseconds ranging from 0 through 10. Does this > indicate a problem, or is it part of the normal chatter? _______________________________________________ vwnc mailing list [hidden email] http://lists.cs.uiuc.edu/mailman/listinfo/vwnc |
Free forum by Nabble | Edit this page |