On Sat, Mar 1, 2014 at 4:09 PM, Jan Vrany <[hidden email]> wrote: Maybe I'm missing something obvious, but from my point of I disagree. What I use for serious VM profiling is a hybrid approach that collects data in the VM but uses the image to display and interact with it. The VMProfiler, which is in CogTools in the VMMaker repository is IMNAAHO a very nice VM profiler.
In the VM there is a thread that interrupts the main VM thread periodically, sampling its pc and storing this in a circular buffer. The thread and buffer are manipulated by a couple of primitives. Alongside is a platform-specific plugin to reveal the dlls loaded in the VM. The JIT reports current machine code methods and PICs through a primitive.
Above the VM, the profiler uses OSProcess to invoke tools like objdump or nm to extract symbols and their addresses from the VM and loaded dlls, and uses morphic to draw graphs of the pc activity, allowing one to zoom in to look at specific code sequences. It generates reports such as the following:
/Users/eliot/Cog/oscogvm/spurcogbuild/macbuild/Fast.app/Contents/MacOS/Squeak 3/4/2014 eden size: 2,603,344 stack pages: 160 code size: 1,048,576 0 tinyBenchmarks
gc prior. clear prior. 4.945 seconds; sampling frequency 1451 hz 7149 samples in the VM (7175 samples in the entire program) 99.64% of total
6533 samples in generated vm code 91.38% of entire vm (91.05% of total) 616 samples in vanilla vm code 8.62% of entire vm ( 8.59% of total) % of generated vm code (% of total) (samples) (cumulative)
49.04% (44.66%) Integer>>benchFib (3204) (49.04%) 19.10% (17.39%) Integer>>benchmark (1248) (68.15%)
15.18% (13.83%) Object>>at:put: (992) (83.33%) 12.23% (11.14%) SmallInteger>>+ (799) (95.56%)
4.18% ( 3.80%) Object>>at: (273) (99.74%) 0.15% ( 0.14%) ceMethodAbort0Args (10) (99.89%)
0.05% ( 0.04%) ceBaseFram...Trampoline(3) (99.94%) 0.02% ( 0.01%) ceEnterCog...ceiverReg (1) (99.95%)
0.02% ( 0.01%) Array>>repla...startingAt:(1) (99.97%) 0.02% ( 0.01%) Sequenceabl...om:to:put: (1) (99.98%)
0.02% ( 0.01%) SmallInteger>>- (1) (100.0%) % of vanilla vm code (% of total) (samples) (cumulative) 27.27% ( 2.34%) scavengingGCTenuringIf (168) (27.27%)
22.89% ( 1.97%) moveFramesInthroughtoPage (141) (50.16%) 20.13% ( 1.73%) primitiveStringReplace (124) (70.29%)
7.63% ( 0.66%) instantiateClassindexableSize (47) (77.92%) 6.49% ( 0.56%) marryFrameSP (40) (84.42%)
4.71% ( 0.40%) handleStackOverflow (29) (89.12%) 4.06% ( 0.35%) ceBaseFrameReturn (25) (93.18%)
1.79% ( 0.15%) ceStackOverflow (11) (94.97%) 1.30% ( 0.11%) scavengeReferentsOf (8) (96.27%)
0.65% ( 0.06%) scavengeLoop (4) (96.92%) 0.65% ( 0.06%) numStrongSlots...eronInactiveIf (4) (97.56%)
0.49% ( 0.04%) ioUTCMicroseconds (3) (98.05%) 0.32% ( 0.03%) checkForEvent...ontextSwitch (2) (98.38%)
0.32% ( 0.03%) primitiveNewWithArg (2) (98.70%) 0.32% ( 0.03%) unlinkSolitaryFreeTreeNode (2) (99.03%)
0.32% ( 0.03%) minCogMethodAddress (2) (99.35%) 0.16% ( 0.01%) allocateSlotsInO...ormatclassIndex(1) (99.51%)
0.16% ( 0.01%) copyAndForward (1) (99.68%) 0.16% ( 0.01%) processWeaklings (1) (99.84%)
0.16% ( 0.01%) returnToExecuti...tContextSwitch(1) (100.0%) **Memory**
old +412,400 bytes young -412,400 bytes
used +0 bytes free +0 bytes
**GCs** full 0 totalling 0ms (0% elapsed time) incr 425 totalling 138ms (2.8000000000000003% elapsed time), avg 0.32ms
tenures 0 root table 0 overflows
**Compiled Code Compactions** 0 totalling 0ms (0% elapsed time) **Events** Process switches 52 (11 per second)
ioProcessEvents calls 245 (50 per second) Interrupt checks 2866 (580 per second) Event checks 3189 (645 per second)
Stack overflows 1043888 (211100 per second) Stack page divorces 0 (0 per second) Here's what the interface looks like: In this case the region of code being displayed is where the JIT keeps the machine code for methods, and the list to the left is some of the methods in the code zone. The pink graph in the main region is the integral of the black pc histogram.
Jan best, Eliot
|
Free forum by Nabble | Edit this page |