Weird problem when adding a method to StackInterpreter

Previous Topic Next Topic
 
classic Classic list List threaded Threaded
14 messages Options
Reply | Threaded
Open this post in threaded view
|

Weird problem when adding a method to StackInterpreter

Mariano Martinez Peck
 
Hi guys. I am migrating some old stuff I have about tracing object usage to the latest version of Cog. I have in StackInterpreter a method which looks like this:

traceObjectUsage: anOop
    ((self isIntegerObject: anOop) not and: [hasToTrace])
        ifTrue: [
            objectMemory setExperimentalBitOf: anOop to: true.
            ]

So I have also added #setExperimentalBitOf:to:  to ObjectMemory, and I have added the instVar 'hasToTrace' to the StackInterpreter. So now from some methods of StackInterpreter, I add sends to this method to trace OOPs. This is working fine, but there is one particular method where I cannot do it. The method is #lookupInMethodCacheSel:class:  and I am trying to do something like:

lookupInMethodCacheSel: selector class: class
    "This method  ....."

    | hash probe rcvr |
    <inline: true>
    <asmLabel: false>
    self traceObjectUsage: class.
    rcvr := self internalStackValue: argumentCount.
    self traceObjectUsage: rcvr.
    hash := selector bitXor: class.  "shift drops two low-order zeros from addresses"
 .......

In this particular method, the problem is with the two lines:
    rcvr := self internalStackValue: argumentCount.
    self traceObjectUsage: rcvr.

If I DO NOT put those two lines, then it compiles perfect. If I put those lines, I have a compilation error:

Undefined symbols:
  "_lookupInMethodCacheSelclass", referenced from:
      _lookupreceiver in gcc3x-cointerpmt.c.o
      _handleMNUInMachineCodeToclassForMessage in gcc3x-cointerpmt.c.o
      _ceSendsupertonumArgs in gcc3x-cointerpmt.c.o
      _ceSendFromInLineCacheMiss in gcc3x-cointerpmt.c.o
      _ceSendAborttonumArgs in gcc3x-cointerpmt.c.o
      _interpret in gcc3x-cointerpmt.c.o
      _interpret in gcc3x-cointerpmt.c.o
      _sendInvokeCallbackStackRegistersJmpbuf in gcc3x-cointerpmt.c.o
      _sendInvokeCallbackContext in gcc3x-cointerpmt.c.o
  "_findNewMethodInClass", referenced from:
      _primitivePerform in gcc3x-cointerpmt.c.o
      _primitiveObjectperformwithArgumentslookedUpIn in gcc3x-cointerpmt.c.o
      _primitiveInvokeObjectAsMethod in gcc3x-cointerpmt.c.o
ld: symbol(s) not found
collect2: ld returned 1 exit status
make[2]: *** [/Users/mariano/Pharo/VM/git/cogVMBlessedSSH/blessed/results/CogMTVM.app/Contents/MacOS/CogMTVM] Error 1
make[1]: *** [CMakeFiles/CogMTVM.dir/all] Error 2
make: *** [all] Error 2


Moreover, when I ADD those lines, apart from the make error, when translating from SLANG to C, if I open a Transcript I see these "suspicious" lines:

Removed findNewMethodInClass: because it refers to the local variable localSP of interpret.
But it is either used outside of interpret or exported!!
Removed lookupInMethodCacheSel:class: because it refers to the local variable localSP of interpret.
But it is either used outside of interpret or exported!!

BUT, none of those methods access localSP.

Any idea of what can be wrong?

Thanks in advance,


--
Mariano
http://marianopeck.wordpress.com

Reply | Threaded
Open this post in threaded view
|

Re: Weird problem when adding a method to StackInterpreter

Eliot Miranda-2
 


On Wed, Dec 21, 2011 at 2:04 PM, Mariano Martinez Peck <[hidden email]> wrote:
 
Hi guys. I am migrating some old stuff I have about tracing object usage to the latest version of Cog. I have in StackInterpreter a method which looks like this:

traceObjectUsage: anOop
    ((self isIntegerObject: anOop) not and: [hasToTrace])
        ifTrue: [
            objectMemory setExperimentalBitOf: anOop to: true.
            ]

So I have also added #setExperimentalBitOf:to:  to ObjectMemory, and I have added the instVar 'hasToTrace' to the StackInterpreter. So now from some methods of StackInterpreter, I add sends to this method to trace OOPs. This is working fine, but there is one particular method where I cannot do it. The method is #lookupInMethodCacheSel:class:  and I am trying to do something like:

lookupInMethodCacheSel: selector class: class
    "This method  ....."

    | hash probe rcvr |
    <inline: true>
    <asmLabel: false>
    self traceObjectUsage: class.
    rcvr := self internalStackValue: argumentCount.
    self traceObjectUsage: rcvr.
    hash := selector bitXor: class.  "shift drops two low-order zeros from addresses"
 .......

In this particular method, the problem is with the two lines:
    rcvr := self internalStackValue: argumentCount.
    self traceObjectUsage: rcvr.

If I DO NOT put those two lines, then it compiles perfect. If I put those lines, I have a compilation error:

Undefined symbols:
  "_lookupInMethodCacheSelclass", referenced from:
      _lookupreceiver in gcc3x-cointerpmt.c.o
      _handleMNUInMachineCodeToclassForMessage in gcc3x-cointerpmt.c.o
      _ceSendsupertonumArgs in gcc3x-cointerpmt.c.o
      _ceSendFromInLineCacheMiss in gcc3x-cointerpmt.c.o
      _ceSendAborttonumArgs in gcc3x-cointerpmt.c.o
      _interpret in gcc3x-cointerpmt.c.o
      _interpret in gcc3x-cointerpmt.c.o
      _sendInvokeCallbackStackRegistersJmpbuf in gcc3x-cointerpmt.c.o
      _sendInvokeCallbackContext in gcc3x-cointerpmt.c.o
  "_findNewMethodInClass", referenced from:
      _primitivePerform in gcc3x-cointerpmt.c.o
      _primitiveObjectperformwithArgumentslookedUpIn in gcc3x-cointerpmt.c.o
      _primitiveInvokeObjectAsMethod in gcc3x-cointerpmt.c.o
ld: symbol(s) not found
collect2: ld returned 1 exit status
make[2]: *** [/Users/mariano/Pharo/VM/git/cogVMBlessedSSH/blessed/results/CogMTVM.app/Contents/MacOS/CogMTVM] Error 1
make[1]: *** [CMakeFiles/CogMTVM.dir/all] Error 2
make: *** [all] Error 2


Moreover, when I ADD those lines, apart from the make error, when translating from SLANG to C, if I open a Transcript I see these "suspicious" lines:

Removed findNewMethodInClass: because it refers to the local variable localSP of interpret.
But it is either used outside of interpret or exported!!
Removed lookupInMethodCacheSel:class: because it refers to the local variable localSP of interpret.
But it is either used outside of interpret or exported!!

BUT, none of those methods access localSP.

Any idea of what can be wrong?

For some reason  lookupInMethodCacheSel:class: is not being inlined.  Since it uses localSP via internalStackValue: it must be always inlined into the body of interpret.  What I don't know is why adding a send of traceObjectusage: after internalStackValue: causes inlining to fail.  You're going to have to delve into the inliner in Slang.  This is, um, not fun.  I liken it to getting hit on the head with a stick by your guru, except that no enlightenment results.  Good luck.


Thanks in advance,


--
Mariano
http://marianopeck.wordpress.com





--
best,
Eliot

Reply | Threaded
Open this post in threaded view
|

Re: Weird problem when adding a method to StackInterpreter

Mariano Martinez Peck
 



Moreover, when I ADD those lines, apart from the make error, when translating from SLANG to C, if I open a Transcript I see these "suspicious" lines:

Removed findNewMethodInClass: because it refers to the local variable localSP of interpret.
But it is either used outside of interpret or exported!!
Removed lookupInMethodCacheSel:class: because it refers to the local variable localSP of interpret.
But it is either used outside of interpret or exported!!

BUT, none of those methods access localSP.

Any idea of what can be wrong?

For some reason  lookupInMethodCacheSel:class: is not being inlined.  Since it uses localSP via internalStackValue: it must be always inlined into the body of interpret.  What I don't know is why adding a send of traceObjectusage: after internalStackValue: causes inlining to fail.


Hi Eliot. I did some more tests and it even fails WITHOUT using #traceObjectUsage:  .  Just adding the line "rcvr := self internalStackValue: argumentCount."  at the beginning of #lookupInMethodCacheSel:class:   will cause the problem. So, the following even fails:

lookupInMethodCacheSel: selector class: class
    "This method....."

    | hash probe rcvr |
    <inline: true>
    <asmLabel: false>
    rcvr := self internalStackValue: argumentCount.
    hash := selector bitXor: class.  "shift drops two low-order zeros from addresses"
......


Weird ehh, because you use #internalStackValue:  along StackInterpreter in a lot of other places and you don't have problems there.

 
 You're going to have to delve into the inliner in Slang.  This is, um, not fun.  I liken it to getting hit on the head with a stick by your guru, except that no enlightenment results.  Good luck.


:(   thanks.

 

Thanks in advance,


--
Mariano
http://marianopeck.wordpress.com





--
best,
Eliot





--
Mariano
http://marianopeck.wordpress.com

Reply | Threaded
Open this post in threaded view
|

Re: Weird problem when adding a method to StackInterpreter

Eliot Miranda-2
 
Hi Mariano,

On Thu, Dec 22, 2011 at 1:12 AM, Mariano Martinez Peck <[hidden email]> wrote:
 



Moreover, when I ADD those lines, apart from the make error, when translating from SLANG to C, if I open a Transcript I see these "suspicious" lines:

Removed findNewMethodInClass: because it refers to the local variable localSP of interpret.
But it is either used outside of interpret or exported!!
Removed lookupInMethodCacheSel:class: because it refers to the local variable localSP of interpret.
But it is either used outside of interpret or exported!!

BUT, none of those methods access localSP.

Any idea of what can be wrong?

For some reason  lookupInMethodCacheSel:class: is not being inlined.  Since it uses localSP via internalStackValue: it must be always inlined into the body of interpret.  What I don't know is why adding a send of traceObjectusage: after internalStackValue: causes inlining to fail.


Hi Eliot. I did some more tests and it even fails WITHOUT using #traceObjectUsage:  .  Just adding the line "rcvr := self internalStackValue: argumentCount."  at the beginning of #lookupInMethodCacheSel:class:   will cause the problem. So, the following even fails:

lookupInMethodCacheSel: selector class: class
    "This method....."

    | hash probe rcvr |
    <inline: true>
    <asmLabel: false>
    rcvr := self internalStackValue: argumentCount.
    hash := selector bitXor: class.  "shift drops two low-order zeros from addresses"
......


Weird ehh, because you use #internalStackValue:  along StackInterpreter in a lot of other places and you don't have problems there.

Turns out it's not weird at all.  Since  lookupInMethodCacheSel:class: is used outside of interpret in findNewMethodInClass: and in callback lookup it can't be inlined and hence can't access localSP.  If you want to get the receiver you'll need to use stackValue: *and* you'll need to ensure that stackPointer is updated when calling lookupInMethodCacheSel:class: from internalFindNewMethod (see externalizeFPandSP), which may slow down the interpreter slightly.


 
 You're going to have to delve into the inliner in Slang.  This is, um, not fun.  I liken it to getting hit on the head with a stick by your guru, except that no enlightenment results.  Good luck.


:(   thanks.

 

Thanks in advance,


--
Mariano
http://marianopeck.wordpress.com





--
best,
Eliot





--
Mariano
http://marianopeck.wordpress.com





--
best,
Eliot

Reply | Threaded
Open this post in threaded view
|

Re: Weird problem when adding a method to StackInterpreter

Mariano Martinez Peck
 


Weird ehh, because you use #internalStackValue:  along StackInterpreter in a lot of other places and you don't have problems there.

Turns out it's not weird at all.  Since  lookupInMethodCacheSel:class: is used outside of interpret in findNewMethodInClass: and in callback lookup it can't be inlined and hence can't access localSP.  


Hi Eliot. Thanks for you answer. It also turns out that I don't know enough about SLANG ;)  so it was not weird at all but expected. Ok, I am learning in the way. So I understand that sentence. But  (down)
 
If you want to get the receiver you'll need to use stackValue: *and* you'll need to ensure that stackPointer is updated when calling lookupInMethodCacheSel:class: from internalFindNewMethod (see externalizeFPandSP), which may slow down the interpreter slightly.


I DO understand what #externalizeFPandSP does, but what I don't understand is why I should only do it in #internalFindNewMethod. I mean, what happens with all the rest of the senders of #lookupInMethodCacheSel:class:   ?  maybe if one of those senders do not update stackPointer (externalizeFPandSP), then in #lookupInMethodCacheSel:class:  I will be accessing something wrong ?

Anyway, I wanted to trace the receiver in #lookupInMethodCacheSel:class:  to avoid doing it in all its senders. But with the problem found, I workarrounded by tracing the receiver in its senders (only those inlined) and that seems to work :)


 

 
 You're going to have to delve into the inliner in Slang.  This is, um, not fun.  I liken it to getting hit on the head with a stick by your guru, except that no enlightenment results.  Good luck.


:(   thanks.

 

Thanks in advance,


--
Mariano
http://marianopeck.wordpress.com





--
best,
Eliot





--
Mariano
http://marianopeck.wordpress.com





--
best,
Eliot





--
Mariano
http://marianopeck.wordpress.com

Reply | Threaded
Open this post in threaded view
|

Re: Weird problem when adding a method to StackInterpreter

Mariano Martinez Peck
 
Eliot, there is one place I have missing and I cannot intercept :(  which I think it makes sense.
If there is a method which has been jitted (#foo: in the following example) and such method has message sends to another method which was jitted also (#name in the example), then I can never intercept the receiver of the second method. Test example:

testTraceOnlyReceiverMethodInsideMethod

    |  obj1 obj2 |
   
    obj1 := 'anObject'.
    obj2 := 'anotherObject'.
    self deny: (tracer isMarked: obj1).
    self deny: (tracer isMarked: obj2).
   
    obj1 foo: obj2.
    obj1 foo: obj2.
    obj1 foo: obj2.
    obj1 foo: obj2.
   
    tracer trace:  [obj1 foo: obj2].
   
    self assert: (tracer isMarked: obj1).
    self assert: (tracer isMarked: obj2).


So if I have

foo: anObject
    anObject name


Then the test fails in self assert: (tracer isMarked: obj2).  I imagine it is because it is executing the machine code of #foo: . So my question is if there is a way where I could intercept and trace the receiver also there?   I tried to do it but I failed.

Thanks a lot in advance,



On Fri, Dec 23, 2011 at 11:23 AM, Mariano Martinez Peck <[hidden email]> wrote:


Weird ehh, because you use #internalStackValue:  along StackInterpreter in a lot of other places and you don't have problems there.

Turns out it's not weird at all.  Since  lookupInMethodCacheSel:class: is used outside of interpret in findNewMethodInClass: and in callback lookup it can't be inlined and hence can't access localSP.  


Hi Eliot. Thanks for you answer. It also turns out that I don't know enough about SLANG ;)  so it was not weird at all but expected. Ok, I am learning in the way. So I understand that sentence. But  (down)
 
If you want to get the receiver you'll need to use stackValue: *and* you'll need to ensure that stackPointer is updated when calling lookupInMethodCacheSel:class: from internalFindNewMethod (see externalizeFPandSP), which may slow down the interpreter slightly.


I DO understand what #externalizeFPandSP does, but what I don't understand is why I should only do it in #internalFindNewMethod. I mean, what happens with all the rest of the senders of #lookupInMethodCacheSel:class:   ?  maybe if one of those senders do not update stackPointer (externalizeFPandSP), then in #lookupInMethodCacheSel:class:  I will be accessing something wrong ?

Anyway, I wanted to trace the receiver in #lookupInMethodCacheSel:class:  to avoid doing it in all its senders. But with the problem found, I workarrounded by tracing the receiver in its senders (only those inlined) and that seems to work :)


 

 
 You're going to have to delve into the inliner in Slang.  This is, um, not fun.  I liken it to getting hit on the head with a stick by your guru, except that no enlightenment results.  Good luck.


:(   thanks.

 

Thanks in advance,


--
Mariano
http://marianopeck.wordpress.com





--
best,
Eliot





--
Mariano
http://marianopeck.wordpress.com





--
best,
Eliot





--
Mariano
http://marianopeck.wordpress.com




--
Mariano
http://marianopeck.wordpress.com

Reply | Threaded
Open this post in threaded view
|

Re: Weird problem when adding a method to StackInterpreter

Eliot Miranda-2
 


On Fri, Dec 23, 2011 at 2:59 AM, Mariano Martinez Peck <[hidden email]> wrote:
 
Eliot, there is one place I have missing and I cannot intercept :(  which I think it makes sense.
If there is a method which has been jitted (#foo: in the following example) and such method has message sends to another method which was jitted also (#name in the example), then I can never intercept the receiver of the second method. Test example:

testTraceOnlyReceiverMethodInsideMethod

    |  obj1 obj2 |
   
    obj1 := 'anObject'.
    obj2 := 'anotherObject'.
    self deny: (tracer isMarked: obj1).
    self deny: (tracer isMarked: obj2).
   
    obj1 foo: obj2.
    obj1 foo: obj2.
    obj1 foo: obj2.
    obj1 foo: obj2.
   
    tracer trace:  [obj1 foo: obj2].
   
    self assert: (tracer isMarked: obj1).
    self assert: (tracer isMarked: obj2).


So if I have

foo: anObject
    anObject name


Then the test fails in self assert: (tracer isMarked: obj2).  I imagine it is because it is executing the machine code of #foo: . So my question is if there is a way where I could intercept and trace the receiver also there?   I tried to do it but I failed.

See the flag word traceLinkedSends in cogit.c.  A bit in the flags causes the JIT to generate a call at the start of a method for tracing:

#define recordSendTrace() (traceLinkedSends & 2)

The result is that ceTraceLinkedSend is called on every send.

HTH
Eliot


Thanks a lot in advance,



On Fri, Dec 23, 2011 at 11:23 AM, Mariano Martinez Peck <[hidden email]> wrote:


Weird ehh, because you use #internalStackValue:  along StackInterpreter in a lot of other places and you don't have problems there.

Turns out it's not weird at all.  Since  lookupInMethodCacheSel:class: is used outside of interpret in findNewMethodInClass: and in callback lookup it can't be inlined and hence can't access localSP.  


Hi Eliot. Thanks for you answer. It also turns out that I don't know enough about SLANG ;)  so it was not weird at all but expected. Ok, I am learning in the way. So I understand that sentence. But  (down)
 
If you want to get the receiver you'll need to use stackValue: *and* you'll need to ensure that stackPointer is updated when calling lookupInMethodCacheSel:class: from internalFindNewMethod (see externalizeFPandSP), which may slow down the interpreter slightly.


I DO understand what #externalizeFPandSP does, but what I don't understand is why I should only do it in #internalFindNewMethod. I mean, what happens with all the rest of the senders of #lookupInMethodCacheSel:class:   ?  maybe if one of those senders do not update stackPointer (externalizeFPandSP), then in #lookupInMethodCacheSel:class:  I will be accessing something wrong ?

Anyway, I wanted to trace the receiver in #lookupInMethodCacheSel:class:  to avoid doing it in all its senders. But with the problem found, I workarrounded by tracing the receiver in its senders (only those inlined) and that seems to work :)


 

 
 You're going to have to delve into the inliner in Slang.  This is, um, not fun.  I liken it to getting hit on the head with a stick by your guru, except that no enlightenment results.  Good luck.


:(   thanks.

 

Thanks in advance,


--
Mariano
http://marianopeck.wordpress.com





--
best,
Eliot





--
Mariano
http://marianopeck.wordpress.com





--
best,
Eliot





--
Mariano
http://marianopeck.wordpress.com




--
Mariano
http://marianopeck.wordpress.com





--
best,
Eliot

Reply | Threaded
Open this post in threaded view
|

Re: Weird problem when adding a method to StackInterpreter

Mariano Martinez Peck
 


Then the test fails in self assert: (tracer isMarked: obj2).  I imagine it is because it is executing the machine code of #foo: . So my question is if there is a way where I could intercept and trace the receiver also there?   I tried to do it but I failed.

See the flag word traceLinkedSends in cogit.c.  A bit in the flags causes the JIT to generate a call at the start of a method for tracing:

#define recordSendTrace() (traceLinkedSends & 2)

The result is that ceTraceLinkedSend is called on every send.


Wow. I cannot believe how easy it was :)  Thanks Eliot. So what I did is to change Cogit class >> declareCVarsIn: 
to set 2 rather than 8:

        var: #traceLinkedSends
            declareC: 'int traceLinkedSends = 2';
   

And then just add my tracing stuff in #ceTraceLinkedSend

Thank you very much Eliot and Happy Christmas to all VM hackers

 
HTH
Eliot


Thanks a lot in advance,



On Fri, Dec 23, 2011 at 11:23 AM, Mariano Martinez Peck <[hidden email]> wrote:


Weird ehh, because you use #internalStackValue:  along StackInterpreter in a lot of other places and you don't have problems there.

Turns out it's not weird at all.  Since  lookupInMethodCacheSel:class: is used outside of interpret in findNewMethodInClass: and in callback lookup it can't be inlined and hence can't access localSP.  


Hi Eliot. Thanks for you answer. It also turns out that I don't know enough about SLANG ;)  so it was not weird at all but expected. Ok, I am learning in the way. So I understand that sentence. But  (down)
 
If you want to get the receiver you'll need to use stackValue: *and* you'll need to ensure that stackPointer is updated when calling lookupInMethodCacheSel:class: from internalFindNewMethod (see externalizeFPandSP), which may slow down the interpreter slightly.


I DO understand what #externalizeFPandSP does, but what I don't understand is why I should only do it in #internalFindNewMethod. I mean, what happens with all the rest of the senders of #lookupInMethodCacheSel:class:   ?  maybe if one of those senders do not update stackPointer (externalizeFPandSP), then in #lookupInMethodCacheSel:class:  I will be accessing something wrong ?

Anyway, I wanted to trace the receiver in #lookupInMethodCacheSel:class:  to avoid doing it in all its senders. But with the problem found, I workarrounded by tracing the receiver in its senders (only those inlined) and that seems to work :)


 

 
 You're going to have to delve into the inliner in Slang.  This is, um, not fun.  I liken it to getting hit on the head with a stick by your guru, except that no enlightenment results.  Good luck.


:(   thanks.

 

Thanks in advance,


--
Mariano
http://marianopeck.wordpress.com





--
best,
Eliot





--
Mariano
http://marianopeck.wordpress.com





--
best,
Eliot





--
Mariano
http://marianopeck.wordpress.com




--
Mariano
http://marianopeck.wordpress.com





--
best,
Eliot





--
Mariano
http://marianopeck.wordpress.com

Reply | Threaded
Open this post in threaded view
|

Re: Weird problem when adding a method to StackInterpreter

Mariano Martinez Peck
 
Hi Eliot. Now I found another thing which took my attention. I would also like to trace when objects receives messages from the special selectors (special bytecode associated). So for example, I would like to trace an object that receives the message #new, #x, etc etc etc. With a StackVM I need to call my method #traceObjectUsage: from the bytecodePrim* methods. Usually, only when those methods answers before than the #normalSend. For example, in #bytecodePrimAdd I trace both the argument and the receiver when they are floats. If I do not add my sends to #traceObjectUsage:, then they receivers are not marked (logically).

Now, what I don't understand is what happens with CogVM. In Cog, even if I don't put my calls to #traceObjectUsage:  the receiver is always marked. I guess this is because I have put #traceObjectUsage: in a lot of general places of Cog. The "problem" is that with #class and #== the receiver is not marked (right now I don't want to discuss whether I should trace this or not) . Previously, with StackVM, if I have the call to #traceObjectUsage: in #bytecodePrimClass and #bytecodePrimEquivalent  then the receiver is marked perfectly. But with Cog I noticed that it doesn't matter what I put in #bytecodePrim*   because they seem they are never executed.  Is this correct?  Are these special bytecode always jitted from the very first time?  or they are jitted on demand (when they are found in the cache) like the rest of the normal methods ?    And the main question, what can be the cause of why I can trace with all the #bytecodePrim*  but not with #class and #== ?   I am obviously missing a place where I should trace....

Thanks a lot in advance,



On Mon, Dec 26, 2011 at 10:00 AM, Mariano Martinez Peck <[hidden email]> wrote:


Then the test fails in self assert: (tracer isMarked: obj2).  I imagine it is because it is executing the machine code of #foo: . So my question is if there is a way where I could intercept and trace the receiver also there?   I tried to do it but I failed.

See the flag word traceLinkedSends in cogit.c.  A bit in the flags causes the JIT to generate a call at the start of a method for tracing:

#define recordSendTrace() (traceLinkedSends & 2)

The result is that ceTraceLinkedSend is called on every send.


Wow. I cannot believe how easy it was :)  Thanks Eliot. So what I did is to change Cogit class >> declareCVarsIn: 
to set 2 rather than 8:

        var: #traceLinkedSends
            declareC: 'int traceLinkedSends = 2';
   

And then just add my tracing stuff in #ceTraceLinkedSend

Thank you very much Eliot and Happy Christmas to all VM hackers

 
HTH
Eliot


Thanks a lot in advance,



On Fri, Dec 23, 2011 at 11:23 AM, Mariano Martinez Peck <[hidden email]> wrote:


Weird ehh, because you use #internalStackValue:  along StackInterpreter in a lot of other places and you don't have problems there.

Turns out it's not weird at all.  Since  lookupInMethodCacheSel:class: is used outside of interpret in findNewMethodInClass: and in callback lookup it can't be inlined and hence can't access localSP.  


Hi Eliot. Thanks for you answer. It also turns out that I don't know enough about SLANG ;)  so it was not weird at all but expected. Ok, I am learning in the way. So I understand that sentence. But  (down)
 
If you want to get the receiver you'll need to use stackValue: *and* you'll need to ensure that stackPointer is updated when calling lookupInMethodCacheSel:class: from internalFindNewMethod (see externalizeFPandSP), which may slow down the interpreter slightly.


I DO understand what #externalizeFPandSP does, but what I don't understand is why I should only do it in #internalFindNewMethod. I mean, what happens with all the rest of the senders of #lookupInMethodCacheSel:class:   ?  maybe if one of those senders do not update stackPointer (externalizeFPandSP), then in #lookupInMethodCacheSel:class:  I will be accessing something wrong ?

Anyway, I wanted to trace the receiver in #lookupInMethodCacheSel:class:  to avoid doing it in all its senders. But with the problem found, I workarrounded by tracing the receiver in its senders (only those inlined) and that seems to work :)


 

 
 You're going to have to delve into the inliner in Slang.  This is, um, not fun.  I liken it to getting hit on the head with a stick by your guru, except that no enlightenment results.  Good luck.


:(   thanks.

 

Thanks in advance,


--
Mariano
http://marianopeck.wordpress.com





--
best,
Eliot





--
Mariano
http://marianopeck.wordpress.com





--
best,
Eliot





--
Mariano
http://marianopeck.wordpress.com




--
Mariano
http://marianopeck.wordpress.com





--
best,
Eliot





--
Mariano
http://marianopeck.wordpress.com




--
Mariano
http://marianopeck.wordpress.com

Reply | Threaded
Open this post in threaded view
|

Re: Weird problem when adding a method to StackInterpreter

Eliot Miranda-2
 
Hi Mariano,

On Tue, Dec 27, 2011 at 7:05 AM, Mariano Martinez Peck <[hidden email]> wrote:
 
Hi Eliot. Now I found another thing which took my attention. I would also like to trace when objects receives messages from the special selectors (special bytecode associated). So for example, I would like to trace an object that receives the message #new, #x, etc etc etc. With a StackVM I need to call my method #traceObjectUsage: from the bytecodePrim* methods. Usually, only when those methods answers before than the #normalSend. For example, in #bytecodePrimAdd I trace both the argument and the receiver when they are floats. If I do not add my sends to #traceObjectUsage:, then they receivers are not marked (logically).

Now, what I don't understand is what happens with CogVM. In Cog, even if I don't put my calls to #traceObjectUsage:  the receiver is always marked. I guess this is because I have put #traceObjectUsage: in a lot of general places of Cog. The "problem" is that with #class and #== the receiver is not marked (right now I don't want to discuss whether I should trace this or not) . Previously, with StackVM, if I have the call to #traceObjectUsage: in #bytecodePrimClass and #bytecodePrimEquivalent  then the receiver is marked perfectly. But with Cog I noticed that it doesn't matter what I put in #bytecodePrim*   because they seem they are never executed.  Is this correct?  Are these special bytecode always jitted from the very first time?  or they are jitted on demand (when they are found in the cache) like the rest of the normal methods ?    And the main question, what can be the cause of why I can trace with all the #bytecodePrim*  but not with #class and #== ?   I am obviously missing a place where I should trace....

#class and #== are always inlined in jitted code and so if you want to trace you'll have to modify the jit to add the tracing code as part of the inlined code.  Note that #class and #== must be inlined and not sent for the semantics to be the same as the interpreter.   In the interpreter these are never sent, but the bytecode for them is executed, just as in jitted code, the fetch of class and the comparison are executed but not sent.

But given that the stack vm and the cog vm are semantically equivalent do you even need to add tracing code to the jit? If you're tracing e.g. to discover how much of the object graph a given computation uses and you;re going to use this information for something later on, like creating a kernel image or something, why not just use the stack vm for tracing?
 

Thanks a lot in advance,



On Mon, Dec 26, 2011 at 10:00 AM, Mariano Martinez Peck <[hidden email]> wrote:


Then the test fails in self assert: (tracer isMarked: obj2).  I imagine it is because it is executing the machine code of #foo: . So my question is if there is a way where I could intercept and trace the receiver also there?   I tried to do it but I failed.

See the flag word traceLinkedSends in cogit.c.  A bit in the flags causes the JIT to generate a call at the start of a method for tracing:

#define recordSendTrace() (traceLinkedSends & 2)

The result is that ceTraceLinkedSend is called on every send.


Wow. I cannot believe how easy it was :)  Thanks Eliot. So what I did is to change Cogit class >> declareCVarsIn: 
to set 2 rather than 8:

        var: #traceLinkedSends
            declareC: 'int traceLinkedSends = 2';
   

And then just add my tracing stuff in #ceTraceLinkedSend

Thank you very much Eliot and Happy Christmas to all VM hackers

 
HTH
Eliot


Thanks a lot in advance,



On Fri, Dec 23, 2011 at 11:23 AM, Mariano Martinez Peck <[hidden email]> wrote:


Weird ehh, because you use #internalStackValue:  along StackInterpreter in a lot of other places and you don't have problems there.

Turns out it's not weird at all.  Since  lookupInMethodCacheSel:class: is used outside of interpret in findNewMethodInClass: and in callback lookup it can't be inlined and hence can't access localSP.  


Hi Eliot. Thanks for you answer. It also turns out that I don't know enough about SLANG ;)  so it was not weird at all but expected. Ok, I am learning in the way. So I understand that sentence. But  (down)
 
If you want to get the receiver you'll need to use stackValue: *and* you'll need to ensure that stackPointer is updated when calling lookupInMethodCacheSel:class: from internalFindNewMethod (see externalizeFPandSP), which may slow down the interpreter slightly.


I DO understand what #externalizeFPandSP does, but what I don't understand is why I should only do it in #internalFindNewMethod. I mean, what happens with all the rest of the senders of #lookupInMethodCacheSel:class:   ?  maybe if one of those senders do not update stackPointer (externalizeFPandSP), then in #lookupInMethodCacheSel:class:  I will be accessing something wrong ?

Anyway, I wanted to trace the receiver in #lookupInMethodCacheSel:class:  to avoid doing it in all its senders. But with the problem found, I workarrounded by tracing the receiver in its senders (only those inlined) and that seems to work :)


 

 
 You're going to have to delve into the inliner in Slang.  This is, um, not fun.  I liken it to getting hit on the head with a stick by your guru, except that no enlightenment results.  Good luck.


:(   thanks.

 

Thanks in advance,


--
Mariano
http://marianopeck.wordpress.com





--
best,
Eliot





--
Mariano
http://marianopeck.wordpress.com





--
best,
Eliot





--
Mariano
http://marianopeck.wordpress.com




--
Mariano
http://marianopeck.wordpress.com





--
best,
Eliot





--
Mariano
http://marianopeck.wordpress.com




--
Mariano
http://marianopeck.wordpress.com





--
best,
Eliot

Reply | Threaded
Open this post in threaded view
|

Re: Weird problem when adding a method to StackInterpreter

Mariano Martinez Peck
 


On Tue, Dec 27, 2011 at 7:02 PM, Eliot Miranda <[hidden email]> wrote:
 
Hi Mariano,

On Tue, Dec 27, 2011 at 7:05 AM, Mariano Martinez Peck <[hidden email]> wrote:
 
Hi Eliot. Now I found another thing which took my attention. I would also like to trace when objects receives messages from the special selectors (special bytecode associated). So for example, I would like to trace an object that receives the message #new, #x, etc etc etc. With a StackVM I need to call my method #traceObjectUsage: from the bytecodePrim* methods. Usually, only when those methods answers before than the #normalSend. For example, in #bytecodePrimAdd I trace both the argument and the receiver when they are floats. If I do not add my sends to #traceObjectUsage:, then they receivers are not marked (logically).

Now, what I don't understand is what happens with CogVM. In Cog, even if I don't put my calls to #traceObjectUsage:  the receiver is always marked. I guess this is because I have put #traceObjectUsage: in a lot of general places of Cog. The "problem" is that with #class and #== the receiver is not marked (right now I don't want to discuss whether I should trace this or not) . Previously, with StackVM, if I have the call to #traceObjectUsage: in #bytecodePrimClass and #bytecodePrimEquivalent  then the receiver is marked perfectly. But with Cog I noticed that it doesn't matter what I put in #bytecodePrim*   because they seem they are never executed.  Is this correct?  Are these special bytecode always jitted from the very first time?  or they are jitted on demand (when they are found in the cache) like the rest of the normal methods ?    And the main question, what can be the cause of why I can trace with all the #bytecodePrim*  but not with #class and #== ?   I am obviously missing a place where I should trace....

#class and #== are always inlined in jitted code and so if you want to trace you'll have to modify the jit to add the tracing code as part of the inlined code.

Ahhh that was is :)  I didn't know that. So now I see that in #initializeBytecodeTableForClosureV3 or friends, you define them as notMapped:
        #(1 198 198 genSpecialSelectorEqualsEquals needsFrameNever: notMapped -1). "not mapped because it is directly inlined (for now)"
        #(1 199 199 genSpecialSelectorClass needsFrameNever: notMapped 0). "not mapped because it is directly inlined (for now)"
And you have comments there and in the beginning of the method. Ok got it :)
 
 Note that #class and #== must be inlined and not sent for the semantics to be the same as the interpreter.   In the interpreter these are never sent, but the bytecode for them is executed, just as in jitted code, the fetch of class and the comparison are executed but not sent.


I understand and it makes sense. I have only one small doubt. With the rest of the special shortcut bytecodes such us #bytecodePrimAdd, #bytecodePrimNew, #bytecodePrimGraterThan, etc. there is usually the same behavior: check whether the receiver is of a certain type (like integers, floats, booleans, arrays etc)  and if true then perform a C code instead of the regular message send. Then, if the receiver or argument are not of the expected type, then you follow with a #commonSend. Some other shortcut bytecodes just set the selector and argument count, such us #bytecodePrimAtEnd. And then of course you have #class and #==.

Now, in the jit, you seem to use the same method for all of them (all but #class and #==) and it is #genSpecialSelectorSend. Such method seems to only set the selector and argument count. That is the style of the #bytecodePrimAtEnd that I mentioned.  So..... my question is... is it ok to assume that when you JIT those special method they "stop making much sense" (in fact, they have less sense) since the only thing you do is to just set the selector and argumentCount?   What I mean is that the jitted version of #+ for example will be generated as a regular jit (using genSend: selector numArgs: numArgs) rather than checking that the receivers are integers and if true answer directly (as #bytecodePrimAdd does). Am I correct?
 
But given that the stack vm and the cog vm are semantically equivalent do you even need to add tracing code to the jit? If you're tracing e.g. to discover how much of the object graph a given computation uses and you;re going to use this information for something later on, like creating a kernel image or something, why not just use the stack vm for tracing?

Indeed :)
Thanks for going beyond my questions. For this thing I am doing (I call it ObjectUsageTracer) we have so far 2 users:
- Luc is trying to do boostrap/kernel. In such scenario he can PERFECTLY use the StackVM since the computation of used/unused objects is mostly done once and then such information is used.
- In Marea (what I am doing for my PhD), I want to dynamically detect unused objects, swap them out and replace them by proxies. It means that the system needs detecting these unused objects all the time. It is not something I just do once. Anyway, I could use the StackVM, no problem. But....with Cog I can improve the performance of my solution hehehhe. So I wanted to give it a try and see if I could make the ObjectUsageTracer work in Cog. So far it is working more or less good and I only found the problem of #class and #==. And I am not even sure if that's a problem in my case (I need to think a little bit about it).

Best regards,


 
 

Thanks a lot in advance,



On Mon, Dec 26, 2011 at 10:00 AM, Mariano Martinez Peck <[hidden email]> wrote:


Then the test fails in self assert: (tracer isMarked: obj2).  I imagine it is because it is executing the machine code of #foo: . So my question is if there is a way where I could intercept and trace the receiver also there?   I tried to do it but I failed.

See the flag word traceLinkedSends in cogit.c.  A bit in the flags causes the JIT to generate a call at the start of a method for tracing:

#define recordSendTrace() (traceLinkedSends & 2)

The result is that ceTraceLinkedSend is called on every send.


Wow. I cannot believe how easy it was :)  Thanks Eliot. So what I did is to change Cogit class >> declareCVarsIn: 
to set 2 rather than 8:

        var: #traceLinkedSends
            declareC: 'int traceLinkedSends = 2';
   

And then just add my tracing stuff in #ceTraceLinkedSend

Thank you very much Eliot and Happy Christmas to all VM hackers

 
HTH
Eliot


Thanks a lot in advance,



On Fri, Dec 23, 2011 at 11:23 AM, Mariano Martinez Peck <[hidden email]> wrote:


Weird ehh, because you use #internalStackValue:  along StackInterpreter in a lot of other places and you don't have problems there.

Turns out it's not weird at all.  Since  lookupInMethodCacheSel:class: is used outside of interpret in findNewMethodInClass: and in callback lookup it can't be inlined and hence can't access localSP.  


Hi Eliot. Thanks for you answer. It also turns out that I don't know enough about SLANG ;)  so it was not weird at all but expected. Ok, I am learning in the way. So I understand that sentence. But  (down)
 
If you want to get the receiver you'll need to use stackValue: *and* you'll need to ensure that stackPointer is updated when calling lookupInMethodCacheSel:class: from internalFindNewMethod (see externalizeFPandSP), which may slow down the interpreter slightly.


I DO understand what #externalizeFPandSP does, but what I don't understand is why I should only do it in #internalFindNewMethod. I mean, what happens with all the rest of the senders of #lookupInMethodCacheSel:class:   ?  maybe if one of those senders do not update stackPointer (externalizeFPandSP), then in #lookupInMethodCacheSel:class:  I will be accessing something wrong ?

Anyway, I wanted to trace the receiver in #lookupInMethodCacheSel:class:  to avoid doing it in all its senders. But with the problem found, I workarrounded by tracing the receiver in its senders (only those inlined) and that seems to work :)


 

 
 You're going to have to delve into the inliner in Slang.  This is, um, not fun.  I liken it to getting hit on the head with a stick by your guru, except that no enlightenment results.  Good luck.


:(   thanks.

 

Thanks in advance,


--
Mariano
http://marianopeck.wordpress.com





--
best,
Eliot





--
Mariano
http://marianopeck.wordpress.com





--
best,
Eliot





--
Mariano
http://marianopeck.wordpress.com




--
Mariano
http://marianopeck.wordpress.com





--
best,
Eliot





--
Mariano
http://marianopeck.wordpress.com




--
Mariano
http://marianopeck.wordpress.com





--
best,
Eliot





--
Mariano
http://marianopeck.wordpress.com

Reply | Threaded
Open this post in threaded view
|

Re: Weird problem when adding a method to StackInterpreter

Eliot Miranda-2
 


On Wed, Dec 28, 2011 at 3:14 AM, Mariano Martinez Peck <[hidden email]> wrote:
 


On Tue, Dec 27, 2011 at 7:02 PM, Eliot Miranda <[hidden email]> wrote:
 
Hi Mariano,

On Tue, Dec 27, 2011 at 7:05 AM, Mariano Martinez Peck <[hidden email]> wrote:
 
Hi Eliot. Now I found another thing which took my attention. I would also like to trace when objects receives messages from the special selectors (special bytecode associated). So for example, I would like to trace an object that receives the message #new, #x, etc etc etc. With a StackVM I need to call my method #traceObjectUsage: from the bytecodePrim* methods. Usually, only when those methods answers before than the #normalSend. For example, in #bytecodePrimAdd I trace both the argument and the receiver when they are floats. If I do not add my sends to #traceObjectUsage:, then they receivers are not marked (logically).

Now, what I don't understand is what happens with CogVM. In Cog, even if I don't put my calls to #traceObjectUsage:  the receiver is always marked. I guess this is because I have put #traceObjectUsage: in a lot of general places of Cog. The "problem" is that with #class and #== the receiver is not marked (right now I don't want to discuss whether I should trace this or not) . Previously, with StackVM, if I have the call to #traceObjectUsage: in #bytecodePrimClass and #bytecodePrimEquivalent  then the receiver is marked perfectly. But with Cog I noticed that it doesn't matter what I put in #bytecodePrim*   because they seem they are never executed.  Is this correct?  Are these special bytecode always jitted from the very first time?  or they are jitted on demand (when they are found in the cache) like the rest of the normal methods ?    And the main question, what can be the cause of why I can trace with all the #bytecodePrim*  but not with #class and #== ?   I am obviously missing a place where I should trace....

#class and #== are always inlined in jitted code and so if you want to trace you'll have to modify the jit to add the tracing code as part of the inlined code.

Ahhh that was is :)  I didn't know that. So now I see that in #initializeBytecodeTableForClosureV3 or friends, you define them as notMapped:
        #(1 198 198 genSpecialSelectorEqualsEquals needsFrameNever: notMapped -1). "not mapped because it is directly inlined (for now)"
        #(1 199 199 genSpecialSelectorClass needsFrameNever: notMapped 0). "not mapped because it is directly inlined (for now)"
And you have comments there and in the beginning of the method. Ok got it :)
 
 Note that #class and #== must be inlined and not sent for the semantics to be the same as the interpreter.   In the interpreter these are never sent, but the bytecode for them is executed, just as in jitted code, the fetch of class and the comparison are executed but not sent.


I understand and it makes sense. I have only one small doubt. With the rest of the special shortcut bytecodes such us #bytecodePrimAdd, #bytecodePrimNew, #bytecodePrimGraterThan, etc. there is usually the same behavior: check whether the receiver is of a certain type (like integers, floats, booleans, arrays etc)  and if true then perform a C code instead of the regular message send. Then, if the receiver or argument are not of the expected type, then you follow with a #commonSend. Some other shortcut bytecodes just set the selector and argument count, such us #bytecodePrimAtEnd. And then of course you have #class and #==.

Now, in the jit, you seem to use the same method for all of them (all but #class and #==) and it is #genSpecialSelectorSend. Such method seems to only set the selector and argument count. That is the style of the #bytecodePrimAtEnd that I mentioned.  So..... my question is... is it ok to assume that when you JIT those special method they "stop making much sense" (in fact, they have less sense) since the only thing you do is to just set the selector and argumentCount?   What I mean is that the jitted version of #+ for example will be generated as a regular jit (using genSend: selector numArgs: numArgs) rather than checking that the receivers are integers and if true answer directly (as #bytecodePrimAdd does). Am I correct?

Nearly correct :)  There are two JITs, SimpleStackBasedCogit that does no inlining except for #class and #== (because Squeak assumes these are executed without lookup) and StackToRegisterMappingCogit that inlines SmallInteger arithmetic and comparison, #class and #==, and short-cuts SmallInteger comparison followed by conditional jumps.  SimpleStackBasedCogit compiles the special selector bytecodes for #+, #-, #<, #> et al merely by generating normal sends.  StackToRegisterMappingCogit compiles (currently) #+ #- #bitAnd: #bitOr: as a test for SmallInteger arguments, inlined code, possibly an overflow test, and a fall-back conventional send if not SmallIntegers or overflow (see genSpecialSelectorArithmetic).  It compiles #< #> #<= #>= #= #~= as tests for SmallIntegers followed by inline comparison, and possibly, if followed by a conditional branch, the inlined conditional branch, with a fall-back conventional send if not SmallIntegers (see genSpecialSelectorComparison).  It will also constant fold #+, #-, #bitAnd: & #bitOr: so that e.g. (1 + 2 bitAnd: 5) bitOr: 8 is compiled to a load of 9.  And I reserve the right to add additional optimizations as time passes ;)

So in summary, the old simple JIT did nothing special, compiling the special selectors to normal sends, the new JIT does some simple inlining, just for SmallIntegers.

I think this has no implications for your tracing code.  You're unlikely to unload the SmallIntegers and so you don't need to trace them.  Instead, I would try and define the abstract semantic model for Smalltalk and come up with the minimal set of trace points.  For example, for any non-immediate object not created as a side-effect of execution (by which I mean contexts, blocks and indirection vectors for closed-over variables) it can only be accessed via a send.  So it seems to me that the only six places in the VM you need to trace objects are sends in the interpreter, sends in jitted code, and the inlining of #class & #== in the interpreter and jitted code.  For performance, you could inline the bit test on the receiver in jitted code either into each method's prolog or into the ceTraceLinkedSend trampoline, avoiding going to C on every send, which kills performance.

 
But given that the stack vm and the cog vm are semantically equivalent do you even need to add tracing code to the jit? If you're tracing e.g. to discover how much of the object graph a given computation uses and you;re going to use this information for something later on, like creating a kernel image or something, why not just use the stack vm for tracing?

Indeed :)
Thanks for going beyond my questions. For this thing I am doing (I call it ObjectUsageTracer) we have so far 2 users:
- Luc is trying to do boostrap/kernel. In such scenario he can PERFECTLY use the StackVM since the computation of used/unused objects is mostly done once and then such information is used.
- In Marea (what I am doing for my PhD), I want to dynamically detect unused objects, swap them out and replace them by proxies. It means that the system needs detecting these unused objects all the time. It is not something I just do once. Anyway, I could use the StackVM, no problem. But....with Cog I can improve the performance of my solution hehehhe. So I wanted to give it a try and see if I could make the ObjectUsageTracer work in Cog. So far it is working more or less good and I only found the problem of #class and #==. And I am not even sure if that's a problem in my case (I need to think a little bit about it).

Think about the abstract semantics.  How can an object be used?  Encapsulation is your friend.


Best regards,


 
 

Thanks a lot in advance,



On Mon, Dec 26, 2011 at 10:00 AM, Mariano Martinez Peck <[hidden email]> wrote:


Then the test fails in self assert: (tracer isMarked: obj2).  I imagine it is because it is executing the machine code of #foo: . So my question is if there is a way where I could intercept and trace the receiver also there?   I tried to do it but I failed.

See the flag word traceLinkedSends in cogit.c.  A bit in the flags causes the JIT to generate a call at the start of a method for tracing:

#define recordSendTrace() (traceLinkedSends & 2)

The result is that ceTraceLinkedSend is called on every send.


Wow. I cannot believe how easy it was :)  Thanks Eliot. So what I did is to change Cogit class >> declareCVarsIn: 
to set 2 rather than 8:

        var: #traceLinkedSends
            declareC: 'int traceLinkedSends = 2';
   

And then just add my tracing stuff in #ceTraceLinkedSend

Thank you very much Eliot and Happy Christmas to all VM hackers

 
HTH
Eliot


Thanks a lot in advance,



On Fri, Dec 23, 2011 at 11:23 AM, Mariano Martinez Peck <[hidden email]> wrote:


Weird ehh, because you use #internalStackValue:  along StackInterpreter in a lot of other places and you don't have problems there.

Turns out it's not weird at all.  Since  lookupInMethodCacheSel:class: is used outside of interpret in findNewMethodInClass: and in callback lookup it can't be inlined and hence can't access localSP.  


Hi Eliot. Thanks for you answer. It also turns out that I don't know enough about SLANG ;)  so it was not weird at all but expected. Ok, I am learning in the way. So I understand that sentence. But  (down)
 
If you want to get the receiver you'll need to use stackValue: *and* you'll need to ensure that stackPointer is updated when calling lookupInMethodCacheSel:class: from internalFindNewMethod (see externalizeFPandSP), which may slow down the interpreter slightly.


I DO understand what #externalizeFPandSP does, but what I don't understand is why I should only do it in #internalFindNewMethod. I mean, what happens with all the rest of the senders of #lookupInMethodCacheSel:class:   ?  maybe if one of those senders do not update stackPointer (externalizeFPandSP), then in #lookupInMethodCacheSel:class:  I will be accessing something wrong ?

Anyway, I wanted to trace the receiver in #lookupInMethodCacheSel:class:  to avoid doing it in all its senders. But with the problem found, I workarrounded by tracing the receiver in its senders (only those inlined) and that seems to work :)


 

 
 You're going to have to delve into the inliner in Slang.  This is, um, not fun.  I liken it to getting hit on the head with a stick by your guru, except that no enlightenment results.  Good luck.


:(   thanks.

 

Thanks in advance,


--
Mariano
http://marianopeck.wordpress.com





--
best,
Eliot





--
Mariano
http://marianopeck.wordpress.com





--
best,
Eliot





--
Mariano
http://marianopeck.wordpress.com




--
Mariano
http://marianopeck.wordpress.com





--
best,
Eliot





--
Mariano
http://marianopeck.wordpress.com




--
Mariano
http://marianopeck.wordpress.com





--
best,
Eliot





--
Mariano
http://marianopeck.wordpress.com





--
best,
Eliot

Reply | Threaded
Open this post in threaded view
|

Re: Weird problem when adding a method to StackInterpreter

Mariano Martinez Peck
 


On Wed, Dec 28, 2011 at 8:28 PM, Eliot Miranda <[hidden email]> wrote:
 


On Wed, Dec 28, 2011 at 3:14 AM, Mariano Martinez Peck <[hidden email]> wrote:
 


On Tue, Dec 27, 2011 at 7:02 PM, Eliot Miranda <[hidden email]> wrote:
 
Hi Mariano,

On Tue, Dec 27, 2011 at 7:05 AM, Mariano Martinez Peck <[hidden email]> wrote:
 
Hi Eliot. Now I found another thing which took my attention. I would also like to trace when objects receives messages from the special selectors (special bytecode associated). So for example, I would like to trace an object that receives the message #new, #x, etc etc etc. With a StackVM I need to call my method #traceObjectUsage: from the bytecodePrim* methods. Usually, only when those methods answers before than the #normalSend. For example, in #bytecodePrimAdd I trace both the argument and the receiver when they are floats. If I do not add my sends to #traceObjectUsage:, then they receivers are not marked (logically).

Now, what I don't understand is what happens with CogVM. In Cog, even if I don't put my calls to #traceObjectUsage:  the receiver is always marked. I guess this is because I have put #traceObjectUsage: in a lot of general places of Cog. The "problem" is that with #class and #== the receiver is not marked (right now I don't want to discuss whether I should trace this or not) . Previously, with StackVM, if I have the call to #traceObjectUsage: in #bytecodePrimClass and #bytecodePrimEquivalent  then the receiver is marked perfectly. But with Cog I noticed that it doesn't matter what I put in #bytecodePrim*   because they seem they are never executed.  Is this correct?  Are these special bytecode always jitted from the very first time?  or they are jitted on demand (when they are found in the cache) like the rest of the normal methods ?    And the main question, what can be the cause of why I can trace with all the #bytecodePrim*  but not with #class and #== ?   I am obviously missing a place where I should trace....

#class and #== are always inlined in jitted code and so if you want to trace you'll have to modify the jit to add the tracing code as part of the inlined code.

Ahhh that was is :)  I didn't know that. So now I see that in #initializeBytecodeTableForClosureV3 or friends, you define them as notMapped:
        #(1 198 198 genSpecialSelectorEqualsEquals needsFrameNever: notMapped -1). "not mapped because it is directly inlined (for now)"
        #(1 199 199 genSpecialSelectorClass needsFrameNever: notMapped 0). "not mapped because it is directly inlined (for now)"
And you have comments there and in the beginning of the method. Ok got it :)
 
 Note that #class and #== must be inlined and not sent for the semantics to be the same as the interpreter.   In the interpreter these are never sent, but the bytecode for them is executed, just as in jitted code, the fetch of class and the comparison are executed but not sent.


I understand and it makes sense. I have only one small doubt. With the rest of the special shortcut bytecodes such us #bytecodePrimAdd, #bytecodePrimNew, #bytecodePrimGraterThan, etc. there is usually the same behavior: check whether the receiver is of a certain type (like integers, floats, booleans, arrays etc)  and if true then perform a C code instead of the regular message send. Then, if the receiver or argument are not of the expected type, then you follow with a #commonSend. Some other shortcut bytecodes just set the selector and argument count, such us #bytecodePrimAtEnd. And then of course you have #class and #==.

Now, in the jit, you seem to use the same method for all of them (all but #class and #==) and it is #genSpecialSelectorSend. Such method seems to only set the selector and argument count. That is the style of the #bytecodePrimAtEnd that I mentioned.  So..... my question is... is it ok to assume that when you JIT those special method they "stop making much sense" (in fact, they have less sense) since the only thing you do is to just set the selector and argumentCount?   What I mean is that the jitted version of #+ for example will be generated as a regular jit (using genSend: selector numArgs: numArgs) rather than checking that the receivers are integers and if true answer directly (as #bytecodePrimAdd does). Am I correct?

Nearly correct :)  There are two JITs, SimpleStackBasedCogit that does no inlining except for #class and #== (because Squeak assumes these are executed without lookup) and StackToRegisterMappingCogit that inlines SmallInteger arithmetic and comparison, #class and #==, and short-cuts SmallInteger comparison followed by conditional jumps.  SimpleStackBasedCogit compiles the special selector bytecodes for #+, #-, #<, #> et al merely by generating normal sends.  StackToRegisterMappingCogit compiles (currently) #+ #- #bitAnd: #bitOr: as a test for SmallInteger arguments, inlined code, possibly an overflow test, and a fall-back conventional send if not SmallIntegers or overflow (see genSpecialSelectorArithmetic).  It compiles #< #> #<= #>= #= #~= as tests for SmallIntegers followed by inline comparison, and possibly, if followed by a conditional branch, the inlined conditional branch, with a fall-back conventional send if not SmallIntegers (see genSpecialSelectorComparison).  It will also constant fold #+, #-, #bitAnd: & #bitOr: so that e.g. (1 + 2 bitAnd: 5) bitOr: 8 is compiled to a load of 9.  And I reserve the right to add additional optimizations as time passes ;)

Thanks Eliot. Much clear now. I understand. Indeed, I was looking at SimpleStackBasedCogit when I wrote that ;)


So in summary, the old simple JIT did nothing special, compiling the special selectors to normal sends, the new JIT does some simple inlining, just for SmallIntegers.

I think this has no implications for your tracing code.  You're unlikely to unload the SmallIntegers and so you don't need to trace them.

Exactly. I don't care at all to trace SmallIntegers. Even if I wanted, I cannot right now because they are immediate objects and I am using a bit in the object header to store the mark.
 
 Instead, I would try and define the abstract semantic model for Smalltalk and come up with the minimal set of trace points.  For example, for any non-immediate object not created as a side-effect of execution (by which I mean contexts, blocks and indirection vectors for closed-over variables) it can only be accessed via a send.  So it seems to me that the only six places in the VM you need to trace objects are sends in the interpreter, sends in jitted code, and the inlining of #class & #== in the interpreter and jitted code.


I think that if you only want to trace when an object receives a message, then you could be right. In my particular case, I need to go a little bit further: i need to trace when an object receives a message or when it is "directly used by the VM". For example, if I send a message to anObject instance of MyClass, I would like to trace its class, its method dictionary, its compiled method, and all the involved classes/methodDict in the lookup (assuming it was a hard lookup). In this case, those objects (classes, methodDict and compiledMethod) do not receive any message, but instead they are used by the VM. Since I am tracing object usage to then decide whether to swap them out or not, this is important. This was just an example.

 
 For performance, you could inline the bit test on the receiver in jitted code either into each method's prolog or into the ceTraceLinkedSend trampoline, avoiding going to C on every send, which kills performance.

mmmm interesting. I am not sure how to start with this. Any deeper hint? an example to take a look?  :)   My methods are so far

traceObjectUsage: anOop
    ((self isIntegerObject: anOop) not and: [hasToTrace])
        ifTrue: [
            objectMemory setExperimentalBitOf: anOop to: true.
            ]


setExperimentalBitOf: anOop to: boolean
    | hdr |
    self inline: true.
    "Dont check here if it is integer. Check in the sender of this."
    hdr := self baseHeader: anOop.
    boolean
        ifTrue: [ self baseHeader: anOop put: (hdr bitOr: ExperimentalObjectBit). ]
        ifFalse: [ self baseHeader: anOop put: (hdr bitAnd: AllButExperimentalBit). ]



 
But given that the stack vm and the cog vm are semantically equivalent do you even need to add tracing code to the jit? If you're tracing e.g. to discover how much of the object graph a given computation uses and you;re going to use this information for something later on, like creating a kernel image or something, why not just use the stack vm for tracing?

Indeed :)
Thanks for going beyond my questions. For this thing I am doing (I call it ObjectUsageTracer) we have so far 2 users:
- Luc is trying to do boostrap/kernel. In such scenario he can PERFECTLY use the StackVM since the computation of used/unused objects is mostly done once and then such information is used.
- In Marea (what I am doing for my PhD), I want to dynamically detect unused objects, swap them out and replace them by proxies. It means that the system needs detecting these unused objects all the time. It is not something I just do once. Anyway, I could use the StackVM, no problem. But....with Cog I can improve the performance of my solution hehehhe. So I wanted to give it a try and see if I could make the ObjectUsageTracer work in Cog. So far it is working more or less good and I only found the problem of #class and #==. And I am not even sure if that's a problem in my case (I need to think a little bit about it).

Think about the abstract semantics.  How can an object be used?  Encapsulation is your friend.


Not sure if I understood. You mean intercepting bytecodes for variable accessing rather than message received?

Thanks

 

Best regards,


 
 

Thanks a lot in advance,



On Mon, Dec 26, 2011 at 10:00 AM, Mariano Martinez Peck <[hidden email]> wrote:


Then the test fails in self assert: (tracer isMarked: obj2).  I imagine it is because it is executing the machine code of #foo: . So my question is if there is a way where I could intercept and trace the receiver also there?   I tried to do it but I failed.

See the flag word traceLinkedSends in cogit.c.  A bit in the flags causes the JIT to generate a call at the start of a method for tracing:

#define recordSendTrace() (traceLinkedSends & 2)

The result is that ceTraceLinkedSend is called on every send.


Wow. I cannot believe how easy it was :)  Thanks Eliot. So what I did is to change Cogit class >> declareCVarsIn: 
to set 2 rather than 8:

        var: #traceLinkedSends
            declareC: 'int traceLinkedSends = 2';
   

And then just add my tracing stuff in #ceTraceLinkedSend

Thank you very much Eliot and Happy Christmas to all VM hackers

 
HTH
Eliot


Thanks a lot in advance,



On Fri, Dec 23, 2011 at 11:23 AM, Mariano Martinez Peck <[hidden email]> wrote:


Weird ehh, because you use #internalStackValue:  along StackInterpreter in a lot of other places and you don't have problems there.

Turns out it's not weird at all.  Since  lookupInMethodCacheSel:class: is used outside of interpret in findNewMethodInClass: and in callback lookup it can't be inlined and hence can't access localSP.  


Hi Eliot. Thanks for you answer. It also turns out that I don't know enough about SLANG ;)  so it was not weird at all but expected. Ok, I am learning in the way. So I understand that sentence. But  (down)
 
If you want to get the receiver you'll need to use stackValue: *and* you'll need to ensure that stackPointer is updated when calling lookupInMethodCacheSel:class: from internalFindNewMethod (see externalizeFPandSP), which may slow down the interpreter slightly.


I DO understand what #externalizeFPandSP does, but what I don't understand is why I should only do it in #internalFindNewMethod. I mean, what happens with all the rest of the senders of #lookupInMethodCacheSel:class:   ?  maybe if one of those senders do not update stackPointer (externalizeFPandSP), then in #lookupInMethodCacheSel:class:  I will be accessing something wrong ?

Anyway, I wanted to trace the receiver in #lookupInMethodCacheSel:class:  to avoid doing it in all its senders. But with the problem found, I workarrounded by tracing the receiver in its senders (only those inlined) and that seems to work :)


 

 
 You're going to have to delve into the inliner in Slang.  This is, um, not fun.  I liken it to getting hit on the head with a stick by your guru, except that no enlightenment results.  Good luck.


:(   thanks.

 

Thanks in advance,


--
Mariano
http://marianopeck.wordpress.com





--
best,
Eliot





--
Mariano
http://marianopeck.wordpress.com





--
best,
Eliot





--
Mariano
http://marianopeck.wordpress.com




--
Mariano
http://marianopeck.wordpress.com





--
best,
Eliot





--
Mariano
http://marianopeck.wordpress.com




--
Mariano
http://marianopeck.wordpress.com





--
best,
Eliot





--
Mariano
http://marianopeck.wordpress.com





--
best,
Eliot





--
Mariano
http://marianopeck.wordpress.com

Reply | Threaded
Open this post in threaded view
|

Re: Weird problem when adding a method to StackInterpreter

Eliot Miranda-2
 


On Fri, Dec 30, 2011 at 3:13 AM, Mariano Martinez Peck <[hidden email]> wrote:
 


On Wed, Dec 28, 2011 at 8:28 PM, Eliot Miranda <[hidden email]> wrote:
 


On Wed, Dec 28, 2011 at 3:14 AM, Mariano Martinez Peck <[hidden email]> wrote:
 


On Tue, Dec 27, 2011 at 7:02 PM, Eliot Miranda <[hidden email]> wrote:
 
Hi Mariano,

On Tue, Dec 27, 2011 at 7:05 AM, Mariano Martinez Peck <[hidden email]> wrote:
 
Hi Eliot. Now I found another thing which took my attention. I would also like to trace when objects receives messages from the special selectors (special bytecode associated). So for example, I would like to trace an object that receives the message #new, #x, etc etc etc. With a StackVM I need to call my method #traceObjectUsage: from the bytecodePrim* methods. Usually, only when those methods answers before than the #normalSend. For example, in #bytecodePrimAdd I trace both the argument and the receiver when they are floats. If I do not add my sends to #traceObjectUsage:, then they receivers are not marked (logically).

Now, what I don't understand is what happens with CogVM. In Cog, even if I don't put my calls to #traceObjectUsage:  the receiver is always marked. I guess this is because I have put #traceObjectUsage: in a lot of general places of Cog. The "problem" is that with #class and #== the receiver is not marked (right now I don't want to discuss whether I should trace this or not) . Previously, with StackVM, if I have the call to #traceObjectUsage: in #bytecodePrimClass and #bytecodePrimEquivalent  then the receiver is marked perfectly. But with Cog I noticed that it doesn't matter what I put in #bytecodePrim*   because they seem they are never executed.  Is this correct?  Are these special bytecode always jitted from the very first time?  or they are jitted on demand (when they are found in the cache) like the rest of the normal methods ?    And the main question, what can be the cause of why I can trace with all the #bytecodePrim*  but not with #class and #== ?   I am obviously missing a place where I should trace....

#class and #== are always inlined in jitted code and so if you want to trace you'll have to modify the jit to add the tracing code as part of the inlined code.

Ahhh that was is :)  I didn't know that. So now I see that in #initializeBytecodeTableForClosureV3 or friends, you define them as notMapped:
        #(1 198 198 genSpecialSelectorEqualsEquals needsFrameNever: notMapped -1). "not mapped because it is directly inlined (for now)"
        #(1 199 199 genSpecialSelectorClass needsFrameNever: notMapped 0). "not mapped because it is directly inlined (for now)"
And you have comments there and in the beginning of the method. Ok got it :)
 
 Note that #class and #== must be inlined and not sent for the semantics to be the same as the interpreter.   In the interpreter these are never sent, but the bytecode for them is executed, just as in jitted code, the fetch of class and the comparison are executed but not sent.


I understand and it makes sense. I have only one small doubt. With the rest of the special shortcut bytecodes such us #bytecodePrimAdd, #bytecodePrimNew, #bytecodePrimGraterThan, etc. there is usually the same behavior: check whether the receiver is of a certain type (like integers, floats, booleans, arrays etc)  and if true then perform a C code instead of the regular message send. Then, if the receiver or argument are not of the expected type, then you follow with a #commonSend. Some other shortcut bytecodes just set the selector and argument count, such us #bytecodePrimAtEnd. And then of course you have #class and #==.

Now, in the jit, you seem to use the same method for all of them (all but #class and #==) and it is #genSpecialSelectorSend. Such method seems to only set the selector and argument count. That is the style of the #bytecodePrimAtEnd that I mentioned.  So..... my question is... is it ok to assume that when you JIT those special method they "stop making much sense" (in fact, they have less sense) since the only thing you do is to just set the selector and argumentCount?   What I mean is that the jitted version of #+ for example will be generated as a regular jit (using genSend: selector numArgs: numArgs) rather than checking that the receivers are integers and if true answer directly (as #bytecodePrimAdd does). Am I correct?

Nearly correct :)  There are two JITs, SimpleStackBasedCogit that does no inlining except for #class and #== (because Squeak assumes these are executed without lookup) and StackToRegisterMappingCogit that inlines SmallInteger arithmetic and comparison, #class and #==, and short-cuts SmallInteger comparison followed by conditional jumps.  SimpleStackBasedCogit compiles the special selector bytecodes for #+, #-, #<, #> et al merely by generating normal sends.  StackToRegisterMappingCogit compiles (currently) #+ #- #bitAnd: #bitOr: as a test for SmallInteger arguments, inlined code, possibly an overflow test, and a fall-back conventional send if not SmallIntegers or overflow (see genSpecialSelectorArithmetic).  It compiles #< #> #<= #>= #= #~= as tests for SmallIntegers followed by inline comparison, and possibly, if followed by a conditional branch, the inlined conditional branch, with a fall-back conventional send if not SmallIntegers (see genSpecialSelectorComparison).  It will also constant fold #+, #-, #bitAnd: & #bitOr: so that e.g. (1 + 2 bitAnd: 5) bitOr: 8 is compiled to a load of 9.  And I reserve the right to add additional optimizations as time passes ;)

Thanks Eliot. Much clear now. I understand. Indeed, I was looking at SimpleStackBasedCogit when I wrote that ;)


So in summary, the old simple JIT did nothing special, compiling the special selectors to normal sends, the new JIT does some simple inlining, just for SmallIntegers.

I think this has no implications for your tracing code.  You're unlikely to unload the SmallIntegers and so you don't need to trace them.

Exactly. I don't care at all to trace SmallIntegers. Even if I wanted, I cannot right now because they are immediate objects and I am using a bit in the object header to store the mark.
 
 Instead, I would try and define the abstract semantic model for Smalltalk and come up with the minimal set of trace points.  For example, for any non-immediate object not created as a side-effect of execution (by which I mean contexts, blocks and indirection vectors for closed-over variables) it can only be accessed via a send.  So it seems to me that the only six places in the VM you need to trace objects are sends in the interpreter, sends in jitted code, and the inlining of #class & #== in the interpreter and jitted code.


I think that if you only want to trace when an object receives a message, then you could be right. In my particular case, I need to go a little bit further: i need to trace when an object receives a message or when it is "directly used by the VM". For example, if I send a message to anObject instance of MyClass, I would like to trace its class, its method dictionary, its compiled method, and all the involved classes/methodDict in the lookup (assuming it was a hard lookup). In this case, those objects (classes, methodDict and compiledMethod) do not receive any message, but instead they are used by the VM. Since I am tracing object usage to then decide whether to swap them out or not, this is important. This was just an example.

 
 For performance, you could inline the bit test on the receiver in jitted code either into each method's prolog or into the ceTraceLinkedSend trampoline, avoiding going to C on every send, which kills performance.

mmmm interesting. I am not sure how to start with this. Any deeper hint? an example to take a look?  :)   My methods are so far

traceObjectUsage: anOop
    ((self isIntegerObject: anOop) not and: [hasToTrace])
        ifTrue: [
            objectMemory setExperimentalBitOf: anOop to: true.
            ]


setExperimentalBitOf: anOop to: boolean
    | hdr |
    self inline: true.
    "Dont check here if it is integer. Check in the sender of this."
    hdr := self baseHeader: anOop.
    boolean
        ifTrue: [ self baseHeader: anOop put: (hdr bitOr: ExperimentalObjectBit). ]
        ifFalse: [ self baseHeader: anOop put: (hdr bitAnd: AllButExperimentalBit). ]

Look at CogObjectRepresentationForSqueakV3's methods.  There are numerous examples of fetching values from an object's header, and some methods that store fields.  There are no examples that set bits in the object header, but it's not that hard.

 
But given that the stack vm and the cog vm are semantically equivalent do you even need to add tracing code to the jit? If you're tracing e.g. to discover how much of the object graph a given computation uses and you;re going to use this information for something later on, like creating a kernel image or something, why not just use the stack vm for tracing?

Indeed :)
Thanks for going beyond my questions. For this thing I am doing (I call it ObjectUsageTracer) we have so far 2 users:
- Luc is trying to do boostrap/kernel. In such scenario he can PERFECTLY use the StackVM since the computation of used/unused objects is mostly done once and then such information is used.
- In Marea (what I am doing for my PhD), I want to dynamically detect unused objects, swap them out and replace them by proxies. It means that the system needs detecting these unused objects all the time. It is not something I just do once. Anyway, I could use the StackVM, no problem. But....with Cog I can improve the performance of my solution hehehhe. So I wanted to give it a try and see if I could make the ObjectUsageTracer work in Cog. So far it is working more or less good and I only found the problem of #class and #==. And I am not even sure if that's a problem in my case (I need to think a little bit about it).

Think about the abstract semantics.  How can an object be used?  Encapsulation is your friend.


Not sure if I understood. You mean intercepting bytecodes for variable accessing rather than message received?

The only access to an object is either if it is the receiver of a message or if it is passed as an argument of a primitive that accesses the state of its arguments/sends a message to its arguments, etc.  So as far as the execution machinery you only need to trace message sends.  Certain primitives need special consideration (perform:, the mirror primitives).

 

Thanks

 

Best regards,


 
 

Thanks a lot in advance,



On Mon, Dec 26, 2011 at 10:00 AM, Mariano Martinez Peck <[hidden email]> wrote:


Then the test fails in self assert: (tracer isMarked: obj2).  I imagine it is because it is executing the machine code of #foo: . So my question is if there is a way where I could intercept and trace the receiver also there?   I tried to do it but I failed.

See the flag word traceLinkedSends in cogit.c.  A bit in the flags causes the JIT to generate a call at the start of a method for tracing:

#define recordSendTrace() (traceLinkedSends & 2)

The result is that ceTraceLinkedSend is called on every send.


Wow. I cannot believe how easy it was :)  Thanks Eliot. So what I did is to change Cogit class >> declareCVarsIn: 
to set 2 rather than 8:

        var: #traceLinkedSends
            declareC: 'int traceLinkedSends = 2';
   

And then just add my tracing stuff in #ceTraceLinkedSend

Thank you very much Eliot and Happy Christmas to all VM hackers

 
HTH
Eliot


Thanks a lot in advance,



On Fri, Dec 23, 2011 at 11:23 AM, Mariano Martinez Peck <[hidden email]> wrote:


Weird ehh, because you use #internalStackValue:  along StackInterpreter in a lot of other places and you don't have problems there.

Turns out it's not weird at all.  Since  lookupInMethodCacheSel:class: is used outside of interpret in findNewMethodInClass: and in callback lookup it can't be inlined and hence can't access localSP.  


Hi Eliot. Thanks for you answer. It also turns out that I don't know enough about SLANG ;)  so it was not weird at all but expected. Ok, I am learning in the way. So I understand that sentence. But  (down)
 
If you want to get the receiver you'll need to use stackValue: *and* you'll need to ensure that stackPointer is updated when calling lookupInMethodCacheSel:class: from internalFindNewMethod (see externalizeFPandSP), which may slow down the interpreter slightly.


I DO understand what #externalizeFPandSP does, but what I don't understand is why I should only do it in #internalFindNewMethod. I mean, what happens with all the rest of the senders of #lookupInMethodCacheSel:class:   ?  maybe if one of those senders do not update stackPointer (externalizeFPandSP), then in #lookupInMethodCacheSel:class:  I will be accessing something wrong ?

Anyway, I wanted to trace the receiver in #lookupInMethodCacheSel:class:  to avoid doing it in all its senders. But with the problem found, I workarrounded by tracing the receiver in its senders (only those inlined) and that seems to work :)


 

 
 You're going to have to delve into the inliner in Slang.  This is, um, not fun.  I liken it to getting hit on the head with a stick by your guru, except that no enlightenment results.  Good luck.


:(   thanks.

 

Thanks in advance,


--
Mariano
http://marianopeck.wordpress.com





--
best,
Eliot





--
Mariano
http://marianopeck.wordpress.com





--
best,
Eliot





--
Mariano
http://marianopeck.wordpress.com




--
Mariano
http://marianopeck.wordpress.com





--
best,
Eliot





--
Mariano
http://marianopeck.wordpress.com




--
Mariano
http://marianopeck.wordpress.com





--
best,
Eliot





--
Mariano
http://marianopeck.wordpress.com





--
best,
Eliot





--
Mariano
http://marianopeck.wordpress.com





--
best,
Eliot