how slower is called a named primitive over a numbered primitive?

Previous Topic Next Topic
 
classic Classic list List threaded Threaded
6 messages Options
Reply | Threaded
Open this post in threaded view
|

how slower is called a named primitive over a numbered primitive?

EstebanLM
Hi,

Any idea how slower is? I mean, any measure/estimation/something around?

cheers,
Esteban


Reply | Threaded
Open this post in threaded view
|

Re: [Vm-dev] how slower is called a named primitive over a numbered primitive?

Eliot Miranda-2
Hi Esteban,

    you can set up a test using the LargeInteger comparison primitives.  They're both named and numberd.  e.g.

23 primitiveLessThanLargeIntegers

So you can write e.g.

LargePositiveInteger>>#< anInteger 
"Primitive. Compare the receiver with the argument and answer true if
the receiver is less than the argument. Otherwise answer false. Fail if the
argument is not a SmallInteger or a LargePositiveInteger less than 2-to-the-30th (1073741824).
Optional. See Object documentation whatIsAPrimitive."

<primitive: 23>
^super < anInteger

as

numberedLessThan: anInteger 
"Primitive. Compare the receiver with the argument and answer true if
the receiver is less than the argument. Otherwise answer false. Fail if the
argument is not a SmallInteger or a LargePositiveInteger less than 2-to-the-30th (1073741824).
Optional. See Object documentation whatIsAPrimitive."

<primitive: 23>
^super < anInteger


namedLessThan: anInteger 
"Primitive. Compare the receiver with the argument and answer true if
the receiver is less than the argument. Otherwise answer false. Fail if the
argument is not a SmallInteger or a LargePositiveInteger less than 2-to-the-30th (1073741824).
Optional. See Object documentation whatIsAPrimitive."

<primitive: 'primitiveLessThanLargeIntegers'>
^super < anInteger

and test it with two suitable large integers.  Will you report back?  I'd like to know the answer.  Named primitive invocation should be slightly slower.  As Clément says, a return different address is written to the stack, overwriting the primitive code, but that return path is essentially the same as for numbered primtiives.  So I expect that there will be almost no measurable difference.

This is all to do with callbacks.  Numbered primitives are assumed never to callback (so far that's a valid assumption).  But named primitives (such as an FFI call) may indeed callback and hence, by the time the primitive finally returns the code zone may have been compacted and the original method containing the callout may have moved.  So the VM can't simply return to a primitive that may have called back, and then have that primitive's code return form the primitive, because that codee may have moved.  The solution is to provide a piece of code at a fixed address that returns from a named primitive call, and have the return sequence run that code.

On Mon, Jun 22, 2015 at 5:13 AM, Esteban Lorenzano <[hidden email]> wrote:

Hi,

Any idea how slower is? I mean, any measure/estimation/something around?

cheers,
Esteban




--
best,
Eliot
Reply | Threaded
Open this post in threaded view
|

Re: [Vm-dev] how slower is called a named primitive over a numbered primitive?

Eliot Miranda-2


On Mon, Jun 22, 2015 at 10:40 AM, Eliot Miranda <[hidden email]> wrote:
Hi Esteban,

    you can set up a test using the LargeInteger comparison primitives.  They're both named and numberd.  e.g.

23 primitiveLessThanLargeIntegers

So you can write e.g.

LargePositiveInteger>>#< anInteger 
"Primitive. Compare the receiver with the argument and answer true if
the receiver is less than the argument. Otherwise answer false. Fail if the
argument is not a SmallInteger or a LargePositiveInteger less than 2-to-the-30th (1073741824).
Optional. See Object documentation whatIsAPrimitive."

<primitive: 23>
^super < anInteger

as

numberedLessThan: anInteger 
"Primitive. Compare the receiver with the argument and answer true if
the receiver is less than the argument. Otherwise answer false. Fail if the
argument is not a SmallInteger or a LargePositiveInteger less than 2-to-the-30th (1073741824).
Optional. See Object documentation whatIsAPrimitive."

<primitive: 23>
^super < anInteger


namedLessThan: anInteger 
"Primitive. Compare the receiver with the argument and answer true if
the receiver is less than the argument. Otherwise answer false. Fail if the
argument is not a SmallInteger or a LargePositiveInteger less than 2-to-the-30th (1073741824).
Optional. See Object documentation whatIsAPrimitive."

<primitive: 'primitiveLessThanLargeIntegers'>
^super < anInteger

and test it with two suitable large integers.  Will you report back?  I'd like to know the answer.  Named primitive invocation should be slightly slower.  As Clément says, a return different address is written to the stack, overwriting the primitive code, but that return path is essentially the same as for numbered primtiives.  So I expect that there will be almost no measurable difference.

This is all to do with callbacks.  Numbered primitives are assumed never to callback (so far that's a valid assumption).  But named primitives (such as an FFI call) may indeed callback and hence, by the time the primitive finally returns the code zone may have been compacted and the original method containing the callout may have moved.  So the VM can't simply return to a primitive that may have called back, and then have that primitive's code return form the primitive, because that codee may have moved.  The solution is to provide a piece of code at a fixed address that returns from a named primitive call, and have the return sequence run that code.

I should have said that numbered primitives other than 117 (primitiveExternalCall) &  120 (primitiveCalloutToFFI) are assumed never to call-back.  In fact, the VM code in primitivePropertyFlagsForSpur: & primitivePropertyFlagsForV3: won't set the required flags to tell the Cogit to substitute the return address if you use primitiveCalloutWithArgs as a named primitive instead of 120 as a numbered primitive.  So please use 120.  Anyway, the test should demonstrate that there's no difference.

If you /do/ want to use primitiveCalloutWithArgs instead of 120 then primitivePropertyFlagsForSpur: & primitivePropertyFlagsForV3: are going to get more complicated.  The system is currently setup for the FFP plugin to be unloaded and hence the system made secure by not shipping the FFI plugin.  But including a reference to primitiveCalloutWithArgs in the main VM needs to be done carefully to avoid having to link the FFI plugin into the VM.  So this should be done with e.g. a 
    self cppIf: PharoVM ifTrue: ...
idiom.


On Mon, Jun 22, 2015 at 5:13 AM, Esteban Lorenzano <[hidden email]> wrote:

Hi,

Any idea how slower is? I mean, any measure/estimation/something around?

cheers,
Esteban




--
best,
Eliot



--
best,
Eliot
Reply | Threaded
Open this post in threaded view
|

Re: [Vm-dev] how slower is called a named primitive over a numbered primitive?

EstebanLM

On 22 Jun 2015, at 19:47, Eliot Miranda <[hidden email]> wrote:



On Mon, Jun 22, 2015 at 10:40 AM, Eliot Miranda <[hidden email]> wrote:
Hi Esteban,

    you can set up a test using the LargeInteger comparison primitives.  They're both named and numberd.  e.g.

23 primitiveLessThanLargeIntegers

So you can write e.g.

LargePositiveInteger>>#< anInteger 
"Primitive. Compare the receiver with the argument and answer true if
the receiver is less than the argument. Otherwise answer false. Fail if the
argument is not a SmallInteger or a LargePositiveInteger less than 2-to-the-30th (1073741824).
Optional. See Object documentation whatIsAPrimitive."

<primitive: 23>
^super < anInteger

as

numberedLessThan: anInteger 
"Primitive. Compare the receiver with the argument and answer true if
the receiver is less than the argument. Otherwise answer false. Fail if the
argument is not a SmallInteger or a LargePositiveInteger less than 2-to-the-30th (1073741824).
Optional. See Object documentation whatIsAPrimitive."

<primitive: 23>
^super < anInteger


namedLessThan: anInteger 
"Primitive. Compare the receiver with the argument and answer true if
the receiver is less than the argument. Otherwise answer false. Fail if the
argument is not a SmallInteger or a LargePositiveInteger less than 2-to-the-30th (1073741824).
Optional. See Object documentation whatIsAPrimitive."

<primitive: 'primitiveLessThanLargeIntegers'>
^super < anInteger

and test it with two suitable large integers.  Will you report back?  I'd like to know the answer.  Named primitive invocation should be slightly slower.  As Clément says, a return different address is written to the stack, overwriting the primitive code, but that return path is essentially the same as for numbered primtiives.  So I expect that there will be almost no measurable difference.

This is all to do with callbacks.  Numbered primitives are assumed never to callback (so far that's a valid assumption).  But named primitives (such as an FFI call) may indeed callback and hence, by the time the primitive finally returns the code zone may have been compacted and the original method containing the callout may have moved.  So the VM can't simply return to a primitive that may have called back, and then have that primitive's code return form the primitive, because that codee may have moved.  The solution is to provide a piece of code at a fixed address that returns from a named primitive call, and have the return sequence run that code.

I should have said that numbered primitives other than 117 (primitiveExternalCall) &  120 (primitiveCalloutToFFI) are assumed never to call-back.  In fact, the VM code in primitivePropertyFlagsForSpur: & primitivePropertyFlagsForV3: won't set the required flags to tell the Cogit to substitute the return address if you use primitiveCalloutWithArgs as a named primitive instead of 120 as a numbered primitive.  So please use 120.  Anyway, the test should demonstrate that there's no difference.

If you /do/ want to use primitiveCalloutWithArgs instead of 120 then primitivePropertyFlagsForSpur: & primitivePropertyFlagsForV3: are going to get more complicated.  The system is currently setup for the FFP plugin to be unloaded and hence the system made secure by not shipping the FFI plugin.  But including a reference to primitiveCalloutWithArgs in the main VM needs to be done carefully to avoid having to link the FFI plugin into the VM.  So this should be done with e.g. a 
    self cppIf: PharoVM ifTrue: ...
idiom.

I will not change anything until I actually find a reason to do it… and if I find a reason, I will discuss it here, so do not worry about. 
Right now I’m just wondering, because I’m writing the NB to FFI backend and I’m thinking on better ways to do it… and since this is all while we wait for the new FFI implementation, most probably everything can wait… 
but, it would be nice to actually know the numbers, instead just guessing… :)

Esteban



On Mon, Jun 22, 2015 at 5:13 AM, Esteban Lorenzano <[hidden email]> wrote:

Hi,

Any idea how slower is? I mean, any measure/estimation/something around?

cheers,
Esteban




-- 
best,
Eliot



-- 
best,
Eliot

Reply | Threaded
Open this post in threaded view
|

Re: [Vm-dev] how slower is called a named primitive over a numbered primitive?

Eliot Miranda-2
In reply to this post by Eliot Miranda-2


On Mon, Jun 22, 2015 at 9:35 AM, David T. Lewis <[hidden email]> wrote:

That sounds right to me too. But it would be a worthwhile experiment to
set up a test to confirm it. Maybe take one or more methods that call
numbered primitives, and recode them to call the primitives by name. Then
measure and see if anything got slower.

Dave

On Mon, Jun 22, 2015 at 10:40 AM, Eliot Miranda <[hidden email]> wrote:
Hi Esteban,

    you can set up a test using the LargeInteger comparison primitives.  They're both named and numberd.  e.g.

23 primitiveLessThanLargeIntegers

So you can write e.g.

LargePositiveInteger>>#< anInteger 
"Primitive. Compare the receiver with the argument and answer true if
the receiver is less than the argument. Otherwise answer false. Fail if the
argument is not a SmallInteger or a LargePositiveInteger less than 2-to-the-30th (1073741824).
Optional. See Object documentation whatIsAPrimitive."

<primitive: 23>
^super < anInteger

as

numberedLessThan: anInteger 
"Primitive. Compare the receiver with the argument and answer true if
the receiver is less than the argument. Otherwise answer false. Fail if the
argument is not a SmallInteger or a LargePositiveInteger less than 2-to-the-30th (1073741824).
Optional. See Object documentation whatIsAPrimitive."

<primitive: 23>
^super < anInteger


namedLessThan: anInteger 
"Primitive. Compare the receiver with the argument and answer true if
the receiver is less than the argument. Otherwise answer false. Fail if the
argument is not a SmallInteger or a LargePositiveInteger less than 2-to-the-30th (1073741824).
Optional. See Object documentation whatIsAPrimitive."

<primitive: 'primitiveLessThanLargeIntegers'>
^super < anInteger

and test it with two suitable large integers.  Will you report back?  I'd like to know the answer.  Named primitive invocation should be slightly slower.  As Clément says, a return different address is written to the stack, overwriting the primitive code, but that return path is essentially the same as for numbered primtiives.  So I expect that there will be almost no measurable difference.

Wow, it is indeed a significant difference.  Substituting the return address must invoke all sorts of cost in an x86 cpu.  Here are 2 x 5 runs


| i | i := SmallInteger maxVal + 1.
(1 to: 6) collect: [:j| {[1 to: 10000000 do: [:k| i numberedLessThan: i]] timeToRun. [1 to: 10000000 do: [:k| i namedLessThan: i]] timeToRun}]

#(#(191 283) #(211 375) #(281 405) #(300 411) #(281 421) #(296 409)) #(#(186 267) #(201 273) #(210 364) #(294 410) #(313 400) #(292 405))

So the overhead is of the order of (100ms / 10,000,000) per call.  e.g. around 10ns per named primitive call.  Interesting :-)
 

This is all to do with callbacks.  Numbered primitives are assumed never to callback (so far that's a valid assumption).  But named primitives (such as an FFI call) may indeed callback and hence, by the time the primitive finally returns the code zone may have been compacted and the original method containing the callout may have moved.  So the VM can't simply return to a primitive that may have called back, and then have that primitive's code return form the primitive, because that codee may have moved.  The solution is to provide a piece of code at a fixed address that returns from a named primitive call, and have the return sequence run that code.

On Mon, Jun 22, 2015 at 5:13 AM, Esteban Lorenzano <[hidden email]> wrote:

Hi,

Any idea how slower is? I mean, any measure/estimation/something around?

cheers,
Esteban




--
best,
Eliot



--
best,
Eliot
Reply | Threaded
Open this post in threaded view
|

Re: [Vm-dev] how slower is called a named primitive over a numbered primitive?

David T. Lewis
On Mon, Jun 22, 2015 at 11:05:30AM -0700, Eliot Miranda wrote:

> On Mon, Jun 22, 2015 at 9:35 AM, David T. Lewis <[hidden email]> wrote:
>
> >
> > That sounds right to me too. But it would be a worthwhile experiment to
> > set up a test to confirm it. Maybe take one or more methods that call
> > numbered primitives, and recode them to call the primitives by name. Then
> > measure and see if anything got slower.
> >
> > Dave
> >
>
> On Mon, Jun 22, 2015 at 10:40 AM, Eliot Miranda <[hidden email]>
> wrote:
>
> > Hi Esteban,
> >
> >     you can set up a test using the LargeInteger comparison primitives.
> > They're both named and numberd.  e.g.
> >
> > 23 primitiveLessThanLargeIntegers
> >
> > So you can write e.g.
> >
> > LargePositiveInteger>>#< anInteger
> > "Primitive. Compare the receiver with the argument and answer true if
> > the receiver is less than the argument. Otherwise answer false. Fail if the
> > argument is not a SmallInteger or a LargePositiveInteger less than
> > 2-to-the-30th (1073741824).
> > Optional. See Object documentation whatIsAPrimitive."
> >
> > <primitive: 23>
> > ^super < anInteger
> >
> > as
> >
> > numberedLessThan: anInteger
> > "Primitive. Compare the receiver with the argument and answer true if
> > the receiver is less than the argument. Otherwise answer false. Fail if the
> > argument is not a SmallInteger or a LargePositiveInteger less than
> > 2-to-the-30th (1073741824).
> > Optional. See Object documentation whatIsAPrimitive."
> >
> > <primitive: 23>
> > ^super < anInteger
> >
> >
> > namedLessThan: anInteger
> > "Primitive. Compare the receiver with the argument and answer true if
> > the receiver is less than the argument. Otherwise answer false. Fail if the
> > argument is not a SmallInteger or a LargePositiveInteger less than
> > 2-to-the-30th (1073741824).
> > Optional. See Object documentation whatIsAPrimitive."
> >
> > <primitive: 'primitiveLessThanLargeIntegers'>
> > ^super < anInteger
> >
> > and test it with two suitable large integers.  Will you report back?  I'd
> > like to know the answer.  Named primitive invocation should be slightly
> > slower.  As Cl??ment says, a return different address is written to the
> > stack, overwriting the primitive code, but that return path is essentially
> > the same as for numbered primtiives.  So I expect that there will be almost
> > no measurable difference.
> >
>
> Wow, it is indeed a significant difference.  Substituting the return
> address must invoke all sorts of cost in an x86 cpu.  Here are 2 x 5 runs
>
>
> | i | i := SmallInteger maxVal + 1.
> (1 to: 6) collect: [:j| {[1 to: 10000000 do: [:k| i numberedLessThan: i]]
> timeToRun. [1 to: 10000000 do: [:k| i namedLessThan: i]] timeToRun}]
>
> #(#(191 283) #(211 375) #(281 405) #(300 411) #(281 421) #(296 409))
> #(#(186 267) #(201 273) #(210 364) #(294 410) #(313 400) #(292 405))
>
> So the overhead is of the order of (100ms / 10,000,000) per call.  e.g.
> around 10ns per named primitive call.  Interesting :-)

On an interpreter VM, the results are as Tim and I initially expected:

   | i | i := SmallInteger maxVal + 1.
   (1 to: 6) collect: [:j| {[1 to: 10000000 do: [:k| i numberedLessThan: i]]
   timeToRun. [1 to: 10000000 do: [:k| i namedLessThan: i]] timeToRun}]
   
   ==> #(#(791 789) #(793 794) #(793 790) #(791 791) #(790 794) #(795 789))
   
With a Cog VM, the numbered primitives are significantly faster:
   
   | i | i := SmallInteger maxVal + 1.
   (1 to: 6) collect: [:j| {[1 to: 10000000 do: [:k| i numberedLessThan: i]]
   timeToRun. [1 to: 10000000 do: [:k| i namedLessThan: i]] timeToRun}]
   
   ==> #(#(542 670) #(542 668) #(544 678) #(546 680) #(540 666) #(540 680))

On Mon, Jun 22, 2015 at 07:16:43PM +0200, Cl??ment Bera wrote:
>
> Well it also depends if the primitive is generated by the JIT. If you
> rewrite SmallInteger>>#+ from primitive 1 to a named primitive the overhead
> will be more important than just the searching/loading/linking because the
> JIT won't compile it to n-code anymore.

So maybe this is the reason for the difference.

Note: this is with an old Cog VM, because I lost my primary PC and can't
restore it right now. But I think the results are relevant WRT this discussion.

 /usr/local/lib/squeak/4.0-2776/squeak
 Croquet Closure Cog VM [CoInterpreter VMMaker.oscog-eem.331]

Dave