Hi everyone,
I'm trying to reduce the computation time of the following pseudo-code: - memory allocation (~40 doubles) - object heap to C heap copying
- NativeBoost call (nbCall:) - memory freeing The time profiling results are bellow: - 24*3600 calls : > 1 minute - 24*3600 calls with only memory allocation and copying : < 1 second
- 1 call with a 24*3600 loop inside de C code : < 1 second So it appears that the very coslty step is the transition from Pharo to C. And I was wondering if it was possible to drasticly reduce this time by doing something like, generate the the machine code once and call it multiple time ?
Thanks in advance ! Thomas. |
Hi Thomas, 2014-08-07 17:25 GMT+02:00 Thomas Bany <[hidden email]>:
the machine code for the marshalling of the arguments is generated one time for all. so the penalty does not come from there. please send the code you wrote for these micro-benchs so I can better understand what happens.
Luc
|
In reply to this post by Thomas Bany
Hi Thomas,
Please share with us how it goes. Your experience is important to us. Alexandre > Le 07-08-2014 à 11:25, Thomas Bany <[hidden email]> a écrit : > > Hi everyone, > > I'm trying to reduce the computation time of the following pseudo-code: > > - memory allocation (~40 doubles) > - object heap to C heap copying > - NativeBoost call (nbCall:) > - memory freeing > > The time profiling results are bellow: > > - 24*3600 calls : > 1 minute > - 24*3600 calls with only memory allocation and copying : < 1 second > - 1 call with a 24*3600 loop inside de C code : < 1 second > > So it appears that the very coslty step is the transition from Pharo to C. And I was wondering if it was possible to drasticly reduce this time by doing something like, generate the the machine code once and call it multiple time ? > > Thanks in advance ! > > Thomas. |
In reply to this post by Thomas Bany
I think that if you posted the code , preferably that contains only the problem would be easier to test , debug and investigate.
On Thu, Aug 7, 2014 at 6:25 PM, Thomas Bany <[hidden email]> wrote:
|
@ Alexandre: sure, no problem ! @ Luc: I'm not sure how much code I can provide without being to specific, but here is how it goes :
MyClass>>withNBCall externalArray := NBExternalArrayOfDoubles new: self internalArray size. output := NBExternalArrayOfDoubles new: 4. [self actualNBCallWith: externalArray adress storeResultIn: output adress] ensure: [externalArray free. output free]. MyClass>>withNBCallCommented externalArray := NBExternalArrayOfDoubles new: self internalArray size. output := NBExternalArrayOfDoubles new: 4. ["self actualNBCallWith: externalArray adress storeResultIn: output adress"] ensure: [externalArray free. output free].MyClass>>actualNBCallWith: externalArray storeResultIn: output <primitive: 'primitiveNativeCall' module: 'NativeBoostPlugin' error: errorCode> ^self nbCall: #(void callToC(double * externalArray, double * output)) module: 'lib/myModule.dll'
void callToC(double * externalArray, double * output) { computationWith(externalArray, output); } void specialCallToC(double * externalArray, double * output) { unsigned int i; for (i = 0; i < 24*3600; i++) computationWith(externalArray, output); }
object := (MyClass new) variousInitialization; yourself 24*3600 timesRepeat: [object withNBCall]>> Over 1 minute computation time of which over 99% are primitives. Also I don't see the nbCall: in the tree.
object := (MyClass new) variousInitialization; yourself 24*3600 timesRepeat: [object withNBCallCommented]>> Less than 1 second. object := (MyClass new) variousInitialization; yourself object withNBCall
>> Less than 1 millisecond. object := (MyClass new) variousInitialization; yourself object withNBSpecialCall "This time, I use the specialCallToC() function">> Arround 20 millisecond. Allright, that's a pile of code but I hope it help :) On a side note:
Again, thanks for the interest on my issue ! Thomas. 2014-08-07 18:39 GMT+02:00 kilon alios <[hidden email]>:
|
I forgot the copying of the data from the object heap to C heap: Thomas.MyClass>>withNBCall externalArray := NBExternalArrayOfDoubles new: self internalArray size. output := NBExternalArrayOfDoubles new: 4. 1 to: self internalArray size. do: [ :index | externalArray at: index (put: self internalArray at: index) ]. [self actualNBCallWith: externalArray adress storeResultIn: output adress] ensure: [externalArray free. output free].MyClass>>withNBCallCommented externalArray := NBExternalArrayOfDoubles new: self internalArray size. output := NBExternalArrayOfDoubles new: 4. ["self actualNBCallWith: externalArray adress storeResultIn: output adress"] ensure: [externalArray free. output free].1 to: self internalArray size. do: [ :index | externalArray at: index (put: self internalArray at: index) ]. 2014-08-07 19:15 GMT+02:00 Thomas Bany <[hidden email]>:
|
In reply to this post by Thomas Bany
NativeBoost methods compiles native code only once (the first time they
are executed) and when sessionId changes (because you may be on a different platform). So this is already like that. The assembly code is cached in the method literal. Stef On 7/8/14 17:25, Thomas Bany wrote: > Hi everyone, > > I'm trying to reduce the computation time of the following pseudo-code: > > - memory allocation (~40 doubles) > - object heap to C heap copying > - NativeBoost call (nbCall:) > - memory freeing > > The time profiling results are bellow: > > - 24*3600 calls : > 1 minute > - 24*3600 calls with only memory allocation and copying : < 1 second > - 1 call with a 24*3600 loop inside de C code : < 1 second > > So it appears that the very coslty step is the transition from Pharo > to C. And I was wondering if it was possible to drasticly reduce this > time by doing something like, generate the the machine code once and > call it multiple time ? > > Thanks in advance ! > > Thomas. |
Okey, I found the issue and it was me doing lazy benchmark: I had forgot a debug printing function in the C code, that I had removed between the benchs. Thanks again for your time ! Thomas. 2014-08-07 21:56 GMT+02:00 stepharo <[hidden email]>: NativeBoost methods compiles native code only once (the first time they are executed) and when sessionId changes (because you may be on a different platform). |
Free forum by Nabble | Edit this page |