Hello again,
it is about running a native code , stored in compiled methods. All we need is: a) allow code execution for object memory. On most platforms this is trivial - just pass additional flags to virtual memory allocation functions. b) add a single primitive 'call a native code' c) use a special compiled method trailer, which will store the native code inside itself Then we could place any native code inside a compiled method and use a single primitive to run this code. This is quite simple thing, given that both pharo and squeak adopted method trailers, so a primitive, before running the code, needs few checks: - a method's trailer format is valid (we could reserve a terminal byte value for that) - a native code, placed inside a method trailer is targeted for same platform as currently running on if not, then primitive will simply fail, and interpreter will run the method's bytecode, which , depending on a situation, can check if it can recompile the code (if platform doesn't match) or use a workaround. The only limitation of such approach, that native code, placed in compiled method should be relocate-agnostic, i.e. never use absolute jumps, only relative ones and so on. Also, we could force, that code using a specific calling convention, so it will be easy to fetch values from stack and other useful stuff from interpreter/object memory (like passing an interpreterProxy pointer). This could be a good answer in generating a trampoline code for FFI/Alien on the fly. What you think? -- Best regards, Igor Stasenko AKA sig. |
Here the proof of concept implementation.
1. Build VM with a new primitive (in Interpreter-primitiveNativeCall.st) 2. run image with new VM - file-in a NativeCodeTrailers - then NativeCodeTests I found that on my WinXP, i even haven't to change a platform-specific code to run it - only this tiny primitive!! Seems like someone put a right flags into memory allocation procedure already :) If you look at primitive, it is quite primitive :) The one thing, which bothers me above everything is a sanity check, which should be done more appropriately. A primitive checks that trailer contains a right platform code, and if it contains something else, then it fails. This is done, obviously, in order to prevent running a native code, generated for different platform. The problem, that VMMaker don't having any notion of platform code, to my knowledge, so i put there '1' for now. But for correct behavior, there should be one, set by VMMaker at initialization, so for different platforms, VM built for that platform will answer different code, i.e. windoze32 = 1, linux32 = 2 ... and so on. So, here the questions, which i like to be answered: 1. reserve a numbered primitive 2. discuss the calling convention 3. add a 'platform id' code to VMMaker, and let language side know , on which platform id VM currently runs on (through additional primitive) P.S. And yes. The ' MethodNativeCodeTest new testAnswer42 ' runs flawlessly , at least on my box :) And I'm already having an tiny assember lying over there , which i can use for generating the native stuff .. It would be cool to adapt this stuff for generating an FFI/Alien trampoline code. -- Best regards, Igor Stasenko AKA sig. NativeCodeTrailers.1.cs (8K) Download Attachment NativeCodeTests.2.cs (1K) Download Attachment Interpreter-primitiveNativeCall.st (1K) Download Attachment |
On Mon, Apr 05, 2010 at 03:13:51PM +0300, Igor Stasenko wrote:
> Here the proof of concept implementation. > So, here the questions, which i like to be answered: > 1. reserve a numbered primitive Probably a named primitive in the interpreter is best, see comment in John's new primitiveMicrosecondClock. I don't think that there is any performance implication once the primitive name has been resolved once. > 2. discuss the calling convention > 3. add a 'platform id' code to VMMaker, and let language side know , > on which platform id VM currently runs on (through additional > primitive) I'm not sure how to handle #3, but there are a lot of different CPUs and operating systems out there, not to mention all the different modes of say an Intel CPU. So it may not be trivial to answer the question "can this snippet of compiled code be executed in my current runtime environment". Dave |
In reply to this post by Igor Stasenko
Igor Stasenko schrieb:
> Hello again, > > it is about running a native code , stored in compiled methods. > > All we need is: > > a) allow code execution for object memory. On most platforms this is > trivial - just pass additional flags to virtual memory allocation > functions. > b) add a single primitive 'call a native code' > c) use a special compiled method trailer, which will store the native > code inside itself > > That's a very nice route to follow. You might be amused to know that I tried something along that line about 12 years ago: http://www.heeg.de/~hmm/squeak/Translator-981121/ My approach was to translate methods with some simple type annotations into corresponding native code which was then stored in the image and executed mostly as in your sketch. Alas, my computer at that time was a Macintosh Performa 6200 which was not exactly a speed wonder, and then there was not enough time to squash a bug in the register coloring algorithm, and then the project went asleep... With the current number of very bright people around, I'm pretty sure that a second stab at that problem can yield more tangible results. About setting a piece of memory to executable: Your version of windows probably does not use the NX bit to prevent code execution in data areas. Newer windows versions do. Other processor architectures have separate instruction and data cases, and there you'd need to flush the caches after you've generated code or moved it around. This should be designed into such an approach. Cheers, Hans-Martin |
In reply to this post by David T. Lewis
On 5 April 2010 15:42, David T. Lewis <[hidden email]> wrote:
> On Mon, Apr 05, 2010 at 03:13:51PM +0300, Igor Stasenko wrote: >> Here the proof of concept implementation. > > >> So, here the questions, which i like to be answered: >> 1. reserve a numbered primitive > > Probably a named primitive in the interpreter is best, see comment in > John's new primitiveMicrosecondClock. I don't think that there is any > performance implication once the primitive name has been resolved once. > Then this will make some difference :) >> 2. discuss the calling convention >> 3. add a 'platform id' code to VMMaker, and let language side know , >> on which platform id VM currently runs on (through additional >> primitive) > > I'm not sure how to handle #3, but there are a lot of different CPUs > and operating systems out there, not to mention all the different modes > of say an Intel CPU. So it may not be trivial to answer the question > "can this snippet of compiled code be executed in my current runtime > environment". > enough, we could use 32 bits for platform id code. Also, i thinking , maybe we could use masks for platform id, so, then one could generate/use the same native code for multiple kinds of OS-es. But then , a number of variants will be much more limited, because platform id will be treated as a bit field, instead of integer value. > Dave > > > -- Best regards, Igor Stasenko AKA sig. |
I would think it was great, as long as I could do something like
Smalltalk hasNativeCode To check if there's any of it in the image, and Smalltalk purgeNativeCode
(e.g., before a release of the image) I definitely don't think we should ship any native code in the image in releases...
On Mon, Apr 5, 2010 at 12:31 PM, Igor Stasenko <[hidden email]> wrote:
|
On 5 April 2010 23:15, Casey Ransberger <[hidden email]> wrote:
> I would think it was great, as long as I could do something like > Smalltalk hasNativeCode > To check if there's any of it in the image, and > Smalltalk purgeNativeCode > (e.g., before a release of the image) > I definitely don't think we should ship any native code in the image in > releases... > this is pretty straightforward: nativeMethods := CompiledMethod allInstances collect: [:m | m trailer nativeCode notNil ]. purgedMethods := Array new: nativeMethods size. 1 to: nativeMethods size do: [:i | purgedMethods at: i put: (( nativeMethods at:i) copyWithTrailerBytes: CompiledMethodTrailer empty) ]. nativeMethods elementsExchangeIdentityWith: purgedMethods. -- Best regards, Igor Stasenko AKA sig. |
On Mon, 5 Apr 2010, Igor Stasenko wrote:
> On 5 April 2010 23:15, Casey Ransberger <[hidden email]> wrote: >> I would think it was great, as long as I could do something like >> Smalltalk hasNativeCode >> To check if there's any of it in the image, and >> Smalltalk purgeNativeCode >> (e.g., before a release of the image) >> I definitely don't think we should ship any native code in the image in >> releases... >> > this is pretty straightforward: > > nativeMethods := CompiledMethod allInstances collect: [:m | m trailer select: ;) Levente > nativeCode notNil ]. > purgedMethods := Array new: nativeMethods size. > 1 to: nativeMethods size do: [:i | > purgedMethods at: i put: (( nativeMethods at:i) > copyWithTrailerBytes: CompiledMethodTrailer empty) ]. > > nativeMethods elementsExchangeIdentityWith: purgedMethods. > > -- > Best regards, > Igor Stasenko AKA sig. > > |
In reply to this post by Igor Stasenko
Hi,
I always liked how Smalltalk X did it... they have -> an ivar in CompiledMethod that is the "native code pointer". -> Primitive methods are just methods that have this pointer set to the primitive in the vm. -> Methods with embedded C-code are compiled and linked with the Smtk->C cross-compiler, the pointer than points to that function -> the JIT just put a pointer to the code it generates. So they merge primitives / static compiling(+embedded c-code) *and* the JIT into one not that ugly mechanism. In STX, the memory is managed by the VM, though (the code is not allocated in the GCed object memory by the JIT). So: Yes, I like this :-) and your mechanism is a nice way to get it easily integrated into the current system. Q: what happens when code is moved by the GC? -- Marcus Denker -- http://www.marcusdenker.de INRIA Lille -- Nord Europe. Team RMoD. |
On 6 April 2010 14:46, Marcus Denker <[hidden email]> wrote:
> Hi, > > I always liked how Smalltalk X did it... they have > > -> an ivar in CompiledMethod that is the "native code pointer". > -> Primitive methods are just methods that have this pointer set > to the primitive in the vm. > -> Methods with embedded C-code are compiled and linked with > the Smtk->C cross-compiler, the pointer than points to that function > -> the JIT just put a pointer to the code it generates. > > So they merge primitives / static compiling(+embedded c-code) *and* the JIT > into one not that ugly mechanism. In STX, the memory is managed by the VM, though (the > code is not allocated in the GCed object memory by the JIT). > > So: Yes, I like this :-) and your mechanism is a nice way to get it easily integrated into the current > system. > > Q: what happens when code is moved by the GC? > To avoid that, a native code should not issue any memory allocation, which may cause GC. Perhaps, there is a way to counter this , while still being able to move the code. A primitive could set a flag that its going to call a native code, embedded into compiled method, so, then, after GC, if this flag is set, VM knows, that memory allocation is issued from native code, and should fix the return address (1) on stack, before returning to native code, which moved to the new code location: primitive -> native code --(1)-> VM allocation procedure -> GC (1) - a return address , which should be fixed. > -- > Marcus Denker -- http://www.marcusdenker.de > INRIA Lille -- Nord Europe. Team RMoD. > > > _______________________________________________ > Pharo-project mailing list > [hidden email] > http://lists.gforge.inria.fr/cgi-bin/mailman/listinfo/pharo-project > -- Best regards, Igor Stasenko AKA sig. |
Okay, here the small snippet of code, which does the trick! :)
Compile it with: gcc calltest.c callgate.s -o calltest Here, a fn1() and fn2() is equal functions (almost), which imitating the native code, which moved during the VM call. And fnCallGate() is a special function, which needs to be called instead of actual function, which potentially could move our code. extern void fnCallGate(void* fn, ...); it take a first argument - a function which needs to be called, and then rest of arguments (if any), which belong to that function. This gate function using another function, which returns a current value of a method oop. In terms of Squeak VM, this is a method oop, which it retrieving before calling a VM function and after calling it (interpreterProxy->primitiveMethod). So, in case if method oop is moved during GC, it will change the return address by the same offset as a difference between old method oop value and new one. And so, with this trick, a native code could safely call any VM function (through a gate function) without the risk that if native code is moved during GC, it will return to wrong address and cause a system crash. On 6 April 2010 23:49, Igor Stasenko <[hidden email]> wrote: > On 6 April 2010 14:46, Marcus Denker <[hidden email]> wrote: >> Hi, >> >> I always liked how Smalltalk X did it... they have >> >> -> an ivar in CompiledMethod that is the "native code pointer". >> -> Primitive methods are just methods that have this pointer set >> to the primitive in the vm. >> -> Methods with embedded C-code are compiled and linked with >> the Smtk->C cross-compiler, the pointer than points to that function >> -> the JIT just put a pointer to the code it generates. >> >> So they merge primitives / static compiling(+embedded c-code) *and* the JIT >> into one not that ugly mechanism. In STX, the memory is managed by the VM, though (the >> code is not allocated in the GCed object memory by the JIT). >> >> So: Yes, I like this :-) and your mechanism is a nice way to get it easily integrated into the current >> system. >> >> Q: what happens when code is moved by the GC? >> > Yeah, it will break the whole thing :) > To avoid that, a native code should not issue any memory allocation, > which may cause GC. > > Perhaps, there is a way to counter this , while still being able to > move the code. > A primitive could set a flag that its going to call a native code, > embedded into compiled method, > so, then, after GC, if this flag is set, VM knows, that memory > allocation is issued from native code, > and should fix the return address (1) on stack, before returning to > native code, which moved to the new code location: > > primitive -> native code --(1)-> VM allocation procedure -> GC > > (1) - a return address , which should be fixed. > >> -- >> Marcus Denker -- http://www.marcusdenker.de >> INRIA Lille -- Nord Europe. Team RMoD. >> >> >> _______________________________________________ >> Pharo-project mailing list >> [hidden email] >> http://lists.gforge.inria.fr/cgi-bin/mailman/listinfo/pharo-project >> > > > > -- > Best regards, > Igor Stasenko AKA sig. > -- Best regards, Igor Stasenko AKA sig. |
Free forum by Nabble | Edit this page |