A small idea how to get closer to hardware with small efforts :)

Previous Topic Next Topic
 
classic Classic list List threaded Threaded
11 messages Options
Reply | Threaded
Open this post in threaded view
|

A small idea how to get closer to hardware with small efforts :)

Igor Stasenko
Hello again,

it is about running a native code , stored in compiled methods.

All we need is:

a) allow code execution for object memory. On most platforms this is
trivial - just pass additional flags to virtual memory allocation
functions.
b) add a single primitive 'call a native code'
c) use a special compiled method trailer, which will store the native
code inside itself

Then we could place any native code inside a compiled method and use a
single primitive to run this code.
This is quite simple thing, given that both pharo and squeak adopted
method trailers, so
a primitive, before running the code, needs few checks:
 - a method's trailer format is valid (we could reserve a terminal
byte value for that)
 - a native code, placed inside a method trailer is targeted for same
platform as currently running on

if not, then primitive will simply fail, and interpreter will run the
method's bytecode, which , depending on a situation, can check if it
can recompile the code
(if platform doesn't match) or use a workaround.
The only limitation of such approach, that native code, placed in
compiled method should be relocate-agnostic, i.e.
never use absolute jumps, only relative ones and so on.

Also, we could force, that code using a specific calling convention,
so it will be easy to fetch values from stack and other useful stuff
from interpreter/object memory (like passing an interpreterProxy
pointer).
This could be a good answer in generating a trampoline code for
FFI/Alien on the fly.

What you think?

--
Best regards,
Igor Stasenko AKA sig.

Reply | Threaded
Open this post in threaded view
|

Re: A small idea how to get closer to hardware with small efforts :)

Igor Stasenko
Here the proof of concept implementation.

1. Build VM with a new primitive (in Interpreter-primitiveNativeCall.st)

2. run image with new VM
 - file-in a NativeCodeTrailers
 - then NativeCodeTests

I found that on my WinXP, i even haven't to change a platform-specific
code to run it - only this tiny primitive!!
Seems like someone put a right flags into memory allocation procedure already :)

If you look at primitive, it is quite primitive :)
The one thing, which bothers me above everything is a sanity check,
which should be done more appropriately.
A primitive checks that trailer contains a right platform code, and if
it contains something else, then it fails.
This is done, obviously, in order to prevent running a native code,
generated for different platform.
The problem, that VMMaker don't having any notion of platform code, to
my knowledge,
so i put there '1' for now. But for correct behavior, there should be
one, set by VMMaker at initialization,
so for different platforms, VM built for that platform will answer
different code, i.e. windoze32 = 1, linux32 = 2 ... and so on.

So, here the questions, which i like to be answered:
1. reserve a numbered primitive
2. discuss the calling convention
3. add a 'platform id' code to VMMaker, and let language side know ,
on which platform id VM currently runs on (through additional
primitive)

P.S. And yes. The ' MethodNativeCodeTest new testAnswer42 '  runs
flawlessly , at least on my box :)
And I'm already having an tiny assember lying over there , which i can
use for generating the native stuff ..
It would be cool to adapt this stuff for generating an FFI/Alien
trampoline code.

--
Best regards,
Igor Stasenko AKA sig.



NativeCodeTrailers.1.cs (8K) Download Attachment
NativeCodeTests.2.cs (1K) Download Attachment
Interpreter-primitiveNativeCall.st (1K) Download Attachment
Reply | Threaded
Open this post in threaded view
|

Re: A small idea how to get closer to hardware with small efforts :)

David T. Lewis
On Mon, Apr 05, 2010 at 03:13:51PM +0300, Igor Stasenko wrote:
> Here the proof of concept implementation.


> So, here the questions, which i like to be answered:
> 1. reserve a numbered primitive

Probably a named primitive in the interpreter is best, see comment in
John's new primitiveMicrosecondClock. I don't think that there is any
performance implication once the primitive name has been resolved once.

> 2. discuss the calling convention
> 3. add a 'platform id' code to VMMaker, and let language side know ,
> on which platform id VM currently runs on (through additional
> primitive)

I'm not sure how to handle #3, but there are a lot of different CPUs
and operating systems out there, not to mention all the different modes
of say an Intel CPU. So it may not be trivial to answer the question
"can this snippet of compiled code be executed in my current runtime
environment".

Dave


Reply | Threaded
Open this post in threaded view
|

Re: A small idea how to get closer to hardware with small efforts :)

Hans-Martin Mosner
In reply to this post by Igor Stasenko
Igor Stasenko schrieb:

> Hello again,
>
> it is about running a native code , stored in compiled methods.
>
> All we need is:
>
> a) allow code execution for object memory. On most platforms this is
> trivial - just pass additional flags to virtual memory allocation
> functions.
> b) add a single primitive 'call a native code'
> c) use a special compiled method trailer, which will store the native
> code inside itself
>
>  

That's a very nice route to follow. You might be amused to know that I
tried something along that line about 12 years ago:
http://www.heeg.de/~hmm/squeak/Translator-981121/
My approach was to translate methods with some simple type annotations
into corresponding native code which was then stored in the image and
executed mostly as in your sketch.
Alas, my computer at that time was a Macintosh Performa 6200 which was
not exactly a speed wonder, and then there was not enough time to squash
a bug in the register coloring algorithm, and then the project went
asleep...
With the current number of very bright people around, I'm pretty sure
that a second stab at that problem can yield more tangible results.

About setting a piece of memory to executable:
Your version of windows probably does not use the NX bit to prevent code
execution in data areas. Newer windows versions do. Other processor
architectures have separate instruction and data cases, and there you'd
need to flush the caches after you've generated code or moved it around.
This should be designed into such an approach.

Cheers,
Hans-Martin

Reply | Threaded
Open this post in threaded view
|

Re: A small idea how to get closer to hardware with small efforts :)

Igor Stasenko
In reply to this post by David T. Lewis
On 5 April 2010 15:42, David T. Lewis <[hidden email]> wrote:

> On Mon, Apr 05, 2010 at 03:13:51PM +0300, Igor Stasenko wrote:
>> Here the proof of concept implementation.
>
>
>> So, here the questions, which i like to be answered:
>> 1. reserve a numbered primitive
>
> Probably a named primitive in the interpreter is best, see comment in
> John's new primitiveMicrosecondClock. I don't think that there is any
> performance implication once the primitive name has been resolved once.
>
this is true, but numbered primitive could be inlined into an interpret() loop.
Then this will make some difference :)

>> 2. discuss the calling convention
>> 3. add a 'platform id' code to VMMaker, and let language side know ,
>> on which platform id VM currently runs on (through additional
>> primitive)
>
> I'm not sure how to handle #3, but there are a lot of different CPUs
> and operating systems out there, not to mention all the different modes
> of say an Intel CPU. So it may not be trivial to answer the question
> "can this snippet of compiled code be executed in my current runtime
> environment".
>
Well, a 2 bytes for code gives us 65536 variants. If this is not
enough, we could use 32 bits for platform id code.
Also, i thinking , maybe we could use masks for platform id,
so, then one could generate/use the same native code for multiple
kinds of OS-es.
But then , a number of variants will be much more limited, because
platform id will be treated as a bit field, instead of integer value.

> Dave
>
>
>



--
Best regards,
Igor Stasenko AKA sig.

Reply | Threaded
Open this post in threaded view
|

Re: A small idea how to get closer to hardware with small efforts :)

Casey Ransberger-2
I would think it was great, as long as I could do something like

Smalltalk hasNativeCode

To check if there's any of it in the image, and

Smalltalk purgeNativeCode

(e.g., before a release of the image)

I definitely don't think we should ship any native code in the image in releases...

On Mon, Apr 5, 2010 at 12:31 PM, Igor Stasenko <[hidden email]> wrote:
On 5 April 2010 15:42, David T. Lewis <[hidden email]> wrote:
> On Mon, Apr 05, 2010 at 03:13:51PM +0300, Igor Stasenko wrote:
>> Here the proof of concept implementation.
>
>
>> So, here the questions, which i like to be answered:
>> 1. reserve a numbered primitive
>
> Probably a named primitive in the interpreter is best, see comment in
> John's new primitiveMicrosecondClock. I don't think that there is any
> performance implication once the primitive name has been resolved once.
>
this is true, but numbered primitive could be inlined into an interpret() loop.
Then this will make some difference :)

>> 2. discuss the calling convention
>> 3. add a 'platform id' code to VMMaker, and let language side know ,
>> on which platform id VM currently runs on (through additional
>> primitive)
>
> I'm not sure how to handle #3, but there are a lot of different CPUs
> and operating systems out there, not to mention all the different modes
> of say an Intel CPU. So it may not be trivial to answer the question
> "can this snippet of compiled code be executed in my current runtime
> environment".
>
Well, a 2 bytes for code gives us 65536 variants. If this is not
enough, we could use 32 bits for platform id code.
Also, i thinking , maybe we could use masks for platform id,
so, then one could generate/use the same native code for multiple
kinds of OS-es.
But then , a number of variants will be much more limited, because
platform id will be treated as a bit field, instead of integer value.

> Dave
>
>
>



--
Best regards,
Igor Stasenko AKA sig.




Reply | Threaded
Open this post in threaded view
|

Re: A small idea how to get closer to hardware with small efforts :)

Igor Stasenko
On 5 April 2010 23:15, Casey Ransberger <[hidden email]> wrote:
> I would think it was great, as long as I could do something like
> Smalltalk hasNativeCode
> To check if there's any of it in the image, and
> Smalltalk purgeNativeCode
> (e.g., before a release of the image)
> I definitely don't think we should ship any native code in the image in
> releases...
>
this is pretty straightforward:

nativeMethods := CompiledMethod allInstances collect: [:m | m trailer
nativeCode notNil ].
purgedMethods := Array new: nativeMethods size.
1 to: nativeMethods size do: [:i |
  purgedMethods at: i put:  (( nativeMethods at:i)
copyWithTrailerBytes: CompiledMethodTrailer empty) ].

nativeMethods elementsExchangeIdentityWith: purgedMethods.

--
Best regards,
Igor Stasenko AKA sig.

Reply | Threaded
Open this post in threaded view
|

Re: A small idea how to get closer to hardware with small efforts :)

Levente Uzonyi-2
On Mon, 5 Apr 2010, Igor Stasenko wrote:

> On 5 April 2010 23:15, Casey Ransberger <[hidden email]> wrote:
>> I would think it was great, as long as I could do something like
>> Smalltalk hasNativeCode
>> To check if there's any of it in the image, and
>> Smalltalk purgeNativeCode
>> (e.g., before a release of the image)
>> I definitely don't think we should ship any native code in the image in
>> releases...
>>
> this is pretty straightforward:
>
> nativeMethods := CompiledMethod allInstances collect: [:m | m trailer

select: ;)

Levente

> nativeCode notNil ].
> purgedMethods := Array new: nativeMethods size.
> 1 to: nativeMethods size do: [:i |
>  purgedMethods at: i put:  (( nativeMethods at:i)
> copyWithTrailerBytes: CompiledMethodTrailer empty) ].
>
> nativeMethods elementsExchangeIdentityWith: purgedMethods.
>
> --
> Best regards,
> Igor Stasenko AKA sig.
>
>

Reply | Threaded
Open this post in threaded view
|

Re: [Pharo-project] A small idea how to get closer to hardware with small efforts :)

Marcus Denker-4
In reply to this post by Igor Stasenko
Hi,

I always liked how Smalltalk X did it... they have

        -> an ivar in CompiledMethod that is the "native code pointer".
        -> Primitive methods are just methods that have this pointer set
             to the primitive in the vm.
        -> Methods with embedded C-code are compiled and linked with
             the Smtk->C cross-compiler, the pointer than points to that function
        -> the JIT just put a pointer to the code it generates.  

So they merge primitives / static compiling(+embedded c-code) *and* the JIT
into one not that ugly mechanism. In STX, the memory is managed by the VM, though (the
code is not allocated in the GCed object memory by the JIT).

So: Yes, I like this :-) and your mechanism is a nice way to get it easily integrated into the current
system.

Q: what happens when code is moved by the GC?

--
Marcus Denker  -- http://www.marcusdenker.de
INRIA Lille -- Nord Europe. Team RMoD.


Reply | Threaded
Open this post in threaded view
|

Re: [Pharo-project] A small idea how to get closer to hardware with small efforts :)

Igor Stasenko
On 6 April 2010 14:46, Marcus Denker <[hidden email]> wrote:

> Hi,
>
> I always liked how Smalltalk X did it... they have
>
>        -> an ivar in CompiledMethod that is the "native code pointer".
>        -> Primitive methods are just methods that have this pointer set
>             to the primitive in the vm.
>        -> Methods with embedded C-code are compiled and linked with
>             the Smtk->C cross-compiler, the pointer than points to that function
>        -> the JIT just put a pointer to the code it generates.
>
> So they merge primitives / static compiling(+embedded c-code) *and* the JIT
> into one not that ugly mechanism. In STX, the memory is managed by the VM, though (the
> code is not allocated in the GCed object memory by the JIT).
>
> So: Yes, I like this :-) and your mechanism is a nice way to get it easily integrated into the current
> system.
>
> Q: what happens when code is moved by the GC?
>
Yeah, it will break the whole thing :)
To avoid that, a native code should not issue any memory allocation,
which may cause GC.

Perhaps, there is a way to counter this , while still being able to
move the code.
A primitive could set a flag that its going to call a native code,
embedded into compiled method,
so, then, after GC, if this flag is set, VM knows, that memory
allocation is issued from native code,
and should fix the return address (1) on stack, before returning to
native code, which moved to the new code location:

primitive -> native code --(1)-> VM allocation procedure -> GC

(1) - a return address , which should be fixed.

> --
> Marcus Denker  -- http://www.marcusdenker.de
> INRIA Lille -- Nord Europe. Team RMoD.
>
>
> _______________________________________________
> Pharo-project mailing list
> [hidden email]
> http://lists.gforge.inria.fr/cgi-bin/mailman/listinfo/pharo-project
>



--
Best regards,
Igor Stasenko AKA sig.

Reply | Threaded
Open this post in threaded view
|

Re: [Pharo-project] A small idea how to get closer to hardware with small efforts :)

Igor Stasenko
Okay, here the small snippet of code, which does the trick! :)

Compile it with:

 gcc calltest.c callgate.s -o calltest

Here, a fn1() and fn2()
is equal functions (almost), which imitating the native code, which
moved during the VM call.

And fnCallGate() is a special function, which needs to be called
instead of actual function,
which potentially could move our code.

extern void fnCallGate(void* fn, ...);

it take a first argument - a function which needs to be called, and
then rest of arguments (if any),
which belong to that function.

This gate function using another function, which returns a current
value of a method oop.
In terms of Squeak VM, this is a method oop, which it retrieving
before calling a VM function
and after calling it (interpreterProxy->primitiveMethod).
So, in case if method oop is moved during GC, it will change the
return address by the same offset
as a difference between old method oop value and new one.

And so, with this trick, a native code could safely call any VM
function (through a gate function)
without the risk that if native code is moved during GC, it will
return to wrong address and cause a system crash.

On 6 April 2010 23:49, Igor Stasenko <[hidden email]> wrote:

> On 6 April 2010 14:46, Marcus Denker <[hidden email]> wrote:
>> Hi,
>>
>> I always liked how Smalltalk X did it... they have
>>
>>        -> an ivar in CompiledMethod that is the "native code pointer".
>>        -> Primitive methods are just methods that have this pointer set
>>             to the primitive in the vm.
>>        -> Methods with embedded C-code are compiled and linked with
>>             the Smtk->C cross-compiler, the pointer than points to that function
>>        -> the JIT just put a pointer to the code it generates.
>>
>> So they merge primitives / static compiling(+embedded c-code) *and* the JIT
>> into one not that ugly mechanism. In STX, the memory is managed by the VM, though (the
>> code is not allocated in the GCed object memory by the JIT).
>>
>> So: Yes, I like this :-) and your mechanism is a nice way to get it easily integrated into the current
>> system.
>>
>> Q: what happens when code is moved by the GC?
>>
> Yeah, it will break the whole thing :)
> To avoid that, a native code should not issue any memory allocation,
> which may cause GC.
>
> Perhaps, there is a way to counter this , while still being able to
> move the code.
> A primitive could set a flag that its going to call a native code,
> embedded into compiled method,
> so, then, after GC, if this flag is set, VM knows, that memory
> allocation is issued from native code,
> and should fix the return address (1) on stack, before returning to
> native code, which moved to the new code location:
>
> primitive -> native code --(1)-> VM allocation procedure -> GC
>
> (1) - a return address , which should be fixed.
>
>> --
>> Marcus Denker  -- http://www.marcusdenker.de
>> INRIA Lille -- Nord Europe. Team RMoD.
>>
>>
>> _______________________________________________
>> Pharo-project mailing list
>> [hidden email]
>> http://lists.gforge.inria.fr/cgi-bin/mailman/listinfo/pharo-project
>>
>
>
>
> --
> Best regards,
> Igor Stasenko AKA sig.
>


--
Best regards,
Igor Stasenko AKA sig.



callgate.s (886 bytes) Download Attachment
calltest.c (1K) Download Attachment