callback explanation

Previous Topic Next Topic
 
classic Classic list List threaded Threaded
1 message Options
Reply | Threaded
Open this post in threaded view
|

callback explanation

stepharo


Begin forwarded message:

From: Eliot Miranda <[hidden email]>
Subject: Re: [Vm-dev] FFI callbacks
Date: 25 Nov 2014 03:19:49 GMT+1
To: Squeak Virtual Machine Development Discussion
<[hidden email]>
Reply-To: Squeak Virtual Machine Development Discussion
<[hidden email]>

Hi Bert,

On Mon, Nov 24, 2014 at 2:49 PM, Bert Freudenberg <[hidden email]>
wrote:

How do they actually work? I want to know what information the thunk
needs, and what happens when it is activated. I think SqueakJS would
benefit from callback support.

I'll try and write this up properly in a blog post tomorrow.  But...

Callback is a wrapper round a piece of executable code, a block, and a
"signature" (see below).  The executable code is wrapped by an
FFICallbackThunk, an Alien, which is just a pointer to this code. Th
code is allocated from a special pool of memory marked as executable.  
The address of the code is unique and is hence used as a key to identify
the Callback and hence locate the block from the thunk.  The FFI
marshalling code should pass the address of the thunk when passing the
Callback.  Right now that has to be done explicitly.  The thunk, when
called, invokes thunkEntry, which is defined in e.g.
platforms/Cross/plugins/IA32ABI/ia32abicc.c (it needs a different
definition for each API).  On x86 thinkEntry needs only to be invoked
with the think and the stack pointer, since on x86 all arguments are
accessible from the stack.  On other platforms thunkEntry will need to
be invoked with the register arguments.

thunkEntry's job is to store the state necessary to return from the
callback in an instance of either VMCallbackContext32 or
VMCallbackContext64.  These are aliens for the following structure:

typedef struct {
     void *thunkp;
     char *stackptr;
     long *intRegArgs;
     double *floatRegArgs;
     void *savedCStackPointer;
     void *savedCFramePointer;
     union {
                             long vallong;
                             struct { int low, high; } valleint64;
                             struct { int high, low; } valbeint64;
                             double valflt64;
                             struct { void *addr; long size; } valstruct;
                         }   rvs;
     jmp_buf trampoline;
  } VMCallbackContext;

The structure lives in the stack frame of thunkEntry, hence one per
callback.

It does three things:

1. it identifies all the input arguments.  These are accessible,
depending on platform, through stackptr, intRegArgs and floatRegArgs.
2. it maintains the jmp_buf which can be used to longjmp back to the
callback to return from it.
3. it has slots to hold the result to be returned from the callback (the
union rvs).  The argument to the longjmp (the value returned from the
setjmp in thinkEntry) tells thunkEntry how to return the result, i.e. as
a 32-bit value, a 64-bit value, a double or a struct.

Once thinkEntry has packed up the state in its local VMCallbackContext,
it uses

     if ((flags = interpreterProxy->ownVM(0)) < 0) {
         fprintf(stderr,"Warning; callback failed to own the VM\n");
         return -1;
     }

to "own" the VM.  If the VM is single-threaded then this can check that
the callback is being made on the same thread as the VM, and fail
otherwise.  If the VM is multi-threaded ownVM can block until the thread
can enter the VM.

It then calls-back into the VM using

         interpreterProxy->sendInvokeCallbackContext(&vmcc);

which sends the message (found in the specialObjectsArray)
#invokeCallbackContext: which is understood by Alien (see Alien
class>>#invokeCallbackContext:).  This method wraps up the input
argument (the raw address of thunkEntry's VMCallbackContext) in the
relevant Alien and invokes Callback's entry point:

invokeCallbackContext: vmCallbackContextAddress "<Integer>"
"^<FFICallbackReturnValue>"
     "The low-level entry-point for callbacks sent from the VM/IA32ABI
plugin.
      Return via primReturnFromContext:through:.  thisContext's sender
is the
      call-out context."
     | callbackAlien type |
     callbackAlien := (Smalltalk wordSize = 4
                         ifTrue: [VMCallbackContext32]
                         ifFalse: [VMCallbackContext64])
                             atAddress: vmCallbackContextAddress.
     [type := Callback evaluateCallbackForContext: callbackAlien]
         ifCurtailed: [self error: 'attempt to non-local return across a
callback'].
     type ifNil:
         [type := 1. callbackAlien wordResult: -1].
     callbackAlien primReturnAs: type fromContext: thisContext

The sendInvokeCallbackContext machinery just constructs an activation of
invokeCallbackContext: on top of the current process's stack, which is
usually the process that called out through the FFI.  Bit it doesn't
have to be.  threaded callbacks are too detailed for this brief message.

Callback then locates the relevant marshalling method for the callback's
signature (actually it does this when the Callback is created, but the
effect is the same).  Callback uses methods that identify themselves via
pragmas to choose the relevant marshaller. e.g. for a qsort sort
function callback the marshaller on x86 is

Callback methods for signatures
voidstarvoidstarRetint: callbackContext sp: spAlien
     <signature: #(int (*)(const void *, const void *)) abi: 'IA32'>
     ^callbackContext wordResult:
         (block
             value: (Alien forPointer: (spAlien unsignedLongAt: 1))
             value: (Alien forPointer: (spAlien unsignedLongAt: 5)))


So it fetches the arguments from the stack using Alien accessors,
evaluates the block with them and then assigns the result via
wordResult, and answers the type code back to invokeCallbackContext:, e.g.

VMCallbackContext32 methods for accessing
wordResult: anInteger
     "Accept any value in the -2^31 to 2^32-1 range."
     anInteger >= 0
         ifTrue: [self unsignedLongAt: 25 put: anInteger]
         ifFalse: [self signedLongAt: 25 put: anInteger].
     ^1

Then invokeCallbackContext: invokes the primitive to longjmp back to
thunkEntry, supplying the return code that will allow thunkEntry to
return the reslt correctly:

VMCallbackContext32 methods for primitives
primReturnAs: typeCode "<SmallInteger>" fromContext: context
"<MethodContext>"
     <primitive: 'primReturnAsFromContextThrough' module: 'IA32ABI'
error: ec>
     ^self primitiveFailed


Then thunkEntry switches on the return code and returns to the caller.


Note that sendInvokeCallbackContext and primReturnAsFromContextThrough
conspire to save, set and restore the VM's notion of what the C stack is
in the VMCallbackContext's savedCStackPointer & savedCFramePointer,
growing the stack on callback, and cutting it back on return.  There's a
variable in the VM, previousCallbackContext, that
primReturnAsFromContextThrough uses to make sure returns are LIFO.

I imagine that somehow the VM's state needs to be saved, then the
context that defined the block is activated, and then the VM would run
the block until it returns (or until the return prim is called?), then
the VM's state would need to be restored to what it was, and the result
is passed back by the thunk returning.

That's right.

Am I close? Since the callback can happen at any time, and it could do
anything, saving the whole VM state seems daunting.

Provided that the callback occurs from the context of a callout there
are no problems.  Threaded callbacks take some more doing. Basically the
VM needs to b sharable between threads.  This is the threaded VM
prototype.  If you absolutely need threaded callbacks we should have a
serious talk.  This is not trivial to productise.


HTH
--
best,
Eliot