rearchitecting the FFI implementation for reentrancy

Previous Topic Next Topic
 
classic Classic list List threaded Threaded
6 messages Options
Reply | Threaded
Open this post in threaded view
|

rearchitecting the FFI implementation for reentrancy

Eliot Miranda-2
 
Hi All,

    I'm looking at making the Squeak FFI reentrant to support nested calls and possibly threading.  The current FFI has a couple of issues which render it non-reentrant.

1. The FFI plugin has a number of static state variables which convey information from the pre-call side of a call-out to the return-side of a call-out.  These include ffiRetOop, ffiRetClass (oops), ffiRetSpec (IIUC a pointer into an object body, the result box for a struct result), ffiRetSpecSize (the size of the struct result box), etc.  Since these variables are set-up before a call-out and used once the call-out has returned, any nested call-out that happens within a call-back from the call-out will overwrite these variables.  Further, any GC that occurs during a call-back can move the struct result box object, rendering the ffiRetSpec pointer invalid, and likely resulting in heap corruption.

2. The platform support code for argument marshalling has static areas in which to marshall the outgoing arguments (which is fine since they're only used on the way out).  But there are also static areas used to hold copies of string arguments.  These are malloced, stored in a static array of pointers, and freed on the return side by ffiCleanup.  Any nested callout will either overwrite the outer call's strings and/or prematurely free the string copies when the nested call-outdoes an ffiCleanup.

Making the variables in 1. reentrant is straight-forward.  All state is derived from the external call spec in an external callout method (primititive 120 method).  The callback machinery is careful to restore newMethod to its state on return form a call-back.  So the state can be refetched from newMethod on the return side of the call-out.  This also avoids having to update ffiRetOop & ffiRetClass in any GC, and allows them to be kept as local variables of the call-out side.

Making 2. reentrant is a little more involved, which is why I wanted to raise this on the list.  here is a sketch of a generic solution that I want y'all to sanity-check for different architectures with which you're familiar.


The basic idea is to use alloca to stack allocate all non-movable state for a particular call-out, using stack discipline to reclaim it automatically when the FFI call-out primitive returns.  The alloca'ed state comprises three regions.  In the quasi-diagram below the stack grows down.
    region 1, space for any copied strings and a temporary result for structure returns
    region 2, an array of pointers, one for each argument, each pointing to the start of its corresponding argument further on in the alloca'ed space
    region 3, the marshalled arguments, ready for call-out

The call-out machinery can either make a pass over the arguments before marshalling to compute the size of the alloca'ed region, or it can guestimate, based on e.g. a precomputing of the size of any struct arguments.

One question is whether marshaling can be done generically in the FFI plugin based on e.g. functions such as alignStruct, alignDouble, alignLongLong implemented in the platform to tell the generic code how to align elements greater than a word in size, or whether it would be better to put the code in the platform.  I'm guessing the alignStruct approach would work fine and could be implemented with macros.

An important question is whether the array-of-pointers scheme will allow platforms that pass values in registers to locate any and all arguments.  Perhaps some platforms will require a separate array of pointers to floating-point values?  Please speak up if you know of any issues here.

Finally I'm assuming that to actually make the call once the arguments have been marshalled some assembler stub (perhaps one for each argument count up to the number of register arguments times the number of basic calling conventions, integer/pointer result, double, struct result) can be called which takes pointers to the three regions, and the function to be called, cuts back the stack to the top of the alloca'ed area, putting the return address in the right place, and tail-calls/jumps to the function.

Does this sound sane?  I'm sure it'll work on a number of platforms I'm familiar with (which e.g. does not include iPhone). I'm sure it won't work for x86-64 structures that partytially fit within registers, but the current implementation is broken for those anyway.  I've used alloca successfully in the VW FFI but there were different implementations of marshalling for each ABI.  Can you say either way whether you think this would work for particular platforms you're familiar with?

TIA
Eliot
Reply | Threaded
Open this post in threaded view
|

Re: rearchitecting the FFI implementation for reentrancy

Igor Stasenko

2009/8/6 Eliot Miranda <[hidden email]>:

>
> Hi All,
>     I'm looking at making the Squeak FFI reentrant to support nested calls and possibly threading.  The current FFI has a couple of issues which render it non-reentrant.
> 1. The FFI plugin has a number of static state variables which convey information from the pre-call side of a call-out to the return-side of a call-out.  These include ffiRetOop, ffiRetClass (oops), ffiRetSpec (IIUC a pointer into an object body, the result box for a struct result), ffiRetSpecSize (the size of the struct result box), etc.  Since these variables are set-up before a call-out and used once the call-out has returned, any nested call-out that happens within a call-back from the call-out will overwrite these variables.  Further, any GC that occurs during a call-back can move the struct result box object, rendering the ffiRetSpec pointer invalid, and likely resulting in heap corruption.
> 2. The platform support code for argument marshalling has static areas in which to marshall the outgoing arguments (which is fine since they're only used on the way out).  But there are also static areas used to hold copies of string arguments.  These are malloced, stored in a static array of pointers, and freed on the return side by ffiCleanup.  Any nested callout will either overwrite the outer call's strings and/or prematurely free the string copies when the nested call-outdoes an ffiCleanup.
> Making the variables in 1. reentrant is straight-forward.  All state is derived from the external call spec in an external callout method (primititive 120 method).  The callback machinery is careful to restore newMethod to its state on return form a call-back.  So the state can be refetched from newMethod on the return side of the call-out.  This also avoids having to update ffiRetOop & ffiRetClass in any GC, and allows them to be kept as local variables of the call-out side.
> Making 2. reentrant is a little more involved, which is why I wanted to raise this on the list.  here is a sketch of a generic solution that I want y'all to sanity-check for different architectures with which you're familiar.
>
> The basic idea is to use alloca to stack allocate all non-movable state for a particular call-out, using stack discipline to reclaim it automatically when the FFI call-out primitive returns.  The alloca'ed state comprises three regions.  In the quasi-diagram below the stack grows down.
>     region 1, space for any copied strings and a temporary result for structure returns
>     region 2, an array of pointers, one for each argument, each pointing to the start of its corresponding argument further on in the alloca'ed space
>     region 3, the marshalled arguments, ready for call-out
> The call-out machinery can either make a pass over the arguments before marshalling to compute the size of the alloca'ed region, or it can guestimate, based on e.g. a precomputing of the size of any struct arguments.

+1 to use alloca().

And, i think , a single pass could be enough, if do something like :

int need2ndpass = 0;
argPtrs = alloca(numArgs * sizeof(void*));
for (i=0 ; i<numArgs; i++) {
  pArg = alloca(argSize(i));
  need2ndpass |= ConvertAndPushArg(pArg, i);
}
if (need2ndpass) {
  ....
}

then, a 2nd pass can be avoided, if none of arguments/retvalue
requires non-movable memory buffers.
There's also a question how alloca() behaves in respect of memory
alignment. But this can be tested by simple snippet:
ptr1 = alloca(1);
ptr2 = alloca(1);
assert( ptr1 - ptr2 == ... expected value...)

> One question is whether marshaling can be done generically in the FFI plugin based on e.g. functions such as alignStruct, alignDouble, alignLongLong implemented in the platform to tell the generic code how to align elements greater than a word in size, or whether it would be better to put the code in the platform.  I'm guessing the alignStruct approach would work fine and could be implemented with macros.
> An important question is whether the array-of-pointers scheme will allow platforms that pass values in registers to locate any and all arguments.  Perhaps some platforms will require a separate array of pointers to floating-point values?  Please speak up if you know of any issues here.
> Finally I'm assuming that to actually make the call once the arguments have been marshalled some assembler stub (perhaps one for each argument count up to the number of register arguments times the number of basic calling conventions, integer/pointer result, double, struct result) can be called which takes pointers to the three regions, and the function to be called, cuts back the stack to the top of the alloca'ed area, putting the return address in the right place, and tail-calls/jumps to the function.
> Does this sound sane?  I'm sure it'll work on a number of platforms I'm familiar with (which e.g. does not include iPhone). I'm sure it won't work for x86-64 structures that partytially fit within registers, but the current implementation is broken for those anyway.  I've used alloca successfully in the VW FFI but there were different implementations of marshalling for each ABI.  Can you say either way whether you think this would work for particular platforms you're familiar with?

can't help with x64 . never used it.

> TIA
> Eliot
>



--
Best regards,
Igor Stasenko AKA sig.
Reply | Threaded
Open this post in threaded view
|

Re: rearchitecting the FFI implementation for reentrancy

Göran Krampe
In reply to this post by Eliot Miranda-2
 
Hi Eliot and all!

Eliot Miranda wrote:
> Hi All,
>     I'm looking at making the Squeak FFI reentrant to support nested calls
> and possibly threading.  The current FFI has a couple of issues which render
> it non-reentrant.

The tech stuff is over my head, but I do have three questions related to
this:

1. What about Alien? Shouldn't we try to move towards Alien instead of
current FFI? Or is that too much work at this point?

2. Callbacks has been a sore point in Squeak for a long time. AFAIK
there is a patch available on www.squeakgtk.org/wiki/download.html, not
sure what it does or if it is the original patch from Andreas when
wxSqueak was being built. wxSqueak had a patched VM I recall. Perhaps
that stuff is not related.

3. Could we possibly ask for a status update on Cog and related
activities? We are itching for news! :) Also curious about your interest
in Factor and its lower bowels (definitely cool stuff going on there).

regards, Göran

Reply | Threaded
Open this post in threaded view
|

Re: rearchitecting the FFI implementation for reentrancy

David T. Lewis
In reply to this post by Eliot Miranda-2
 
On Wed, Aug 05, 2009 at 05:56:02PM -0700, Eliot Miranda wrote:
>  
> Hi All,
>     I'm looking at making the Squeak FFI reentrant to support nested calls
> and possibly threading.  The current FFI has a couple of issues which render
> it non-reentrant.

Your approach sounds fairly straightforward. Slightly off of your intended
topic, but I want to mention also that we have some changes queued up for
FFI to address 64 bit pointer issues. The changes are fairly extensive, so
I'd like to lobby for getting these done before too much bit rot sets in.

For a description of the status of the 64-bit FFI updates:
  http://lists.squeakfoundation.org/pipermail/vm-dev/2008-April/001900.html

Patches on Mantis:
  http://bugs.squeak.org/view.php?id=7237

Eliot, I don't know what your time frame is for doing the reentrant FFI work,
but if we could do the 64-bit updates first that would be a very good thing
from my point of view.

Is there any interest in taking this on right now? The affected folks are
Andreas, Ian, and John. Opinions?

Dave

Reply | Threaded
Open this post in threaded view
|

Re: rearchitecting the FFI implementation for reentrancy

Andreas.Raab
 
David -

My recommendation would be not to get held up by whatever Eliot is
working towards. There is no telling how long it is going to take us
until any of it is released since we always go through a full
discussion-review-decision cycle before making such choices (incl.
approval by the board). In fact, at this point there isn't even any
telling whether we're going to release any of it or not ;-)

So what I'd recommend doing is to move forward with whatever has been on
the list of things to do. Eliot is pretty good in folding these things
into his work where it makes sense, and 64bit stuff is important for us
too; in particular with server deployments.

Cheers,
   - Andreas

David T. Lewis wrote:

>  
> On Wed, Aug 05, 2009 at 05:56:02PM -0700, Eliot Miranda wrote:
>>  
>> Hi All,
>>     I'm looking at making the Squeak FFI reentrant to support nested calls
>> and possibly threading.  The current FFI has a couple of issues which render
>> it non-reentrant.
>
> Your approach sounds fairly straightforward. Slightly off of your intended
> topic, but I want to mention also that we have some changes queued up for
> FFI to address 64 bit pointer issues. The changes are fairly extensive, so
> I'd like to lobby for getting these done before too much bit rot sets in.
>
> For a description of the status of the 64-bit FFI updates:
>   http://lists.squeakfoundation.org/pipermail/vm-dev/2008-April/001900.html
>
> Patches on Mantis:
>   http://bugs.squeak.org/view.php?id=7237
>
> Eliot, I don't know what your time frame is for doing the reentrant FFI work,
> but if we could do the 64-bit updates first that would be a very good thing
> from my point of view.
>
> Is there any interest in taking this on right now? The affected folks are
> Andreas, Ian, and John. Opinions?
>
> Dave
>
Reply | Threaded
Open this post in threaded view
|

Re: rearchitecting the FFI implementation for reentrancy

David T. Lewis
 
Thanks, good advice.

I had listed "64-bit FFI" as last on the suggested to-do list:
  http://lists.squeakfoundation.org/pipermail/vm-dev/2009-July/002853.html

That was on the theory that it's a complex change that requires lots of
coordination, so I figure that it's better to take on some easier changes
first. I guess I had better keep that priority unchanged, and if Eliot
gets his changes done first, well ... too bad for me ;)

Dave

On Fri, Aug 07, 2009 at 01:05:00AM -0700, Andreas Raab wrote:

>
> David -
>
> My recommendation would be not to get held up by whatever Eliot is
> working towards. There is no telling how long it is going to take us
> until any of it is released since we always go through a full
> discussion-review-decision cycle before making such choices (incl.
> approval by the board). In fact, at this point there isn't even any
> telling whether we're going to release any of it or not ;-)
>
> So what I'd recommend doing is to move forward with whatever has been on
> the list of things to do. Eliot is pretty good in folding these things
> into his work where it makes sense, and 64bit stuff is important for us
> too; in particular with server deployments.
>
> Cheers,
>   - Andreas
>
> David T. Lewis wrote:
> >
> >On Wed, Aug 05, 2009 at 05:56:02PM -0700, Eliot Miranda wrote:
> >>
> >>Hi All,
> >>    I'm looking at making the Squeak FFI reentrant to support nested calls
> >>and possibly threading.  The current FFI has a couple of issues which
> >>render
> >>it non-reentrant.
> >
> >Your approach sounds fairly straightforward. Slightly off of your intended
> >topic, but I want to mention also that we have some changes queued up for
> >FFI to address 64 bit pointer issues. The changes are fairly extensive, so
> >I'd like to lobby for getting these done before too much bit rot sets in.
> >
> >For a description of the status of the 64-bit FFI updates:
> >  http://lists.squeakfoundation.org/pipermail/vm-dev/2008-April/001900.html
> >
> >Patches on Mantis:
> >  http://bugs.squeak.org/view.php?id=7237
> >
> >Eliot, I don't know what your time frame is for doing the reentrant FFI
> >work,
> >but if we could do the 64-bit updates first that would be a very good thing
> >from my point of view.
> >
> >Is there any interest in taking this on right now? The affected folks are
> >Andreas, Ian, and John. Opinions?
> >
> >Dave
> >