Smalltalk › Squeak › Squeak - Dev

[squeak-dev] Better VM <-> plugin API

_‹ Previous Topic Next Topic _›

Classic

List

Threaded

24 messages Options

Igor Stasenko

Re: [squeak-dev] Better VM <-> plugin API

2008/11/24 Greg A. Woods; Planix, Inc. <[hidden email]>:

>
> On 21-Nov-2008, at 4:30 PM, Igor Stasenko wrote:
>>
>> 2008/11/21 Greg A. Woods; Planix, Inc. <[hidden email]>:
>>>
>>> How is another plugin supposed to access this 'bitBlitAtom' handle that
>>> is
>>> effectively always in the first plugin's private storage and literally
>>> (as
>>> you've declared it above) also in the other plugin module's private
>>> symbol
>>> namespace? (i.e. if it's compiled as a "static" variable in the first
>>> plugin module then there's no way the second plugin can use the same
>>> symbol
>>> to refer to the same storage)
>>
>> Its not in a private storage. When you telling makeAtom(), you
>> receiving an atom id for given name which is shared among all plugins.
>
> Perhaps I'm assuming your syntax is literal C code when that is not what you
> meant?
>
> The code you wrote explicitly defines a _static_ (i.e. private) symbol for
> storage that only the one compiled source file will be able to see the
> referenced storage, and only within the scope of the definition.

I'm not sure what code you refer to (there was many). A shared
namespace , from definition is not private.
Any module can access a symbol value by its name. In same way how we
using system dictionary in smalltalk.

>
> But I think Andreas Raab hit the nail on the head with his suggested
> ioLoadFunctionFrom() idea.
>

The ioLoadFunctionFrom() having static behavior, e.g. no matter how
many times you call this function, you'll receive same result for same
arguments. Its because, once compiled, an exported symbols values
(function/variable pointers) can't change their values anymore.

> --
> Greg A. Woods; Planix, Inc.
> <[hidden email]>
>

--
Best regards,
Igor Stasenko AKA sig.

Eliot Miranda-2

Re: [squeak-dev] Re: [Vm-dev] Better VM <-> plugin API

In reply to this post by Igor Stasenko

Hi Igor,

one question Andreas raised that I hadn't thought of was how with the new scheme you deal with multiple interpreters? With the old interpreterProxy scheme I guess the interpreterProxy could redirect a function to the relevant interpreter (although this would require a mutex to make it thread safe). How do you do it in Hydra with the new scheme? Do you load the plugin multiple times, once per VM, or do you do some redrecting behind the scenes?

On Sat, Nov 22, 2008 at 11:11 AM, Igor Stasenko <[hidden email]> wrote:

2008/11/22 Eliot Miranda <[hidden email]>:

> OK, good :) So Igor, what are the three functions interpreterProxy still
> has?

Here the part of sqVirtualMachine.h

typedef struct ObjVirtualMachine {
sqInt (*minorVersion)(void);
sqInt (*majorVersion)(void);

/* IMPORTANT!!!
* The rest of functions can be obtained by plugin by calling a
getVMFunctionPointerBySelector function.
* The need in defining additional functions in this struct is gone forever
*/
void * (*getVMFunctionPointerBySelector)(char * selector);

} ObjVirtualMachine;

> TIA
>
> (P.S. this can be backward-compatible, right? We can still provide the old
> nterface for a while to allow external plugins to still load)

Yes, first it calls setInterpreter() with new struct.
Old plugin checks version and refuses to work.
Then VM calls setInterpreter() with old InterpreterProxy, to make plugin happy.

> On Sat, Nov 22, 2008 at 11:06 AM, Igor Stasenko <[hidden email]> wrote:
>>

--

Best regards,
Igor Stasenko AKA sig.

Igor Stasenko

Re: [squeak-dev] Re: [Vm-dev] Better VM <-> plugin API

2008/11/26 Eliot Miranda <[hidden email]>:
> Hi Igor,
> one question Andreas raised that I hadn't thought of was how with the
> new scheme you deal with multiple interpreters? With the old
> interpreterProxy scheme I guess the interpreterProxy could redirect a
> function to the relevant interpreter (although this would require a mutex to
> make it thread safe). How do you do it in Hydra with the new scheme? Do
> you load the plugin multiple times, once per VM, or do you do some
> redrecting behind the scenes?
>

To maintain backward compatibility in Hydra VM we need to support two
kinds of plugins: old and new ones.

We have a limitation , that old plugins could work only with single
interpreter - main one. Any attempt to use primitives of old plugin in
non-main interpreter will lead to primitive failure.
Since we can't predict what plugin does, or what internal state its
using for work - its really dangerous to use its primitives in a
concurrent world. So, we saying big NO to use old plugins for non-main
interpreter.
In this way, we guarantee that Hydra VM works with old plugins
identically as Squeak VM, but only for the main interpreter.

Of course, things are different for new plugins: they are aware that
their primitives could run concurrently in multiple threads.

In concurrent world we have to make difference between global state
and thread-local(per-interpreter) state.
To access global state, there is nothing new - we should use
synchronization objects (mutex/semaphore).
But most plugins which i converted to Hydra don't using a mutable
global state, or initialize it once (at plugin loading stage) and
using it in read-only manner.
So, the major thing is how to deal with per-interpreter state of plugins.
My solution is simple: i added helper functions into VM which allows
users to 'attach' a state buffer to all interpreter instances.

By calling #attachStateBuffer: numBytes initializeFn: fn1 finalizeFn:
fn2 , plugin telling VM, that for each interpreter instance it needs a
numBytes long memory buffer to hold its thread-local state.
In result , it receives a unique state id, which it should use later
to access an attached state buffer of particular interpreter instance.

So each time it needs to access own thread-local state, it simply calls
#getAttachedStateBuffer: aStateHandle of: interpreter

And , of course plugin can provide two pointers of functions which
initializing the state buffer or finalize it.
These functions is called each time you creating new instance of
interpreter or destroying it.

In Hydra-s VMMaker, i made modifications which seamlessly (for plugin
writer) deals with per-interpreter state.
As well as in old generation scheme - all you need is to declare some
ivars in plugin's class.
But code generator , instead of placing them at module scope level ,
wrapping all such variables with struct.

I think it not requires too much explanation, for you Eliot, but for
other readers:

Let me show you the difference between old and new code generation on
an example of B2DPlugin.

The old generator simply puts all variables into C file:

/*** Variables ***/
static int* aetBuffer;
static char bbPluginName[256] = "BitBltPlugin";
static void * copyBitsFn;
static sqInt dispatchReturnValue;
static sqInt dispatchedValue;
static int doProfileStats = 0;
static sqInt engine;
static sqInt engineStopped;
static sqInt formArray;
static sqInt geProfileTime;
static int* getBuffer;

#ifdef SQUEAK_BUILTIN_PLUGIN
extern
#endif
struct VirtualMachine* interpreterProxy;
static void * loadBBFn;
static const char *moduleName =
#ifdef SQUEAK_BUILTIN_PLUGIN
"B2DPlugin 11 November 2008 (i)"
#else
"B2DPlugin 11 November 2008 (e)"
#endif
;
static int* objBuffer;
static sqInt objUsed;
static unsigned int* spanBuffer;
static int* workBuffer;

The new one is a bit more clever: all global variables still going to
C file, but thread-local going to special struct, declared in
<plugin-name>_imports.h :

/* Per-image plugin state */
typedef struct PluginState {
int* workBuffer;
int* objBuffer;
int* getBuffer;
int* aetBuffer;
unsigned int* spanBuffer;
sqInt engine;
sqInt formArray;
sqInt engineStopped;
sqInt geProfileTime;
sqInt dispatchedValue;
sqInt dispatchReturnValue;
sqInt objUsed;
#ifdef PLUGIN_STATE_EXTRAS
PLUGIN_STATE_EXTRAS
#endif
} PluginState;

And it defines two helper macros:

#define DECLARE_PLUGIN_STATE() struct PluginState * pstate =
vmFunction(getAttachedStateBufferof)(pluginStateId,currentInterpreter())
#define plugin_state(name) pstate->name

Now lets take a look at method code, generated by VMMaker:

/* old */
static sqInt addEdgeToGET(sqInt edge) {
if (!(allocateGETEntry(1))) {
return 0;
}
getBuffer[workBuffer[GWGETUsed]] = edge;
workBuffer[GWGETUsed] = ((workBuffer[GWGETUsed]) + 1);
}

/* new */
static sqInt addEdgeToGET _iargs(sqInt edge) {
DECLARE_PLUGIN_STATE();
if (!(allocateGETEntry _iparams(1))) {
return 0;
}
plugin_state(getBuffer)[plugin_state(workBuffer)[GWGETUsed]] = edge;
plugin_state(workBuffer)[GWGETUsed] =
((plugin_state(workBuffer)[GWGETUsed]) + 1);
}

here an _iargs() macro expands to
#define _iargs(args...) (PInterpreter intr, args)
and currentInterpreter() macro expands to 'intr'

So, in expanded form code could look like:

static sqInt addEdgeToGET (PInterpreter intr, sqInt edge) {
PluginState * pstate = getAttachedStateBufferof(pluginStateId, intr);

pstate->foo = blabla
}

Few words about vmFunction(name)

#ifdef SQUEAK_BUILTIN_PLUGIN

#include "interp_prototypes.h"
#define vmFunction(name) name

#else

#define vmFunction(name) vmFunctions.name

so, as you see , this macro expands to 'name' in internal plugins and
YES, it calls vm functions directly w/o using interpreterProxy.

As for external plugin its expands to an a vmFunctions.name , which is
a struct in a form:

typedef struct VMFunctions {
char * sel1; sqInt (*attachStateBufferinitializeFnfinalizeFn) (sqInt
numberOfBytes, AttachedStateFn stateInitializeFn, AttachedStateFn
stateFinalizeFn);
char * sel2; sqInt (*booleanValueOf) _iargs(sqInt obj);
char * sel3; sqInt (*byteSizeOf) (sqInt oop);
char * sel4; sqInt (*classBitmap) _iarg();
char * sel5; sqInt (*classPoint) _iarg();
char * sel6; sqInt (*failed) _iarg();
...
void * _dummy; void * _dummyfn;
}

In C file we got a variable vmFunctions

struct VMFunctions vmFunctions = {
"attachStateBuffer:initializeFn:finalizeFn:", NULL,
"booleanValueOf:", NULL,
"byteSizeOf:", NULL,
"classBitmap", NULL,
"classPoint", NULL,
"failed", NULL,
"fetchClassOf:", NULL,
"fetchInteger:ofObject:", NULL,
"fetchPointer:ofObject:", NULL,
"firstIndexableField:", NULL,
.....
NULL, NULL };

And in, setInterpreter() - a first function which called by VM, this
structure is filled with appropriate pointers:

EXPORT(sqInt) setInterpreter (InterpreterProxy * anInterpreter) {
sqInt i;
char * ptr;
char** table;
sqInt ok;

ok = anInterpreter->majorVersion() == OBJVM_PROXY_MAJOR;
if (ok == 0) {
return 0;
}
ok = anInterpreter->minorVersion() >= OBJVM_PROXY_MINOR;
if (ok == 0) {
return 0;
}
i = 0;
table = (char**) &vmFunctions;
while ((table[i]) != null) {
ptr = (char*)anInterpreter->getVMFunctionPointerBySelector(table[i]);
if (ptr == null) {
dprintf(("Plugin unable to find vm function: %s\n", table[i]));
return 0;
}
table[i + 1] = ptr;
i += 2;
}
;
ok = INIT_PLUGIN_STATE();

return ok;
}

as you can see, plugin fails to initialize (returns 0) if any of
functions its using is null.

There is more things under the hood concerning where i got function
prototypes , and how VM generates a list of public functions.. but i
think its enough for this post :)

--
Best regards,
Igor Stasenko AKA sig.

Igor Stasenko

Re: [squeak-dev] Re: [Vm-dev] Better VM <-> plugin API

.. continuing
In the light of current topic about shared namespace. Things could be
more uniform:

let suppose VM having following functions:

int makeAtom(char * atomName) -- which returns an atom id, and interns
an unique name (in same way as symbols interned in squeak) , and sets
it default value to null (if it wasn't internet before).
Any subsequent calls to this function with same argument will return same id.

Two accessor functions:
void * getAtomValue(int atomId);
void * setAtomValue(int atomId, void * newValue); /* returns old value */

And, surely we need a special thread-local storage

makeThreadLocalAtom(char * atomName, initFn, destroyFn)

this function behaves similar to makeAtom() except that
getAtomValue/setAtomValue will use thread-local storage to access atom
value.
InitFn/destroyFn is a functions in a form:

void Fn(Interpreter * intr, int atomId);

Not sure about init/destroy functions. Maybe they not needed, but
instead a VM should notify all plugins about creating new interpreter
or destroying existing one, so plugins should care for themselves
about initializing/finalizing/allocating/deallocating their state.
This just a choice between, where to put a handling logic - into VM,
or into each plugin.

Now, how things would look like if we apply unification thoughout VM:

1. VM could use atoms to define own public function pointers:

setAtom(makeAtom("KERNEL.fetchClassOf:") , & fetchClassOf);

then external plugin don't needs interpreterProxy. All it have to do is:

fetchClassOfFnPtr = getAtomValue(makeAtom("KERNEL.fetchClassOf:"));

and to make a call it simply could use:
fetchClassOfFnPtr (foo, bar);

2. Plugin state (Hydra specific):

int stateId = makeThreadLocalAtom("BitBltPlugin.TLS", initFn, destroyFn);

and to access it:
struct PluginState* pstate = (struct PluginState* ) getAtomValue(stateId);

and, of course we can use similar macros for it
DECLARE_STATE() / pstate(name)

so, things remain pretty same for code generation - only different macros.

3. Primitives.

Plugins at loading stage should:

- register atom "modulaName" and set its value to non-nil.
e.g.
setAtomValue(makeAtom("BitBltPlugin"), 1);
or.. we could put a version there.. not really relevant.

Trough this atom, we letting know VM that plugin is loaded , so it
shouldn't try to find/load module again.
It tries to load a module only if atom value is null.

Then, as you may suggest, primitive pointers registered by plugin into
a namespace by defining an atom names in a form:
"<moduleName>.<primitiveName>", and setting their values.

If VM encounters an unknown primitive (primitiveIndex is 0, but method
has a named primitive)
it does following:
first, checks if there is non-null atom value with given module name

moduleLoaded = getAtomValue(makeAtom(moduleName));
if (!moduleLoaded) ioLoadModule(...blabla)

then instead of storing a direct function pointer for given primitive,
into primitiveCache,
it puts an atom id for "<moduleName>.<primitiveName>" name.

so, that primitiveCache[primitiveIndex] = atomId
and to get function pointer it should read it with getAtomValue().

In this way, a plugin may override primitive pointer values at run
time (by setting atom value to null or to different function),
while for VM its a simple task: before call, fetch atom value by id,
call the primitive or report failure if value is null.

Its really helpful for debugging/monitoring purposes. As i described earlier,
one plugin may override a primitive pointers of another plugin, for
collecting a different statistical information or for logging activity
etc etc.
Of course such usage introducing some risks, but as our folks says:
the one who not risking, not drinking a champagne :)

--
Best regards,
Igor Stasenko AKA sig.