Folks -
For a variety of reasons I am in dire need of the ability to vector shared variables (globals, class vars and pool vars) through an extra indirection vector per process (really per island but binding per process seems to be simpler for now). Since I need this for *each and every shared variable* it needs to be *very* efficient. The question is: What is the most efficient way to implement such a scheme? There are a couple of ways I can think about: 1) Just use a dictionary. The main disadvantage is the lookup cost which could be handled by making it a special kind of dictionary and implementing the lookup in a primitive. This is a good fallback position but probably just a little slow in general. It could implemented by something along the lines of: ProtoObject>>lookup: sharedBinding "Look up the value of the given shared binding in the currently executing process." ^Processor activeProcess scope at: sharedBinding ifAbsent:[nil]. which is pretty straightforward. 2) Use message lookup, e.g., send a message. This is simple to describe but not necessarily simple to implement correctly. Here is how the simulation would look like: ProtoObject>>lookup: sharedBinding "Look up the value of the given shared binding in the currently executing process." ^[Processor activeProcess scope perform: sharedBinding key] on: MessageNotUnderstood do:[:ex| ex return: nil]. One problem here is that the key needs to be unique within all possible keys which is a problem if there is a name conflict. This can be resolved by implicitly prefixing names with the place where they are defined so it's not such big of a deal conceptually but practically the impact of that change might be more visible. The other problem is that the scope object needs to hold all the objects which means quite a number of them. OTOH, one could argue that in many ways "Smalltalk" is just an object with a few thousand iVars so having a class representing the namespace defined by Smalltalk may be quite reasonable. 3) Use "some" integer index caching scheme. The main idea here is in realizing that really, option #2 doesn't quite work since classes can't have more than 256 iVars so we'd need to have an indirection through an array to be able to access these variables. If that is so, then why can't we inline the entire access pattern and have the scope just be an array that we index directly? This is actually the most interesting approach to me because (as far as I can tell) it would be by far the most efficient. The basic idea goes like this: If all shared variables are assigned a "global index" then only this index is required to use them. Any use of the shared variable Foo would be inlined to "Processor activeProcess scope at: FooIndex" which (given proper primitive support) would probably be by far the fastest version (if offered a byte code it should rival the current speed of accessing shared variables). [I'll admit that there are some tricky issues with this approach as well, like the size needed for the scope object and whether or not to use hash lookup instead of indexing] In any case, I'm trying to gather options. If any of you have any new ideas or have tried one or the other (successfully or not) or have any other comments to make I'd love to hear about it. Cheers, - Andreas |
Hi Andreas,
on Tue, 24 Oct 2006 06:46:26 +0200, you wrote: > Folks - > > For a variety of reasons I am in dire need of the ability to vector > shared variables (globals, class vars and pool vars) through an extra > indirection vector per process (really per island but binding per > process seems to be simpler for now). Since I need this for *each and > every shared variable* it needs to be *very* efficient. > > The question is: What is the most efficient way to implement such a > scheme? The fastest indirect access is through literal variables (limited only by the # of literals allowed per method). Since you are willing to spend a #symbol per variable, formally declare a "descriptor" to be a class var (or use a pool). Take #PerProcessThing as as example; initialize PerProcessThing to a subinstance of Association which holds a fast and fixed Array index. Then all you need in the scope of activeProcess is a shared Array which is indexed by the above machinery. Example use: PerProcessThing localSharedValue PerProcessThing localSharedValue: somethingElse Not counting "Processor activeProcess scope", the above is the fastest double-indirect access that I can think of. /Klaus |
In reply to this post by Andreas.Raab
Hi Nicolas,
on Tue, 24 Oct 2006 09:39:31 +0200, you wrote: > Hi Klaus and Andreas > I find the local shared variable feature most useful. > I'am trying to understand your suggestions, ... > Klaus: > - have a single SharedPool with values being an array, and an index per > process? No, one Array per process and one integer index per shared variable. > pseudo code for PerProcessThing: > PerProcessThing localSharedValue > where localSharedValue is (^self at: Processor activeProcess > processIndex) No, Association subclass #LocalSharedVariable and then LocalSharedVariable>>localSharedValue <primitive: 4711> "this is what the primitive does faster:" ^ Processor activeProcess scope localSharedArray at: value LocalSharedVariable's key is the same as the key in the PerProcessThing association (for convenience), and LocalSharedVariable's value is the integer index into the localSharedArray. If you spend a primitive implementation and suppose that the example compiles to pushLiteralVariable: (#PerProcessThing -> aLocalSharedVariable) send: #localSharedValue "handled by primitive 4711" So exactly two bytecodes (without context switch) and, since you at least need to tell a "descriptor" and what you want from it (get or set a value), this identifies the least number of bytecodes necessary. > - have to reset all the shared value arrays each time a process is > created or die... This is independent of any proposal, you always have to allocate the shared value array per process, like in [[self allocateSharedValueArray. self doTheJob] ensure: [self destroySharedValueArray]] fork > I am not sure i understood well Klaus proposition > Did i get it ? I think so :) /Klaus > Nicolas > > Le Mardi 24 Octobre 2006 07:33, Klaus D. Witzel a écrit : >> Hi Andreas, >> >> on Tue, 24 Oct 2006 06:46:26 +0200, you wrote: >> > Folks - >> > >> > For a variety of reasons I am in dire need of the ability to vector >> > shared variables (globals, class vars and pool vars) through an extra >> > indirection vector per process (really per island but binding per >> > process seems to be simpler for now). Since I need this for *each and >> > every shared variable* it needs to be *very* efficient. >> > >> > The question is: What is the most efficient way to implement such a >> > scheme? >> >> The fastest indirect access is through literal variables (limited only >> by >> the # of literals allowed per method). >> >> Since you are willing to spend a #symbol per variable, formally declare >> a >> "descriptor" to be a class var (or use a pool). Take #PerProcessThing as >> as example; initialize PerProcessThing to a subinstance of Association >> which holds a fast and fixed Array index. >> >> Then all you need in the scope of activeProcess is a shared Array which >> is >> indexed by the above machinery. Example use: >> >> PerProcessThing localSharedValue >> PerProcessThing localSharedValue: somethingElse >> >> Not counting "Processor activeProcess scope", the above is the fastest >> double-indirect access that I can think of. >> >> /Klaus > > > > ________________________________________________________________________ > iFRANCE, exprimez-vous ! > http://web.ifrance.com |
Free forum by Nabble | Edit this page |