2008/7/5 tim Rowledge <[hidden email]>:
> > On 4-Jul-08, at 6:35 AM, Eliot Miranda wrote: >> >> [snip] >> Thanks Tim! That's what I needed. Being pointed to the right place. It >> has taken 20 minutes to understand the code and 20 minutes to fix it. >> Thanks so much!! > > Nice to have actually achieved something this week; it's been one of those > weeks... > > Simulating simulating the VM to gather type data seems like a pretty complex > project. I can't help feeling it would be simpler and faster to simply write > the VM cleanly, with decent documentation and specs. > > Igor, if you can produce a Better Slang With Lambas, do please share the > code. There has to be some way of cleaning up the current mess. > It is already done. :) The native methods which fully replacing primitives in CorruptVM having special syntax. Here an example of <native> method in CorruptVM: lookup: selector <native> <variable: #delegate knownAs: #VTable> <variable: #vec knownAs: #Vector> <variable: #assoc knownAs: #Association> | vec delegate | delegate := self. delegate equal: nil whileFalse: [ vec := delegate bindings. 1 to: vec size do: [:i | | assoc | assoc := vec at: i. assoc key equal: selector ifTrue: [ ^ assoc value ] ]. delegate := delegate delegate. ]. ^ nil A <native> pragma tells compiler to switch to different logic. A logic is simple: any variables and values is a machine words (32/64/arbirary bit ints), messages like + , - , * , / , bit shifts are pointer arithmetic , and they behave exactly like corresponding CPU instructions. Blocks not allowed in syntax except some special messages, like: #equal:whileTrue: , #equal:ifTrue: , and variants (replace 'equal' with 'greater', 'less' etc) simply translated to branching instructions. To read memory you simply write: memValue := someAddressValue readWord. "there is also a #readByte" to write, you type: someAddressValue writeWord: value But if you look at method example, it looks quite similar to regular smalltalk. Its because it using a static inlining. A principle is simple: whenever compiler sees a message send, and its not a special message (selector not found in CVSpecialMessages class), then compiler does a method lookup and inlines found method. Pragmas, like: <variable: #vec knownAs: #Vector> helping compiler with hints, where it should look for message implementation. In example above , i sending #at: to 'vec' variable. In result compiler inlines #at: method either in Vector class or its parent. If class of variable not specified, then by default its class assumed ProtoObject. If implementation not found - then compiler raising a compile error. You can't recursively inline same method. There are simple check which throws an error when you trying this. Messages to thisContext is treated as messages to compiler itself, and therefore can be uses a preprocessing directives. For instance , i found quite useful a following directive (or call it macro): thisContext ifInlined: [ ... ] ifNotInlined: [ .. ]. With this, i can determine if current method are compiled for inlining or for call from smalltalk and provide different implementation for both cases. An example is tagging small integers: a method #size, returns a size of array. When inlinined, it returns a number of array elements in machine word representation, when called from smalltalk - it returns a small integer. This allows to avoid excessive tagging/detagging in many places. So, what about it in essence: - with native methods you can provide implementation of any low-level basic behavior. - things are quite simple and you write code very similar to regular smalltalk, with exception that you need to keep in mind, that all message sends in native method either inlined or special low-level messages. So, in future system, you can write a package which contains smalltalk code, and native code both. You don't need plugins or something else external to make any code working just after it loaded into image. Imagine a BitBlt package which contains everything in one place. Once you load it - you got bitblt, and you don't need to care about compiling/downloading plugins. As for translation to C: its really easy to write such. You basically need to write a lambda transformation which transforms them to C code. This is quite straightforward, once you got a low-level lambdas. Simulation: it took me about 2 hours to implement a basic CVCPUSimulator. It is really dumb and you don't find any complex logic in it. -- Best regards, Igor Stasenko AKA sig. |
Hi Igor,
Very nice work Igor. What about special assembly language instructions for the various architectures? Such as test and set (for managing concurrency control)? How do you control the mapping to native code? By lambda, do you mean to say that you're doing this at the level of block closures rather than only the level of methods? Block closures being more general purpose so that the entire method need not be a "primitive". Never liked primitive methods... always thought that blocks were the better place to hang primitives - as well as a host of other capabilities - from. You mentioned you do byte code to native code conversion with Exupery. How does this work and how can one fine tune the generated code for the various architectures and calling conventions (for interfacing with ugly things like DLLs or Shared Libraries that static <as in frozen in time> system generate? How much work would it take to put the squeak code base upon a hydra rewritten in the new improved slang+exupery? How many architectures does exupery currently support? What are your plans on sharing your "lambda" code improvements? All the best, Peter |
2008/7/5 Peter William Lount <[hidden email]>:
> Hi Igor, > > Very nice work Igor. > > What about special assembly language instructions for the various > architectures? Such as test and set (for managing concurrency control)? How > do you control the mapping to native code? > > By lambda, do you mean to say that you're doing this at the level of block > closures rather than only the level of methods? Block closures being more > general purpose so that the entire method need not be a "primitive". Never > liked primitive methods... always thought that blocks were the better place > to hang primitives - as well as a host of other capabilities - from. > Native methods , smalltalk methods , and block closures from ST methods are compiled to common form: CompiledMethod. A difference lies just in preprocessing style. Compiled method always expects that arguments are passed via stack , but rest is up to implementor. Compiler inlines code for method prologue/epilogue (code sits in CVStackConext & friends), so if you don't like how its currently done, you can always invent own stack layout & formats. A polymorphic sends (Smalltalk method lookup & call) is implemented in corresponding classes, and compiler just inlines this code with very little assumptions about what code does. > You mentioned you do byte code to native code conversion with Exupery. How > does this work and how can one fine tune the generated code for the various > architectures and calling conventions (for interfacing with ugly things like > DLLs or Shared Libraries that static <as in frozen in time> system generate? > No, i'm not using byte codes at all. Translation performed from method source to lambdas. Currently its using AST created by default squeak Parser, but it will be replaced by own parser to translate sources directly to lambdas. It already lies in package, written by Klaus, but i didn't wired it up yet. Interfacing with DLL's & other FFI stuff is up to implementor. Its as easy as writing a code which does a call to address, stored in memory. You can write such code yourself as: result := someAddress call. - generates a call to address = someAddress. Maybe later, i'll add an foreign function callout code generator, which will automatically generate code for arguments coercion & pushing , and converting returned value. But its optional. There are some nuances with returned values. Currently it assumes that returned value is in eax, but its not true for a C functions, which returning floats or 64 bit values on 32bit platform. It would require adding some special instructions later, like: someAddress fcall: pointerForStoringFloat " here you provide two addresses: function pointer and pointer to memory location where returned floating point value should be stored" > How much work would it take to put the squeak code base upon a hydra > rewritten in the new improved slang+exupery? > This project not related to Hydra. The earlier you start writing it, the faster you'll get results :) And if seriously, currently its a big playground. With freedom from C, it opens a huge field of possibilities, how system can be implemented from scratch. Currently, there are only few bits, which make basic things working: Context format, CompiledMethod format and VTable format to be able perform polymorphic sends. Low-level lambdas (the result of method compiling) are intentionally platform independent. They assume you having a virtual CPU with unlimited number of registers. Special register holding stack value , special register holding context value, and register for returning value. And i'm using Exupery for getting results fast. It lacks some instructions (like working with byte-sized memory operands) and float math support, but its not hard to add them later. Since 99% of code already can work using current Exupery features, i don't think its a big issue. Its not using Exupery at full scale , only the last few classes responsible for register allocation, instruction selection and assembly. > How many architectures does exupery currently support? > Well, originally it supports only i386. But i heard there are ports on ARM. I think its not me who need to answer this question. Bryce knows better about it. :) > What are your plans on sharing your "lambda" code improvements? > Its free for any who may want to use it (MIT license). There a lot of things which need to be done, before it can become a working system, and amount of work to replicate such environment as Squeak, for example is paramount for single man. So i don't have an illusions that i could do it alone. But team of people can do. -- Best regards, Igor Stasenko AKA sig. |
As follow-up , here an illustration, how easier you can implement
low-level behavior comparing to old ways: Consider current SmallInteger>>+ implementation: A method in SmallInteger: -- + aNumber "Primitive. Add the receiver to the argument and answer with the result if it is a SmallInteger. Fail if the argument or the result is not a SmallInteger Essential No Lookup. See Object documentation whatIsAPrimitive." <primitive: 1> ^ super + aNumber --- And method in VMMaker: --- primitiveAdd self pop2AndPushIntegerIfOK: (self stackIntegerValue: 1) + (self stackIntegerValue: 0) which translated to following C code: sqInt primitiveAdd(void) { sqInt integerResult; sqInt sp; /* begin pop2AndPushIntegerIfOK: */ integerResult = (stackIntegerValue(1)) + (stackIntegerValue(0)); if (successFlag) { if ((integerResult ^ (integerResult << 1)) >= 0) { /* begin pop:thenPush: */ longAtput(sp = stackPointer - ((2 - 1) * BytesPerWord), ((integerResult << 1) | 1)); stackPointer = sp; } else { successFlag = 0; } } } --- Now compare two methods above with single native method which does all itself: --- SmallInteger>>+ aNumber <native> (aNumber bitAnd: 1) equal: 1 ifTrue: [ | result | result := self >> 1 + (aNumber >>1 ). (result bitXor: result << 1) greaterOrEqual: 0 ifTrue: [ ^ result << 1 bitOr: 1 ] ]. ^ super perform: #+ with: aNumber " a perform:[with:..] is special message for doing polymorphic send inside a native method" ---- Note , that checking that self is smallinteger is gone, because its impossible to call this method for other than smallinteger instance. Think about, how simpler would be to implement methods which use complex structures, or when primitive needs to instantiate some object(s) and force to use SpecialObjectsArray. Also, no need in coercion: you can simply do a type-check and then inline a code which reads a value(s) from object slots. Lot of crappy things will be wiped away. Not mentioning, that you can distribute any native code in your package, and don't need to accompany it with pre-built plugin .dll etc. -- Best regards, Igor Stasenko AKA sig. |
Free forum by Nabble | Edit this page |