Hi Bryce, thanks for the new release. I have been playing around with it and have some questions.
First, I was surprised to see the suggestion to compile Array>>at: and Array>>at:put:. These methods are already inlined, right? They inherit the implementation in Object which are already both primitives. Primitives are already "compiled" right? So how does Exupery speed them up? Overall, does it ever make sense to compile a method with just a single send in it? Does Exupery dive into the called method and compile that too? If not, I don't see how it could help unless some inlining was done.. This allso led to my other question. These Swiki instructions tell Exupery to compile an inherited method, not an actual method that exists on Array. Since we don't pass the actual CompiledMethod object to Exupery (e.g., Exupery compile: Array>>#at:), instead a selector and a class, what interpretations about these does Exupery make with respect to inheritance. Forging onward, I decided to piggyback on your easy approach provided by ExuperyBenchmarks to experiment so I can easily see results of compiling some of my own methods. I have these two methods which set or get an unsigned integer into a ByteArray. Because of the lesser results I get, I want to post their implementation right in the email so you may have an idea.. ByteArray>>maUint: bits at: anInteger | answer bytes | bits == 64 ifTrue: [ ^ self maUnsigned64At: anInteger + 1 ]. bits == 56 ifTrue: [ ^ self maUnsigned56At: anInteger + 1 ]. bits == 48 ifTrue: [ ^ self maUnsigned48At: anInteger + 1 ]. bits == 40 ifTrue: [ ^ self maUnsigned40At: anInteger + 1 ]. bits == 32 ifTrue: [ ^self unsignedLongAt: anInteger + 1 bigEndian: false ]. bits == 16 ifTrue: [ ^self unsignedShortAt: anInteger + 1 bigEndian: false ]. bits == 8 ifTrue: [ ^self byteAt: anInteger + 1 ]. bytes _ bits // 8. answer _ LargePositiveInteger new: bytes. 1 to: bytes do: [ :digitPosition | answer digitAt: digitPosition put: (self at: digitPosition + anInteger) ]. ^answer normalize and ByteArray>>#maUint: bits at: position put: anInteger position + 1 to: position + (bits // 8) do: [ :pos | self at: pos put: (anInteger digitAt: pos-position) ]. ^anInteger Now, when I just compiled these two methods alone, I get: maUintAtPutBenchmark 5625 compiled 7912 ratio: 0.711 maUintAtBenchmark 2610 compiled 3474 ratio: 0.751 Cumulative Time 27.124 compiled 30.748 ratio 0.882 So, this tells me that compiling *can* actually make things worse. Is there any way for Exupery to detect and prevent this or is this strictly the user/developers responsibility to profile everything for comparison? Next I tried adding compilation of some of the lower-level methods called by these methods. I first just added Exupery compileMethod: #at: class: ByteArray Normally I wouldn't do this but since the Swiki suggested compiling Array>>#at: I thought it worth a try. But it caused the stack-trace at the end of this e-mail. I tried just a few more experiments such as SmallInteger>>#digitAt: and Integer>>#bitShift: but generally couldn't get results above 1.0. BTW, on a completely separate experiment, one method I tried to compile it said "Unknow bytecode" (it was bytecode 136). Ok, so obviously I'm a novice at this! This project is exciting and I'm at least I know Exupery is doing something (compared to my last attempt where I didn't even know to compile methods), but hope that I can figure out how to actually make it go *faster*. :) Can you help me with any advice / guidelines for *what* to compile? Thanks, Chris 15 November 2006 10:53:18 pm VM: Win32 - a SmalltalkImage Image: Squeak3.8 [latest update: #6665] SecurityManager state: Restricted: false FileAccess: true SocketAccess: true Working Dir C:\Development\Chris\Development\Squeak Trusted Dir C:\Development\Chris\Development\Squeak\Chris Untrusted Dir C:\My Squeak\Chris IntermediateSimplifier>>primitiveAt: Receiver: an IntermediateSimplifier Arguments and temporary variables: aMedPrimitive: #(#primitive 60 'block6' #(#mem #(#add #(#mem #activeContext) #(...etc... methodBytecodes: nil index: nil resultAddress: nil receiver: nil Receiver's instance variables: source: a MedMethod result: a MedMethod emitter: an IntermediateEmitter currentBlock: (block1 (primitiveReturn #(#primitive 60 'block6' #(#mem #(#add ...etc... stack: an OrderedCollection() simplifier: an IntermediateSimplifier stacksForBlocks: a Dictionary() IntermediateSimplifier>>visitPrimitive: Receiver: an IntermediateSimplifier Arguments and temporary variables: aMedPrimitive: #(#primitive 60 'block6' #(#mem #(#add #(#mem #activeContext) #(...etc... Receiver's instance variables: source: a MedMethod result: a MedMethod emitter: an IntermediateEmitter currentBlock: (block1 (primitiveReturn #(#primitive 60 'block6' #(#mem #(#add ...etc... stack: an OrderedCollection() simplifier: an IntermediateSimplifier stacksForBlocks: a Dictionary() MedPrimitive>>visitWith: Receiver: #(#primitive 60 'block6' #(#mem #(#add #(#mem #activeContext) #(#add #(#sal #(#mem #(#add ...etc... Arguments and temporary variables: aTreeOptimiser: an IntermediateSimplifier Receiver's instance variables: in: nil out: nil primitiveName: primitive 60 arguments: #(#(#mem #(#add #(#mem #activeContext) #(#add #(#sal #(#mem #(#add #...etc... failureBlock: (block6 (createContext) (deconvertBoolean falseObj #(#send #(#m...etc... receiver: ByteArray IntermediateSimplifier>>visitPrimitiveReturn: Receiver: an IntermediateSimplifier Arguments and temporary variables: aMedPrimitiveReturn: #(#primitiveReturn 2 #(#primitive 60 'block6' #(#mem #(#ad...etc... Receiver's instance variables: source: a MedMethod result: a MedMethod emitter: an IntermediateEmitter currentBlock: (block1 (primitiveReturn #(#primitive 60 'block6' #(#mem #(#add ...etc... stack: an OrderedCollection() simplifier: an IntermediateSimplifier stacksForBlocks: a Dictionary() --- The full stack --- IntermediateSimplifier>>primitiveAt: IntermediateSimplifier>>visitPrimitive: MedPrimitive>>visitWith: IntermediateSimplifier>>visitPrimitiveReturn: - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - MedPrimitiveReturn>>visitWith: [] in IntermediateSimplifier(IntermediateCopier)>>simplifyBlock: {[:each | emitter addExpression: (each visitWith: self)]} OrderedCollection>>do: MedBlock>>instructionsDo: IntermediateSimplifier(IntermediateCopier)>>simplifyBlock: IntermediateSimplifier>>simplifyBlock: [] in IntermediateSimplifier>>visitMethod: {[:each | self prepareBlockFrom: each. self simplifyBlock: each]} OrderedCollection>>do: IntermediateSimplifier>>visitMethod: MedMethod>>visitWith: IntermediateSimplifier(IntermediateCopier)>>run Exupery>>convertIntermediate Exupery>>run Exupery class>>compileMethod:inlining:receiver: Exupery class>>compileMethod:into:forClass:inlining: Exupery class>>compileMethod:class:inlining: Exupery class>>compileMethod:class: ExuperyBenchmarks>>compilemaUintAt ExuperyBenchmarks>>runBenchmark:compilingWith: [] in ExuperyBenchmarks>>run {[:each | self runBenchmark: (each at: 1) compilingWith: (each at: 2)]} Array(SequenceableCollection)>>do: ExuperyBenchmarks>>run UndefinedObject>>DoIt Compiler>>evaluate:in:to:notifying:ifFail:logged: [] in TextMorphEditor(ParagraphEditor)>>evaluateSelection {[rcvr class evaluatorClass new evaluate: self selectionAsStream in: ctxt...]} BlockContext>>on:do: TextMorphEditor(ParagraphEditor)>>evaluateSelection TextMorphEditor(ParagraphEditor)>>doIt [] in TextMorphEditor(ParagraphEditor)>>doIt: {[self doIt]} TextMorphEditor(Controller)>>terminateAndInitializeAround: TextMorphEditor(ParagraphEditor)>>doIt: TextMorphEditor(ParagraphEditor)>>dispatchOnCharacter:with: TextMorphEditor>>dispatchOnCharacter:with: TextMorphEditor(ParagraphEditor)>>readKeyboard ...etc... _______________________________________________ Exupery mailing list [hidden email] http://lists.squeakfoundation.org/cgi-bin/mailman/listinfo/exupery |
Hi Chris, First, there are a few bytecodes that don't compile. The major language feature missing now is cascades. Bytecode 136 is duplicate top of stack which is used for cascades. There's also only a handful of primitives implemented. If you're doing something that the interpreter optimises more than Exupery does now then compiling will slow down execution. That said, 70% of the time is spent inside interpret() the big interpreter function produced by inlining the interpreter's main loop. For now I'm targeting that 70%. The easiest way to try to optimise something is to use the following sequence: ExuperyProfiler optimise: [your code]. #optimise: runs the code in the block and profiles it. Based on that profile it will try to compile methods that will benefit. your code. Execute your code again to populate the polymorphic inline caches. Exupery uses them to dynamically inline primitives. Exupery dynamicallyInline. #dynamciallyInline runs over all the natively compiled methods in the system then dynamically inlines any primitives Exupery's send optimisations only provide a speed improvement if both sides are compiled. Performance seems identical to the interpreter when calling interpreted code. The interpreter's main loop includes implementations of a handful of primitives including #at: and #at:put: that have their own bytecodes. Exupery optimises these by using dynamic primitive inlining however that requires a second compile or explicit inlining instructions. Also I haven't yet re-implemented all of the primitives that the interpreter optimises. SmallInteger operations are automatically inlined. Exupery also needs to compile a method once for each receiver. I do this so that I can specialise the method for it's receiver. At the moment only #at: and #at:put: are specialised. The advantage is that the code executed is customised to the receivers shape. I may allow some method's to be shared to multiple receivers in the future but for now compiling everything the same way is simpler. In your example: ByteArray>>#maUint: bits at: position put: anInteger position + 1 to: position + (bits // 8) do: [ :pos | self at: pos put: (anInteger digitAt: pos-position) ]. ^anInteger First I'd try compiling it using the profiler as above. If I was manually trying to compile it, I would also compile SmallInteger>>digitAt:. ByteArray>>#at:put: can not be compiled yet but the intepreter optimises it into the #at:put: bytecode. When optimising a method, try to compile all the methods it will call while measuring any benefits. I haven't yet tried to optimise code that uses LargeIntegers heavily. I don't know how such code will perform. There are several options availible to optimise them including compiling calls to primitives into compiled code. Compiling a call to a primitive would let it benefit from Exupery's faster sends between compiled code. How heavily are LargeInteger's used in Magma? Bryce _______________________________________________ Exupery mailing list [hidden email] http://lists.squeakfoundation.org/cgi-bin/mailman/listinfo/exupery |
Free forum by Nabble | Edit this page |