Sista byte code documentation

Previous Topic Next Topic
 
classic Classic list List threaded Threaded
7 messages Options
Reply | Threaded
Open this post in threaded view
|

Sista byte code documentation

Vanessa Freudenberg
 
Eliot / Clément / Everyone -

What’s the best documentation for the new byte codes?

I found the 2014 paper, and Clèment’s thesis, and I don’t think either is detailed enough to implement them correctly. Just wondering if there’s anything else I could peruse other than the VM source code.

Cheers!
Vanessa


Reply | Threaded
Open this post in threaded view
|

Re: Sista byte code documentation

Clément Béra
 
Hi Vanessa,

There should be a class comment in the EncoderForSistaV1 class describing all the bytecodes. Below is the class comment, I stroke through all the bytecodes you don't need to implement since they are unused, required only for the adaptive optimizer or clean/copying block optimizations. 

It's possible not to implement the FullBlockClosure creation bytecode at first if you run only images with the old block design. Once the FullBlockClosure creation bytecode is there, you also need the "Send To Superclass of Stacked Class Literal Selector", so that a super send can be encoded in a CompiledBlock. FullBlockClosure requires an extra primitive for evaluation. Besides FullBlockClosure, everything should be pretty straightforward.

Best!

EncoderForSistaV1 encodes a bytecode set for Smalltalk that lifts limits on the number of literals and branch distances, and provides extended push integer and push character bytecodes.  The bytecode set also supports creating FullBlockClosures, closures whose method is separate from their home method's.  Bytecodes are ordered by length to make decoding easier.  Bytecodes marked with an * are extensible via a prefix bytecode.

N.B.  Extension bytecodes can only come before extensible bytecodes, and only if valid (one cannot extend a bytecode extensible by Ext A with an Ext B).  An extensible bytecode consumes (and zeros) its extension(s).  Hence the hidden implicit variables holding extensions are always zero except after a valid sequence of extension bytecodes.

EncoderForSistaV1 also includes an extended set of bytecodes for Sista, the Speculative Inlining Smalltalk Architecture, a project by Clément Bera and Eliot Miranda.  Scorch is an optimizer that exists in the Smalltalk image, /not/ in the VM,  and optimizes by substituting normal bytecoded methods by optimized bytecoded methods that may use special bytecodes for which the Cogit can generate faster code.  These bytecodes eliminate overheads such as bounds checks or polymorphic code (indexing Array, ByteArray, String etc).  But the bulk of the optimization performed is in inlining blocks and sends for the common path.  This bytecode set therefore differs from a normal Smalltalk set in providing a set of inlined primitives that do not validate their arguments that the compiler generates only when it can prove that the primitives' arguments are valid.

The basic scheme is that the Cogit generates code containing performance counters.  When these counters trip, a callback into the image is performed, at which point Scorch analyses some portion of the stack, looking at performance data for the methods on the stack, and optimises based on the stack and performance data.  Execution then resumes in the optimized code.

The Sista Cogit (e.g. SistaStackToRegisterMappingCogit) adds counters to conditional branches.  Each branch has an executed and a taken count.  On execution the executed count is decremented and if the count goes below zero the VM sends a message at a special index in the specialObjectsArray (as of writing, conditionalCounterTrippedOn:).  Then if the branch is taken the taken count is decremented.  The two counter values allow the Sista optimizer to collect basic block execution paths and to know what are the "hot" paths through execution that are worth agressively optimizing.  Since conditional branches are about 1/6 as frequent as sends, and since they can be used to determine the hot path through code, they are a better choice to count than, for example, method or block entry.

The VM provides a primitive that fills an Array with the state of the counters, and the state of each linked send in a method.  The optimizer obtains the branch and send data for a method via this primitive.


Instance Variables (inherited)

1 Byte Bytecodes
code (note) binary name
0-15 0000 iiii Push Receiver Variable #iiii
16-31 0001 iiii Push Literal Variable #iiii
32-63 001 iiiii Push Literal #iiiii
64-71 01000 iii Push Temp #iii
72-75 010010 ii Push Temp #ii + 8
76 01001100 Push Receiver
77 01001101 Push true
78 01001110 Push false
79 01001111 Push nil
80 01010000 Push 0
81 01010001 Push 1
* 82 01010010 Push thisContext, (then Extend B = 1 => push thisProcess)
83 01010011 Duplicate Stack Top
84-87 010101 ii UNASSIGNED
88-91 010110 ii Return Receiver/true/false/nil
92 01011100 Return top
93 01011101 BlockReturn nil
* 94 01011110 BlockReturn Top [* return from enclosing block N, N = Extend A, then jump by Ext B ]
* 95 01011111 Nop
96-111 0110 iiii Send Arithmetic Message #iiii (+ - < > <= >= = ~= * / \\ @ bitShift: // bitAnd: bitOr:)
112-119 01110 iii Send Special Message #iii + 0 (at: at:put: size next nextPut: atEnd == class)
120-127 01111 iii Send Special Message #iii + 8 (~~ value value: do: new new: x y)
128-143 1000 iiii Send Literal Selector #iiii With 0 Argument
144-159 1001 iiii Send Literal Selector #iiii With 1 Arguments
160-175 1010 iiii Send Literal Selector #iiii With 2 Arguments
176-183 10110 iii Jump iii + 1 (i.e., 1 through 8)
184-191 10111 iii Pop and Jump 0n True iii +1 (i.e., 1 through 8)
192-199 11000 iii Pop and Jump 0n False iii +1 (i.e., 1 through 8)
200-207 11001 iii Pop and Store Receiver Variable #iii
208-215 11010 iii Pop and Store Temporary Variable #iii
216 11011000 Pop Stack Top
217 11011001 Unconditional trap
218-219 1101101 i UNASSIGNED
220-223 110111 ii UNASSIGNED

2 Byte Bytecodes
* 224 11100000 aaaaaaaa Extend A (Ext A = Ext A prev * 256 + Ext A) A is an unsigned extension.
* 225 11100001 bbbbbbbb Extend B (Ext B = Ext B prev * 256 + Ext B) B is a signed extension.
* 226 11100010 iiiiiiii Push Receiver Variable #iiiiiiii (+ Extend A * 256)
* 227 11100011 iiiiiiii Push Literal Variable #iiiiiiii (+ Extend A * 256)
* 228 11100100 iiiiiiii Push Literal #iiiiiiii (+ Extend A * 256)
229 11100101 iiiiiiii Push Temporary Variable #iiiiiiii
230 11100110 iiiiiiii UNASSIGNED (was pushNClosureTemps)
231 11100111 jkkkkkkk Push (Array new: kkkkkkk) (j = 0)
& Pop kkkkkkk elements into: (Array new: kkkkkkk) (j = 1)
* 232 11101000 iiiiiiii Push Integer #iiiiiiii (+ Extend B * 256, where bbbbbbbb = sddddddd, e.g. -32768 = i=0, d=0, s=1)
* 233 11101001 iiiiiiii Push Character #iiiiiiii (+ Extend B * 256)
** 234 11101010 iiiiijjj Send Literal Selector #iiiii (+ Extend A * 32) with jjj (+ Extend B * 8) Arguments
** 235 (1) 11101011 iiiiijjj ExtendB < 64
ifTrue: [Send To Superclass Literal Selector #iiiii (+ Extend A * 32) with jjj (+ Extend B * 8) Arguments]
ifFalse: [Send To Superclass of Stacked Class Literal Selector #iiiii (+ Extend A * 32) with jjj (+ (Extend B bitAnd: 63) * 8) Arguments]
* 236 11101100 iiiiiiii UNASSIGNED
* 237 11101101 iiiiiiii Jump #iiiiiiii (+ Extend B * 256, where bbbbbbbb = sddddddd, e.g. -32768 = i=0, d=0, s=1)
** 238 11101110 iiiiiiii Pop and Jump 0n True #iiiiiiii (+ Extend B * 256, where Extend B >= 0) (4)
** 239 11101111 iiiiiiii Pop and Jump 0n False #iiiiiiii (+ Extend B * 256, where Extend B >= 0) (4)
** 240 (3) 11110000 iiiiiiii Pop and Store Receiver Variable #iiiiiii (+ Extend A * 256)
** 241 (3) 11110001 iiiiiiii Pop and Store Literal Variable #iiiiiiii (+ Extend A * 256)
242 11110010 iiiiiiii Pop and Store Temporary Variable #iiiiiiii
** 243 (3) 11110011 iiiiiiii Store Receiver Variable #iiiiiii (+ Extend A * 256)
** 244 (3) 11110100 iiiiiiii Store Literal Variable #iiiiiiii (+ Extend A * 256)
245 11110110 iiiiiiii Store Temporary Variable #iiiiiiii
246-247 1111011 i xxxxxxxx UNASSIGNED

3 Byte Bytecodes
** 248 (2) 11111000 iiiiiiii mssjjjjj Call Primitive #iiiiiiii + (jjjjj * 256)
m=1 means inlined primitive, no hard return after execution.
ss defines the unsafe operation set used to encode the operations.
(ss = 0 means sista unsafe operations, ss = 01 means lowcode operations, other numbers are not used)
Lowcode inlined primitives may have extensions.

249 11111001 xxxxxxxx siyyyyyy push Closure Compiled block literal index xxxxxxxx (+ Extend A * 256) numCopied yyyyyy receiverOnStack: s = 1 ignoreOuterContext: i = 1
** 250 11111010 eeiiikkk jjjjjjjj Push Closure Num Copied iii (+ExtA//16*8) Num Args kkk (+ ExtA\\16*8) BlockSize jjjjjjjj (+ExtB*256). ee = num extensions
251 11111011 kkkkkkkk sjjjjjjj Push Temp At kkkkkkkk In Temp Vector At: jjjjjjj, s = 1 implies remote inst var access instead of remote temp vector access
* 252 (3) 11111100 kkkkkkkk sjjjjjjj Store Temp At kkkkkkkk In Temp Vector At: jjjjjjj s = 1 implies remote inst var access instead of remote temp vector access
* 253 (3) 11111101 kkkkkkkk sjjjjjjj Pop and Store Temp At kkkkkkkk In Temp Vector At: jjjjjjj s = 1 implies remote inst var access instead of remote temp vector access
** 254 11111110 kkkkkkkk jjjjjjjj branch If Not Instance Of Behavior/Array Of Behavior literal kkkkkkkk (+ Extend A * 256, where Extend A >= 0) distance jjjjjjjj (+ Extend B * 256, where Extend B >= 0 and <= 127)
** 254 11111110 kkkkkkkk jjjjjjjj branch If Instance Of Behavior/Array Of Behavior literal kkkkkkkk (+ Extend A * 256, where Extend A >= 0) distance jjjjjjjj (+ (Extend B bitAnd: 127) * 256, where Extend B >= 128 and <= 255)

* 255 11111111 xxxxxxxx jjjjjjjj UNASSIGNED

(1) Bytecode 235 is a super send bytecode that starts the lookup in the superclass of some class.  It has two forms, "normal" and "directed". In the normal form, the class is the value of the method's methodClassAssociation which must be the last literal.  In the directed form the class is the class on top of stack.

(2) The Call Primitive Bytecode specifies either a primitive in the primitive table (m=0) or an inlined primitive (m=1). Non-inlined primitives from the primitive table have index (jjjjjjj * 256) + iiiiiiii and return from the method if they succeed.  This bytecode is only valid as the first bytecode of a method.  Inline primitives have index (jjjjjjj * 256) + iiiiiiii, cannot fail, and do not return when they succeed, yielding a result (typically on top of stack after popping their arguments, but possibly in a byte data stack, for example for unboxed floating-point primitives).

(3) ExtB lowest bit implies no store check is needed, ExtB second bit implies the object may be a context, ExtB third bit implies no immutability/read-only check is needed, other bits in the extension are unused.

(4) ExtA = 1 implies no mustBeBoolean trampoline is needed, other bits in the extension are unused


Here is the specification of the Sista unsafe instructions (unsafe operations, set 00). The lowcode set uses external specifications.
We sort the inline primitive operations by arity.  Nullary primitives occupy the 0-999 range. Unary primitives occupy the 1-1999 range, up until 8 args. 8191 instructions can be encoded in each unsafe operation set, instructions from 0 to 7 arguments can have 1000 different instructions each, while 8 args instructions can have 192 different instructions.

Sista defines the following inlined primitives (CallPrimitive iiiiiiii 100jjjjj, n = jjjjjiiiiiiii)
1000 class
1001 pointer numSlots
1002 pointer basicSize
1003 byte8Type format numBytes (includes CompiledMethod)
1004 short16Type format numShorts
1005 word32Type format numWords
1006 doubleWord64Type format numDoubleWords

1010 ensure number of bytes available.
1011 fixed-sized new. (objects with 0 to n inst vars)

1020 identityHash (non-immediate, non-Behavior)
1021 identityHash (SmallInteger)
1022 identityHash (Character)
1023 identityHash (SmallFloat64)
1024 identityHash (Behavior, has hash?)

1030 immediateAsInteger (Character)
1031 immediateAsInteger (SmallFloat64)
1035 immediateAsFloat (Smallinteger)

2000 SmallInteger #+.  Both arguments are SmallIntegers and the result fits in a SmallInteger (* depends on word size)
2001 SmallInteger #-.  Both arguments are SmallIntegers and the result fits in a SmallInteger (* depends on word size)
2002 SmallInteger #*.  Both arguments are SmallIntegers and the result fits in a SmallInteger (* depends on word size)
2003 SmallInteger #/.  Both arguments are SmallIntegers and the result fits in a SmallInteger (* depends on word size)
2004 SmallInteger #//.  Both arguments are SmallIntegers and the result fits in a SmallInteger (* depends on word size)
2005 SmallInteger #\\.  Both arguments are SmallIntegers and the result fits in a SmallInteger (* depends on word size)
2006 SmallInteger #quo:.  Both arguments are SmallIntegers and the result fits in a SmallInteger (* depends on word size)

2011 Variable-sized pointers new (new:). Array, etc.
2012 Variable-sized byte new (new:). ByteArray, ByteString, etc.
2013 Variable-sized 16-bit new (new:). DoubleByteArray, etc.
2014 Variable-sized 32-bit new (new:). Bitmap, FloatArray, etc.
2015 Variable-sized 64-bit new (new:). DoubleWordArray, etc.

2016 SmallInteger #bitAnd:.  Both arguments are SmallIntegers and the result fits in a SmallInteger (* depends on word size)
2017 SmallInteger #bitOr:.  Both arguments are SmallIntegers and the result fits in a SmallInteger (* depends on word size)
2018 SmallInteger #bitXor:.  Both arguments are SmallIntegers and the result fits in a SmallInteger (* depends on word size)
2019 SmallInteger #bitShiftLeft:.  Both arguments are SmallIntegers and the result fits in a SmallInteger (* depends on word size)
2020 SmallInteger #bitShiftRight:.  Both arguments are SmallIntegers and the result fits in a SmallInteger (* depends on word size)

2032 SmallInteger #>.  Both arguments are SmallIntegers
2033 SmallInteger #<.  Both arguments are SmallIntegers
2034 SmallInteger #>=.  Both arguments are SmallIntegers
2035 SmallInteger #<=.  Both arguments are SmallIntegers
2036 SmallInteger #=.  Both arguments are SmallIntegers
2037 SmallInteger #~=.  Both arguments are SmallIntegers

2064 Pointer Object>>at:. The receiver is guaranteed to be a pointer object.  The 0-relative (1-relative?) index is an in-range SmallInteger
2065 Byte Object>>at:. The receiver is guaranteed to be a non-pointer object.  The 0-relative (1-relative?) index is an in-range SmallInteger.  The result is a SmallInteger.
2066 16-bit Word Object>>at:. The receiver is guaranteed to be a non-pointer object.  The 0-relative (1-relative?) index is an in-range SmallInteger.  The result is a SmallInteger.
2067 32-bit DoubleWord Object>>at:. The receiver is guaranteed to be a non-pointer object.  The 0-relative (1-relative?) index is an in-range SmallInteger.  The result is a SmallInteger or a LargePositiveInteger.
2068 64-bit QuadWord Object>>at:. The receiver is guaranteed to be a non-pointer object.  The 0-relative (1-relative?) index is an in-range SmallInteger.  The result is a SmallInteger or a LargePositiveInteger.

The following instructions can have the ExtB check flag (See (3)).
3000 Pointer Object>>at:put:. The receiver is guaranteed to be a pointer object.  The 0-relative (1-relative?) index is an in-range SmallInteger
3001 Byte Object>>at:put:. The receiver is guaranteed to be a non-pointer object.  The 0-relative (1-relative?) index is an in-range SmallInteger.  The argument is a SmallInteger.  The primitive stores the least significant 8 bits.
3002 Word Object>>at:put:. The receiver is guaranteed to be a non-pointer object.  The 0-relative (1-relative?) index is an in-range SmallInteger.  The argument is a SmallInteger.  The primitive stores the least significant 16 bits.
3003 DoubleWord Object>>at:put:. The receiver is guaranteed to be a non-pointer object.  The 0-relative (1-relative?) index is an in-range SmallInteger.  The argument is a SmallInteger.  The primitive stores the least significant 32 bits.
3004 QuadWord Object>>at:put:. The receiver is guaranteed to be a non-pointer object.  The 0-relative (1-relative?) index is an in-range SmallInteger.  The argument is a SmallInteger.  The primitive stores the least significant 64 bits.

3021 Byte Object >> equals:length: The receiver and the arguments are both byte objects and have both the same size (length). The length argument is a smallinteger. Answers true if all fields are equal, false if not. Comparison is bulked to word comparison.

4000 Pointer Object>> fillFrom:to:with: The receiver is a Pointer object. the middle two arguments are smallintegers. Last argument is any object. Fills the object in between the two indexes with last argument. Receiver is guaranteed to be mutable. The pointer accesses are raw (no inst var check). If ExtB is set to 1, no store check is present. Else a single store check is done for the bulk operation. Answers the receiver.

5000 Pointer Object>> replaceFrom:to:with:startingAt: Src and dest are pointer objects. ScrPos, scrLast and destLast are smallintegers. Receiver is guaranteed to be mutable.  Both ranges are in-bounds. The pointer accesses are raw (no inst var check). As for the normal primitive, the copy is linear. Answers the receiver.


Lowcode defines inlined primitives for the range CallPrimitive iiiiiiii 101jjjjj, n = jjjjjiiiiiiii.


On Tue, Jun 30, 2020 at 6:34 AM Vanessa Freudenberg <[hidden email]> wrote:
 
Eliot / Clément / Everyone -

What’s the best documentation for the new byte codes?

I found the 2014 paper, and Clèment’s thesis, and I don’t think either is detailed enough to implement them correctly. Just wondering if there’s anything else I could peruse other than the VM source code.

Cheers!
Vanessa




--
Reply | Threaded
Open this post in threaded view
|

Re: Sista byte code documentation

Vanessa Freudenberg
 
That's extremely helpful, thank you, Clément!

For anyone interested, this is about being able to run the upcoming Pharo release on SqueakJS, specifically for Erik's CodeParadise project:

I don't know yet how much time I can personally invest in the near future, but if you are interested in helping out, let's discuss here.

I myself have been busy helping Dan Ingalls build his Smalltalk Zoo website with examples linked from the just-published HOPL paper:
https://smalltalkzoo.computerhistory.org/papers/EvolutionOfSmalltalk.pdf (this is a version with corrections done after publication)

Vanessa

On Mon, Jun 29, 2020 at 10:50 PM Clément Béra <[hidden email]> wrote:
 
Hi Vanessa,

There should be a class comment in the EncoderForSistaV1 class describing all the bytecodes. Below is the class comment, I stroke through all the bytecodes you don't need to implement since they are unused, required only for the adaptive optimizer or clean/copying block optimizations. 

It's possible not to implement the FullBlockClosure creation bytecode at first if you run only images with the old block design. Once the FullBlockClosure creation bytecode is there, you also need the "Send To Superclass of Stacked Class Literal Selector", so that a super send can be encoded in a CompiledBlock. FullBlockClosure requires an extra primitive for evaluation. Besides FullBlockClosure, everything should be pretty straightforward.

Best!
[... snip ...]

--
Reply | Threaded
Open this post in threaded view
|

Re: Sista byte code documentation

David T. Lewis
 
Way off topic, but I notice that Erik is "ErikOnBike" on GitHube.
I can't help but be reminded of the wonderful Squeak work of Ned Konz
(http://wiki.squeak.org/squeak/1577) who shows up now as "bikeNomad"
on GitHub.

There must be some kind of good karma associated with riding bicycles :-)

Dave


On Tue, Jun 30, 2020 at 01:21:12PM -0700, Vanessa Freudenberg wrote:

>  
> That's extremely helpful, thank you, Cl??ment!
>
> For anyone interested, this is about being able to run the upcoming Pharo
> release on SqueakJS, specifically for Erik's CodeParadise project:
> https://github.com/ErikOnBike/CodeParadise
> https://github.com/codefrau/SqueakJS/issues/108
>
> I don't know yet how much time I can personally invest in the near future,
> but if you are interested in helping out, let's discuss here.
>
> I myself have been busy helping Dan Ingalls build his Smalltalk Zoo website
> with examples linked from the just-published HOPL paper:
> https://smalltalkzoo.computerhistory.org/papers/EvolutionOfSmalltalk.pdf
> (this is a version with corrections done after publication)
>
> Vanessa
>
> On Mon, Jun 29, 2020 at 10:50 PM Cl??ment B??ra <[hidden email]>
> wrote:
>
> >
> > Hi Vanessa,
> >
> > There should be a class comment in the EncoderForSistaV1 class describing
> > all the bytecodes. Below is the class comment, I stroke through all the
> > bytecodes you don't need to implement since they are unused, required only
> > for the adaptive optimizer or clean/copying block optimizations.
> >
> > It's possible not to implement the FullBlockClosure creation bytecode at
> > first if you run only images with the old block design. Once
> > the FullBlockClosure creation bytecode is there, you also need the "Send To
> > Superclass of Stacked Class Literal Selector", so that a super send can be
> > encoded in a CompiledBlock. FullBlockClosure requires an extra primitive
> > for evaluation. Besides FullBlockClosure, everything should be pretty
> > straightforward.
> >
> > Best!
> > [... snip ...]
> >
> > --
> > Cl??ment B??ra
> > https://clementbera.github.io/
> > https://clementbera.wordpress.com/
> >

Reply | Threaded
Open this post in threaded view
|

updated paper (was: Sista byte code documentation)

Jecel Assumpcao Jr
In reply to this post by Vanessa Freudenberg
 
Vanessa Freudenberg wrote on Tue, 30 Jun 2020 13:21:12 -0700
> I don't know yet how much time I can personally invest in the near future, but if you are interested in helping out, let's discuss here.
> I myself have been busy helping Dan Ingalls build his Smalltalk Zoo website with examples linked from the just-published HOPL paper:
> https://smalltalkzoo.computerhistory.org/papers/EvolutionOfSmalltalk.pdf (this is a version with corrections done after publication)

Thanks for the link! I had already read the previously published version
but it is always nice to have the latest.

I didn't know he was going to do any corrections or I would have
commented on:

"In 1984, I left Xerox and joined Apple, hoping to leverage Apple's
existing nonrestrictive license to put Smalltalk on the Macintosh, and
hence out into the world. I worked for several years on this, and
released several versions of Smalltalk-80 that were available for free
through the Apple Programmers and Developers Association (APDA)."

It was actually $70 and not free, but you could argue that was just the
cost of the media (7 floppy disks) and shipping. But the license was for
personal use only by APDA members so if anybody developed anything with
it they couldn't give it or sell it to anybody else. I was not a member
(being from Brazil I couldn't be even if I wanted to) but someone who
was bought version 0.7 (I think) and sent it to me. He was very nervous
that Apple might audit him and find out he had violated the license.

-- Jecel
Reply | Threaded
Open this post in threaded view
|

Re: Sista byte code documentation

Eliot Miranda-2
In reply to this post by Vanessa Freudenberg
 
Hi Vanessa,

   Clément pointed you to good sources.  The most concise guide to implementing them is however in the standard trunk image itself in the "decoding - private - sista v1" category of InstructionStream.
To support multiple bytecode sets CompiledCode maintains a flag bit in its header word that selects one of two bytecode sets, represented as BytecodeEncoder subclasses in the CompiledCode class vars PrimaryBytecodeSetEncoderClass & SecondaryBytecodeSetEncoderClass. A CompiledCode answers the relevant class via CompiledCode>>encoderClass which tests the bit and answers the relevant class.  Then double dispatching is used to invoke the relevant bytecode set decode:

InstructionStream methods for instruction decode
interpretNextInstructionFor: client
"Send to the argument, client, a message that specifies the type of the next instruction."

^self method encoderClass interpretNextInstructionFor: client in: self

EncoderForV3PlusClosures class methods for instruction stream support
interpretNextInstructionFor: aClient in: anInstructionStream
"Double-dispatch through the encoder to select the correct instruction set decoder."
^anInstructionStream interpretNextV3ClosuresInstructionFor: aClient

EncoderForSistaV1 class methods for instruction stream support
interpretNextInstructionFor: aClient in: anInstructionStream
"Double-dispatch through the encoder to select the correct instruction set decoder."
^anInstructionStream interpretNextSistaV1InstructionFor: aClient

which gets us back to the decoders in InstructionStream.  In particular,
InstructionStream methods for decoding - private - sista v1
interpretNextSistaV1InstructionFor: client
"Send to the argument, client, a message that specifies the next instruction."

| byte div16 offset method extA extB savedPC |
method := self method.
"For a table of the bytecode set, see EncoderForSistaV1's class comment."
"consume and compute any extensions first."
extA := extB := 0.
savedPC := pc.
[byte := self method at: pc.
pc := pc + 1.
byte >= 16rE0 and: [byte <= 16rE1]] whileTrue:
[| extByte |
extByte := self method at: pc.
pc := pc + 1.
byte = 16rE0
ifTrue:
[extA := (extA bitShift: 8) + extByte]
ifFalse:
[extB := (extB = 0 and: [extByte > 127])
ifTrue: [extByte - 256]
ifFalse: [(extB bitShift: 8) + extByte]]].
div16 := byte // 16.
offset := byte \\ 16.
"We do an inline quasi-binary search on each of the possible 16 values of div16"
... all the one byte codes elided...
"2 byte and 3 byte codes"
byte < 248 ifTrue:
[^self interpretNext2ByteSistaV1Instruction: byte for: client extA: extA extB: extB startPC: savedPC].
^self interpretNext3ByteSistaV1Instruction: byte for: client extA: extA extB: extB startPC: savedPC


and then all remaining implementation is in Context.

On Mon, Jun 29, 2020 at 9:34 PM Vanessa Freudenberg <[hidden email]> wrote:
 
Eliot / Clément / Everyone -

What’s the best documentation for the new byte codes?

I found the 2014 paper, and Clèment’s thesis, and I don’t think either is detailed enough to implement them correctly. Just wondering if there’s anything else I could peruse other than the VM source code.

Cheers!
Vanessa




--
_,,,^..^,,,_
best, Eliot
Reply | Threaded
Open this post in threaded view
|

Re: Sista byte code documentation

Eliot Miranda-2
 
Hi Vanessa,

  one very important facet of the extensions design is not sufficiently explicit in the InstructionStream methods, or in EncoderForSistaV1's class comment, which says only this:

"N.B.  Extension bytecodes can only come before extensible bytecodes, and only if valid (one cannot extend a bytecode extensible by Ext A with an Ext B).  An extensible bytecode consumes (and zeros) its extension(s).  Hence the hidden implicit variables holding extensions are always zero except after a valid sequence of extension bytecodes."

In a real bytecode interpreter the implications are as follows:
0. an Interpreter should use *static* variables for extA and extB and initialize them to zero.  For example, they *do not* need to be initialized on every method activation.
1. an interpreter can, and should, entirely ignore extA and extB in all bytecodes other than those that take extensions.  So for example the vast majority of one-byte bytecodes do not read extA or extB.
2. It is an error for a code generator to emit extension bytecodes before any bytecode that does not take extensions.  Further, it is an error for a code generator to emit extA before a bytecode that only takes extB extensions and vice verse.
3. it is the responsibility of any bytecode that consumes extensions to reset the extA and/or extB variables to zero after it has consumed their values

Given these constraints extensions are very cheap:
- they are written to only on initialization, when creating extension values via the extension bytecodes, and after being consumed by a bytecode that takes extensions
- they are read only by bytecodes that take extensions, so high dynamic-frequency single-byte bytecodes pay no penalty for the 3extensions design

Note that these constraints are not possible to observe in InstructionStream because there is no where (cheap and convenient) to store the extA and extB values.  Hence the crude implementation in interpretNextSistaV1InstructionFor: 

On Tue, Jun 30, 2020 at 4:29 PM Eliot Miranda <[hidden email]> wrote:
Hi Vanessa,

   Clément pointed you to good sources.  The most concise guide to implementing them is however in the standard trunk image itself in the "decoding - private - sista v1" category of InstructionStream.
To support multiple bytecode sets CompiledCode maintains a flag bit in its header word that selects one of two bytecode sets, represented as BytecodeEncoder subclasses in the CompiledCode class vars PrimaryBytecodeSetEncoderClass & SecondaryBytecodeSetEncoderClass. A CompiledCode answers the relevant class via CompiledCode>>encoderClass which tests the bit and answers the relevant class.  Then double dispatching is used to invoke the relevant bytecode set decode:

InstructionStream methods for instruction decode
interpretNextInstructionFor: client
"Send to the argument, client, a message that specifies the type of the next instruction."

^self method encoderClass interpretNextInstructionFor: client in: self

EncoderForV3PlusClosures class methods for instruction stream support
interpretNextInstructionFor: aClient in: anInstructionStream
"Double-dispatch through the encoder to select the correct instruction set decoder."
^anInstructionStream interpretNextV3ClosuresInstructionFor: aClient

EncoderForSistaV1 class methods for instruction stream support
interpretNextInstructionFor: aClient in: anInstructionStream
"Double-dispatch through the encoder to select the correct instruction set decoder."
^anInstructionStream interpretNextSistaV1InstructionFor: aClient

which gets us back to the decoders in InstructionStream.  In particular,
InstructionStream methods for decoding - private - sista v1
interpretNextSistaV1InstructionFor: client
"Send to the argument, client, a message that specifies the next instruction."

| byte div16 offset method extA extB savedPC |
method := self method.
"For a table of the bytecode set, see EncoderForSistaV1's class comment."
"consume and compute any extensions first."
extA := extB := 0.
savedPC := pc.
[byte := self method at: pc.
pc := pc + 1.
byte >= 16rE0 and: [byte <= 16rE1]] whileTrue:
[| extByte |
extByte := self method at: pc.
pc := pc + 1.
byte = 16rE0
ifTrue:
[extA := (extA bitShift: 8) + extByte]
ifFalse:
[extB := (extB = 0 and: [extByte > 127])
ifTrue: [extByte - 256]
ifFalse: [(extB bitShift: 8) + extByte]]].
div16 := byte // 16.
offset := byte \\ 16.
"We do an inline quasi-binary search on each of the possible 16 values of div16"
... all the one byte codes elided...
"2 byte and 3 byte codes"
byte < 248 ifTrue:
[^self interpretNext2ByteSistaV1Instruction: byte for: client extA: extA extB: extB startPC: savedPC].
^self interpretNext3ByteSistaV1Instruction: byte for: client extA: extA extB: extB startPC: savedPC


and then all remaining implementation is in Context.

On Mon, Jun 29, 2020 at 9:34 PM Vanessa Freudenberg <[hidden email]> wrote:
 
Eliot / Clément / Everyone -

What’s the best documentation for the new byte codes?

I found the 2014 paper, and Clèment’s thesis, and I don’t think either is detailed enough to implement them correctly. Just wondering if there’s anything else I could peruse other than the VM source code.

Cheers!
Vanessa




--
_,,,^..^,,,_
best, Eliot


--
_,,,^..^,,,_
best, Eliot