VM Maker: VMMaker.oscog-eem.2199.mcz

Previous Topic Next Topic
 
classic Classic list List threaded Threaded
1 message Options
Reply | Threaded
Open this post in threaded view
|

VM Maker: VMMaker.oscog-eem.2199.mcz

commits-2
 
Eliot Miranda uploaded a new version of VMMaker to project VM Maker:
http://source.squeak.org/VMMaker/VMMaker.oscog-eem.2199.mcz

==================== Summary ====================

Name: VMMaker.oscog-eem.2199
Author: eem
Time: 21 April 2017, 4:03:59.073811 pm
UUID: 16141873-43e5-47d4-9f9a-a7c7aab2152a
Ancestors: VMMaker.oscog-EstebanLorenzano.2198

RegisterAllocatingCogit:
recover from overflow in genSpecialSelectorArithmetic correctly (must include carry/borrow); hence implement AddCqR & SubbCqR in CogX64Compiler.

Use a much more straight-forwqard algorithm for merging, albeit one that requires that the current simStack is saved and restored around merges since the current simStack is updated t reflect live registers when regsiters are stolen from temporaries to implement the merge.  Hence add scratchBytecodePC to avoid copying teh original simStack to the scratchSimStack more than once in each bytecode (it is used three times in genSecialSelectorComparison for example).

Fix slip in RegisterAllocatingCogit>>ssAllocateRequiredRegMask:upThrough:upThroughNative:; 0 is not NoReg ;-).

Restore SistaCogitClone methods lost in previous commit.

=============== Diff against VMMaker.oscog-EstebanLorenzano.2198 ===============

Item was changed:
  SharedPool subclass: #CogRTLOpcodes
  instanceVariableNames: ''
+ classVariableNames: 'AddCqR AddCwR AddRR AddRdRd AddRsRs AddcCqR AddcRR AlignmentNops AndCqR AndCqRR AndCwR AndRR ArithmeticShiftRightCqR ArithmeticShiftRightRR Call CallFull CallR CmpC32R CmpCqR CmpCwR CmpRR CmpRdRd CmpRsRs ConvertRRd ConvertRRs ConvertRdR ConvertRdRs ConvertRsR ConvertRsRd DivRdRd DivRsRs Fill32 FirstJump FirstShortJump Jump JumpAbove JumpAboveOrEqual JumpBelow JumpBelowOrEqual JumpCarry JumpFPEqual JumpFPGreater JumpFPGreaterOrEqual JumpFPLess JumpFPLessOrEqual JumpFPNotEqual JumpFPOrdered JumpFPUnordered JumpFull JumpGreater JumpGreaterOrEqual JumpLess JumpLessOrEqual JumpLong JumpLongNonZero JumpLongZero JumpNegative JumpNoCarry JumpNoOverflow JumpNonNegative JumpNonZero JumpOverflow JumpR JumpZero Label LastJump LastRTLCode Literal LoadEffectiveAddressMwrR LoadEffectiveAddressXowrR LogicalShiftLeftCqR LogicalShiftLeftRR LogicalShiftRightCqR LogicalShiftRightRR MoveA32R MoveAbR MoveAwR MoveC32R MoveC64R MoveCqR MoveCwR MoveM16rR MoveM32rR MoveM32rRs MoveM64rR
 d MoveM8rR MoveMbrR MoveMs8rR MoveMwrR MoveRA32 MoveRAb MoveRAw MoveRM16r MoveRM32r MoveRM8r MoveRMbr MoveRMwr MoveRR MoveRRd MoveRX16rR MoveRX32rR MoveRXbrR MoveRXowr MoveRXwrR MoveRdM64r MoveRdR MoveRdRd MoveRsM32r MoveRsRs MoveX16rRR MoveX32rRR MoveXbrRR MoveXowrR MoveXwrRR MulRdRd MulRsRs NegateR Nop NotR OrCqR OrCwR OrRR PopR PrefetchAw PushCq PushCw PushR RetN RotateLeftCqR RotateRightCqR SignExtend16RR SignExtend32RR SignExtend8RR SqrtRd SqrtRs Stop SubCqR SubCwR SubRR SubRdRd SubRsRs SubbCqR SubbRR TstCqR XorCqR XorCwR XorRR XorRdRd XorRsRs ZeroExtend16RR ZeroExtend32RR ZeroExtend8RR'
- classVariableNames: 'AddCqR AddCwR AddRR AddRdRd AddRsRs AddcCqR AddcRR AlignmentNops AndCqR AndCqRR AndCwR AndRR ArithmeticShiftRightCqR ArithmeticShiftRightRR Call CallFull CallR CmpC32R CmpCqR CmpCwR CmpRR CmpRdRd CmpRsRs ConvertRRd ConvertRRs ConvertRdR ConvertRdRs ConvertRsR ConvertRsRd DivRdRd DivRsRs Fill32 FirstJump FirstShortJump Jump JumpAbove JumpAboveOrEqual JumpBelow JumpBelowOrEqual JumpCarry JumpFPEqual JumpFPGreater JumpFPGreaterOrEqual JumpFPLess JumpFPLessOrEqual JumpFPNotEqual JumpFPOrdered JumpFPUnordered JumpFull JumpGreater JumpGreaterOrEqual JumpLess JumpLessOrEqual JumpLong JumpLongNonZero JumpLongZero JumpNegative JumpNoCarry JumpNoOverflow JumpNonNegative JumpNonZero JumpOverflow JumpR JumpZero Label LastJump LastRTLCode Literal LoadEffectiveAddressMwrR LoadEffectiveAddressXowrR LogicalShiftLeftCqR LogicalShiftLeftRR LogicalShiftRightCqR LogicalShiftRightRR MoveA32R MoveAbR MoveAwR MoveC32R MoveC64R MoveCqR MoveCwR MoveM16rR MoveM32rR MoveM32rRs MoveM64rR
 d MoveM8rR MoveMbrR MoveMs8rR MoveMwrR MoveRA32 MoveRAb MoveRAw MoveRM16r MoveRM32r MoveRM8r MoveRMbr MoveRMwr MoveRR MoveRRd MoveRX16rR MoveRX32rR MoveRXbrR MoveRXowr MoveRXwrR MoveRdM64r MoveRdR MoveRdRd MoveRsM32r MoveRsRs MoveX16rRR MoveX32rRR MoveXbrRR MoveXowrR MoveXwrRR MulRdRd MulRsRs NegateR Nop NotR OrCqR OrCwR OrRR PopR PrefetchAw PushCq PushCw PushR RetN RotateLeftCqR RotateRightCqR SignExtend16RR SignExtend32RR SignExtend8RR SqrtRd SqrtRs Stop SubCqR SubCwR SubRR SubRdRd SubRsRs SubbRR TstCqR XorCqR XorCwR XorRR XorRdRd XorRsRs ZeroExtend16RR ZeroExtend32RR ZeroExtend8RR'
  poolDictionaries: ''
  category: 'VMMaker-JIT'!
 
  !CogRTLOpcodes commentStamp: 'eem 12/26/2015 14:00' prior: 0!
  I am a pool for the Register-Transfer-Language to which Cog compiles.  I define unique integer values for all RTL opcodes.  See CogAbstractInstruction for instances of instructions with the opcodes that I define.!

Item was changed:
  ----- Method: CogRTLOpcodes class>>initialize (in category 'class initialization') -----
  initialize
  "Abstract opcodes are a compound of a one word operation specifier and zero or more operand type specifiers.
  The assembler is in Cogit protocol abstract instructions and uses `at&t' syntax, assigning to the register on the
  right. e.g. MoveRR is the Move opcode with two register operand specifiers and defines a move register to
  register instruction from operand 0 to operand 1.  The word and register size is assumed to be either 32-bits
  on a 32-bit architecture or 64-bits on a 64-bit architecture.  The abstract machine is mostly a 2 address machine
  with the odd three address instruction added to better exploit RISCs.
  (self initialize)
  The operand specifiers are
  R - general purpose register
  Rs - single-precision floating-point register
  Rd - double-precision floating-point register
  Cq - a `quick' constant that can be encoded in the minimum space possible.
  Cw - a constant with word size where word is the default operand size for the Smalltalk VM, 32-bits
   for a 32-bit VM, 64-bits for a 64-bit VM.  The generated constant must occupy the default number
   of bits.  This allows e.g. a garbage collector to update the value without invalidating the code.
  C32 - a constant with 32 bit size.  The generated constant must occupy 32 bits.
  C64 - a constant with 64 bit size.  The generated constant must occupy 64 bits.
  Aw - memory word (32-bits for a 32-bit VM, 64-bits for a 64-bit VM) at an absolute address
  Ab - memory byte at an absolute address
  A32 - memory 32-bit halfword at an absolute address
  Mwr - memory word whose address is at a constant offset from an address in a register
  Mbr - memory byte whose address is at a constant offset from an address in a register (zero-extended on read)
  M16r - memory 16-bit halfword whose address is at a constant offset from an address in a register
  M32r - memory 32-bit halfword whose address is at a constant offset from an address in a register
  M64r - memory 64-bit doubleword whose address is at a constant offset from an address in a register
  Xbr - memory byte whose address is r * byte size away from an address in a register
  X16r - memory 16-bit halfword whose address is r * (2 bytes size) away from an address in a register
  X32r - memory 32-bit halfword whose address is r * (4 bytes size) away from an address in a register
  Xwr - memory word whose address is r * word size away from an address in a register
  Xowr - memory word whose address is o + (r * word size) away from an address in a register (scaled indexed)
 
  An alternative would be to decouple opcodes from operands, e.g.
  Move := 1. Add := 2. Sub := 3...
  RegisterOperand := 1. ConstantQuickOperand := 2. ConstantWordOperand := 3...
  But not all combinations make sense and even fewer are used so we stick with the simple compound approach.
 
  The assumption is that comparison and arithmetic instructions set condition codes and that move instructions
  leave the condition codes unaffected.  In particular LoadEffectiveAddressMwrR does not set condition codes
  although it can be used to do arithmetic.  On processors such as MIPS this distinction is invalid; there are no
  condition codes.  So the backend is allowed to collapse operation, branch pairs to internal instruciton definitions
  (see sender and implementors of noteFollowingConditionalBranch:).
 
  Not all of the definitions in opcodeDefinitions below are implemented.  In particular we do not implement the
  XowrR scaled index addressing mode since it requires 4 operands.
 
  Not all instructions make sense on all architectures.  MoveRRd and MoveRdR aqre meaningful only on 64-bit machines.
 
  Note that there are no generic division instructions defined, but a processor may define some.
 
  Branch/Call ranges.  Jump[Cond] can be generated as short as possible.  Call/Jump[Cond]Long must be generated
  in the same number of bytes irrespective of displacement since their targets may be updated, but they need only
  span 16Mb, the maximum size of the code zone.  This allows e.g. ARM to use single-word call and jump instructions
  for most calls and jumps.  CallFull/JumpFull must also be generated in the same number of bytes irrespective of
  displacement for the same reason, but they must be able to span the full (32-bit or 64-bit) address space because
  they are used to call code in the C runtime, which may be distant from the code zone.  CallFull/JumpFull are allowed
  to use the cResultRegister as a scratch if required (e.g. on x64 where there is no direct 64-bit call or jump).
 
  Byte reads.  If the concrete compiler class answers true to byteReadsZeroExtend then byte reads must zero-extend
  the byte read into the destination register.  If not, the other bits of the register should be left undisturbed and the
  Cogit will add an instruction to zero the register as required.  Under no circumstances should byte reads sign-extend.
 
  16-bit (and on 64-bits, 32-bit) reads.  These /are/ expected to always zero-extend."
 
  | opcodeNames refs |
  opcodeNames := #("Noops & Pseudo Ops"
  Label
  Literal "a word-sized literal"
  AlignmentNops
  Fill32 "output four byte's worth of bytes with operand 0"
  Nop
 
  "Control"
  Call "call within the code zone"
  CallFull "call anywhere within the full address space"
  CallR
  RetN
  JumpR "Not a regular jump, i.e. not pc dependent."
  Stop "Halt the processor"
 
  "N.B.  Jumps are contiguous.  Long and Full jumps are contiguous within them.  See FirstJump et al below"
  JumpFull "Jump anywhere within the address space"
  JumpLong "Jump anywhere within the 16mb code zone."
  JumpLongZero "a.k.a. JumpLongEqual"
  JumpLongNonZero "a.k.a. JumpLongNotEqual"
  Jump "short jumps; can be encoded in as few bytes as possible; will not be disturbed by GC or relocation."
  JumpZero "a.k.a. JumpEqual"
  JumpNonZero "a.k.a. JumpNotEqual"
  JumpNegative
  JumpNonNegative
  JumpOverflow
  JumpNoOverflow
  JumpCarry
  JumpNoCarry
  JumpLess "signed"
  JumpGreaterOrEqual
  JumpGreater
  JumpLessOrEqual
  JumpBelow "unsigned"
  JumpAboveOrEqual
  JumpAbove
  JumpBelowOrEqual
 
  JumpFPEqual
  JumpFPNotEqual
  JumpFPLess
  JumpFPLessOrEqual
  JumpFPGreater
  JumpFPGreaterOrEqual
  JumpFPOrdered
  JumpFPUnordered
 
  "Data Movement; destination is always last operand"
  MoveRR
  MoveAwR MoveA32R
  MoveRAw MoveRA32
  MoveAbR
  MoveRAb
  MoveMwrR MoveRMwr MoveXwrRR MoveRXwrR MoveXowrR MoveRXowr
  MoveM8rR MoveMs8rR MoveRM8r
  MoveM16rR MoveRM16r MoveX16rRR MoveRX16rR
  MoveM32rR MoveRM32r MoveX32rRR MoveRX32rR
  MoveMbrR MoveRMbr MoveXbrRR MoveRXbrR
  MoveCqR MoveCwR MoveC32R MoveC64R
  MoveRRd MoveRdR MoveRdRd MoveM64rRd MoveRdM64r
  MoveRsRs MoveM32rRs MoveRsM32r
  PopR PushR PushCq PushCw
  PrefetchAw
 
  "Arithmetic; destination is always last operand except Cmp; CmpXR is SubRX with no update of result"
  LoadEffectiveAddressMwrR LoadEffectiveAddressXowrR "Variants of add/multiply"
  NegateR "2's complement negation"
  NotR
  ArithmeticShiftRightCqR ArithmeticShiftRightRR
  LogicalShiftRightCqR LogicalShiftRightRR
  LogicalShiftLeftCqR LogicalShiftLeftRR
  RotateLeftCqR RotateRightCqR
 
  CmpRR AddRR SubRR AndRR OrRR XorRR
  CmpCqR AddCqR SubCqR AndCqR OrCqR TstCqR XorCqR
  CmpCwR CmpC32R AddCwR SubCwR AndCwR OrCwR XorCwR
+ AddcRR AddcCqR SubbRR SubbCqR
- AddcRR AddcCqR SubbRR
 
  AndCqRR "Three address ops for RISCs; feel free to add and extend"
 
  CmpRdRd AddRdRd SubRdRd MulRdRd DivRdRd SqrtRd XorRdRd
  CmpRsRs AddRsRs SubRsRs MulRsRs DivRsRs SqrtRs XorRsRs
 
  "Conversion"
  ConvertRRd ConvertRdR
  ConvertRsRd ConvertRdRs ConvertRsR ConvertRRs
 
  SignExtend8RR SignExtend16RR SignExtend32RR
  ZeroExtend8RR ZeroExtend16RR ZeroExtend32RR
 
  LastRTLCode).
 
  "Magic auto declaration. Add to the classPool any new variables and nuke any obsolete ones, and assign values"
  "Find the variables directly referenced by this method"
  refs := (thisContext method literals select: [:l| l isVariableBinding and: [classPool includesKey: l key]]) collect:
  [:ea| ea key].
  "Move to Undeclared any opcodes in classPool not in opcodes or this method."
  (classPool keys reject: [:k| (opcodeNames includes: k) or: [refs includes: k]]) do:
  [:k|
  Undeclared declare: k from: classPool].
  "Declare as class variables and number elements of opcodeArray above"
  opcodeNames withIndexDo:
  [:classVarName :value|
  self classPool
  declare: classVarName from: Undeclared;
  at: classVarName put: value].
 
  "For CogAbstractInstruction>>isJump etc..."
  FirstJump := JumpFull.
  LastJump := JumpFPUnordered.
  FirstShortJump := Jump.
 
  "And now initialize the backends; they add their own opcodes and hence these must be reinitialized."
  (Smalltalk classNamed: #CogAbstractInstruction) ifNotNil:
  [:cogAbstractInstruction| cogAbstractInstruction allSubclasses do: [:sc| sc initialize]]!

Item was changed:
  ----- Method: CogRegisterAllocatingSimStackEntry>>reconcilePoppingWith: (in category 'compile abstract instructions') -----
  reconcilePoppingWith: targetEntry
  "Make the state of a targetEntry, a stack entry following a non-inlined special selector
  send, the same as the corresponding entry (the receiver) along the inlined path."
  <var: #targetEntry type: #'SimStackEntry *'>
  | targetReg |
  spilled = targetEntry spilled ifTrue:
+ [self assert: ((self isSameEntryAs: targetEntry)
+ or: [targetEntry spilled not and: [targetEntry registerOrNone ~= NoReg]]).
- [self assert: (targetEntry spilled
- ifTrue: [self isSameEntryAs: targetEntry]
- ifFalse: [targetEntry registerOrNone ~= NoReg]).
  (targetReg := targetEntry registerOrNone) = NoReg ifTrue:
  [^self].
  type caseOf: {
  [SSBaseOffset] -> [cogit MoveMw: offset r: register R: targetReg].
  [SSSpill] -> [cogit MoveMw: offset r: register R: targetReg].
  [SSConstant] -> [cogit genMoveConstant: constant R: targetReg].
  [SSRegister] -> [targetReg ~= register ifTrue:
  [cogit MoveR: register R: targetReg]] }.
  ^self].
  self assert: spilled.
  (targetEntry type ~= SSConstant
  and: [(targetReg := targetEntry registerOrNone) ~= NoReg])
  ifTrue: [cogit PopR: targetReg]
  ifFalse: [cogit AddCq: objectRepresentation wordSize R: SPReg]!

Item was added:
+ ----- Method: CogRegisterAllocatingSimStackEntry>>reconcilePushingWith: (in category 'compile abstract instructions') -----
+ reconcilePushingWith: targetEntry
+ "Make the state of the receiver, a stack entry at the end of a basic block,
+ the same as the corresponding simStackEntry at the target of a preceding
+ jump to the beginning of the next basic block.  Make sure targetEntry
+ reflects the state of the merged simStack; it will be installed as the current
+ entry by restoreSimStackAtMergePoint: in mergeWithFixupIfRequired:.
+
+ Answer if the liveRegister for the targetEntry (if any) should be deassigned;
+ this is because if merging a non-temp with a temp that has a live register we
+ can assign to the register, but must unassign the register from the temp,
+ otherwise the temp will acquire the merged value without an assignment."
+ <var: #targetEntry type: #'SimStackEntry *'>
+ | targetReg |
+ (targetReg := targetEntry registerOrNone) = NoReg ifTrue:
+ [| reg |
+ self assert: targetEntry spilled.
+ (self isSameEntryAs: targetEntry) ifTrue:
+ [self assert: spilled.
+ ^false].
+ (reg := self registerOrNone) = NoReg ifTrue: [reg := TempReg].
+ self storeToReg: reg.
+ spilled
+ ifTrue: [cogit MoveR: reg Mw: targetEntry offset r: targetEntry register]
+ ifFalse: [cogit PushR: reg].
+ ^false].
+ liveRegister ~= NoReg ifTrue:
+ [liveRegister ~= targetReg ifTrue:
+ [cogit MoveR: liveRegister R: targetReg].
+ (spilled and: [targetEntry spilled not]) ifTrue:
+ [cogit AddCq: objectRepresentation wordSize R: SPReg].
+ ^false].
+ spilled
+ ifTrue:
+ [targetEntry spilled ifFalse:
+ [cogit PopR: targetReg. "KISS; generate the least number of instructions..."
+ ^false]]
+ ifFalse:
+ [targetEntry spilled ifTrue:
+ [cogit SubCq: objectRepresentation wordSize R: SPReg]].
+ type caseOf: {
+ [SSBaseOffset] -> [cogit MoveMw: offset r: register R: targetReg].
+ [SSSpill] -> [cogit MoveMw: offset r: register R: targetReg].
+ [SSConstant] -> [cogit genMoveConstant: constant R: targetReg].
+ [SSRegister] -> [register ~= targetReg ifTrue:
+ [cogit MoveR: register R: targetReg]] }.
+ (targetEntry type = SSConstant
+ and: [type ~= SSConstant or: [constant ~= targetEntry constant]]) ifTrue:
+ [targetEntry
+ register: targetReg;
+ type: SSRegister].
+ "If merging a constant with a constant assigned to a register, then the register must be deassigned from any temps."
+ ^targetEntry type = SSConstant
+ "If merging a non-temp with a temp that has a live register we can assign
+ to the register, but must unassign the register from the temp, otherwise
+ the temp will acquire the merged value without an assignment."
+ or: [targetEntry isFrameTempVar and: [(self isSameEntryAs: targetEntry) not]]!

Item was changed:
  ----- Method: CogX64Compiler>>computeMaximumSize (in category 'generate machine code') -----
(excessive size, no diff calculated)

Item was changed:
  ----- Method: CogX64Compiler>>dispatchConcretize (in category 'generate machine code') -----
  dispatchConcretize
  "Attempt to generate concrete machine code for the instruction at address.
  This is the inner dispatch of concretizeAt: actualAddress which exists only
  to get around the branch size limits in the SqueakV3 (blue book derived)
  bytecode set."
  <returnTypeC: #void>
  opcode >= CDQ ifTrue:
  [^self dispatchConcretizeProcessorSpecific].
  opcode caseOf: {
  "Noops & Pseudo Ops"
  [Label] -> [^self concretizeLabel].
  [AlignmentNops] -> [^self concretizeAlignmentNops].
  [Fill32] -> [^self concretizeFill32].
  [Nop] -> [^self concretizeNop].
  "Control"
  [Call] -> [^self concretizeCall].
  [CallR] -> [^self concretizeCallR].
  [CallFull] -> [^self concretizeCallFull].
  [JumpR] -> [^self concretizeJumpR].
  [JumpFull] -> [^self concretizeJumpFull].
  [JumpLong] -> [^self concretizeJumpLong].
  [JumpLongZero] -> [^self concretizeConditionalJump: 16r4].
  [JumpLongNonZero] -> [^self concretizeConditionalJump: 16r5].
  [Jump] -> [^self concretizeJump].
  "Table B-1 Intel 64 and IA-32 Architectures Software Developer's Manual Volume 1: Basic Architecture"
  [JumpZero] -> [^self concretizeConditionalJump: 16r4].
  [JumpNonZero] -> [^self concretizeConditionalJump: 16r5].
  [JumpNegative] -> [^self concretizeConditionalJump: 16r8].
  [JumpNonNegative] -> [^self concretizeConditionalJump: 16r9].
  [JumpOverflow] -> [^self concretizeConditionalJump: 16r0].
  [JumpNoOverflow] -> [^self concretizeConditionalJump: 16r1].
  [JumpCarry] -> [^self concretizeConditionalJump: 16r2].
  [JumpNoCarry] -> [^self concretizeConditionalJump: 16r3].
  [JumpLess] -> [^self concretizeConditionalJump: 16rC].
  [JumpGreaterOrEqual] -> [^self concretizeConditionalJump: 16rD].
  [JumpGreater] -> [^self concretizeConditionalJump: 16rF].
  [JumpLessOrEqual] -> [^self concretizeConditionalJump: 16rE].
  [JumpBelow] -> [^self concretizeConditionalJump: 16r2].
  [JumpAboveOrEqual] -> [^self concretizeConditionalJump: 16r3].
  [JumpAbove] -> [^self concretizeConditionalJump: 16r7].
  [JumpBelowOrEqual] -> [^self concretizeConditionalJump: 16r6].
  [JumpFPEqual] -> [^self concretizeConditionalJump: 16r4].
  [JumpFPNotEqual] -> [^self concretizeConditionalJump: 16r5].
  [JumpFPLess] -> [^self concretizeConditionalJump: 16r2].
  [JumpFPGreaterOrEqual] -> [^self concretizeConditionalJump: 16r3].
  [JumpFPGreater] -> [^self concretizeConditionalJump: 16r7].
  [JumpFPLessOrEqual] -> [^self concretizeConditionalJump: 16r6].
  [JumpFPOrdered] -> [^self concretizeConditionalJump: 16rB].
  [JumpFPUnordered] -> [^self concretizeConditionalJump: 16rA].
  [RetN] -> [^self concretizeRetN].
  [Stop] -> [^self concretizeStop].
  "Arithmetic"
  [AddCqR] -> [^self concretizeArithCqRWithRO: 0 raxOpcode: 15r05].
+ [AddcCqR] -> [^self concretizeArithCqRWithRO: 2 raxOpcode: 15r15].
  [AddCwR] -> [^self concretizeArithCwR: 16r03].
  [AddRR] -> [^self concretizeOpRR: 16r03].
  [AddRsRs] -> [^self concretizeSEEOpRsRs: 16r58].
  [AddRdRd] -> [^self concretizeSEE2OpRdRd: 16r58].
  [AndCqR] -> [^self concretizeArithCqRWithRO: 4 raxOpcode: 16r25].
  [AndCwR] -> [^self concretizeArithCwR: 16r23].
  [AndRR] -> [^self concretizeOpRR: 16r23].
  [TstCqR] -> [^self concretizeTstCqR].
  [CmpCqR] -> [^self concretizeArithCqRWithRO: 7 raxOpcode: 16r3D].
  [CmpCwR] -> [^self concretizeArithCwR: 16r39].
  [CmpC32R] -> [^self concretizeCmpC32R].
  [CmpRR] -> [^self concretizeReverseOpRR: 16r39].
  [CmpRdRd] -> [^self concretizeCmpRdRd].
  [CmpRsRs] -> [^self concretizeCmpRsRs].
  [DivRdRd] -> [^self concretizeSEE2OpRdRd: 16r5E].
  [DivRsRs] -> [^self concretizeSEEOpRsRs: 16r5E].
  [MulRdRd] -> [^self concretizeSEE2OpRdRd: 16r59].
  [MulRsRs] -> [^self concretizeSEEOpRsRs: 16r59].
  [OrCqR] -> [^self concretizeArithCqRWithRO: 1 raxOpcode: 16r0D].
  [OrCwR] -> [^self concretizeArithCwR: 16r0B].
  [OrRR] -> [^self concretizeOpRR: 16r0B].
  [SubCqR] -> [^self concretizeArithCqRWithRO: 5 raxOpcode: 16r2D].
+ [SubbCqR] -> [^self concretizeArithCqRWithRO: 3 raxOpcode: 16r1D].
  [SubCwR] -> [^self concretizeArithCwR: 16r2B].
  [SubRR] -> [^self concretizeOpRR: 16r2B].
  [SubRdRd] -> [^self concretizeSEE2OpRdRd: 16r5C].
  [SubRsRs] -> [^self concretizeSEEOpRsRs: 16r5C].
  [SqrtRd] -> [^self concretizeSqrtRd].
  [SqrtRs] -> [^self concretizeSqrtRs].
  [XorCwR] -> [^self concretizeArithCwR: 16r33].
  [XorRR] -> [^self concretizeOpRR: 16r33].
  [XorRdRd] -> [^self concretizeXorRdRd].
  [XorRsRs] -> [^self concretizeXorRsRs].
  [NegateR] -> [^self concretizeNegateR].
  [LoadEffectiveAddressMwrR] -> [^self concretizeLoadEffectiveAddressMwrR].
  [RotateLeftCqR] -> [^self concretizeShiftCqRegOpcode: 0].
  [RotateRightCqR] -> [^self concretizeShiftCqRegOpcode: 1].
  [ArithmeticShiftRightCqR] -> [^self concretizeShiftCqRegOpcode: 7].
  [LogicalShiftRightCqR] -> [^self concretizeShiftCqRegOpcode: 5].
  [LogicalShiftLeftCqR] -> [^self concretizeShiftCqRegOpcode: 4].
  [ArithmeticShiftRightRR] -> [^self concretizeShiftRegRegOpcode: 7].
  [LogicalShiftLeftRR] -> [^self concretizeShiftRegRegOpcode: 4].
  "Data Movement"
  [MoveCqR] -> [^self concretizeMoveCqR].
  [MoveCwR] -> [^self concretizeMoveCwR].
  [MoveC32R] -> [^self concretizeMoveC32R].
  [MoveRR] -> [^self concretizeReverseOpRR: 16r89].
  [MoveAwR] -> [^self concretizeMoveAwR].
  [MoveA32R] -> [^self concretizeMoveA32R].
  [MoveRAw] -> [^self concretizeMoveRAw].
  [MoveRA32] -> [^self concretizeMoveRA32].
  [MoveAbR] -> [^self concretizeMoveAbR].
  [MoveRAb] -> [^self concretizeMoveRAb].
  [MoveMbrR] -> [^self concretizeMoveMbrR].
  [MoveRMbr] -> [^self concretizeMoveRMbr].
  [MoveM8rR] -> [^self concretizeMoveMbrR].
  [MoveRM8r] -> [^self concretizeMoveRMbr].
  [MoveM16rR] -> [^self concretizeMoveM16rR].
  [MoveRM16r] -> [^self concretizeMoveRM16r].
  [MoveM32rR] -> [^self concretizeMoveM32rR].
  [MoveM32rRs] -> [^self concretizeMoveM32rRs].
  [MoveM64rRd] -> [^self concretizeMoveM64rRd].
  [MoveMwrR] -> [^self concretizeMoveMwrR].
  [MoveXbrRR] -> [^self concretizeMoveXbrRR].
  [MoveRXbrR] -> [^self concretizeMoveRXbrR].
  [MoveXwrRR] -> [^self concretizeMoveXwrRR].
  [MoveRXwrR] -> [^self concretizeMoveRXwrR].
  [MoveX32rRR] -> [^self concretizeMoveX32rRR].
  [MoveRX32rR] -> [^self concretizeMoveRX32rR].
  [MoveRMwr] -> [^self concretizeMoveRMwr].
  [MoveRM32r] -> [^self concretizeMoveRM32r].
  [MoveRsM32r] -> [^self concretizeMoveRsM32r].
  [MoveRdM64r] -> [^self concretizeMoveRdM64r].
  [MoveRdR] -> [^self concretizeMoveRdR].
  [MoveRRd] -> [^self concretizeMoveRRd].
  [MoveRdRd] -> [^self concretizeMoveRdRd].
  [MoveRsRs] -> [^self concretizeMoveRsRs].
  [PopR] -> [^self concretizePopR].
  [PushR] -> [^self concretizePushR].
  [PushCq] -> [^self concretizePushCq].
  [PushCw] -> [^self concretizePushCw].
  [PrefetchAw] -> [^self concretizePrefetchAw].
  "Conversion"
  [ConvertRRd] -> [^self concretizeConvertRRd].
  [ConvertRdR] -> [^self concretizeConvertRdR].
  [ConvertRRs] -> [^self concretizeConvertRRs].
  [ConvertRsR] -> [^self concretizeConvertRsR].
  [ConvertRsRd] -> [^self concretizeConvertRsRd].
  [ConvertRdRs] -> [^self concretizeConvertRdRs].
 
  [SignExtend8RR] -> [^self concretizeSignExtend8RR].
  [SignExtend16RR] -> [^self concretizeSignExtend16RR].
  [SignExtend32RR] -> [^self concretizeSignExtend32RR].
 
  [ZeroExtend8RR] -> [^self concretizeZeroExtend8RR].
  [ZeroExtend16RR] -> [^self concretizeZeroExtend16RR].
  [ZeroExtend32RR] -> [^self concretizeZeroExtend32RR].
  }!

Item was added:
+ ----- Method: Cogit>>SubbCq:R: (in category 'abstract instructions') -----
+ SubbCq: quickConstant R: reg
+ <inline: true>
+ <returnTypeC: #'AbstractInstruction *'>
+ ^self gen: SubbCqR quickConstant: quickConstant operand: reg!

Item was changed:
  StackToRegisterMappingCogit subclass: #RegisterAllocatingCogit
+ instanceVariableNames: 'numFixups mergeSimStacksBase nextFixup scratchSimStack scratchSpillBase scratchOptStatus ceSendMustBeBooleanAddTrueLongTrampoline ceSendMustBeBooleanAddFalseLongTrampoline recompileForLoopRegisterAssignments scratchBytecodePC'
- instanceVariableNames: 'numFixups mergeSimStacksBase nextFixup scratchSimStack scratchSpillBase scratchOptStatus ceSendMustBeBooleanAddTrueLongTrampoline ceSendMustBeBooleanAddFalseLongTrampoline recompileForLoopRegisterAssignments'
  classVariableNames: ''
  poolDictionaries: ''
  category: 'VMMaker-JIT'!
 
  !RegisterAllocatingCogit commentStamp: 'eem 2/9/2017 10:40' prior: 0!
  RegisterAllocatingCogit is an optimizing code generator that is specialized for register allocation.
 
  On the contrary to StackToRegisterMappingCogit, RegisterAllocatingCogit keeps at each control flow merge point the state of the simulated stack to merge into and not only an integer fixup. Each branch and jump record the current state of the simulated stack, and each fixup is responsible for merging this state into the saved simulated stack.
 
  Instance Variables
  ceSendMustBeBooleanAddFalseLongTrampoline: <Integer>
  ceSendMustBeBooleanAddTrueLongTrampoline: <Integer>
  mergeSimStacksBase: <Integer>
  nextFixup: <Integer>
  numFixups: <Integer>
  scratchOptStatus: <CogSSOptStatus>
  scratchSimStack: <Array of CogRegisterAllocatingSimStackEntry>
  scratchSpillBase: <Integer>
 
  ceSendMustBeBooleanAddFalseLongTrampoline
  - the must-be-boolean trampoline for long jump false bytecodes (the existing ceSendMustBeBooleanAddFalseTrampoline is used for short branches)
 
  ceSendMustBeBooleanAddTrueLongTrampoline
  - the must-be-boolean trampoline for long jump true bytecodes (the existing ceSendMustBeBooleanAddTrueTrampoline is used for short branches)
 
  mergeSimStacksBase
  - the base address of the alloca'ed memory for merge fixups
 
  nextFixup
  - the index into mergeSimStacksBase from which the next needed mergeSimStack will be allocated
 
  numFixups
  - a conservative (over) estimate of the number of merge fixups needed in a method
 
  scratchOptStatus
  - a scratch variable to hold the state of optStatus while merge code is generated
 
  scratchSimStack
  - a scratch variable to hold the state of simStack while merge code is generated
 
  scratchSpillBase
  - a scratch variable to hold the state of spillBase while merge code is generated!

Item was removed:
- ----- Method: RegisterAllocatingCogit>>assignToTempRegConflictingRegisterIn: (in category 'bytecode generator support') -----
- assignToTempRegConflictingRegisterIn: conflictingRegisterMask
- "Find the stackEntry in simStack whose liveRegister matches conflictingRegisterMask
- and assign it to TempReg."
- self assert: (self isAPowerOfTwo: conflictingRegisterMask).
- 0 to: simStackPtr do:
- [:i|
- (self simStackAt: i) registerMaskOrNone = conflictingRegisterMask ifTrue:
- [(self simStackAt: i)
- storeToReg: TempReg;
- liveRegister: TempReg.
- i+1 to: simStackPtr do:
- [:j|
- self deny: (self simStackAt: i) registerMaskOrNone = conflictingRegisterMask].
- ^self]].
- self error: 'conflict entry not found'!

Item was changed:
  ----- Method: RegisterAllocatingCogit>>compileAbstractInstructionsFrom:through: (in category 'compile abstract instructions') -----
  compileAbstractInstructionsFrom: start through: end
  "Loop over bytecodes, dispatching to the generator for each bytecode, handling fixups in due course.
  Override to provide a development-time only escape for failed merges due to partially implemented
  parallel move.  Override to recompile after a loop requiring a merge is detected."
  ^[| result initialOpcodeIndex initialCounterIndex initialIndexOfIRC |
    compilationPass := 1.
+   scratchBytecodePC := nil.
    initialOpcodeIndex := opcodeIndex.
    initialCounterIndex := self maybeCounterIndex."for SistaCogit"
    literalsManager saveForRecompile.
    NewspeakVM ifTrue:
  [initialIndexOfIRC := indexOfIRC].
    [recompileForLoopRegisterAssignments := false.
     result := super compileAbstractInstructionsFrom: start through: end.
     result = 0 and: [recompileForLoopRegisterAssignments]]
  whileTrue:
  [self assert: compilationPass <= 2.
  self reinitializeAllButBackwardFixupsFrom: start through: end.
  self resetSimStack: start.
  self reinitializeOpcodesFrom: initialOpcodeIndex to: opcodeIndex - 1.
  compilationPass := compilationPass + 1.
  nextFixup := 0.
  opcodeIndex := initialOpcodeIndex.
  self maybeSetCounterIndex: initialCounterIndex. "For SistaCogit"
  literalsManager resetForRecompile.
  NewspeakVM ifTrue:
  [indexOfIRC := initialIndexOfIRC]].
     result]
  on: Notification
  do: [:ex|
  ex tag == #failedMerge ifTrue:
  [coInterpreter transcript
  ensureCr; nextPutAll: 'FAILED MERGE IN ';
  nextPutAll: (coInterpreter nameOfClass: (coInterpreter methodClassOf: methodObj));
  nextPutAll: '>>#'; nextPutAll: (coInterpreter stringOf: (coInterpreter maybeSelectorOfMethod: methodObj));
  flush.
  ^ShouldNotJIT].
  ex pass]!

Item was changed:
  ----- Method: RegisterAllocatingCogit>>copySimStackToScratch: (in category 'bytecode generator support') -----
  copySimStackToScratch: spillBase
  <inline: true>
+ scratchBytecodePC = bytecodePC ifTrue:
+ [^self].
+ scratchBytecodePC := bytecodePC.
  self cCode: [self mem: scratchSimStack cp: simStack y: self simStackSlots * (self sizeof: CogSimStackEntry)]
  inSmalltalk: [0 to: simStackPtr do:
  [:i|
  scratchSimStack at: i put: (simStack at: i) copy]].
  scratchSpillBase := spillBase.
  scratchOptStatus := self cCode: [optStatus] inSmalltalk: [optStatus copy]!

Item was changed:
  ----- Method: RegisterAllocatingCogit>>deassignRegisterForTempVar:in: (in category 'bytecode generator support') -----
  deassignRegisterForTempVar: targetEntry in: mergeSimStack
  "If merging a non-temp with a temp that has a live register we can assign
  to the register, but must unassign the register from the temp, otherwise
  the temp will acquire the merged value without an assignment.  The targetEntry
  must also be transmogrified into an SSRegister entry, which is done in the caller."
  <var: #targetEntry type: #'SimStackEntry *'>
  <var: #duplicateEntry type: #'SimStackEntry *'>
  <var: #mergeSimStack type: #'SimStackEntry *'>
  <inline: true>
  | reg |
+ self halt.  "Clément and I hope this shouldn't happen as of the new merge code in reconcileRegistersInTempVarsInCurrentSimStackWithThoseIn:"
  reg := targetEntry liveRegister.
  self assert: (reg ~= NoReg and: [targetEntry type = SSConstant or: [targetEntry isFrameTempVar]]).
  targetEntry type = SSConstant
  ifTrue:
  [simStackPtr to: 0 by: -1 do:
  [:j| | duplicateEntry |
  duplicateEntry := self simStack: mergeSimStack at: j.
  (duplicateEntry registerOrNone = reg
   and: [duplicateEntry type = SSBaseOffset or: [duplicateEntry type = SSSpill]]) ifTrue:
  [duplicateEntry liveRegister: NoReg]]]
  ifFalse:
  [simStackPtr to: 0 by: -1 do:
  [:j| | duplicateEntry |
  duplicateEntry := self simStack: mergeSimStack at: j.
  (targetEntry isSameEntryAs: duplicateEntry) ifTrue:
  [j < methodOrBlockNumTemps
  ifTrue: [duplicateEntry liveRegister: NoReg]
  ifFalse: [duplicateEntry type: SSRegister; register: reg]]]]!

Item was changed:
  ----- Method: RegisterAllocatingCogit>>ensureFixupAt: (in category 'bytecode generator support') -----
  ensureFixupAt: targetPC
  "Make sure there's a flagged fixup at the target pc in fixups.
  Initially a fixup's target is just a flag.  Later on it is replaced with a proper instruction.
  Override to enerate stack merging code if required."
  | fixup |
  <var: #fixup type: #'BytecodeFixup *'>
  self assert: targetPC > bytecodePC.
  fixup := self fixupAt: targetPC.
  fixup needsFixup
  ifTrue:
  [fixup mergeSimStack
  ifNil: [self setMergeSimStackOf: fixup]
+ ifNotNil:
+ [self copySimStackToScratch: simSpillBase.
+ self mergeCurrentSimStackWith: fixup forwards: true.
+ self restoreSimStackFromScratch]]
- ifNotNil: [self mergeCurrentSimStackWith: fixup forwards: true]]
  ifFalse:
  [self assert: (fixup mergeSimStack isNil or: [compilationPass = 2]).
  self moveVolatileSimStackEntriesToRegisters.
  fixup mergeSimStack
  ifNil: [self setMergeSimStackOf: fixup]
  ifNotNil: [self assert: (self simStack: simStack isIdenticalTo: fixup mergeSimStack)]].
  ^super ensureFixupAt: targetPC!

Item was changed:
  ----- Method: RegisterAllocatingCogit>>ensureRegisterAssignmentsAreAtHeadOfLoop: (in category 'bytecode generator support') -----
  ensureRegisterAssignmentsAreAtHeadOfLoop: target
  "Compiling a loop body will compute a set of live registers.  The backward branch must merge
  with the head of the loop.  So it is preferrable to make the register assignments at the end of
  the loop available at the head.  To do this, simply copy the register assignments to the loop
  head's fixup in the first compilation pass and schedule a second compilation pass.  On the
  second pass the merge will occur when encountering the fixup for the loop head, using
  exactly the same code as for a merge at the end of an if."
  | conflictingRegsMask |
  compilationPass > 1 ifTrue:
  ["self deny: (self mergeRequiredToTarget: target mergeSimStack)."
  self assert: (target mergeSimStack isNil or: [self simStack: simStack isIdenticalTo: target mergeSimStack]).
  ^self].
  (self mergeRequiredToTarget: target mergeSimStack) ifFalse:
  [^self].
  "Schedule a recompile and merge the end-of-loop assignments into the head of the loop,
+ replacing any and all register assignments with the state as of the back jump.  Because
+ typically the back jump will be taken much more often than the loop entered, favouring
+ the assignments here is more efficient than trying to merge."
- giving priority to the assignments at this point, and preserving any other non-conflicting
- assignments."
  recompileForLoopRegisterAssignments := true.
  conflictingRegsMask := self conflictingRegistersBetweenSimStackAnd: target mergeSimStack.
  self deny: (self register: FPReg isInMask: conflictingRegsMask).
  0 to: simStackPtr do:
  [:i| | currentEntry targetEntry |
  currentEntry := self simStack: simStack at: i.
  targetEntry := self simStack: target mergeSimStack at: i.
+ targetEntry liveRegister: currentEntry liveRegister].
- currentEntry liveRegister ~= NoReg
- ifTrue:
- [targetEntry liveRegister: currentEntry liveRegister]
- ifFalse:
- [(targetEntry registerMask anyMask: conflictingRegsMask) ifTrue:
- [targetEntry liveRegister: NoReg]]].
  optStatus isReceiverResultRegLive ifTrue:
  [target isReceiverResultRegSelf: true]!

Item was changed:
  ----- Method: RegisterAllocatingCogit>>genSpecialSelectorArithmetic (in category 'bytecode generators') -----
  genSpecialSelectorArithmetic
  | primDescriptor rcvrIsConst argIsConst rcvrIsInt argIsInt rcvrInt argInt destReg
  jumpNotSmallInts jumpContinue jumpOverflow index rcvrReg argReg regMask |
  <var: #jumpOverflow type: #'AbstractInstruction *'>
  <var: #jumpContinue type: #'AbstractInstruction *'>
  <var: #primDescriptor type: #'BytecodeDescriptor *'>
  <var: #jumpNotSmallInts type: #'AbstractInstruction *'>
  primDescriptor := self generatorAt: byte0.
  argIsInt := (argIsConst := self ssTop type = SSConstant)
  and: [objectMemory isIntegerObject: (argInt := self ssTop constant)].
  rcvrIsInt := ((rcvrIsConst := (self ssValue: 1) type = SSConstant)
   and: [objectMemory isIntegerObject: (rcvrInt := (self ssValue: 1) constant)])
  or: [self mclassIsSmallInteger and: [(self ssValue: 1) isSameEntryAs: (self addressOf: simSelf)]].
 
  (argIsInt and: [rcvrIsInt and: [rcvrIsConst]]) ifTrue:
  [| result |
  rcvrInt := objectMemory integerValueOf: rcvrInt.
  argInt := objectMemory integerValueOf: argInt.
  primDescriptor opcode caseOf: {
  [AddRR] -> [result := rcvrInt + argInt].
  [SubRR] -> [result := rcvrInt - argInt].
  [AndRR] -> [result := rcvrInt bitAnd: argInt].
  [OrRR] -> [result := rcvrInt bitOr: argInt] }.
  (objectMemory isIntegerValue: result) ifTrue:
  ["Must annotate the bytecode for correct pc mapping."
  ^self ssPop: 2; ssPushAnnotatedConstant: (objectMemory integerObjectOf: result)].
  ^self genSpecialSelectorSend].
 
  "If there's any constant involved other than a SmallInteger don't attempt to inline."
  ((rcvrIsConst and: [rcvrIsInt not])
  or: [argIsConst and: [argIsInt not]]) ifTrue:
  [^self genSpecialSelectorSend].
 
  "If we know nothing about the types then better not to inline as the inline cache and
  primitive code is not terribly slow so wasting time on duplicating tag tests is pointless."
  (argIsInt or: [rcvrIsInt]) ifFalse:
  [^self genSpecialSelectorSend].
 
  "Since one or other of the arguments is an integer we can very likely profit from inlining.
  But if the other type is not SmallInteger or if the operation overflows then we will need
  to do a send.  Since we're allocating values in registers we would like to keep those
  registers live on the inlined path and reload registers along the non-inlined send path.
  See reconcileRegisterStateForJoinAfterSpecialSelectorSend below."
  argIsInt
  ifTrue:
  [rcvrReg := self allocateRegForStackEntryAt: 1.
  (self ssValue: 1) popToReg: rcvrReg.
  regMask := self registerMaskFor: rcvrReg]
  ifFalse:
  [self allocateRegForStackTopTwoEntriesInto: [:rTop :rNext| argReg := rTop. rcvrReg := rNext].
  self ssTop popToReg: argReg.
  (self ssValue: 1) popToReg: rcvrReg.
  regMask := self registerMaskFor: rcvrReg and: argReg].
 
  "rcvrReg can be reused for the result iff the receiver is a constant or is an SSRegister that is not used elsewhere."
  destReg := ((rcvrIsInt and: [rcvrIsConst])
  or: [(self ssValue: 1) type = SSRegister
  and: [(self anyReferencesToRegister: rcvrReg inAllButTopNItems: 2) not]])
  ifTrue: [rcvrReg]
  ifFalse: [self allocateRegNotConflictingWith: regMask].
  self ssPop: 2.
  jumpNotSmallInts := (rcvrIsInt and: [argIsInt]) ifFalse:
  [argIsInt
  ifTrue: [objectRepresentation genJumpNotSmallInteger: rcvrReg]
  ifFalse:
  [rcvrIsInt
  ifTrue: [objectRepresentation genJumpNotSmallInteger: argReg]
  ifFalse: [objectRepresentation genJumpNotSmallIntegersIn: rcvrReg and: argReg scratch: TempReg]]].
  rcvrReg ~= destReg ifTrue:
  [self MoveR: rcvrReg R: destReg].
  primDescriptor opcode caseOf: {
  [AddRR] -> [argIsInt
  ifTrue:
  [self AddCq: argInt - ConstZero R: destReg.
  jumpContinue := self JumpNoOverflow: 0.
  "overflow; must undo the damage before doing send"
  rcvrReg = destReg ifTrue:
+ [self SubbCq: argInt - ConstZero R: rcvrReg]]
- [self SubCq: argInt - ConstZero R: rcvrReg]]
  ifFalse:
  [objectRepresentation genRemoveSmallIntegerTagsInScratchReg: destReg.
  self AddR: argReg R: destReg.
  jumpContinue := self JumpNoOverflow: 0.
  "overflow; must undo the damage before doing send"
  destReg = rcvrReg ifTrue:
  [(rcvrIsInt and: [rcvrIsConst])
  ifTrue: [self MoveCq: rcvrInt R: rcvrReg]
  ifFalse:
+ [self SubbR: argReg R: rcvrReg.
- [self SubR: argReg R: rcvrReg.
  objectRepresentation genSetSmallIntegerTagsIn: rcvrReg]]]].
  [SubRR] -> [argIsInt
  ifTrue:
  [self SubCq: argInt - ConstZero R: destReg.
  jumpContinue := self JumpNoOverflow: 0.
  "overflow; must undo the damage before doing send"
  rcvrReg = destReg ifTrue:
+ [self AddcCq: argInt - ConstZero R: rcvrReg]]
- [self AddCq: argInt - ConstZero R: rcvrReg]]
  ifFalse:
  [(self anyReferencesToRegister: argReg inAllButTopNItems: 0)
  ifTrue: "argReg is live; cannot strip tags and continue on no overflow without restoring tags"
  [objectRepresentation genRemoveSmallIntegerTagsInScratchReg: argReg.
  self SubR: argReg R: destReg.
  jumpOverflow := self JumpOverflow: 0.
  "no overflow; must undo the damage before continuing"
  objectRepresentation genSetSmallIntegerTagsIn: argReg.
  jumpContinue := self Jump: 0.
  jumpOverflow jmpTarget: self Label.
  "overflow; must undo the damage before doing send"
  ((rcvrIsInt and: [rcvrIsConst]) or: [destReg ~= rcvrReg]) ifFalse:
+ [self AddcR: argReg R: destReg].
- [self AddR: argReg R: destReg].
  objectRepresentation genSetSmallIntegerTagsIn: argReg]
  ifFalse:
  [objectRepresentation genRemoveSmallIntegerTagsInScratchReg: argReg.
  self SubR: argReg R: destReg.
  jumpContinue := self JumpNoOverflow: 0.
  "overflow; must undo the damage before doing send"
  ((rcvrIsInt and: [rcvrIsConst]) or: [destReg ~= rcvrReg]) ifFalse:
+ [self AddcR: argReg R: rcvrReg].
- [self AddR: argReg R: rcvrReg].
  objectRepresentation genSetSmallIntegerTagsIn: argReg]]].
  [AndRR] -> [argIsInt
  ifTrue: [self AndCq: argInt R: destReg]
  ifFalse: [self AndR: argReg R: destReg].
+ jumpContinue := jumpNotSmallInts ifNotNil: [self Jump: 0]].
- jumpContinue := self Jump: 0].
  [OrRR] -> [argIsInt
  ifTrue: [self OrCq: argInt R: destReg]
  ifFalse: [self OrR: argReg R: destReg].
+ jumpContinue := jumpNotSmallInts ifNotNil: [self Jump: 0]] }.
+ jumpNotSmallInts
+ ifNil: [jumpContinue ifNil: "overflow cannot happen"
+ [self annotateInstructionForBytecode.
+ self ssPushRegister: destReg.
+ ^0]]
+ ifNotNil:
+ [jumpNotSmallInts jmpTarget: self Label].
- jumpContinue := self Jump: 0] }.
- jumpNotSmallInts jmpTarget: self Label.
  self ssPushRegister: destReg.
  self copySimStackToScratch: (simSpillBase min: simStackPtr - 1).
  self ssPop: 1.
  self ssFlushTo: simStackPtr.
  rcvrReg = Arg0Reg
  ifTrue:
  [argReg = ReceiverResultReg
  ifTrue: [self SwapR: Arg0Reg R: Arg0Reg Scratch: TempReg. argReg := Arg0Reg]
  ifFalse: [self MoveR: rcvrReg R: ReceiverResultReg].
  rcvrReg := ReceiverResultReg].
  argIsInt
  ifTrue: [self MoveCq: argInt R: Arg0Reg]
  ifFalse: [argReg ~= Arg0Reg ifTrue: [self MoveR: argReg R: Arg0Reg]].
  rcvrReg ~= ReceiverResultReg ifTrue: [self MoveR: rcvrReg R: ReceiverResultReg].
  index := byte0 - self firstSpecialSelectorBytecodeOffset.
  self genMarshalledSend: index negated - 1 numArgs: 1 sendTable: ordinarySendTrampolines.
  self reconcileRegisterStateForJoinAfterSpecialSelectorSend.
  jumpContinue jmpTarget: self Label.
  ^0!

Item was changed:
  ----- Method: RegisterAllocatingCogit>>mergeCurrentSimStackWith:forwards: (in category 'bytecode generator support') -----
  mergeCurrentSimStackWith: fixup forwards: forwards
  "At a merge point the cogit expects the stack to be in the same state as mergeSimStack.
+ mergeSimStack is the state as of some jump forward or backward to this point.  So make
+ simStack agree with mergeSimStack (it is, um, problematic to plant code at the jump).
- mergeSimStack is the state as of some jump forward or backward to this point.  So make simStack agree
- with mergeSimStack (it is, um, problematic to plant code at the jump).
  Values may have to be assigned to registers.  Registers may have to be swapped.
  The state of optStatus must agree.
  Generate code to merge the current simStack with that of the target fixup,
  the goal being to keep as many registers live as possible.  If the merge is forwards
  registers can be deassigned (since registers are always written to temp vars).
  But if backwards, nothing can be deassigned, and the state /must/ reflect the target."
  "self printSimStack; printSimStack: fixup mergeSimStack"
  "abstractOpcodes object copyFrom: startIndex to: opcodeIndex"
  <var: #fixup type: #'BytecodeFixup *'>
+ | startIndex mergeSimStack |
- | startIndex mergeSimStack currentEntry targetEntry writtenToRegisters |
  <var: #mergeSimStack type: #'SimStackEntry *'>
  <var: #targetEntry type: #'SimStackEntry *'>
  <var: #currentEntry type: #'SimStackEntry *'>
  (mergeSimStack := fixup mergeSimStack) ifNil: [^self].
  startIndex := opcodeIndex. "for debugging"
  "Assignments amongst the registers must be made in order to avoid overwriting.
  If necessary exchange registers amongst simStack's entries to resolve any conflicts."
+ self reconcileRegistersInTempVarsInCurrentSimStackWithThoseIn: mergeSimStack.
- self resolveRegisterOrderConflictsBetweenCurrentSimStackAnd: mergeSimStack.
  (self asserta: (self conflictsResolvedBetweenSimStackAnd: mergeSimStack)) ifFalse:
  [Notification new tag: #failedMerge; signal].
- "Compute written to registers.  Perhaps we should use 0 in place of methodOrBlockNumTemps
- but Smalltalk does not assign to arguments."
- writtenToRegisters := 0.
  (self pushForMergeWith: mergeSimStack)
  ifTrue:
  [methodOrBlockNumTemps to: simStackPtr do:
+ [:i| self mergePushingWithEntryInTargetSimStack: mergeSimStack at: i]]
- [:i|
- currentEntry := self simStack: simStack at: i.
- targetEntry := self simStack: mergeSimStack at: i.
- writtenToRegisters := writtenToRegisters bitOr: targetEntry registerMask.
- (currentEntry reconcileForwardsWith: targetEntry) ifTrue:
- [self assert: i >= methodOrBlockNumTemps.
- self deassignRegisterForTempVar: targetEntry in: mergeSimStack.
- targetEntry
- type: SSRegister;
- register: targetEntry liveRegister].
- "Note, we could update the simStack and spillBase here but that is done in restoreSimStackAtMergePoint:
- spilled ifFalse:
- [simSpillBase := i - 1].
- simStack
- at: i
- put: (self
- cCode: [mergeSimStack at: i]
- inSmalltalk: [(mergeSimStack at: i) copy])"]]
  ifFalse:
  [simStackPtr to: methodOrBlockNumTemps by: -1 do:
+ [:i| self mergePoppingWithEntryInTargetSimStack: mergeSimStack at: i]].
+ "Still haven't handled simSpillBase."
+ self assert: (simSpillBase > simStackPtr
+ or: [simSpillBase < (methodOrBlockNumTemps max: 1)
+ or: [(self simStack: mergeSimStack at: simSpillBase - 1) spilled]]).
+ fixup isReceiverResultRegSelf ifTrue:
+ [optStatus isReceiverResultRegLive ifFalse:
+ [self putSelfInReceiverResultReg]]!
- [:i|
- currentEntry := self simStack: simStack at: i.
- targetEntry := self simStack: mergeSimStack at: i.
- writtenToRegisters := writtenToRegisters bitOr: targetEntry registerMask.
- (currentEntry reconcileForwardsWith: targetEntry) ifTrue:
- [self assert: i >= methodOrBlockNumTemps.
- self deassignRegisterForTempVar: targetEntry in: mergeSimStack.
- targetEntry
- type: SSRegister;
- register: targetEntry liveRegister].
- "Note, we could update the simStack and spillBase here but that is done in restoreSimStackAtMergePoint:
- spilled ifFalse:
- [simSpillBase := i - 1].
- simStack
- at: i
- put: (self
- cCode: [mergeSimStack at: i]
- inSmalltalk: [(mergeSimStack at: i) copy])"]].
- "Note that since we've deassigned any conflicts beyond the temps above we need only compare the temps here."
- methodOrBlockNumTemps - 1 to: 0 by: -1 do:
- [:i|
- targetEntry := self simStack: mergeSimStack at: i.
- (targetEntry registerMask noMask: writtenToRegisters) ifTrue:
- [currentEntry := self simStack: simStack at: i.
- writtenToRegisters := writtenToRegisters bitOr: targetEntry registerMask.
- (currentEntry reconcileForwardsWith: targetEntry) ifTrue:
- [self assert: i >= methodOrBlockNumTemps.
- self deassignRegisterForTempVar: targetEntry in: mergeSimStack]]].
- optStatus isReceiverResultRegLive ifFalse:
- [forwards
- ifTrue: "a.k.a. fixup isReceiverResultRegSelf: (fixup isReceiverResultRegSelf and: [optStatus isReceiverResultRegLive])"
- [fixup isReceiverResultRegSelf: false]
- ifFalse:
- [fixup isReceiverResultRegSelf ifTrue:
- [self putSelfInReceiverResultReg]]]!

Item was added:
+ ----- Method: RegisterAllocatingCogit>>mergePoppingWithEntryInTargetSimStack:at: (in category 'bytecode generator support') -----
+ mergePoppingWithEntryInTargetSimStack: mergeSimStack at: i
+ "Merge an intermediate result on currentSimStack with the corresponding one in target's mergeSimStack.
+ Depending on spilledness, the stack may need to be pushed or popped, or simply a register assignment made."
+ | currentEntry targetEntry |
+ <inline: true>
+ currentEntry := self simStack: simStack at: i.
+ targetEntry := self simStack: mergeSimStack at: i.
+ currentEntry reconcilePoppingWith: targetEntry.
+ "Note, we could update the simStack and spillBase here but that is done in restoreSimStackAtMergePoint:
+ spilled ifFalse:
+ [simSpillBase := i - 1].
+ simStack
+ at: i
+ put: (self
+ cCode: [mergeSimStack at: i]
+ inSmalltalk: [(mergeSimStack at: i) copy])"!

Item was added:
+ ----- Method: RegisterAllocatingCogit>>mergePushingWithEntryInTargetSimStack:at: (in category 'bytecode generator support') -----
+ mergePushingWithEntryInTargetSimStack: mergeSimStack at: i
+ "Merge an intermediate result on currentSimStack with the corresponding one in target's mergeSimStack.
+ Depending on spilledness, the stack may need to be pushed or popped, or simply a register assignment made."
+ | currentEntry targetEntry |
+ <inline: true>
+ currentEntry := self simStack: simStack at: i.
+ targetEntry := self simStack: mergeSimStack at: i.
+ (currentEntry reconcilePushingWith: targetEntry) ifTrue:
+ [self assert: i >= methodOrBlockNumTemps.
+ self deassignRegisterForTempVar: targetEntry in: mergeSimStack.
+ targetEntry
+ type: SSRegister;
+ register: targetEntry liveRegister].
+ "Note, we could update the simStack and spillBase here but that is done in restoreSimStackAtMergePoint:
+ spilled ifFalse:
+ [simSpillBase := i - 1].
+ simStack
+ at: i
+ put: (self
+ cCode: [mergeSimStack at: i]
+ inSmalltalk: [(mergeSimStack at: i) copy])"!

Item was changed:
  ----- Method: RegisterAllocatingCogit>>mergeRequiredForJumpTo: (in category 'bytecode generator support') -----
  mergeRequiredForJumpTo: targetPC
  "While this is a multi-pass compiler, no intermediate control-flow graph is built from bytecode and
  there is a monotonically increasing one-to-one relationship between bytecode pcs and machine
  code pcs that map to one another.  Therefore, when jumping forward, any required code to merge
  the state of the current simStack with that at the target must be generated before the jump
  (because at the target the simStack state will be whatever falls through). If only one forward jump
  to the target exists then that jump can simply install its simStack as the required simStack at the
  target and the merge code wil be generated just before the target as control falls through.  But if
  there are two or more forward jumps to the target, a situation that occurs given that the
  StackToRegisterMappingCogit follows jump chains, then jumps other than the first must generate
  merge code before jumping.  This poses a problem for conditional branches.  The merge code must
  only be generated along the path that takes the jump  Therefore this must *not* be generated:
 
  ... merge code ...
  jump cond Ltarget
 
  which incorrectly executes the merge code along both the taken and untaken paths.  Instead
  this must be generated so that the merge code is only executed if the branch is taken.
 
  jump not cond Lcontinue
  ... merge code ...
  jump Ltarget
  Lcontinue:
 
  Note that no merge code is required for code such as self at: (expr ifTrue: [1] ifFalse: [2])
  17 <70> self
  18 <71> pushConstant: true
  19 <99> jumpFalse: 22
  20 <76> pushConstant: 1
  21 <90> jumpTo: 23
  22 <77> pushConstant: 2
  23 <C0> send: at:
  provided that 1 and 2 are assigned to the same target register."
  | fixup |
  (fixup := self fixupAt: targetPC) hasMergeSimStack ifFalse:
  [^false].
+ self assert: (simStackPtr = fixup simStackPtr or: [fixup isBackwardBranchFixup]).
- self assert: simStackPtr = fixup simStackPtr.
  ^self mergeRequiredToTarget: fixup mergeSimStack!

Item was changed:
  ----- Method: RegisterAllocatingCogit>>moveVolatileSimStackEntriesToRegisters (in category 'bytecode generator support') -----
  moveVolatileSimStackEntriesToRegisters
  "When jumping forward to a merge point the stack must be reconcilable with the state that falls through to the merge point.
  We cannot easily arrange that later we add code to the branch, e.g. to spill values.  Instead, any volatile contents must be
+ moved to registers.
+ [In fact, that's not exactly true, consider these two code sequences:
- moved to registers.  [In fact, that's not exactly true, consider these two code sequences:
  self at: (expr ifTrue: [1] ifFalse: [2]) put: a
  self at: 1 put: (expr ifTrue: [a] ifFalse: [b])
  The first one needs 1 saving to a register to reconcile with 2.
+ The second one has 1 on both paths, but we're not clever enough to spot this case yet.
+ First of all, if the constant requires an annotation then it is difficult to deal with.  But if the constant
+ does not require an annotation one way would be for a SimStackEntry for an SSConstant to refer to
+ the loading instruction and then at the merge simply change the loading instruction to a Label if the
+ constant is the same on both branches].
- The second one has 1 on both paths, but we're not clever enough to spot this case yet.]
  Volatile contents are anything not spilled to the stack, because as yet we can only merge registers."
  <inline: true>
  | allocatedRegs |
  <var: #desc type: #'SimStackEntry *'>
  allocatedRegs := self allocatedRegisters.
  (simSpillBase max: 0) to: simStackPtr do:
  [:i| | desc reg |
  desc := self simStackAt: i.
  desc spilled
  ifTrue: [simSpillBase := i]
  ifFalse:
  [desc registerOrNone = NoReg ifTrue:
  [reg := self allocateRegNotConflictingWith: allocatedRegs.
  reg = NoReg
  ifTrue: [self halt] "have to spill"
  ifFalse:
  [desc storeToReg: reg.
  allocatedRegs := allocatedRegs bitOr: (self registerMaskFor: reg)]]]].
  self deny: self duplicateRegisterAssignmentsInTemporaries!

Item was added:
+ ----- Method: RegisterAllocatingCogit>>reconcileRegistersInTempVarsInCurrentSimStackWithThoseIn: (in category 'bytecode generator support') -----
+ reconcileRegistersInTempVarsInCurrentSimStackWithThoseIn: mergeSimStack
+ <var: #mergeSimStack type: #'SimStackEntry *'>
+ 0 to: methodOrBlockNumTemps - 1 do:
+ [ :i | | current target |
+ current := self simStack: simStack at: i.
+ target := self simStack: mergeSimStack at: i.
+ target registerMaskOrNone ~= 0
+ ifTrue:
+ [ target registerMaskOrNone ~= current registerMaskOrNone ifTrue:
+ [ self swap: target with: current at: i]]
+ ifFalse: [current liveRegister: NoReg]].
+ ^0!

Item was removed:
- ----- Method: RegisterAllocatingCogit>>resolveRegisterOrderConflictsBetweenCurrentSimStackAnd: (in category 'bytecode generator support') -----
- resolveRegisterOrderConflictsBetweenCurrentSimStackAnd: mergeSimStack
- <var: #mergeSimStack type: #'SimStackEntry *'>
- "One simple algorithm is to spill everything if there are any conflicts and then pop back.
- But this is terrible :-(  Can we do better? Yes... Consider the following two simStacks
- target: 0: | rA | __ | rB | rC | rD | <- sp
- current: 0: | __ | __ | rD | rA | rC | <- sp
- If we were to assign in a naive order, 0 through sp rA would be overwritten before its value in current[3] is written to rC,
- and rC would be overwritten before its value in current[4] is written to rD.  But if we swap the registers in current so that
- they respect the reverse ordering in target we can assign directly:
- swap current[3] & current[4]
- 0: | __ | __ | rD | rC | rA | <- sp
- now do the assignment in the order target[0] := current[0],  target[1] := current[1], ...  target[4] := current[4],
- i.e. rA := current[0]; rB := rD; (rC := rC); (rD := rD).
-
- So find any conflicts, and if there are any, swap registers in the simStack to resolve them.
- The trivial case of a single conflict is resolved by assigning that conflict to TempReg."
- | conflictingRegsMask |
- conflictingRegsMask := self conflictingRegistersBetweenSimStackAnd: mergeSimStack.
- conflictingRegsMask ~= 0 ifTrue:
- [(self isAPowerOfTwo: conflictingRegsMask) "Multiple conflicts mean we have to sort"
- ifFalse: [self swapCurrentRegistersInMask: conflictingRegsMask accordingToRegisterOrderIn: mergeSimStack]
- ifTrue: [self assignToTempRegConflictingRegisterIn: conflictingRegsMask]].!

Item was changed:
  ----- Method: RegisterAllocatingCogit>>ssAllocateRequiredRegMask:upThrough:upThroughNative: (in category 'simulation stack') -----
  ssAllocateRequiredRegMask: requiredRegsMask upThrough: stackPtr upThroughNative: nativeStackPtr
  "Override to void any required registers in temp vars."
  (requiredRegsMask anyMask: (self registerMaskFor: ReceiverResultReg)) ifTrue:
  [optStatus isReceiverResultRegLive: false.
  optStatus ssEntry liveRegister: NoReg].
  0 to: methodOrBlockNumTemps - 1 do:
  [:i|
  ((self simStackAt: i) registerMask anyMask: requiredRegsMask) ifTrue:
+ [(self simStackAt: i) liveRegister: NoReg]].
- [(self simStackAt: i) liveRegister: 0]].
  super ssAllocateRequiredRegMask: requiredRegsMask upThrough: stackPtr upThroughNative: nativeStackPtr!

Item was added:
+ ----- Method: RegisterAllocatingCogit>>swap:with:at: (in category 'bytecode generator support') -----
+ swap: target with: current at: index
+ "Swaps the registers between target and current.
+ target is guaranteed to be in a register. Current is not.
+ If current is in a register, just perform a register swap and update the simStack.
+ If current is not in a register, free the target register and use it.
+ Invariant:
+ items in current's simStack up to index have been resolved with target because we are visiting the stack in order 0 to siumStackPtr.
+ Strategy:
+ since the target simStack is valid (it has a unique disposition of temps) we can
+ spill to obtain registers (since once an entry is written to ther stack its register, if any, can be freed)
+ pop to assign after fully spilling (if necessary)"
+ | currentLiveRegisters |
+ self assert: target registerMaskOrNone ~= 0.
+ current registerMaskOrNone ~= 0 ifTrue:
+ [ self SwapR: target liveRegister R: current liveRegister Scratch: RISCTempReg.
+  methodOrBlockNumTemps to: simStackPtr do:
+ [:i| | localCurrent |
+ localCurrent := self simStack: simStack at: i.
+ localCurrent liveRegister = current liveRegister
+ ifTrue: [ localCurrent liveRegister: target liveRegister ]
+ ifFalse: [ localCurrent liveRegister = target liveRegister
+ ifTrue: [ localCurrent liveRegister: current liveRegister ] ] ].
+ current liveRegister: target liveRegister.
+ ^ 0 ].
+ 0 to: index -1 do: [:j | self assert: (self simStack: simStack at: j) liveRegister ~= target liveRegister].
+
+ currentLiveRegisters := self liveRegistersExceptingTopNItems: 0 in: simStack.
+ (self register: target liveRegister isInMask: currentLiveRegisters) ifTrue:
+ [self ssAllocateRequiredReg: target liveRegister].
+ "Now target liveRegister is available. we set it."
+ current storeToReg: target liveRegister.
+ ^0!

Item was removed:
- ----- Method: RegisterAllocatingCogit>>swapCurrentRegistersInMask:accordingToRegisterOrderIn: (in category 'bytecode generator support') -----
- swapCurrentRegistersInMask: conflictingRegsMask accordingToRegisterOrderIn: mergeSimStack
- <var: #mergeSimStack type: #'SimStackEntry *'>
- "Swap liveRegisters in simStack entries according to their order in mergeSimStack so as to avoid
- overwriting live registers when merging simStack into mergeSimStack.  Consider the following two simStacks
- target: 0: | rA | __ | rB | rC | rD | <- sp
- current: 0: | __ | __ | rD | rA | rC | <- sp
- If we were to assign in a naive order, 0 through sp rA would be overwritten before its value in current[3] is written to rC,
- and rC would be overwritten before its value in current[4] is written to rD.  But if we swap the registers in current so that
- they respect the reverse ordering in target we can assign directly:
- swap current[3] & current[4]
- 0: | __ | __ | rD | rC | rA | <- sp
- now do the assignment in the order target[0] := current[0],  target[1] := current[1], ...  target[4] := current[4],
- i.e. rA := current[0]; rB := rD; (rC := rC); (rD := rD).
-
- See https://hal.inria.fr/inria-00435844/file/article-hal.pdf
- Florent Bouchez, Quentin Colombet, Alain Darte, Christophe Guillon, Fabrice Rastello.
- Parallel Copy Motion. SCOPES, ACM, 2010, pp.0. <inria-00435844>
-
- So find any conflicts, and if there are any, swap registers in the simStack to resolve them."
-
- "self printSimStack; printSimStack: mergeSimStack"
-
- "Some processors have a SwapRR but not all.  Write one-size-fits-all code that moves things through TempReg."
- | order n visitedMask ssEntry regA regB |
- <var: 'order' declareC: 'sqInt order[8*BytesPerWord]'>
- <var: 'ssEntry' type: #'SimStackEntry *'>
- self cCode: [self me: order ms: 0 et: (self sizeof: order)]
- inSmalltalk: [order := CArrayAccessor on: (Array new: 8*BytesPerWord withAll: 0)].
- n := 0.
- visitedMask := conflictingRegsMask.
- 0 to: methodOrBlockNumTemps - 1 do:
- [:i|
- ssEntry := self simStack: mergeSimStack at: i.
- (ssEntry registerMaskOrNone anyMask: visitedMask) ifTrue:
- [order at: ssEntry registerOrNone put: (n := n + 1).
- visitedMask := visitedMask - ssEntry registerMaskOrNone]].
- self assert: n >= 1.
- n <= 2 ifTrue: "simple case; here to show me what I have to do in addition to the sort"
- [regA := conflictingRegsMask highBit - 1.
- regB := (conflictingRegsMask - (1 << regA)) highBit - 1.
- self SwapR: regA R: regB Scratch: TempReg.
- 0 to: simStackPtr do:
- [:i|
- ssEntry := self simStack: simStack at: i.
- (ssEntry registerMaskOrNone anyMask: conflictingRegsMask) ifTrue:
- [| reg |
- reg := ssEntry registerOrNone = regA ifTrue: [regB] ifFalse: [regA].
- ssEntry type = SSRegister ifTrue:
- [ssEntry register: reg].
- ssEntry liveRegister: reg]].
- ^self].
-
- self halt!

Item was added:
+ ----- Method: SistaCogitClone class>>methodZoneClass (in category 'accessing class hierarchy') -----
+ methodZoneClass
+ ^SistaMethodZone!

Item was added:
+ ----- Method: SistaCogitClone>>defaultCogCodeSize (in category 'accessing') -----
+ defaultCogCodeSize
+ "Return the default number of bytes to allocate for native code at startup.
+ The actual value can be set via vmParameterAt: and/or a preference in the ini file."
+ <api>
+ ^2 * backEnd getDefaultCogCodeSize!

Item was added:
+ ----- Method: SistaCogitClone>>genAtPutInlinePrimitive: (in category 'inline primitive generators') -----
+ genAtPutInlinePrimitive: prim
+ "Unary inline primitives."
+ "SistaV1: 248 11111000 iiiiiiii mjjjjjjj Call Primitive #iiiiiiii + (jjjjjjj * 256) m=1 means inlined primitive, no hard return after execution.
+ See EncoderForSistaV1's class comment and StackInterpreter>>#trinaryInlinePrimitive:"
+ | ra1 ra2 rr adjust needsStoreCheck |
+ "The store check requires rr to be ReceiverResultReg"
+ needsStoreCheck := (objectRepresentation isUnannotatableConstant: self ssTop) not.
+ self
+ allocateRegForStackTopThreeEntriesInto: [:rTop :rNext :rThird | ra2 := rTop. ra1 := rNext. rr := rThird ]
+ thirdIsReceiver: (prim = 0 and: [ needsStoreCheck ]).
+ self assert: (rr ~= ra1 and: [rr ~= ra2 and: [ra1 ~= ra2]]).
+ self ssTop popToReg: ra2.
+ self ssPop: 1.
+ self ssTop popToReg: ra1.
+ self ssPop: 1.
+ self ssTop popToReg: rr.
+ self ssPop: 1.
+ objectRepresentation genConvertSmallIntegerToIntegerInReg: ra1.
+ "Now: ra is the variable object, rr is long, TempReg holds the value to store."
+ self flag: #TODO. "This is not really working as the immutability and store check needs to be present. "
+ prim caseOf: {
+ "0 - 1 pointerAt:put: and byteAt:Put:"
+ [0] -> [ adjust := (objectMemory baseHeaderSize >> objectMemory shiftForWord) - 1. "shift by baseHeaderSize and then move from 1 relative to zero relative"
+ adjust ~= 0 ifTrue: [ self AddCq: adjust R: ra1. ].
+ self MoveR: ra2 Xwr: ra1 R: rr.
+ "I added needsStoreCheck so if you initialize an array with a Smi such as 0 or a boolean you don't need the store check"
+ needsStoreCheck ifTrue:
+ [ self assert: needsFrame.
+ objectRepresentation genStoreCheckReceiverReg: rr valueReg: ra2 scratchReg: TempReg inFrame: true] ].
+ [1] -> [ objectRepresentation genConvertSmallIntegerToIntegerInReg: ra2.
+ adjust := objectMemory baseHeaderSize - 1. "shift by baseHeaderSize and then move from 1 relative to zero relative"
+ self AddCq: adjust R: ra1.
+ self MoveR: ra2 Xbr: ra1 R: rr.
+ objectRepresentation genConvertIntegerToSmallIntegerInReg: ra2. ].
+ }
+ otherwise: [^EncounteredUnknownBytecode].
+ self ssPushRegister: ra2.
+ ^0!

Item was added:
+ ----- Method: SistaCogitClone>>genByteEqualsInlinePrimitive: (in category 'inline primitive generators') -----
+ genByteEqualsInlinePrimitive: prim
+
+ "3021 Byte Object >> equals:length:
+ The receiver and the arguments are both byte objects and have both the same size (length in bytes).
+ The length argument is a smallinteger.
+ Answers true if all fields are equal, false if not.
+ Comparison is bulked to word comparison."
+
+ "Overview:
+ 1. The primitive is called like that: [byteObj1 equals: byteObj2 length: length].
+   In the worst case we use 5 registers including TempReg
+ and we produce a loop bulk comparing words.
+ 2. The common case is a comparison against a cst: [byteString = 'foo'].
+ which produces in Scorch [byteString equals: 'foo' length: 3].
+ We try to generate fast code for this case with 3 heuristics:
+ - specific fast code if len is a constant
+ - unroll the loop if len < 2 * wordSize
+ - compile-time reads if str1 or str2 is a constant and loop is unrolled.
+ We use 3 registers including TempReg in the common case.
+ We could use 1 less reg if the loop is unrolled, the instr is followed by a branch
+ AND one operand is a constant, but this is complicated enough.
+ 3. We ignore the case where all operands are constants
+ (We assume Scorch simplifies it, it works but it is not optimised)"
+
+ | str1Reg str2Reg lenReg extraReg jmp jmp2 needjmpZeroSize needLoop unroll jmpZeroSize instr lenCst mask |
+ <var: #jmp type: #'AbstractInstruction *'>
+ <var: #instr type: #'AbstractInstruction *'>
+ <var: #jmp2 type: #'AbstractInstruction *'>
+ <var: #jmpZeroSize type: #'AbstractInstruction *'>
+
+ "--- quick path for empty string---"
+ "This path does not allocate registers and right shift on negative int later in the code.
+ Normally this is resolved by Scorch but we keep it for correctness and consistency"
+ self ssTop type = SSConstant ifTrue:
+ [ lenCst := objectMemory integerValueOf: self ssTop constant.
+  lenCst = 0 ifTrue: [ self ssPop: 3. self ssPushConstant: objectMemory trueObject. ^ 0 ] ].
+
+ "--- Allocating & loading registers --- "
+ needLoop := (self ssTop type = SSConstant and: [ lenCst <= (objectMemory wordSize * 2) ]) not.
+ unroll := needLoop not and: [lenCst > objectMemory wordSize ].
+ needLoop
+ ifTrue:
+ [ str1Reg := self allocateRegForStackEntryAt: 1 notConflictingWith: self emptyRegisterMask.
+  str2Reg := self allocateRegForStackEntryAt: 2 notConflictingWith: (self registerMaskFor: str1Reg).
+  lenReg := self allocateRegForStackEntryAt: 0 notConflictingWith: (self registerMaskFor:str1Reg and: str2Reg).
+  (self ssValue: 1) popToReg: str1Reg.
+  (self ssValue: 2) popToReg: str2Reg.
+  extraReg := self allocateRegNotConflictingWith: (self registerMaskFor: str1Reg and: str2Reg and: lenReg)]
+ ifFalse:
+ [ mask := self emptyRegisterMask.
+  (self ssValue: 1) type = SSConstant ifFalse:
+ [ str1Reg := self allocateRegForStackEntryAt: 1 notConflictingWith: mask.
+  (self ssValue: 1) popToReg: str1Reg.
+  mask := mask bitOr: (self registerMaskFor: str1Reg) ].
+  (self ssValue: 2) type = SSConstant ifFalse:
+ [ str2Reg := self allocateRegForStackEntryAt: 2 notConflictingWith: mask.
+  (self ssValue: 2) popToReg: str2Reg.
+  mask := mask bitOr: (self registerMaskFor: str2Reg) ].
+  extraReg := self allocateRegNotConflictingWith: mask].
+
+ "--- Loading LenReg (or statically resolving it) --- "
+ "LenReg is loaded with (lenInBytes + objectMemory baseHeaderSize - 1 >> shiftForWord)
+ LenReg is the index for the last word to compare with MoveXwr:r:R:.
+ The loop iterates from LenReg to first word of ByteObj"
+ self ssTop type = SSConstant
+ ifTrue: "common case, str = 'foo'. We can precompute lenReg."
+ [ lenCst := lenCst + objectMemory baseHeaderSize - 1 >> objectMemory shiftForWord.
+  needLoop ifTrue: [self MoveCq: lenCst R: lenReg ].
+  needjmpZeroSize := false]
+ ifFalse: "uncommon case, str = str2. lenReg in word computed at runtime."
+ [ self ssTop popToReg: lenReg.
+  objectRepresentation genConvertSmallIntegerToIntegerInReg: lenReg.
+  self CmpCq: 0 R: lenReg.
+  jmpZeroSize := self JumpZero: 0.
+  needjmpZeroSize := true.
+  self AddCq: objectMemory baseHeaderSize - 1 R: lenReg.
+  self ArithmeticShiftRightCq: objectMemory shiftForWord R: lenReg ].
+
+ "--- Comparing the strings --- "
+ "LenReg has the index of the last word to read (unless no loop).
+ We decrement it to adjust -1 (0 in 64 bits) while comparing"
+ needLoop
+ ifTrue:
+ [instr := self MoveXwr: lenReg R: str1Reg R: extraReg.
+ self MoveXwr: lenReg R: str2Reg R: TempReg.
+ self CmpR: extraReg R: TempReg.
+ jmp := self JumpNonZero: 0. "then string are not equal (jmp target)"
+ self AddCq: -1 R: lenReg.
+ self CmpCq: (objectMemory baseHeaderSize >> objectMemory shiftForWord) - 1 R: lenReg. "first word of ByteObj, stop looping."
+ self JumpNonZero: instr]
+ ifFalse: "Common case, only 1 or 2 word to check: no lenReg allocation, cst micro optimisations"
+ [self genByteEqualsInlinePrimitiveCmp: str1Reg with: str2Reg scratch1: extraReg scratch2: TempReg field: 0.
+ jmp := self JumpNonZero: 0. "then string are not equal (jmp target)"
+ unroll ifTrue: "unrolling more than twice generate more instructions than the loop so we don't do it"
+ [self genByteEqualsInlinePrimitiveCmp: str1Reg with: str2Reg scratch1: extraReg scratch2: TempReg field: 1.
+ jmp2 := self JumpNonZero: 0. "then string are not equal (jmp target)"]].
+ needjmpZeroSize ifTrue: [ jmpZeroSize jmpTarget: self Label ].
+ "fall through, strings are equal"
+
+ "--- Pushing the result or pipelining a branch --- "
+ self ssPop: 3.
+ self genByteEqualsInlinePrimitiveResult: jmp returnReg: extraReg.
+ unroll ifTrue: [jmp2 jmpTarget: jmp getJmpTarget].
+ ^0!

Item was added:
+ ----- Method: SistaCogitClone>>genByteEqualsInlinePrimitiveCmp:with:scratch1:scratch2:field: (in category 'inline primitive generators') -----
+ genByteEqualsInlinePrimitiveCmp: str1Reg with: str2Reg scratch1: scratch1Reg scratch2: scratch2Reg field: index
+ | shift |
+ <inline: true>
+ shift := objectMemory baseHeaderSize + (index * objectMemory wordSize).
+ (self ssValue: 1) type = SSConstant
+ ifTrue: [self MoveCq: (objectMemory fetchPointer: index ofObject: (self ssValue: 1) constant) R: scratch1Reg]
+ ifFalse: [self MoveMw: shift r: str1Reg R: scratch1Reg].
+ (self ssValue: 2) type = SSConstant
+ ifTrue: [self MoveCq: (objectMemory fetchPointer: index ofObject: (self ssValue: 2) constant) R: scratch2Reg]
+ ifFalse: [self MoveMw: shift r: str2Reg R: scratch2Reg].
+ self CmpR: scratch1Reg R: scratch2Reg.!

Item was added:
+ ----- Method: SistaCogitClone>>genByteEqualsInlinePrimitiveResult:returnReg: (in category 'inline primitive generators') -----
+ genByteEqualsInlinePrimitiveResult: jmp returnReg: reg
+ "Byte equal is falling through if the result is true, or jumping using jmp if the result is false.
+ The method is required to set the jump target of jmp.
+ We look ahead for a branch and pipeline the jumps if possible..
+ ReturnReg is used only if not followed immediately by a branch."
+ | branchDescriptor nextPC postBranchPC targetBytecodePC localJump canElide |
+ <var: #localJump type: #'AbstractInstruction *'>
+ <var: #branchDescriptor type: #'BytecodeDescriptor *'>
+ self extractMaybeBranchDescriptorInto: [ :descr :next :postBranch :target |
+ branchDescriptor := descr. nextPC := next. postBranchPC := postBranch. targetBytecodePC := target ].
+
+ "Case 1 - not followed by a branch"
+ (branchDescriptor isBranchTrue or: [branchDescriptor isBranchFalse])
+ ifFalse:
+ [self genMoveTrueR: reg.
+ localJump := self Jump: 0.
+ jmp jmpTarget: (self genMoveFalseR: reg).
+ localJump jmpTarget: self Label.
+ self ssPushRegister: reg.
+ ^ 0].
+
+ "Case 2 - followed by a branch"
+ (self fixupAt: nextPC) notAFixup
+ ifTrue: "The next instruction is dead.  we can skip it."
+ [deadCode := true.
+ self ensureFixupAt: targetBytecodePC.
+ self ensureFixupAt: postBranchPC ]
+ ifFalse:
+ [self ssPushConstant: objectMemory trueObject]. "dummy value"
+ "We can only elide the jump if the pc after nextPC is the same as postBranchPC.
+ Branch following means it may not be."
+ self nextDescriptorExtensionsAndNextPCInto:
+ [:iguana1 :iguana2 :iguana3 :followingPC| nextPC := followingPC].
+ canElide := deadCode and: [nextPC = postBranchPC].
+ branchDescriptor isBranchTrue
+ ifTrue:
+ [ self Jump: (self ensureNonMergeFixupAt: targetBytecodePC).
+  canElide
+ ifFalse: [ jmp jmpTarget: (self ensureNonMergeFixupAt: postBranchPC) ]
+ ifTrue: [ jmp jmpTarget: self Label ] ]
+ ifFalse: [ canElide ifFalse: [ self Jump: (self ensureNonMergeFixupAt: postBranchPC).
+ jmp jmpTarget: (self ensureNonMergeFixupAt: targetBytecodePC) ] ].
+ ^0!

Item was added:
+ ----- Method: SistaCogitClone>>genQuaternaryInlinePrimitive: (in category 'inline primitive generators') -----
+ genQuaternaryInlinePrimitive: prim
+ "Quaternary inline primitives."
+ "SistaV1: 248 11111000 iiiiiiii mjjjjjjj Call Primitive #iiiiiiii + (jjjjjjj * 256) m=1 means inlined primitive, no hard return after execution.
+ See EncoderForSistaV1's class comment and StackInterpreter>>#quaternaryInlinePrimitive:"
+ | needStoreCheck sourceReg stopReg objReg adjust jmp cmp isStartCst isStopCst startCst stopCst iteratorReg |
+ <var: #jmp type: #'AbstractInstruction *'>
+ <var: #cmp type: #'AbstractInstruction *'>
+ prim = 0 ifFalse: [^EncounteredUnknownBytecode].
+
+ "4000 Pointer Object>> fillFrom:to:with: The receiver is a Pointer object. the middle two arguments are smallintegers. Last argument is any object. Fills the object in between the two indexes with last argument. Receiver is guaranteed to be mutable. The pointer accesses are raw (no inst var check). If ExtB is set to 1, no store check is present. Else a single store check is done for the bulk operation. Answers the receiver."
+ needStoreCheck := self sistaNeedsStoreCheck.
+ extB := numExtB := 0.
+
+ "Allocate reg for src, objToStore, iterator and stop."
+ sourceReg := needStoreCheck
+ ifTrue: [ self ssAllocateRequiredReg: ReceiverResultReg.
+ self voidReceiverResultRegContainsSelf.
+ ReceiverResultReg ]
+ ifFalse: [ self allocateRegForStackEntryAt: 3 notConflictingWith: self emptyRegisterMask ].
+ (self ssValue: 3) popToReg: sourceReg.
+ objReg := self allocateRegForStackEntryAt: 0 notConflictingWith: (self registerMaskFor: sourceReg).
+ self ssTop popToReg: objReg.
+
+ "Set up iterator to first index to write and stop to last index to write"
+ adjust := (objectMemory baseHeaderSize >> objectMemory shiftForWord) - 1. "shift by baseHeaderSize and then move from 1 relative to zero relative"
+ isStartCst := (self ssValue: 2) type = SSConstant.
+ isStopCst := (self ssValue: 1) type = SSConstant.
+ isStartCst ifTrue: [startCst := adjust + (objectMemory integerValueOf: (self ssValue: 2) constant)].
+ isStopCst ifTrue: [stopCst := adjust + (objectMemory integerValueOf: (self ssValue: 1) constant)].
+
+ (isStartCst
+ and: [isStopCst
+ and: [stopCst - startCst < 7 ]]) "The other path generates at least 7 instructions"
+ ifTrue: ["unroll"
+ startCst
+ to: stopCst
+ do: [ :i | self MoveMw: i r: sourceReg R: objReg ] ]
+ ifFalse: ["loop"
+ stopReg := self allocateRegNotConflictingWith: (self registerMaskFor: sourceReg and: objReg).
+ iteratorReg := self allocateRegNotConflictingWith: (self registerMaskFor: sourceReg and: objReg and: stopReg).
+ isStartCst
+ ifTrue: [ self MoveCq: startCst R: iteratorReg ]
+ ifFalse: [ (self ssValue: 2) popToReg: iteratorReg.
+ adjust ~= 0 ifTrue: [ self AddCq: adjust R: iteratorReg ] ].
+ isStopCst
+ ifTrue: [ self MoveCq: stopCst R: stopReg ]
+ ifFalse: [ (self ssValue: 1) popToReg: stopReg.
+ adjust ~= 0 ifTrue: [ self AddCq: adjust R: stopReg ] ].
+ cmp := self CmpR: stopReg R: iteratorReg.
+ jmp := self JumpAbove: 0.
+ self MoveR: objReg Xwr: iteratorReg R: sourceReg.
+ self AddCq: 1 R: iteratorReg.
+ self Jump: cmp.
+ jmp jmpTarget: self Label].
+
+ needStoreCheck ifTrue: [objectRepresentation genStoreCheckReceiverReg: sourceReg valueReg: objReg scratchReg: TempReg inFrame: true].
+
+ self ssPop: 4.
+ self ssPushRegister: sourceReg.
+ ^0!

Item was added:
+ ----- Method: SistaCogitClone>>genQuinaryInlinePrimitive: (in category 'inline primitive generators') -----
+ genQuinaryInlinePrimitive: prim
+ "SistaV1: 248 11111000 iiiiiiii mjjjjjjj Call Primitive #iiiiiiii + (jjjjjjj * 256) m=1 means inlined primitive, no hard return after execution.
+ See EncoderForSistaV1's class comment and StackInterpreter>>#quaternaryInlinePrimitive:"
+ ^EncounteredUnknownBytecode!

Item was changed:
  ----- Method: SistaCogitClone>>genUnaryInlinePrimitive: (in category 'inline primitive generators') -----
  genUnaryInlinePrimitive: prim
  "Unary inline primitives."
  "SistaV1: 248 11111000 iiiiiiii mjjjjjjj Call Primitive #iiiiiiii + (jjjjjjj * 256) m=1 means inlined primitive, no hard return after execution.
  See EncoderForSistaV1's class comment and StackInterpreter>>#unaryInlinePrimitive:"
  | rcvrReg resultReg |
  rcvrReg := self allocateRegForStackEntryAt: 0.
  resultReg := self allocateRegNotConflictingWith: (self registerMaskFor: rcvrReg).
  prim
  caseOf: {
  "00 unchecked class"
  [1] -> "01 unchecked pointer numSlots"
  [self ssTop popToReg: rcvrReg.
  self ssPop: 1.
  objectRepresentation
  genGetNumSlotsOf: rcvrReg into: resultReg;
  genConvertIntegerToSmallIntegerInReg: resultReg].
  "02 unchecked pointer basicSize"
  [3] -> "03 unchecked byte numBytes"
  [self ssTop popToReg: rcvrReg.
  self ssPop: 1.
  objectRepresentation
  genGetNumBytesOf: rcvrReg into: resultReg;
  genConvertIntegerToSmallIntegerInReg: resultReg].
  "04 unchecked short16Type format numShorts"
  "05 unchecked word32Type format numWords"
  "06 unchecked doubleWord64Type format numDoubleWords"
  [11] -> "11 unchecked fixed pointer basicNew"
  [self ssTop type ~= SSConstant ifTrue:
  [^EncounteredUnknownBytecode].
  (objectRepresentation
+ genGetInstanceOfFixedClass: self ssTop constant
- genGetInstanceOf: self ssTop constant
  into: resultReg
  initializingIf: self extBSpecifiesInitializeInstance) ~= 0 ifTrue:
  [^ShouldNotJIT]. "e.g. bad class"
  self ssPop: 1] .
  [20] -> "20 identityHash"
  [objectRepresentation genGetIdentityHash: rcvrReg resultReg: resultReg.
  self ssPop: 1] .
  "21 identityHash (SmallInteger)"
  "22 identityHash (Character)"
  "23 identityHash (SmallFloat64)"
  "24 identityHash (Behavior)"
  "30 immediateAsInteger (Character)
  31 immediateAsInteger (SmallFloat64)
  35 immediateAsFloat  (SmallInteger) "
  [30] ->
  [self ssTop popToReg: resultReg.
  objectRepresentation genConvertCharacterToSmallIntegerInReg: resultReg.
  self ssPop: 1].
  [35] ->
  [self assert: self processorHasDoublePrecisionFloatingPointSupport.
  self MoveR: rcvrReg R: TempReg.
  self genConvertSmallIntegerToIntegerInReg: TempReg.
  self ConvertR: TempReg Rd: DPFPReg0.
  self flag: #TODO. "Should never fail"
  self
  genAllocFloatValue: DPFPReg0
  into: resultReg
  scratchReg: TempReg
  scratchReg: NoReg. "scratch2 for V3 only"]
   }
 
  otherwise:
  [^EncounteredUnknownBytecode].
  extB := 0.
  numExtB := 0.
  self ssPushRegister: resultReg.
  ^0!

Item was added:
+ ----- Method: SistaCogitClone>>isTrapAt: (in category 'simulation only') -----
+ isTrapAt: retpc
+ "For stack depth checking."
+ <doNotGenerate>
+ ^(backEnd isCallPrecedingReturnPC: retpc)
+ and: [(backEnd callTargetFromReturnAddress: retpc) = ceTrapTrampoline]!

Item was added:
+ ----- Method: SistaCogitClone>>setCogCodeZoneThreshold: (in category 'accessing') -----
+ setCogCodeZoneThreshold: threshold
+ <doNotGenerate>
+ ^methodZone setCogCodeZoneThreshold: threshold!

Item was added:
+ ----- Method: StackToRegisterMappingCogit>>printRegisterMask:on: (in category 'debug printing') -----
+ printRegisterMask: registerMask on: aStream
+ | first |
+ aStream nextPut: ${.
+ registerMask = 0
+ ifTrue:
+ [aStream nextPutAll: 'NoReg']
+ ifFalse:
+ [first := true.
+ 0 to: 31 do:
+ [:reg|
+ (registerMask anyMask: 1 << reg) ifTrue:
+ [first ifFalse: [aStream space].
+ first := false.
+ aStream nextPutAll: (backEnd nameForRegister: reg)]]].
+ aStream nextPut: $}; flush!

Item was changed:
  ----- Method: StackToRegisterMappingCogit>>ssAllocateRequiredRegMask:upThrough:upThroughNative: (in category 'simulation stack') -----
  ssAllocateRequiredRegMask: requiredRegsMask upThrough: stackPtr upThroughNative: nativeStackPtr
  | lastRequired lastRequiredNative liveRegs |
  lastRequired := -1.
  lastRequiredNative := -1.
  "compute live regs while noting the last occurrence of required regs.
  If these are not free we must spill from simSpillBase to last occurrence.
  Note we are conservative here; we could allocate FPReg in frameless methods."
  liveRegs := self registerMaskFor: FPReg and: SPReg.
  (simSpillBase max: 0) to: stackPtr do:
  [:i|
  liveRegs := liveRegs bitOr: (self simStackAt: i) registerMask.
  ((self simStackAt: i) registerMask bitAnd: requiredRegsMask) ~= 0 ifTrue:
  [lastRequired := i]].
  LowcodeVM ifTrue:
  [(simNativeSpillBase max: 0) to: nativeStackPtr do:
  [:i|
  liveRegs := liveRegs bitOr: (self simNativeStackAt: i) nativeRegisterMask.
  ((self simNativeStackAt: i) nativeRegisterMask anyMask: requiredRegsMask) ifTrue:
  [lastRequiredNative := i]]].
  "If any of requiredRegsMask are live we must spill."
  (liveRegs anyMask: requiredRegsMask) ifTrue:
  [self ssFlushTo: lastRequired nativeFlushTo: lastRequiredNative.
+ self deny: (self liveRegisters anyMask: requiredRegsMask)]!
- self assert: (self liveRegisters bitAnd: requiredRegsMask) = 0]!