VM Maker: VMMaker.oscog-eem.762.mcz

Previous Topic Next Topic
 
classic Classic list List threaded Threaded
1 message Options
Reply | Threaded
Open this post in threaded view
|

VM Maker: VMMaker.oscog-eem.762.mcz

commits-2
 
Eliot Miranda uploaded a new version of VMMaker to project VM Maker:
http://source.squeak.org/VMMaker/VMMaker.oscog-eem.762.mcz

==================== Summary ====================

Name: VMMaker.oscog-eem.762
Author: eem
Time: 7 June 2014, 7:09:38.813 am
UUID: ae2ff2b3-86a9-45a6-bab6-b009d27b1d99
Ancestors: VMMaker.oscog-tpr.761

Move BICCqR from CogRTLOpcodes to CogARMCompiler.
It's currently ARM-specific and there's a mechanism for
handling processor-specific opcodes.

Revert the ceCheckForInterrupts hack.  The pre-hack code
is correct.

Streamline the genPrimReturnEnterCogCodeEnilopmart:

But most importanty, fix CogRTLOpcodes class>>initialize to
automatically initialize the CogAbstractCompiler subclasses
after initializing its opcodes, so that the processor-specific
opcodes are correctly initialized.  The simulator breaks in
a visually interesting way (read most things continue to work)
if not.

=============== Diff against VMMaker.oscog-tpr.761 ===============

Item was changed:
  CogAbstractInstruction subclass: #CogARMCompiler
  instanceVariableNames: ''
+ classVariableNames: 'AL BICCqR CArg0Reg CArg1Reg CArg2Reg CArg3Reg CC CS EQ GE GT HI LDMFD LE LR LS LT MI NE PC PL R0 R1 R10 R11 R12 R2 R3 R4 R5 R6 R7 R8 R9 RISCTempReg SP STMFD VC VS'
- classVariableNames: 'AL CArg0Reg CArg1Reg CArg2Reg CArg3Reg CC CS EQ GE GT HI LDMFD LE LR LS LT MI NE PC PL R0 R1 R10 R11 R12 R2 R3 R4 R5 R6 R7 R8 R9 RISCTempReg SP STMFD VC VS'
  poolDictionaries: ''
  category: 'VMMaker-JIT'!
 
  !CogARMCompiler commentStamp: 'lw 8/23/2012 19:38' prior: 0!
  I generate ARM instructions from CogAbstractInstructions.  For reference see
  http://infocenter.arm.com/help/index.jsp?topic=/com.arm.doc.set.architecture/index.html
 
  The Architecture Reference Manual used is that of version 5, which includes some version 6 instructions. Of those, only pld is used(for PrefetchAw).
 
  This class does not take any special action to flush the instruction cache on instruction-modification.!

Item was changed:
  ----- Method: CogARMCompiler class>>initialize (in category 'class initialization') -----
  initialize
 
  "Initialize various ARM instruction-related constants."
  "CogARMCompiler initialize"
 
+ | specificOpcodes refs |
- | specificOpcodes refs conditionCodes |
  super initialize.
  self ~~ CogARMCompiler ifTrue: [^self].
 
  R0 := 0.
  R1 := 1.
  R2 := 2.
  R3 := 3.
  R4 := 4.
  R5 := 5.
  R6 := 6.
  R7 := 7.
  R8 := 8.
  R9 := 9.
  R10 := 10.
  R11 := 11.
  R12 := 12..
  SP := 13..
  LR := 14.
  PC := 15.
 
  CArg0Reg := 0.
  CArg1Reg := 1.
  CArg2Reg := 2.
  CArg3Reg := 3.
 
  RISCTempReg := R10.
 
  "Condition Codes. Note that cc=16rF is NOT ALLOWED as a condition; it specifies an extension instruction. See e.g.ARM_ARM v5 DDI01001.pdf A3.2.1"
+ EQ := 0.
+ NE := 1.
+ CS := 2.
+ CC := 3.
+ MI := 4.
+ PL := 5.
+ VS := 6.
+ VC := 7.
+ HI := 8.
+ LS := 9.
+ GE := 10.
+ LT := 11.
+ GT := 12.
+ LE := 13.
+ AL := 14.
- conditionCodes := #(EQ NE CS CC MI PL VS VC HI LS GE LT GT LE AL).
- "this is a somewhat odd way to set class variables - were they originally globals?"
- conditionCodes withIndexDo: [ :classVarName :value |
- self classPool
- declare: classVarName from: Undeclared;
- at: classVarName put: value - 1].
 
  "Specific instructions"
  LastRTLCode isNil ifTrue:
  [CogRTLOpcodes initialize].
+ specificOpcodes := #(LDMFD STMFD BICCqR).
- specificOpcodes := #(LDMFD STMFD).
  refs := (thisContext method literals select: [:l| l isVariableBinding and: [classPool includesKey: l key]]) collect:
  [:ea| ea key].
+ (classPool keys reject: [:k| (specificOpcodes includes: k) or: [refs includes: k]]) do:
- (classPool keys reject: [:k| (specificOpcodes includes: k) or: [refs includes: k] or: [conditionCodes includes: k]]) do:
  [:k|
  Undeclared declare: k from: classPool].
  specificOpcodes withIndexDo:
  [:classVarName :value|
  self classPool
  declare: classVarName from: Undeclared;
  at: classVarName put: value + LastRTLCode - 1]!

Item was added:
+ ----- Method: CogARMCompiler class>>specificOpcodes (in category 'class initialization') -----
+ specificOpcodes
+ "Answer the processor-specific opcodes for this class.
+ They're all in an Array literal in the initialize method."
+ ^(self class >> #initialize) literals detect: [:l| l isArray and: [l includes: #LDMFD]]!

Item was changed:
  ----- Method: CogARMCompiler>>concretizeRetN (in category 'generate machine code - concretize') -----
  concretizeRetN
  "Will get inlined into concretizeAt: switch."
  <inline: true>
  | offset |
  offset := operands at: 0.
  offset = 0 ifTrue:
  [self machineCodeAt: 0 put: (self mov: PC rn: LR). "pop {pc}"
  ^machineCodeSize := 4].
  self assert: offset < 32. "We have an 8 bit immediate. If needed, we could rotate it less than 30 bit."
  "add sp, sp, #n, ROR (15<<2) <- ie shift left 2 to convert words to bytes"
  self machineCodeAt: 0 put: (self add: SP rn: SP imm: offset ror: 30).
  self machineCodeAt: 4 put: (self mov: PC rn: LR).  "pop {pc}"
  ^machineCodeSize := 8!

Item was changed:
  ----- Method: CogARMCompiler>>genAlignCStackSavingRegisters:numArgs:wordAlignment: (in category 'abi') -----
  genAlignCStackSavingRegisters: saveRegs numArgs: numArgs wordAlignment: alignment
+ "ARM needs 8 byte stack alignment but it's hard to be sure where the stack is at this
+ point due to the complexities of whether we push the return address or not.  So do
+ a simple bitAnd to effectively round-down the SP - except the vagaries of the ARM
+ instruction set means we actually need a BIC sp, sp, $7"
+
+ cogit gen: BICCqR operand: 2r111 operand: SPReg.
+ ^0!
- "ARM needs 8 byte stack alignment but it's hard to be sure where the stack is at this point due to the cpmplexities of whether we push the return address or not. So do a simple bitAnd to effectively round-down the SP - except the vagaries of the ARM instruction set means we actually need a BIC sp, sp, $7"
- cogit BICCq: 2r111 R: SPReg.
- ^ 0!

Item was changed:
  SharedPool subclass: #CogRTLOpcodes
  instanceVariableNames: ''
+ classVariableNames: 'AddCqR AddCwR AddRR AddRdRd AlignmentNops AndCqR AndCwR AndRR Arg0Reg Arg1Reg ArithmeticShiftRightCqR ArithmeticShiftRightRR Call ClassReg CmpCqR CmpCwR CmpRR CmpRdRd ConvertRRd DPFPReg0 DPFPReg1 DPFPReg2 DPFPReg3 DPFPReg4 DPFPReg5 DPFPReg6 DPFPReg7 DivRdRd FPReg Fill16 Fill32 Fill8 FillBytesFrom FillFromWord FirstJump FirstShortJump GPRegMax GPRegMin Jump JumpAbove JumpAboveOrEqual JumpBelow JumpBelowOrEqual JumpCarry JumpFPEqual JumpFPGreater JumpFPGreaterOrEqual JumpFPLess JumpFPLessOrEqual JumpFPNotEqual JumpFPOrdered JumpFPUnordered JumpGreater JumpGreaterOrEqual JumpLess JumpLessOrEqual JumpLong JumpLongNonZero JumpLongZero JumpNegative JumpNoCarry JumpNoOverflow JumpNonNegative JumpNonZero JumpOverflow JumpR JumpZero Label LastJump LastRTLCode LinkReg LoadEffectiveAddressMwrR LoadEffectiveAddressXowrR LogicalShiftLeftCqR LogicalShiftLeftRR LogicalShiftRightCqR LogicalShiftRightRR MoveAbR MoveAwR MoveC32R MoveC64R MoveCqR MoveCwR MoveM16rR MoveM32rR MoveM64rRd MoveMbrR MoveMwrR MoveRAw MoveRM16r MoveRM32r MoveRMbr MoveRMwr MoveRR MoveRX16rR MoveRX32rR MoveRXbrR MoveRXowr MoveRXwrR MoveRdM64r MoveRdRd MoveX16rRR MoveX32rRR MoveXbrRR MoveXowrR MoveXwrRR MulCqR MulCwR MulRR MulRdRd NegateR Nop OrCqR OrCwR OrRR PopR PrefetchAw PushCw PushR ReceiverResultReg RetN SPReg SendNumArgsReg SqrtRd SubCqR SubCwR SubRR SubRdRd TempReg XorCqR XorCwR XorRR'
- classVariableNames: 'AddCqR AddCwR AddRR AddRdRd AlignmentNops AndCqR AndCwR AndRR Arg0Reg Arg1Reg ArithmeticShiftRightCqR ArithmeticShiftRightRR BICCqR Call ClassReg CmpCqR CmpCwR CmpRR CmpRdRd ConvertRRd DPFPReg0 DPFPReg1 DPFPReg2 DPFPReg3 DPFPReg4 DPFPReg5 DPFPReg6 DPFPReg7 DivRdRd FPReg Fill16 Fill32 Fill8 FillBytesFrom FillFromWord FirstJump FirstShortJump GPRegMax GPRegMin Jump JumpAbove JumpAboveOrEqual JumpBelow JumpBelowOrEqual JumpCarry JumpFPEqual JumpFPGreater JumpFPGreaterOrEqual JumpFPLess JumpFPLessOrEqual JumpFPNotEqual JumpFPOrdered JumpFPUnordered JumpGreater JumpGreaterOrEqual JumpLess JumpLessOrEqual JumpLong JumpLongNonZero JumpLongZero JumpNegative JumpNoCarry JumpNoOverflow JumpNonNegative JumpNonZero JumpOverflow JumpR JumpZero Label LastJump LastRTLCode LinkReg LoadEffectiveAddressMwrR LoadEffectiveAddressXowrR LogicalShiftLeftCqR LogicalShiftLeftRR LogicalShiftRightCqR LogicalShiftRightRR MoveAbR MoveAwR MoveC32R MoveC64R MoveCqR MoveCwR MoveM16rR MoveM32rR MoveM64rRd MoveMbrR MoveMwrR MoveRAw MoveRM16r MoveRM32r MoveRMbr MoveRMwr MoveRR MoveRX16rR MoveRX32rR MoveRXbrR MoveRXowr MoveRXwrR MoveRdM64r MoveRdRd MoveX16rRR MoveX32rRR MoveXbrRR MoveXowrR MoveXwrRR MulCqR MulCwR MulRR MulRdRd NegateR Nop OrCqR OrCwR OrRR PopR PrefetchAw PushCw PushR ReceiverResultReg RetN SPReg SendNumArgsReg SqrtRd SubCqR SubCwR SubRR SubRdRd TempReg XorCqR XorCwR XorRR'
  poolDictionaries: ''
  category: 'VMMaker-JIT'!
 
  !CogRTLOpcodes commentStamp: '<historical>' prior: 0!
  I am a pool for the Register-Transfer-Language to which Cog compiles.  I define unique integer values for all RTL opcodes and abstract registers.  See CogAbstractInstruction for instances of instructions with the opcodes that I define.!

Item was changed:
  ----- Method: CogRTLOpcodes class>>initialize (in category 'class initialization') -----
  initialize
  "Abstract opcodes are a compound of a one word operation specifier and zero or more operand type specifiers.
  e.g. MoveRR is the Move opcode with two register operand specifiers and defines a move register to
  register instruction from operand 0 to operand 1.  The word and register size is assumed to be either 32-bits on
  a 32-bit architecture or 64-bits on a 64-bit architecture.  
  The operand specifiers are
  R - general purpose register
  Rd - double-precision floating-point register
  Cq - a quick constant that can be encoded in the minimum space possible.
  Cw - a constant with word size where word is the default operand size for the Smalltalk VM, 32-bits
   for a 32-bit VM, 64-bits for a 64-bit VM.  The generated constant must occupy the default number
   of bits.  This allows e.g. a garbage collector to update the value without invalidating the code.
  C32 - a constant with 32 bit size.  The generated constant must occupy 32 bits.
  C64 - a constant with 64 bit size.  The generated constant must occupy 64 bits.
  Aw - memory word with an absolute address
  Ab - memory byte with an absolute address
  Mwr - memory word whose address is at a constant offset from an address in a register
  Mbr - memory byte whose address is at a constant offset from an address in a register
  M16r - memory 16-bit halfword whose address is at a constant offset from an address in a register
  M32r - memory 32-bit halfword whose address is at a constant offset from an address in a register
  M64r - memory 64-bit doubleword whose address is at a constant offset from an address in a register
  XbrR - memory word whose address is r * byte size away from an address in a register
  X16rR - memory word whose address is r * (2 bytes size) away from an address in a register
  XwrR - memory word whose address is r * word size away from an address in a register
  XowrR - memory word whose address is (r * word size) + o away from an address in a register (scaled indexed)
 
  An alternative would be to decouple opcodes from operands, e.g.
  Move := 1. Add := 2. Sub := 3...
  RegisterOperand := 1. ConstantQuickOperand := 2. ConstantWordOperand := 3...
  But not all combinations make sense and even fewer are used so we stick with the simple compound approach.
 
  The assumption is that comparison and arithmetic instructions set condition codes and that move instructions
  leave the condition codes unaffected.  In particular LoadEffectiveAddressMwrR does not set condition codes
  although it can be used to do arithmetic.
 
  Not all of the definitions in opcodeDefinitions below are implemented.  In particular we do not implement the
  XowrR scaled index addressing mode since it requires 4 operands.
 
  Note that there are no generic division instructions defined, but a processor may define some."
 
- "CogRTLOpcodes initialize.
- CogAbstractInstruction allSubclasses do: [:sc| sc initialize]"
-
  | opcodeNames refs |
  FPReg := -1.
  SPReg := -2.
  ReceiverResultReg := GPRegMax := -3.
  TempReg := -4.
  ClassReg := -5.
  SendNumArgsReg := -6.
  Arg0Reg := -7.
  Arg1Reg := GPRegMin := -8.
 
  DPFPReg0 := -9.
  DPFPReg1 := -10.
  DPFPReg2 := -11.
  DPFPReg3 := -12.
  DPFPReg4 := -13.
  DPFPReg5 := -14.
  DPFPReg6 := -15.
  DPFPReg7 := -16.
 
  LinkReg := -17.
 
  opcodeNames := #("Noops & Pseudo Ops"
  Label
  AlignmentNops
  FillBytesFrom "output operand 0's worth of bytes from the address in operand 1"
  Fill8 "output a byte's worth of bytes with operand 0"
  Fill16 "output two byte's worth of bytes with operand 0"
  Fill32 "output four byte's worth of bytes with operand 0"
  FillFromWord "output BytesPerWord's worth of bytes with operand 0 + operand 1"
  Nop
 
  "Control"
  Call
  RetN
  JumpR "Not a regular jump, i.e. not pc dependent."
 
  "N.B.  Jumps are contiguous.  Long jumps are contigiuous within them.  See FirstJump et al below"
  JumpLong
  JumpLongZero "a.k.a. JumpLongEqual"
  JumpLongNonZero "a.k.a. JumpLongNotEqual"
  Jump
  JumpZero "a.k.a. JumpEqual"
  JumpNonZero "a.k.a. JumpNotEqual"
  JumpNegative
  JumpNonNegative
  JumpOverflow
  JumpNoOverflow
  JumpCarry
  JumpNoCarry
  JumpLess "signed"
  JumpGreaterOrEqual
  JumpGreater
  JumpLessOrEqual
  JumpBelow "unsigned"
  JumpAboveOrEqual
  JumpAbove
  JumpBelowOrEqual
 
  JumpFPEqual
  JumpFPNotEqual
  JumpFPLess
  JumpFPLessOrEqual
  JumpFPGreater
  JumpFPGreaterOrEqual
  JumpFPOrdered
  JumpFPUnordered
 
  "Data Movement; destination is always last operand"
  MoveRR
  MoveAwR
  MoveRAw
  MoveAbR
  MoveMwrR MoveRMwr MoveXwrRR MoveRXwrR MoveXowrR MoveRXowr
  MoveM16rR MoveRM16r MoveX16rRR MoveRX16rR
  MoveM32rR MoveRM32r MoveX32rRR MoveRX32rR
  MoveMbrR MoveRMbr MoveXbrRR MoveRXbrR
  MoveCqR MoveCwR MoveC32R MoveC64R
  MoveRdRd MoveM64rRd MoveRdM64r
  PopR PushR PushCw
  PrefetchAw
 
  "Arithmetic; destination is always last operand except Cmp; CmpXR is SubRX with no update of result"
  LoadEffectiveAddressMwrR LoadEffectiveAddressXowrR "Variants of add/multiply"
  NegateR "2's complement negation"
  ArithmeticShiftRightCqR ArithmeticShiftRightRR
  LogicalShiftRightCqR LogicalShiftRightRR
  LogicalShiftLeftCqR LogicalShiftLeftRR
 
  CmpRR AddRR SubRR AndRR OrRR XorRR MulRR
+ CmpCqR AddCqR SubCqR AndCqR OrCqR XorCqR MulCqR
- CmpCqR AddCqR SubCqR AndCqR BICCqR "tpr - ARM only" OrCqR XorCqR MulCqR
  CmpCwR AddCwR SubCwR AndCwR OrCwR XorCwR MulCwR
 
  CmpRdRd AddRdRd SubRdRd MulRdRd DivRdRd SqrtRd
 
  "Conversion"
  ConvertRRd
 
  LastRTLCode).
 
  "Magic auto declaration. Add to the classPool any new variables and nuke any obsolete ones, and assign values"
  "Find the variables directly referenced by this method"
  refs := (thisContext method literals select: [:l| l isVariableBinding and: [classPool includesKey: l key]]) collect:
  [:ea| ea key].
  "Move to Undeclared any opcodes in classPool not in opcodes or this method."
  (classPool keys reject: [:k| (opcodeNames includes: k) or: [refs includes: k]]) do:
  [:k|
  Undeclared declare: k from: classPool].
  "Declare as class variables and number elements of opcodeArray above"
  opcodeNames withIndexDo:
  [:classVarName :value|
  self classPool
  declare: classVarName from: Undeclared;
  at: classVarName put: value].
 
  "For CogAbstractInstruction>>isJump etc..."
  FirstJump := JumpLong.
  LastJump := JumpFPUnordered.
+ FirstShortJump := Jump.
+
+ "And now initialize the backends; they add their own opcodes and hence these must be reinitialized."
+ (Smalltalk classNamed: #CogAbstractInstruction) ifNotNil:
+ [:cogAbstractInstruction| cogAbstractInstruction allSubclasses do: [:sc| sc initialize]]!
- FirstShortJump := Jump!

Item was removed:
- ----- Method: Cogit>>BICCq:R: (in category 'abstract instructions') -----
- BICCq: quickConstant R: reg
- <inline: true>
- <returnTypeC: #'AbstractInstruction *'>
- ^self gen: BICCqR operand: quickConstant operand: reg!

Item was changed:
  ----- Method: Cogit>>compileTrampolineFor:callJumpBar:numArgs:arg:arg:arg:arg:saveRegs:resultReg: (in category 'initialization') -----
  compileTrampolineFor: aRoutine callJumpBar: callJumpBar "<Boolean>" numArgs: numArgs arg: regOrConst0 arg: regOrConst1 arg: regOrConst2 arg: regOrConst3 saveRegs: saveRegs resultReg: resultRegOrNil
  "Generate a trampoline with up to four arguments.  Generate either a call or a jump to aRoutine
  as requested by callJumpBar.  If generating a call and resultRegOrNil is non-zero pass the C result
  back in resultRegOrNil.
  Hack: a negative value indicates an abstract register, a non-negative value indicates a constant."
  <var: #aRoutine type: #'void *'>
  <inline: false>
  "If on a RISC processor the return address needs to be pushed to the
  stack so that the interpreter sees the same stack layout as on CISC."
+ backEnd hasLinkRegister ifTrue:
+ [self PushR: LinkReg].
- "tpr evil test hack for stack imbalance problem"
- aRoutine ~=  #ceCheckForInterrupts ifTrue:[backEnd hasLinkRegister ifTrue:
- [self PushR: LinkReg]].
  self genSmalltalkToCStackSwitch.
  cStackAlignment > BytesPerWord ifTrue:
  [backEnd
  genAlignCStackSavingRegisters: saveRegs
  numArgs: numArgs
  wordAlignment: cStackAlignment / BytesPerWord].
  saveRegs ifTrue:
  [callJumpBar ifFalse:
  [self error: 'why save registers when you''re not going to return?'].
  backEnd genSaveRegisters].
  numArgs > 0 ifTrue:
  [numArgs > 1 ifTrue:
  [numArgs > 2 ifTrue:
  [numArgs > 3 ifTrue:
  [regOrConst3 < 0
  ifTrue: [backEnd genPassReg: regOrConst3 asArgument: 3]
  ifFalse: [backEnd genPassConst: regOrConst3 asArgument: 3]].
  regOrConst2 < 0
  ifTrue: [backEnd genPassReg: regOrConst2 asArgument: 2]
  ifFalse: [backEnd genPassConst: regOrConst2 asArgument: 2]].
  regOrConst1 < 0
  ifTrue: [backEnd genPassReg: regOrConst1 asArgument: 1]
  ifFalse: [backEnd genPassConst: regOrConst1 asArgument: 1]].
  regOrConst0 < 0
  ifTrue: [backEnd genPassReg: regOrConst0 asArgument: 0]
  ifFalse: [backEnd genPassConst: regOrConst0 asArgument: 0]].
  self gen: (callJumpBar ifTrue: [Call] ifFalse: [Jump])
  operand: (self cCode: [aRoutine asUnsignedInteger]
    inSmalltalk: [self simulatedTrampolineFor: aRoutine]).
  callJumpBar ifTrue:
  [resultRegOrNil ifNotNil:
  [backEnd genWriteCResultIntoReg: resultRegOrNil].
  saveRegs ifTrue:
  [numArgs > 0 ifTrue:
  [backEnd genRemoveNArgsFromStack: numArgs].
  resultRegOrNil
  ifNotNil: [backEnd genRestoreRegsExcept: resultRegOrNil]
  ifNil: [backEnd genRestoreRegs]].
  backEnd genLoadStackPointers.
  backEnd hasLinkRegister ifTrue:
  [self PopR: LinkReg].
  self RetN: 0]!

Item was changed:
  ----- Method: Cogit>>genNonLocalReturnTrampoline (in category 'initialization') -----
  genNonLocalReturnTrampoline
  opcodeIndex := 0.
+ "write the return address to the coInterpreter instructionPointerAddress;
+ CISCs will have pushed it on the stack, so pop it first; RISCs will have it in
+ their link register so just write it directly."
- "write the return address to the coInterpreter instructionPointerAddress; IA32 will have pushed it on the stack, so pop it first; ARM will have it in LR so just write it"
  backEnd hasLinkRegister
  ifTrue:
  [self MoveR: LinkReg Aw: coInterpreter instructionPointerAddress]
  ifFalse:
  [self PopR: TempReg. "instruction pointer"
  self MoveR: TempReg Aw: coInterpreter instructionPointerAddress].
  ^self genTrampolineFor: #ceNonLocalReturn:
  called: 'ceNonLocalReturnTrampoline'
  callJumpBar: true
  numArgs: 1
  arg: ReceiverResultReg
  arg: nil
  arg: nil
  arg: nil
  saveRegs: false
  resultReg: nil
  appendOpcodes: true!

Item was changed:
  ----- Method: SimpleStackBasedCogit>>genPrimReturnEnterCogCodeEnilopmart: (in category 'initialization') -----
  genPrimReturnEnterCogCodeEnilopmart: profiling
  "Generate the substitute return code for an external or FFI primitive call.
  On success simply return, extracting numArgs from newMethod.
  On primitive failure call ceActivateFailingPrimitiveMethod: newMethod."
  | jmpSample continuePostSample jmpFail |
  <var: #jmpSample type: #'AbstractInstruction *'>
  <var: #continuePostSample type: #'AbstractInstruction *'>
  <var: #jmpFail type: #'AbstractInstruction *'>
  opcodeIndex := 0.
 
  profiling ifTrue:
  ["Test nextProfileTick for being non-zero and call checkProfileTick: if so.
   N.B. nextProfileTick is 64-bits so 32-bit systems need to test both halves."
  BytesPerWord = 4
  ifTrue:
  [self MoveAw: coInterpreter nextProfileTickAddress R: TempReg.
  self MoveAw: coInterpreter nextProfileTickAddress + BytesPerWord R: ClassReg.
  self OrR: TempReg R: ClassReg]
  ifFalse:
  [self MoveAw: coInterpreter nextProfileTickAddress R: TempReg.
  self CmpCq: 0 R: TempReg].
  "If set, jump to record sample call."
  jmpSample := self JumpNonZero: 0.
  continuePostSample := self Label].
 
  "Test primitive failure"
  self MoveAw: coInterpreter primFailCodeAddress R: TempReg.
  self flag: 'ask concrete code gen if move sets condition codes?'.
  self CmpCq: 0 R: TempReg.
  jmpFail := self JumpNonZero: 0.
 
  "Switch back to the Smalltalk stack.  Stack better be in either of these two states:
  success: stackPointer -> result (was receiver)
+ arg1
+ ...
+ argN
+ return pc
+ failure: receiver
+ arg1
+ ...
- arg1
- ...
- argN
- return pc
- failure: receiver
- arg1
- ...
  stackPointer -> argN
+ return pc
- return pc
  We push the instructionPointer to reestablish the return pc in the success case,
  but leave it to ceActivateFailingPrimitiveMethod: to do so in the failure case."
 
+ backEnd hasLinkRegister
+ ifTrue:
+ [backEnd genLoadStackPointers. "Switch back to Smalltalk stack."
+ self MoveMw: 0 r: SPReg R: ReceiverResultReg. "Fetch result from stack"
+ self MoveAw: coInterpreter instructionPointerAddress R: LinkReg. "Get and restore ret pc"
+ self RetN: BytesPerWord] "Return, popping result from stack"
+ ifFalse:
+ [self MoveAw: coInterpreter instructionPointerAddress R: ClassReg. "Get return pc"
+ backEnd genLoadStackPointers. "Switch back to Smalltalk stack."
+ self MoveMw: 0 r: SPReg R: ReceiverResultReg. "Fetch result from stack"
+ self MoveR: ClassReg Mw: 0 r: SPReg. "Restore return pc"
+ self RetN: 0]. "Return, popping result from stack"
- self MoveAw: coInterpreter instructionPointerAddress R: ClassReg.
- backEnd genLoadStackPointers.
- self PushR: ClassReg. "Restore return pc"
- "Fetch result from stack"
- self MoveMw: BytesPerWord r: SPReg R: ReceiverResultReg.
- self flag: 'currently caller pushes result'.
- self RetN: BytesPerWord.
 
  "Primitive failed.  Invoke C code to build the frame and continue."
  jmpFail jmpTarget: (self MoveAw: coInterpreter newMethodAddress R: SendNumArgsReg).
  "Reload sp with CStackPointer; easier than popping args of checkProfileTick."
  self MoveAw: self cStackPointerAddress R: SPReg.
  cStackAlignment > BytesPerWord ifTrue:
  [backEnd
  genAlignCStackSavingRegisters: false
  numArgs: 1
  wordAlignment: cStackAlignment / BytesPerWord].
  backEnd genPassReg: SendNumArgsReg asArgument: 0.
  self CallRT: (self cCode: '(unsigned long)ceActivateFailingPrimitiveMethod'
  inSmalltalk: [self simulatedTrampolineFor: #ceActivateFailingPrimitiveMethod:]).
 
  profiling ifTrue:
  ["Call ceCheckProfileTick: to record sample and then continue.
   newMethod should be up-to-date."
  jmpSample jmpTarget: self Label.
  self CallRT: (self cCode: '(unsigned long)ceCheckProfileTick'
  inSmalltalk: [self simulatedTrampolineFor: #ceCheckProfileTick]).
  self Jump: continuePostSample]!