VM Maker: VMMaker.oscog-eem.2039.mcz

Previous Topic Next Topic
 
classic Classic list List threaded Threaded
1 message Options
Reply | Threaded
Open this post in threaded view
|

VM Maker: VMMaker.oscog-eem.2039.mcz

commits-2
 
Eliot Miranda uploaded a new version of VMMaker to project VM Maker:
http://source.squeak.org/VMMaker/VMMaker.oscog-eem.2039.mcz

==================== Summary ====================

Name: VMMaker.oscog-eem.2039
Author: eem
Time: 14 December 2016, 3:19:42.124426 pm
UUID: 78dde0e7-e826-47ad-b2ef-4a8aff88b963
Ancestors: VMMaker.oscog-nice.2038

ThreadedFFIPlugin (the FFI plugin)
Fix callout failures for callouts that invoke callbacks.  The VM checks that interpreter primitive calls (including external calls) have correctly-balanced stacks on return.  The ThreadedFFIPlugin (and indeed the old FFIPlugin) would use argumentCount to compute the stack on return, but a callback can alter argumentCount to arbitrary values, and callback return can't easily determine the value of argumentCount to restore (especially in a threaded context).  So the plugin must remember the argumentCount before it calls out and should use this cached value in restoring the stack.  To this end, remove all uses of methodReturnValue: in the support code for th ecallouts and use pop:thenPush: in ffiCall:ArgArrayOrNil:NumArgs: to update the stack correctly.

Streamline the callouts in the plugin to eliminate an extra test on the return type being float.  Neaten the code for (dis)owning the VM.

StackInterpreter (helping resolve callback issue above):
Refactor handleStackOverflow into checkForStackOverflow and handleStackOverflow to get better inlining/better readability.

Fix a bug in checkStackPointerIndexForFrame: for the current frame that caused many false assert-fails during callbacks.

Nuke the recently-added callback debugging machinery now that the problem is fixed.

Slang:
Don't output methods marked as <inline: #always> which are used to allow limited use of literal blocks in Slang.  These never want to be generated as C functions.

SistaRegisterAllocatingCogit:
Eliminate ssFlushTo:'s in several methods.

SistaCogit:
Use genCallMustBeBooleanFor: in the countertrip-only jump generator.

Cogit:
Remobe a couple of unused var types and variables.

=============== Diff against VMMaker.oscog-nice.2038 ===============

Item was changed:
  ----- Method: CCodeGenerator>>shouldGenerateMethod: (in category 'utilities') -----
  shouldGenerateMethod: aTMethod
  ^(self isBuiltinSelector: aTMethod selector)
  ifTrue: [requiredSelectors includes: aTMethod selector]
+ ifFalse: [aTMethod inline ~~ #always]!
- ifFalse: [true]!

Item was added:
+ ----- Method: CogAbstractInstruction>>usesTempRegForAbsoluteLoads (in category 'testing') -----
+ usesTempRegForAbsoluteLoads
+ "Answer if TempReg is used in absolute memory loads (as it is on x64).  By default answer false, allowing subclasses to override."
+ <inline: true>
+ ^false!

Item was changed:
  ----- Method: CogObjectRepresentationForSpur>>mixed:branchIf:instanceOfBehaviors:target: (in category 'sista support') -----
  mixed: numNonImmediates branchIf: reg instanceOfBehaviors: arrayObj target: targetFixUp
  | jmpImmediate numCases classObj index jmp |
- <var: #label type: #'AbstractInstruction *'>
  <var: #jmpImmediate type: #'AbstractInstruction *'>
- <var: #jmp type: #'AbstractInstruction *'>
  <var: #targetFixUp type: #'AbstractInstruction *'>
+ <var: #jmp type: #'AbstractInstruction *'>
  numCases := objectMemory numSlotsOf: arrayObj.
  jmpImmediate := self genJumpImmediate: reg.
 
  "Rcvr is non immediate"
  self genGetClassIndexOfNonImm: reg into: TempReg.
  index := 0.
  0 to: numCases - 1 do:
  [:i|
  classObj := objectMemory fetchPointer: i ofObject: arrayObj.
  (objectMemory isImmediateClass: classObj) ifFalse: [
  self genCmpClassIndex: (objectMemory classTagForClass: classObj) R: TempReg.
  cogit JumpZero: targetFixUp.
  index := index + 1 ] ].
  jmp := cogit Jump: 0.
 
  "Rcvr is immediate"
  jmpImmediate jmpTarget: cogit Label.
  numCases - numNonImmediates "num Immediates allowed"
  caseOf:
  {[ 1 ] -> [ "1 immediate needs to jump. Find it and jump."
  0 to: numCases - 1 do:
  [ :j |
  classObj := objectMemory fetchPointer: j ofObject: arrayObj.
  (objectMemory isImmediateClass: classObj) ifTrue: [
  self branchIf: reg hasImmediateTag: (objectMemory classTagForClass: classObj) target: targetFixUp ] ] ] .
  [ 2 ] -> [ "2 immediates needs to jump"
  self branch2CasesIf: reg instanceOfBehaviors: arrayObj target: targetFixUp ] .
  [ 3 ] -> [ "all 3 needs to jump" self Jump: targetFixUp ] }.
 
  jmp jmpTarget: self Label.
 
  ^ 0
  !

Item was changed:
  ----- Method: CogObjectRepresentationForSpur>>noneImmediateBranchIf:instanceOfBehaviors:target: (in category 'sista support') -----
  noneImmediateBranchIf: reg instanceOfBehaviors: arrayObj target: targetFixUp
  "All classes in arrayObj are not immediate"
  | classObj jmp |
  <var: #targetFixUp type: #'AbstractInstruction *'>
- <var: #label type: #'AbstractInstruction *'>
  <var: #jmp type: #'AbstractInstruction *'>
  jmp := self genJumpImmediate: reg.
  self genGetClassIndexOfNonImm: reg into: TempReg.
  0 to: (objectMemory numSlotsOf: arrayObj) - 1 do:
  [:i|
  classObj := objectMemory fetchPointer: i ofObject: arrayObj.
  self genCmpClassIndex: (objectMemory classTagForClass: classObj) R: TempReg.
  cogit JumpZero: targetFixUp ].
  jmp jmpTarget: self Label.
  ^0!

Item was added:
+ ----- Method: CogX64Compiler>>usesTempRegForAbsoluteLoads (in category 'testing') -----
+ usesTempRegForAbsoluteLoads
+ "Answer if TempReg is used in absolute memory loads (as it is on x64).  By default answer false, allowing subclasses to override."
+ <inline: true>
+ ^true!

Item was changed:
  ----- Method: RegisterAllocatingCogit>>genSpecialSelectorComparison (in category 'bytecode generators') -----
  genSpecialSelectorComparison
  | nextPC postBranchPC targetBytecodePC primDescriptor branchDescriptor
+  rcvrIsInt argIsInt argInt jumpNotSmallInts inlineCAB index rcvrReg argReg |
-  rcvrIsInt argIsInt argInt jumpNotSmallInts inlineCAB index rcvrReg argReg regMask |
  <var: #primDescriptor type: #'BytecodeDescriptor *'>
  <var: #branchDescriptor type: #'BytecodeDescriptor *'>
  <var: #jumpNotSmallInts type: #'AbstractInstruction *'>
  primDescriptor := self generatorAt: byte0.
  argIsInt := self ssTop type = SSConstant
  and: [objectMemory isIntegerObject: (argInt := self ssTop constant)].
  rcvrIsInt := (self ssValue: 1) type = SSConstant
  and: [objectMemory isIntegerObject: (self ssValue: 1) constant].
 
  (argIsInt and: [rcvrIsInt]) ifTrue:
  [^ self genStaticallyResolvedSpecialSelectorComparison].
 
  self extractMaybeBranchDescriptorInto: [ :descr :next :postBranch :target |
  branchDescriptor := descr. nextPC := next. postBranchPC := postBranch. targetBytecodePC := target ].
 
  "Only interested in inlining if followed by a conditional branch."
  inlineCAB := branchDescriptor isBranchTrue or: [branchDescriptor isBranchFalse].
  "Further, only interested in inlining = and ~= if there's a SmallInteger constant involved.
  The relational operators successfully statically predict SmallIntegers; the equality operators do not."
  (inlineCAB and: [primDescriptor opcode = JumpZero or: [primDescriptor opcode = JumpNonZero]]) ifTrue:
  [inlineCAB := argIsInt or: [rcvrIsInt]].
  inlineCAB ifFalse:
  [^self genSpecialSelectorSend].
 
  "In-line the comparison and the jump, but if the types are not SmallInteger then we will need
  to do a send and fall through to the following conditional branch.  Since we're allocating values
  in registers we would like to keep those registers live on the inlined path and reload registers
  along the non-inlined send path.  The merge logic at the branch destinations handles this."
  argIsInt
  ifTrue:
  [rcvrReg := self allocateRegForStackEntryAt: 1.
  (self ssValue: 1) popToReg: rcvrReg.
+ self MoveR: rcvrReg R: TempReg]
- self MoveR: rcvrReg R: TempReg.
- regMask := self registerMaskFor: rcvrReg]
  ifFalse:
  [self allocateRegForStackTopTwoEntriesInto: [:rTop :rNext| argReg := rTop. rcvrReg := rNext].
  rcvrReg = Arg0Reg ifTrue:
  [rcvrReg := argReg. argReg := Arg0Reg].
  self ssTop popToReg: argReg.
  (self ssValue: 1) popToReg: rcvrReg.
+ self MoveR: argReg R: TempReg].
- self MoveR: argReg R: TempReg.
- regMask := self registerMaskFor: rcvrReg and: argReg].
  self ssPop: 2.
  jumpNotSmallInts := (argIsInt or: [rcvrIsInt])
  ifTrue: [objectRepresentation genJumpNotSmallIntegerInScratchReg: TempReg]
  ifFalse: [objectRepresentation genJumpNotSmallIntegersIn: rcvrReg andScratch: TempReg scratch: ClassReg].
  argIsInt
  ifTrue: [self CmpCq: argInt R: rcvrReg]
  ifFalse: [self CmpR: argReg R: rcvrReg].
  "Cmp is weird/backwards so invert the comparison.  Further since there is a following conditional
  jump bytecode define non-merge fixups and leave the cond bytecode to set the mergeness."
  self genConditionalBranch: (branchDescriptor isBranchTrue
  ifTrue: [primDescriptor opcode]
  ifFalse: [self inverseBranchFor: primDescriptor opcode])
  operand: (self ensureFixupAt: targetBytecodePC - initialPC) asUnsignedInteger.
  self Jump: (self ensureFixupAt: postBranchPC - initialPC).
  jumpNotSmallInts jmpTarget: self Label.
  self ssFlushTo: simStackPtr.
  self deny: rcvrReg = Arg0Reg.
  argIsInt
  ifTrue: [self MoveCq: argInt R: Arg0Reg]
  ifFalse: [argReg ~= Arg0Reg ifTrue: [self MoveR: argReg R: Arg0Reg]].
  rcvrReg ~= ReceiverResultReg ifTrue: [self MoveR: rcvrReg R: ReceiverResultReg].
  index := byte0 - self firstSpecialSelectorBytecodeOffset.
+ ^self genMarshalledSend: index negated - 1 numArgs: 1 sendTable: ordinarySendTrampolines!
- self genMarshalledSend: index negated - 1 numArgs: 1 sendTable: ordinarySendTrampolines.
- ^0!

Item was changed:
  ----- Method: SistaCogit>>genCounterTripOnlyJumpIf:to: (in category 'bytecode generator support') -----
  genCounterTripOnlyJumpIf: boolean to: targetBytecodePC
  "Specific version if the branch is only reached while falling through if the counter trips after an inlined #== branch. We do not regenerate the counter logic in this case to avoid 24 bytes instructions."
 
  <var: #ok type: #'AbstractInstruction *'>
  <var: #mustBeBooleanTrampoline type: #'AbstractInstruction *'>
 
  | ok mustBeBooleanTrampoline |
 
  extA := 0.
 
  self ssFlushTo: simStackPtr - 1.
 
  self ssTop popToReg: TempReg.
 
  self ssPop: 1.
 
  counterIndex := counterIndex + 1. "counters are increased / decreased in the inlined branch"
 
  "We need SendNumArgsReg because of the mustBeBooleanTrampoline"
  self ssAllocateRequiredReg: SendNumArgsReg.
  self MoveCq: 1 R: SendNumArgsReg.
 
  "The first time this is reached, it calls necessarily the counter trip for the trampoline because SendNumArgsReg is non zero"
+ mustBeBooleanTrampoline := self genCallMustBeBooleanFor: boolean.
- mustBeBooleanTrampoline := self CallRT: (boolean == objectMemory falseObject
- ifTrue: [ceSendMustBeBooleanAddFalseTrampoline]
- ifFalse: [ceSendMustBeBooleanAddTrueTrampoline]).
-
  self annotateBytecode: self Label.
 
  "Cunning trick by LPD.  If true and false are contiguous subtract the smaller.
  Correct result is either 0 or the distance between them.  If result is not 0 or
  their distance send mustBeBoolean."
  self assert: (objectMemory objectAfter: objectMemory falseObject) = objectMemory trueObject.
  self genSubConstant: boolean R: TempReg.
  self JumpZero: (self ensureFixupAt: targetBytecodePC - initialPC).
 
+ self CmpCq: (boolean = objectMemory falseObject
- self CmpCq: (boolean == objectMemory falseObject
  ifTrue: [objectMemory trueObject - objectMemory falseObject]
  ifFalse: [objectMemory falseObject - objectMemory trueObject])
  R: TempReg.
 
  ok := self JumpZero: 0.
  self MoveCq: 0 R: SendNumArgsReg. "if counterReg is 0 this is a mustBeBoolean, not a counter trip."
 
  self Jump: mustBeBooleanTrampoline.
 
  ok jmpTarget: self Label.
  ^0!

Item was changed:
  ----- Method: SistaCogit>>genJumpIf:to: (in category 'bytecode generator support') -----
  genJumpIf: boolean to: targetBytecodePC
  "The heart of performance counting in Sista.  Conditional branches are 6 times less
  frequent than sends and can provide basic block frequencies (send counters can't).
  Each conditional has a 32-bit counter split into an upper 16 bits counting executions
  and a lower half counting untaken executions of the branch.  Executing the branch
  decrements the upper half, tripping if the count goes negative.  Not taking the branch
  decrements the lower half.  N.B. We *do not* eliminate dead branches (true ifTrue:/true ifFalse:)
  so that scanning for send and branch data is simplified and that branch data is correct."
  <inline: false>
  | ok counterAddress countTripped retry nextPC nextDescriptor desc |
  <var: #ok type: #'AbstractInstruction *'>
  <var: #desc type: #'CogSimStackEntry *'>
  <var: #retry type: #'AbstractInstruction *'>
  <var: #countTripped type: #'AbstractInstruction *'>
  <var: #nextDescriptor type: #'BytecodeDescriptor *'>
 
  "In optimized code we don't generate counters to improve performance"
  (coInterpreter isOptimizedMethod: methodObj) ifTrue: [ ^ super genJumpIf: boolean to: targetBytecodePC ].
 
  "If the branch is reached only for the counter trip trampoline
  (typically, var1 == var2 ifTrue: falls through to the branch only for the trampoline)
  we generate a specific path to drastically reduce the number of machine instructions"
  branchReachedOnlyForCounterTrip ifTrue:
  [ branchReachedOnlyForCounterTrip := false.
  ^ self genCounterTripOnlyJumpIf: boolean to: targetBytecodePC ].
 
  "We detect and: / or:, if found, we don't generate the counters to avoid pathological counter slow down"
+ boolean = objectMemory falseObject ifTrue:
- boolean == objectMemory falseObject ifTrue:
  [ nextPC := bytecodePC + (self generatorAt: byte0) numBytes.
   nextDescriptor := self generatorAt: (objectMemory fetchByte: nextPC ofObject: methodObj) + bytecodeSetOffset.
   nextDescriptor generator ==  #genPushConstantTrueBytecode ifTrue: [ ^ super genJumpIf: boolean to: targetBytecodePC ].
   nextDescriptor := self generatorAt: (objectMemory fetchByte: targetBytecodePC ofObject: methodObj) + bytecodeSetOffset.
+  nextDescriptor generator == #genPushConstantFalseBytecode ifTrue: [ ^ super genJumpIf: boolean to: targetBytecodePC ].  ].
-  nextDescriptor generator ==  #genPushConstantFalseBytecode ifTrue: [ ^ super genJumpIf: boolean to: targetBytecodePC ].  ].
 
  extA := 0. "We ignore the noMustBeBoolean flag. It should not be present in methods with counters, and if it is we don't care."
 
  "We don't generate counters on branches on true/false, the basicblock usage can be inferred"
  desc := self ssTop.
  (desc type == SSConstant
  and: [desc constant = objectMemory trueObject or: [desc constant = objectMemory falseObject]]) ifTrue:
  [ ^ super genJumpIf: boolean to: targetBytecodePC ].
 
  self ssFlushTo: simStackPtr - 1.
  desc popToReg: TempReg.
  self ssPop: 1.
 
  "We need SendNumArgsReg because of the mustBeBooleanTrampoline"
  self ssAllocateRequiredReg: SendNumArgsReg.
 
  retry := self Label.
  self
  genExecutionCountLogicInto: [ :cAddress :countTripBranch |
  counterAddress := cAddress.
  countTripped := countTripBranch ]
  counterReg: SendNumArgsReg.
  counterIndex := counterIndex + 1.
 
  "Cunning trick by LPD.  If true and false are contiguous subtract the smaller.
  Correct result is either 0 or the distance between them.  If result is not 0 or
  their distance send mustBeBoolean."
  self assert: (objectMemory objectAfter: objectMemory falseObject) = objectMemory trueObject.
  self genSubConstant: boolean R: TempReg.
  self JumpZero: (self ensureFixupAt: targetBytecodePC - initialPC).
 
  self genFallsThroughCountLogicCounterReg: SendNumArgsReg counterAddress: counterAddress.
 
  self CmpCq: (boolean = objectMemory falseObject
  ifTrue: [objectMemory trueObject - objectMemory falseObject]
  ifFalse: [objectMemory falseObject - objectMemory trueObject])
  R: TempReg.
  ok := self JumpZero: 0.
  self MoveCq: 0 R: SendNumArgsReg. "if counterReg is 0 this is a mustBeBoolean, not a counter trip."
 
  countTripped jmpTarget: (self genCallMustBeBooleanFor: boolean).
 
  "If we're in an image which hasn't got the Sista code loaded then the ceCounterTripped:
  trampoline will return directly to machine code, returning the boolean.  So the code should
  jump back to the retry point. The trampoline makes sure that TempReg has been reloaded."
+
+ "Clément: For some reason if I write self annotateBytecode: (self Jump: retry) the annotation is not at the correct place."
+ "Eliot: Annotations apply the the address following an instruction, and the annotation must be for the return address
+ of the call (since this is the address the run-time sees), so it must be on a label before the jump, not after the jump."
+ self annotateBytecode: self Label.
- self annotateBytecode: self Label. "For some reason if I write self annotateBytecode: (self Jump: retry) the annotation is not at the correct place."
  self Jump: retry.
 
  ok jmpTarget: self Label.
  ^0!

Item was removed:
- ----- Method: SistaCogitClone>>genCounterTripOnlyJumpIf:to: (in category 'bytecode generator support') -----
- genCounterTripOnlyJumpIf: boolean to: targetBytecodePC
- "Specific version if the branch is only reached while falling through if the counter trips after an inlined #== branch. We do not regenerate the counter logic in this case to avoid 24 bytes instructions."
-
- <var: #ok type: #'AbstractInstruction *'>
- <var: #mustBeBooleanTrampoline type: #'AbstractInstruction *'>
-
- | ok mustBeBooleanTrampoline |
-
- extA := 0.
-
- self ssFlushTo: simStackPtr - 1.
-
- self ssTop popToReg: TempReg.
-
- self ssPop: 1.
-
- counterIndex := counterIndex + 1. "counters are increased / decreased in the inlined branch"
-
- "We need SendNumArgsReg because of the mustBeBooleanTrampoline"
- self ssAllocateRequiredReg: SendNumArgsReg.
- self MoveCq: 1 R: SendNumArgsReg.
-
- "The first time this is reached, it calls necessarily the counter trip for the trampoline because SendNumArgsReg is non zero"
- mustBeBooleanTrampoline := self CallRT: (boolean == objectMemory falseObject
- ifTrue: [ceSendMustBeBooleanAddFalseTrampoline]
- ifFalse: [ceSendMustBeBooleanAddTrueTrampoline]).
-
- self annotateBytecode: self Label.
-
- "Cunning trick by LPD.  If true and false are contiguous subtract the smaller.
- Correct result is either 0 or the distance between them.  If result is not 0 or
- their distance send mustBeBoolean."
- self assert: (objectMemory objectAfter: objectMemory falseObject) = objectMemory trueObject.
- self genSubConstant: boolean R: TempReg.
- self JumpZero: (self ensureFixupAt: targetBytecodePC - initialPC).
-
- self CmpCq: (boolean == objectMemory falseObject
- ifTrue: [objectMemory trueObject - objectMemory falseObject]
- ifFalse: [objectMemory falseObject - objectMemory trueObject])
- R: TempReg.
-
- ok := self JumpZero: 0.
- self MoveCq: 0 R: SendNumArgsReg. "if counterReg is 0 this is a mustBeBoolean, not a counter trip."
-
- self Jump: mustBeBooleanTrampoline.
-
- ok jmpTarget: self Label.
- ^0!

Item was removed:
- ----- Method: SistaCogitClone>>genJumpIf:to: (in category 'bytecode generator support') -----
- genJumpIf: boolean to: targetBytecodePC
- "The heart of performance counting in Sista.  Conditional branches are 6 times less
- frequent than sends and can provide basic block frequencies (send counters can't).
- Each conditional has a 32-bit counter split into an upper 16 bits counting executions
- and a lower half counting untaken executions of the branch.  Executing the branch
- decrements the upper half, tripping if the count goes negative.  Not taking the branch
- decrements the lower half.  N.B. We *do not* eliminate dead branches (true ifTrue:/true ifFalse:)
- so that scanning for send and branch data is simplified and that branch data is correct."
- <inline: false>
- | ok counterAddress countTripped retry nextPC nextDescriptor desc |
- <var: #ok type: #'AbstractInstruction *'>
- <var: #desc type: #'CogSimStackEntry *'>
- <var: #retry type: #'AbstractInstruction *'>
- <var: #countTripped type: #'AbstractInstruction *'>
- <var: #nextDescriptor type: #'BytecodeDescriptor *'>
-
- "In optimized code we don't generate counters to improve performance"
- (coInterpreter isOptimizedMethod: methodObj) ifTrue: [ ^ super genJumpIf: boolean to: targetBytecodePC ].
-
- "If the branch is reached only for the counter trip trampoline
- (typically, var1 == var2 ifTrue: falls through to the branch only for the trampoline)
- we generate a specific path to drastically reduce the number of machine instructions"
- branchReachedOnlyForCounterTrip ifTrue:
- [ branchReachedOnlyForCounterTrip := false.
- ^ self genCounterTripOnlyJumpIf: boolean to: targetBytecodePC ].
-
- "We detect and: / or:, if found, we don't generate the counters to avoid pathological counter slow down"
- boolean == objectMemory falseObject ifTrue:
- [ nextPC := bytecodePC + (self generatorAt: byte0) numBytes.
-  nextDescriptor := self generatorAt: (objectMemory fetchByte: nextPC ofObject: methodObj) + bytecodeSetOffset.
-  nextDescriptor generator ==  #genPushConstantTrueBytecode ifTrue: [ ^ super genJumpIf: boolean to: targetBytecodePC ].
-  nextDescriptor := self generatorAt: (objectMemory fetchByte: targetBytecodePC ofObject: methodObj) + bytecodeSetOffset.
-  nextDescriptor generator ==  #genPushConstantFalseBytecode ifTrue: [ ^ super genJumpIf: boolean to: targetBytecodePC ].  ].
-
- extA := 0. "We ignore the noMustBeBoolean flag. It should not be present in methods with counters, and if it is we don't care."
-
- "We don't generate counters on branches on true/false, the basicblock usage can be inferred"
- desc := self ssTop.
- (desc type == SSConstant
- and: [desc constant = objectMemory trueObject or: [desc constant = objectMemory falseObject]]) ifTrue:
- [ ^ super genJumpIf: boolean to: targetBytecodePC ].
-
- self ssFlushTo: simStackPtr - 1.
- desc popToReg: TempReg.
- self ssPop: 1.
-
- "We need SendNumArgsReg because of the mustBeBooleanTrampoline"
- self ssAllocateRequiredReg: SendNumArgsReg.
-
- retry := self Label.
- self
- genExecutionCountLogicInto: [ :cAddress :countTripBranch |
- counterAddress := cAddress.
- countTripped := countTripBranch ]
- counterReg: SendNumArgsReg.
- counterIndex := counterIndex + 1.
-
- "Cunning trick by LPD.  If true and false are contiguous subtract the smaller.
- Correct result is either 0 or the distance between them.  If result is not 0 or
- their distance send mustBeBoolean."
- self assert: (objectMemory objectAfter: objectMemory falseObject) = objectMemory trueObject.
- self genSubConstant: boolean R: TempReg.
- self JumpZero: (self ensureFixupAt: targetBytecodePC - initialPC).
-
- self genFallsThroughCountLogicCounterReg: SendNumArgsReg counterAddress: counterAddress.
-
- self CmpCq: (boolean = objectMemory falseObject
- ifTrue: [objectMemory trueObject - objectMemory falseObject]
- ifFalse: [objectMemory falseObject - objectMemory trueObject])
- R: TempReg.
- ok := self JumpZero: 0.
- self MoveCq: 0 R: SendNumArgsReg. "if counterReg is 0 this is a mustBeBoolean, not a counter trip."
-
- countTripped jmpTarget: (self genCallMustBeBooleanFor: boolean).
-
- "If we're in an image which hasn't got the Sista code loaded then the ceCounterTripped:
- trampoline will return directly to machine code, returning the boolean.  So the code should
- jump back to the retry point. The trampoline makes sure that TempReg has been reloaded."
- self annotateBytecode: self Label. "For some reason if I write self annotateBytecode: (self Jump: retry) the annotation is not at the correct place."
- self Jump: retry.
-
- ok jmpTarget: self Label.
- ^0!

Item was removed:
- ----- Method: SistaCogitClone>>genSpecialSelectorComparison (in category 'bytecode generators') -----
- genSpecialSelectorComparison
- "Override to count inlined branches if followed by a conditional branch.
- We borrow the following conditional branch's counter and when about to
- inline the comparison we decrement the counter (without writing it back)
- and if it trips simply abort the inlining, falling back to the normal send which
- will then continue to the conditional branch which will trip and enter the abort."
- | nextPC postBranchPC targetBytecodePC primDescriptor branchDescriptor
-  rcvrIsInt argIsInt argInt jumpNotSmallInts inlineCAB
-  counterAddress countTripped counterReg index |
- <var: #countTripped type: #'AbstractInstruction *'>
- <var: #primDescriptor type: #'BytecodeDescriptor *'>
- <var: #jumpNotSmallInts type: #'AbstractInstruction *'>
- <var: #branchDescriptor type: #'BytecodeDescriptor *'>
-
- (coInterpreter isOptimizedMethod: methodObj) ifTrue: [ ^ self genSpecialSelectorComparisonWithoutCounters ].
-
- self ssFlushTo: simStackPtr - 2.
- primDescriptor := self generatorAt: byte0.
- argIsInt := self ssTop type = SSConstant
- and: [objectMemory isIntegerObject: (argInt := self ssTop constant)].
- rcvrIsInt := (self ssValue: 1) type = SSConstant
- and: [objectMemory isIntegerObject: (self ssValue: 1) constant].
-
- "short-cut the jump if operands are SmallInteger constants."
- (argIsInt and: [rcvrIsInt]) ifTrue:
- [^ self genStaticallyResolvedSpecialSelectorComparison].
-
- self extractMaybeBranchDescriptorInto: [ :descr :next :postBranch :target |
- branchDescriptor := descr. nextPC := next. postBranchPC := postBranch. targetBytecodePC := target ].
-
- "Only interested in inlining if followed by a conditional branch."
- inlineCAB := branchDescriptor isBranchTrue or: [branchDescriptor isBranchFalse].
- "Further, only interested in inlining = and ~= if there's a SmallInteger constant involved.
- The relational operators successfully statically predict SmallIntegers; the equality operators do not."
- (inlineCAB and: [primDescriptor opcode = JumpZero or: [primDescriptor opcode = JumpNonZero]]) ifTrue:
- [inlineCAB := argIsInt or: [rcvrIsInt]].
- inlineCAB ifFalse:
- [^self genSpecialSelectorSend].
-
- argIsInt
- ifTrue:
- [(self ssValue: 1) popToReg: ReceiverResultReg.
- self ssPop: 2.
- self MoveR: ReceiverResultReg R: TempReg]
- ifFalse:
- [self marshallSendArguments: 1.
- self MoveR: Arg0Reg R: TempReg].
- jumpNotSmallInts := (argIsInt or: [rcvrIsInt])
- ifTrue: [objectRepresentation genJumpNotSmallIntegerInScratchReg: TempReg]
- ifFalse: [objectRepresentation genJumpNotSmallIntegersIn: ReceiverResultReg andScratch: TempReg scratch: ClassReg].
-
- counterReg := self allocateRegNotConflictingWith: (self registerMaskFor: ReceiverResultReg and: Arg0Reg).
- self
- genExecutionCountLogicInto: [ :cAddress :countTripBranch |
- counterAddress := cAddress.
- countTripped := countTripBranch ]
- counterReg: counterReg.
-
- argIsInt
- ifTrue: [self CmpCq: argInt R: ReceiverResultReg]
- ifFalse: [self CmpR: Arg0Reg R: ReceiverResultReg].
- "Cmp is weird/backwards so invert the comparison.  Further since there is a following conditional
- jump bytecode define non-merge fixups and leave the cond bytecode to set the mergeness."
- self genConditionalBranch: (branchDescriptor isBranchTrue
- ifTrue: [primDescriptor opcode]
- ifFalse: [self inverseBranchFor: primDescriptor opcode])
- operand: (self ensureNonMergeFixupAt: targetBytecodePC - initialPC) asUnsignedInteger.
-
- self genFallsThroughCountLogicCounterReg: counterReg counterAddress: counterAddress.
-
- self Jump: (self ensureNonMergeFixupAt: postBranchPC - initialPC).
- countTripped jmpTarget: (jumpNotSmallInts jmpTarget: self Label).
-
- argIsInt ifTrue:
- [self MoveCq: argInt R: Arg0Reg].
- index := byte0 - self firstSpecialSelectorBytecodeOffset.
- ^self genMarshalledSend: index negated - 1 numArgs: 1 sendTable: ordinarySendTrampolines!

Item was added:
+ ----- Method: SistaRegisterAllocatingCogit class>>implementation:isNewerThan: (in category 'clone maintennance') -----
+ implementation: aMethodReference isNewerThan: bMethodReference
+ ^(self dateAndTimeFrom: aMethodReference timeStamp) > (self dateAndTimeFrom: bMethodReference timeStamp)!

Item was added:
+ ----- Method: SistaRegisterAllocatingCogit>>genCounterTripOnlyJumpIf:to: (in category 'bytecode generator support') -----
+ genCounterTripOnlyJumpIf: boolean to: targetBytecodePC
+ "Specific version if the branch is only reached while falling through if the counter trips after an inlined #== branch. We do not regenerate the counter logic in this case to avoid 24 bytes instructions."
+
+ <var: #ok type: #'AbstractInstruction *'>
+ <var: #mustBeBooleanTrampoline type: #'AbstractInstruction *'>
+
+ | ok mustBeBooleanTrampoline |
+
+ extA := 0.
+
+ self ssTop popToReg: TempReg.
+ self ssPop: 1.
+
+ counterIndex := counterIndex + 1. "counters are increased / decreased in the inlined branch"
+
+ "We need SendNumArgsReg because of the mustBeBooleanTrampoline"
+ self ssAllocateRequiredReg: SendNumArgsReg.
+ self MoveCq: 1 R: SendNumArgsReg.
+
+ "The first time this is reached, it calls necessarily the counter trip for the trampoline because SendNumArgsReg is non zero"
+ mustBeBooleanTrampoline := self genCallMustBeBooleanFor: boolean.
+
+ self annotateBytecode: self Label.
+
+ "Cunning trick by LPD.  If true and false are contiguous subtract the smaller.
+ Correct result is either 0 or the distance between them.  If result is not 0 or
+ their distance send mustBeBoolean."
+ self assert: (objectMemory objectAfter: objectMemory falseObject) = objectMemory trueObject.
+ self genSubConstant: boolean R: TempReg.
+ self JumpZero: (self ensureFixupAt: targetBytecodePC - initialPC).
+
+ self CmpCq: (boolean = objectMemory falseObject
+ ifTrue: [objectMemory trueObject - objectMemory falseObject]
+ ifFalse: [objectMemory falseObject - objectMemory trueObject])
+ R: TempReg.
+
+ ok := self JumpZero: 0.
+ self MoveCq: 0 R: SendNumArgsReg. "if counterReg is 0 this is a mustBeBoolean, not a counter trip."
+
+ self Jump: mustBeBooleanTrampoline.
+
+ ok jmpTarget: self Label.
+ ^0!

Item was added:
+ ----- Method: SistaRegisterAllocatingCogit>>genForwardersInlinedIdenticalOrNotIf: (in category 'bytecode generators') -----
+ genForwardersInlinedIdenticalOrNotIf: orNot
+ "Override to count inlined branches if followed by a conditional branch.
+ We borrow the following conditional branch's counter and when about to
+ inline the comparison we decrement the counter (without writing it back)
+ and if it trips simply abort the inlining, falling back to the normal send which
+ will then continue to the conditional branch which will trip and enter the abort."
+ | nextPC postBranchPC targetBytecodePC branchDescriptor counterReg fixup jumpEqual jumpNotEqual
+  counterAddress countTripped unforwardArg unforwardRcvr argReg rcvrReg regMask |
+ <var: #fixup type: #'BytecodeFixup *'>
+ <var: #countTripped type: #'AbstractInstruction *'>
+ <var: #label type: #'AbstractInstruction *'>
+ <var: #branchDescriptor type: #'BytecodeDescriptor *'>
+ <var: #jumpEqual type: #'AbstractInstruction *'>
+ <var: #jumpNotEqual type: #'AbstractInstruction *'>
+
+ ((coInterpreter isOptimizedMethod: methodObj) or: [needsFrame not]) ifTrue:
+ [^super genForwardersInlinedIdenticalOrNotIf: orNot].
+
+ regMask := 0.
+
+ self extractMaybeBranchDescriptorInto: [ :descr :next :postBranch :target |
+ branchDescriptor := descr. nextPC := next. postBranchPC := postBranch. targetBytecodePC := target ].
+
+ unforwardRcvr := (objectRepresentation isUnannotatableConstant: (self ssValue: 1)) not.
+ unforwardArg := (objectRepresentation isUnannotatableConstant: self ssTop) not.
+
+ "If an operand is an annotable constant, it may be forwarded, so we need to store it into a
+ register so the forwarder check can jump back to the comparison after unforwarding the constant.
+ However, if one of the operand is an unnanotable constant, does not allocate a register for it
+ (machine code will use operations on constants)."
+ rcvrReg:= argReg := NoReg.
+ self
+ allocateEqualsEqualsRegistersArgNeedsReg: unforwardArg
+ rcvrNeedsReg: unforwardRcvr
+ into: [ :rcvr :arg | rcvrReg:= rcvr. argReg := arg ].
+
+ argReg ~= NoReg ifTrue: [ regMask := self registerMaskFor: argReg ].
+ rcvrReg ~= NoReg ifTrue: [ regMask := regMask bitOr: (self registerMaskFor: rcvrReg) ].
+
+ "Only interested in inlining if followed by a conditional branch."
+ (branchDescriptor isBranchTrue or: [branchDescriptor isBranchFalse]) ifFalse:
+ [^ self
+ genIdenticalNoBranchArgIsConstant: unforwardArg not
+ rcvrIsConstant: unforwardRcvr not
+ argReg: argReg
+ rcvrReg: rcvrReg
+ orNotIf: orNot].
+
+ unforwardArg ifTrue: [ objectRepresentation genEnsureOopInRegNotForwarded: argReg scratchReg: TempReg ].
+ unforwardRcvr ifTrue: [ objectRepresentation genEnsureOopInRegNotForwarded: rcvrReg scratchReg: TempReg ].
+
+ counterReg := self allocateRegNotConflictingWith: regMask.
+ self
+ genExecutionCountLogicInto: [ :cAddress :countTripBranch |
+ counterAddress := cAddress.
+ countTripped := countTripBranch ]
+ counterReg: counterReg.
+
+ self assert: (unforwardArg or: [ unforwardRcvr ]).
+ self genCmpArgIsConstant: unforwardArg not rcvrIsConstant: unforwardRcvr not argReg: argReg rcvrReg: rcvrReg.
+ self ssPop: 2.
+
+ "We could use (branchDescriptor isBranchTrue xor: orNot) to simplify this."
+ orNot
+ ifFalse: [branchDescriptor isBranchTrue
+ ifTrue:
+ [ fixup := (self ensureNonMergeFixupAt: postBranchPC - initialPC) asUnsignedInteger.
+ self JumpZero:  (self ensureNonMergeFixupAt: targetBytecodePC - initialPC) asUnsignedInteger ]
+ ifFalse: "branchDescriptor is branchFalse"
+ [ fixup := (self ensureNonMergeFixupAt: targetBytecodePC - initialPC) asUnsignedInteger.
+ self JumpZero: (self ensureNonMergeFixupAt: postBranchPC - initialPC) asUnsignedInteger ]]
+ ifTrue: [branchDescriptor isBranchTrue
+ ifFalse: "branchDescriptor is branchFalse"
+ [ fixup := (self ensureNonMergeFixupAt: postBranchPC - initialPC) asUnsignedInteger.
+ self JumpZero:  (self ensureNonMergeFixupAt: targetBytecodePC - initialPC) asUnsignedInteger ]
+ ifTrue:
+ [ fixup := (self ensureNonMergeFixupAt: targetBytecodePC - initialPC) asUnsignedInteger.
+ self JumpZero: (self ensureNonMergeFixupAt: postBranchPC - initialPC) asUnsignedInteger ]].
+
+ self genFallsThroughCountLogicCounterReg: counterReg counterAddress: counterAddress.
+ self Jump: fixup.
+
+ countTripped jmpTarget: self Label.
+
+ "inlined version of #== ignoring the branchDescriptor if the counter trips to have normal state for the optimizer"
+ self ssPop: -2.
+ self genCmpArgIsConstant: unforwardArg not rcvrIsConstant: unforwardRcvr not argReg: argReg rcvrReg: rcvrReg.
+ self ssPop: 2.
+
+ "This code necessarily directly falls through the jumpIf: code which pops the top of the stack into TempReg.
+ We therefore directly assign the result to TempReg to save one move instruction"
+ jumpEqual := orNot ifFalse: [self JumpZero: 0] ifTrue: [self JumpNonZero: 0].
+ self genMoveFalseR: TempReg.
+ jumpNotEqual := self Jump: 0.
+ jumpEqual jmpTarget: (self genMoveTrueR: TempReg).
+ jumpNotEqual jmpTarget: self Label.
+ self ssPushRegister: TempReg.
+
+ (self fixupAt: nextPC - initialPC) notAFixup ifTrue: [ branchReachedOnlyForCounterTrip := true ].
+
+ ^ 0!

Item was added:
+ ----- Method: SistaRegisterAllocatingCogit>>genJumpIf:to: (in category 'bytecode generator support') -----
+ genJumpIf: boolean to: targetBytecodePC
+ "The heart of performance counting in Sista.  Conditional branches are 6 times less
+ frequent than sends and can provide basic block frequencies (send counters can't).
+ Each conditional has a 32-bit counter split into an upper 16 bits counting executions
+ and a lower half counting untaken executions of the branch.  Executing the branch
+ decrements the upper half, tripping if the count goes negative.  Not taking the branch
+ decrements the lower half.  N.B. We *do not* eliminate dead branches (true ifTrue:/true ifFalse:)
+ so that scanning for send and branch data is simplified and that branch data is correct."
+ <inline: false>
+ | ok counterAddress countTripped retry nextPC nextDescriptor desc reg |
+ <var: #ok type: #'AbstractInstruction *'>
+ <var: #desc type: #'CogSimStackEntry *'>
+ <var: #retry type: #'AbstractInstruction *'>
+ <var: #countTripped type: #'AbstractInstruction *'>
+ <var: #nextDescriptor type: #'BytecodeDescriptor *'>
+
+ "In optimized code we don't generate counters to improve performance"
+ (coInterpreter isOptimizedMethod: methodObj) ifTrue:
+ [^super genJumpIf: boolean to: targetBytecodePC].
+
+ "If the branch is reached only for the counter trip trampoline
+ (typically, var1 == var2 ifTrue: falls through to the branch only for the trampoline)
+ we generate a specific path to drastically reduce the number of machine instructions"
+ branchReachedOnlyForCounterTrip ifTrue:
+ [branchReachedOnlyForCounterTrip := false.
+ ^self genCounterTripOnlyJumpIf: boolean to: targetBytecodePC].
+
+ "We detect and: / or:, if found, we don't generate the counters to avoid pathological counter slow down"
+ boolean = objectMemory falseObject ifTrue:
+ [ nextPC := bytecodePC + (self generatorAt: byte0) numBytes.
+  nextDescriptor := self generatorAt: (objectMemory fetchByte: nextPC ofObject: methodObj) + bytecodeSetOffset.
+  nextDescriptor generator ==  #genPushConstantTrueBytecode ifTrue: [ ^ super genJumpIf: boolean to: targetBytecodePC ].
+  nextDescriptor := self generatorAt: (objectMemory fetchByte: targetBytecodePC ofObject: methodObj) + bytecodeSetOffset.
+  nextDescriptor generator == #genPushConstantFalseBytecode ifTrue: [ ^ super genJumpIf: boolean to: targetBytecodePC ]. ].
+
+ extA := 0. "We ignore the noMustBeBoolean flag. It should not be present in methods with counters, and if it is we don't care."
+
+ "We don't generate counters on branches on true/false, the basicblock usage can be inferred"
+ desc := self ssTop.
+ (desc type == SSConstant
+ and: [desc constant = objectMemory trueObject or: [desc constant = objectMemory falseObject]]) ifTrue:
+ [ ^ super genJumpIf: boolean to: targetBytecodePC ].
+
+ self flag: 'Because of the restriction on x64 that absolute loads must target %rax, it would perhaps be a better choice to use TempReg (%rax) for the counter reg and SendNumArgsReg for the boolean.'.
+ "try and use the top entry's register if ant, but only if it can be destroyed."
+ reg := (desc type ~= SSRegister
+ or: [(self anyReferencesToRegister: desc register inAllButTopNItems: 0)
+ or: [(desc register = ReceiverResultReg and: [optStatus isReceiverResultRegLive])]])
+ ifTrue: [TempReg]
+ ifFalse: [desc register].
+ desc popToReg: reg.
+ self ssPop: 1.
+
+ "We need SendNumArgsReg because of the mustBeBooleanTrampoline"
+ self ssAllocateRequiredReg: SendNumArgsReg.
+
+ retry := self Label.
+ self
+ genExecutionCountLogicInto: [ :cAddress :countTripBranch |
+ counterAddress := cAddress.
+ countTripped := countTripBranch ]
+ counterReg: SendNumArgsReg.
+ counterIndex := counterIndex + 1.
+
+ "Cunning trick by LPD.  If true and false are contiguous subtract the smaller.
+ Correct result is either 0 or the distance between them.  If result is not 0 or
+ their distance send mustBeBoolean."
+ self assert: (objectMemory objectAfter: objectMemory falseObject) = objectMemory trueObject.
+ self genSubConstant: boolean R: reg.
+ self JumpZero: (self ensureFixupAt: targetBytecodePC - initialPC).
+
+ self genFallsThroughCountLogicCounterReg: SendNumArgsReg counterAddress: counterAddress.
+
+ self CmpCq: (boolean = objectMemory falseObject
+ ifTrue: [objectMemory trueObject - objectMemory falseObject]
+ ifFalse: [objectMemory falseObject - objectMemory trueObject])
+ R: reg.
+ ok := self JumpZero: 0.
+ self MoveCq: 0 R: SendNumArgsReg. "if counterReg is 0 this is a mustBeBoolean, not a counter trip."
+ reg ~= TempReg ifTrue:
+ [self MoveR: reg R: TempReg].
+ countTripped jmpTarget: self Label.
+ self copySimStackToScratch: simSpillBase.
+ self ssFlushTo: simStackPtr.
+ self genCallMustBeBooleanFor: boolean.
+
+ "If we're in an image which hasn't got the Sista code loaded then the ceCounterTripped: trampoline
+ will return directly to machine code, returning the boolean.  So the code should jump back to the
+ retry point. The trampoline preserves register state when taking the ceCounterTripped: path."
+ "Clément: For some reason if I write self annotateBytecode: (self Jump: retry) the annotation is not at the correct place."
+ "Eliot: Annotations apply the the address following an instruction, and the annotation must be for the return address
+ of the call (since this is the address the run-time sees), so it must be on a label before the jump, not after the jump."
+ self annotateBytecode: self Label.
+ simSpillBase ~= scratchSpillBase ifTrue:
+ [self assert: simSpillBase > scratchSpillBase.
+ self AddCq: simSpillBase - scratchSpillBase * objectMemory wordSize R: SPReg].
+ self Jump: retry.
+
+ ok jmpTarget: self Label.
+ self restoreSimStackFromScratch.
+ ^0!

Item was added:
+ ----- Method: SistaRegisterAllocatingCogit>>genSpecialSelectorComparison (in category 'bytecode generators') -----
+ genSpecialSelectorComparison
+ "Override to count inlined branches if followed by a conditional branch.
+ We borrow the following conditional branch's counter and when about to
+ inline the comparison we decrement the counter (without writing it back)
+ and if it trips simply abort the inlining, falling back to the normal send which
+ will then continue to the conditional branch which will trip and enter the abort."
+ | nextPC postBranchPC targetBytecodePC primDescriptor branchDescriptor
+  rcvrIsInt argIsInt argInt jumpNotSmallInts inlineCAB
+  counterAddress countTripped counterReg index rcvrReg argReg |
+ <var: #countTripped type: #'AbstractInstruction *'>
+ <var: #primDescriptor type: #'BytecodeDescriptor *'>
+ <var: #jumpNotSmallInts type: #'AbstractInstruction *'>
+ <var: #branchDescriptor type: #'BytecodeDescriptor *'>
+
+ (coInterpreter isOptimizedMethod: methodObj) ifTrue:
+ [^self genSpecialSelectorComparisonWithoutCounters].
+
+ primDescriptor := self generatorAt: byte0.
+ argIsInt := self ssTop type = SSConstant
+ and: [objectMemory isIntegerObject: (argInt := self ssTop constant)].
+ rcvrIsInt := (self ssValue: 1) type = SSConstant
+ and: [objectMemory isIntegerObject: (self ssValue: 1) constant].
+
+ "short-cut the jump if operands are SmallInteger constants."
+ (argIsInt and: [rcvrIsInt]) ifTrue:
+ [^ self genStaticallyResolvedSpecialSelectorComparison].
+
+ self extractMaybeBranchDescriptorInto: [ :descr :next :postBranch :target |
+ branchDescriptor := descr. nextPC := next. postBranchPC := postBranch. targetBytecodePC := target ].
+
+ "Only interested in inlining if followed by a conditional branch."
+ inlineCAB := branchDescriptor isBranchTrue or: [branchDescriptor isBranchFalse].
+ "Further, only interested in inlining = and ~= if there's a SmallInteger constant involved.
+ The relational operators successfully statically predict SmallIntegers; the equality operators do not."
+ (inlineCAB and: [primDescriptor opcode = JumpZero or: [primDescriptor opcode = JumpNonZero]]) ifTrue:
+ [inlineCAB := argIsInt or: [rcvrIsInt]].
+ inlineCAB ifFalse:
+ [^self genSpecialSelectorSend].
+
+ "In-line the comparison and the jump, but if the types are not SmallInteger then we will need
+ to do a send and fall through to the following conditional branch.  Since we're allocating values
+ in registers we would like to keep those registers live on the inlined path and reload registers
+ along the non-inlined send path.  The merge logic at the branch destinations handles this."
+ argIsInt
+ ifTrue:
+ [rcvrReg := self allocateRegForStackEntryAt: 1.
+ (self ssValue: 1) popToReg: rcvrReg.
+ self MoveR: rcvrReg R: TempReg.
+ counterReg := self allocateRegNotConflictingWith: (self registerMaskFor: rcvrReg)]
+ ifFalse:
+ [self allocateRegForStackTopTwoEntriesInto: [:rTop :rNext| argReg := rTop. rcvrReg := rNext].
+ rcvrReg = Arg0Reg ifTrue:
+ [rcvrReg := argReg. argReg := Arg0Reg].
+ self ssTop popToReg: argReg.
+ (self ssValue: 1) popToReg: rcvrReg.
+ self MoveR: argReg R: TempReg.
+ counterReg := self allocateRegNotConflictingWith: (self registerMaskFor: rcvrReg and: argReg)].
+ jumpNotSmallInts := (argIsInt or: [rcvrIsInt])
+ ifTrue: [objectRepresentation genJumpNotSmallIntegerInScratchReg: TempReg]
+ ifFalse: [objectRepresentation genJumpNotSmallIntegersIn: rcvrReg andScratch: TempReg scratch: ClassReg].
+
+ self
+ genExecutionCountLogicInto: [ :cAddress :countTripBranch |
+ counterAddress := cAddress.
+ countTripped := countTripBranch ]
+ counterReg: counterReg.
+
+ argIsInt
+ ifTrue: [self CmpCq: argInt R: rcvrReg]
+ ifFalse: [self CmpR: argReg R: rcvrReg].
+ "Cmp is weird/backwards so invert the comparison.  Further since there is a following conditional
+ jump bytecode define non-merge fixups and leave the cond bytecode to set the mergeness."
+ self genConditionalBranch: (branchDescriptor isBranchTrue
+ ifTrue: [primDescriptor opcode]
+ ifFalse: [self inverseBranchFor: primDescriptor opcode])
+ operand: (self ensureFixupAt: targetBytecodePC - initialPC) asUnsignedInteger.
+
+ self genFallsThroughCountLogicCounterReg: counterReg counterAddress: counterAddress.
+
+ self Jump: (self ensureFixupAt: postBranchPC - initialPC).
+ countTripped jmpTarget: (jumpNotSmallInts jmpTarget: self Label).
+
+ self ssFlushTo: simStackPtr.
+ self deny: rcvrReg = Arg0Reg.
+ argIsInt
+ ifTrue: [self MoveCq: argInt R: Arg0Reg]
+ ifFalse: [argReg ~= Arg0Reg ifTrue: [self MoveR: argReg R: Arg0Reg]].
+ rcvrReg ~= ReceiverResultReg ifTrue: [self MoveR: rcvrReg R: ReceiverResultReg].
+ index := byte0 - self firstSpecialSelectorBytecodeOffset.
+ ^self genMarshalledSend: index negated - 1 numArgs: 1 sendTable: ordinarySendTrampolines!

Item was changed:
  InterpreterPrimitives subclass: #StackInterpreter
+ instanceVariableNames: 'currentBytecode bytecodeSetSelector localFP localIP localSP stackLimit stackPage stackPages method instructionPointer stackPointer framePointer localReturnValue localAbsentReceiver localAbsentReceiverOrZero extA extB primitiveFunctionPointer methodCache nsMethodCache atCache lkupClassTag lkupClass methodDictLinearSearchLimit highestRunnableProcessPriority nextWakeupUsecs nextPollUsecs inIOProcessEvents interruptKeycode interruptPending savedWindowSize imageHeaderFlags fullScreenFlag deferDisplayUpdates pendingFinalizationSignals extraVMMemory interpreterProxy showSurfaceFn primitiveTable primitiveAccessorDepthTable externalPrimitiveTable externalPrimitiveTableFirstFreeIndex overflowedPage extraFramesToMoveOnOverflow globalSessionID jmpBuf jmpDepth suspendedCallbacks suspendedMethods numStackPages desiredNumStackPages desiredEdenBytes classNameIndex thisClassIndex metaclassNumSlots interruptCheckChain suppressHeartbeatFlag breakSelector breakSelectorLength l
 ongRunningPrimitiveCheckMethod longRunningPrimitiveCheckSemaphore longRunningPrimitiveStartUsecs longRunningPrimitiveStopUsecs longRunningPrimitiveGCUsecs longRunningPrimitiveCheckSequenceNumber longRunningPrimitiveSignalUndelivered checkAllocFiller tempOop tempOop2 tempOop3 theUnknownShort the2ndUnknownShort imageFloatsBigEndian maxExtSemTabSizeSet lastMethodCacheProbeWrite gcSemaphoreIndex classByteArrayCompactIndex checkedPluginName statForceInterruptCheck statStackOverflow statStackPageDivorce statCheckForEvents statProcessSwitch statIOProcessEvents statPendingFinalizationSignals nativeSP nativeStackPointer lowcodeCalloutState shadowCallStackPointer'
- instanceVariableNames: 'currentBytecode bytecodeSetSelector localFP localIP localSP stackLimit stackPage stackPages method instructionPointer stackPointer framePointer localReturnValue localAbsentReceiver localAbsentReceiverOrZero extA extB primitiveFunctionPointer methodCache nsMethodCache atCache lkupClassTag lkupClass methodDictLinearSearchLimit highestRunnableProcessPriority nextWakeupUsecs nextPollUsecs inIOProcessEvents interruptKeycode interruptPending savedWindowSize imageHeaderFlags fullScreenFlag deferDisplayUpdates pendingFinalizationSignals extraVMMemory interpreterProxy showSurfaceFn primitiveTable primitiveAccessorDepthTable externalPrimitiveTable externalPrimitiveTableFirstFreeIndex overflowedPage extraFramesToMoveOnOverflow globalSessionID jmpBuf jmpDepth suspendedCallbacks suspendedMethods numStackPages desiredNumStackPages desiredEdenBytes classNameIndex thisClassIndex metaclassNumSlots interruptCheckChain suppressHeartbeatFlag breakSelector breakSelectorLength l
 ongRunningPrimitiveCheckMethod longRunningPrimitiveCheckSemaphore longRunningPrimitiveStartUsecs longRunningPrimitiveStopUsecs longRunningPrimitiveGCUsecs longRunningPrimitiveCheckSequenceNumber longRunningPrimitiveSignalUndelivered checkAllocFiller tempOop tempOop2 tempOop3 theUnknownShort the2ndUnknownShort imageFloatsBigEndian maxExtSemTabSizeSet lastMethodCacheProbeWrite gcSemaphoreIndex classByteArrayCompactIndex checkedPluginName statForceInterruptCheck statStackOverflow statStackPageDivorce statCheckForEvents statProcessSwitch statIOProcessEvents statPendingFinalizationSignals nativeSP nativeStackPointer lowcodeCalloutState shadowCallStackPointer debugCallbackPath callbackReturnSP'
  classVariableNames: 'AccessModifierPrivate AccessModifierProtected AccessModifierPublic AltBytecodeEncoderClassName AltLongStoreBytecode AlternateHeaderHasPrimFlag AlternateHeaderIsOptimizedFlag AlternateHeaderNumLiteralsMask AtCacheFixedFields AtCacheFmt AtCacheMask AtCacheOop AtCacheSize AtCacheTotalSize AtPutBase BytecodeEncoderClassName BytecodeTable CacheProbeMax DirBadPath DirEntryFound DirNoMoreEntries DumpStackOnLowSpace EnclosingMixinIndex EnclosingObjectIndex EnforceAccessControl FailImbalancedPrimitives LongStoreBytecode MaxExternalPrimitiveTableSize MaxJumpBuf MaxPrimitiveIndex MaxQuickPrimitiveIndex MethodHeaderArgCountShift MethodHeaderFlagBitPosition MethodHeaderTempCountShift MixinIndex PrimNumberDoExternalCall PrimNumberExternalCall PrimNumberFFICall PrimitiveExternalCallIndex PrimitiveTable StackPageReachedButUntraced StackPageTraceInvalid StackPageTraced StackPageUnreached V3PrimitiveBitsMask'
  poolDictionaries: 'VMBasicConstants VMBytecodeConstants VMMethodCacheConstants VMObjectIndices VMSpurObjectRepresentationConstants VMSqueakClassIndices VMStackFrameOffsets'
  category: 'VMMaker-Interpreter'!
 
  !StackInterpreter commentStamp: 'eem 12/5/2014 11:32' prior: 0!
  This class is a complete implementation of the Smalltalk-80 virtual machine, derived originally from the Blue Book specification but quite different in some areas.  This VM supports Closures but *not* old-style BlockContexts.
 
  It has been modernized with 32-bit pointers, better management of Contexts (see next item), and attention to variable use that allows the CCodeGenerator (qv) to keep, eg, the instruction pointer and stack pointer in registers as well as keeping most simple variables in a global array that seems to improve performance for most platforms.
 
  The VM does not use Contexts directly.  Instead Contexts serve as proxies for a more conventional stack format that is invisible to the image.  There is considerable explanation at http://www.mirandabanda.org/cogblog/2009/01/14/under-cover-contexts-and-the-big-frame-up.  The VM maintains a fixed-size stack zone divided into pages, each page being capable of holding several method/block activations.  A send establishes a new frame in the current stack page, a return returns to the previous frame.  This eliminates allocation/deallocation of contexts and the moving of receiver and arguments from caller to callee on each send/return.  Contexts are created lazily when an activation needs a context (creating a block, explicit use of thisContext, access to sender when sender is a frame, or linking of stack pages together).  Contexts are either conventional and heap-resident ("single") or "married" and serve as proxies for their corresponding frame or "widowed", meaning that their spouse f
 rame has been returned from (died).  A married context is specially marked (more details in the code) and refers to its frame.  Likewise a married frame is specially marked and refers to its context.
 
  In addition to SmallInteger arithmetic and Floats, the VM supports logic on 32-bit PositiveLargeIntegers, thus allowing it to simulate itself much more effectively than would otherwise be the case.
 
  StackInterpreter and subclasses support multiple memory managers.  Currently there are two.  NewMemoryManager is a slightly refined version of ObjectMemory, and is the memory manager and garbage collector for the original Squeak object representation as described in "Back to the Future The Story of Squeak, A Practical Smalltalk Written in Itself", see http://ftp.squeak.org/docs/OOPSLA.Squeak.html.  Spur is a faster, more regular object representation that is designed for more performance and functionality, and to have a common header format for both 32-bit and 64-bit versions.  You can read about it in SpurMemoryManager's class comment.  There is also a video of a presentation at ESUG 2014 (https://www.youtube.com/watch?v=k0nBNS1aHZ4), along with slides (http://www.slideshare.net/esug/spur-a-new-object-representation-for-cog?related=1).!

Item was changed:
  ----- Method: StackInterpreter class>>mustBeGlobal: (in category 'translation') -----
  mustBeGlobal: var
  "Answer if a variable must be global and exported.  Used for inst vars that are accessed from VM support code."
 
  ^(super mustBeGlobal: var)
    or: [(self objectMemoryClass mustBeGlobal: var)
    or: [(#('interpreterProxy' 'interpreterVersion' 'inIOProcessEvents'
  'deferDisplayUpdates' 'extraVMMemory' 'showSurfaceFn'
  'desiredNumStackPages' 'desiredEdenBytes'
  'breakSelector' 'breakSelectorLength' 'sendTrace' 'checkAllocFiller' 'checkedPluginName'
+ 'suppressHeartbeatFlag') includes: var)
- 'suppressHeartbeatFlag' 'debugCallbackPath' 'callbackReturnSP') includes: var)
    or: [ "This allows slow machines to define bytecodeSetSelector as 0
  to avoid the interpretation overhead."
  MULTIPLEBYTECODESETS not and: [var = 'bytecodeSetSelector']]]]!

Item was added:
+ ----- Method: StackInterpreter>>checkForStackOverflow (in category 'message sending') -----
+ checkForStackOverflow
+ "Check for stack overflow, moving frames to another stack if so."
+ <inline: true>
+ "After checkForInterrupts another event check may have been forced, setting both
+ stackLimit and stackPage stackLimit to all ones.  So here we must check against
+ the real stackLimit, not the effective stackLimit."
+ stackPointer < stackPage realStackLimit ifTrue:
+ [self handleStackOverflow]!

Item was changed:
  ----- Method: StackInterpreter>>checkStackPointerIndexForFrame: (in category 'frame access') -----
  checkStackPointerIndexForFrame: theFP
  "Version of stackPointerIndexForFrame: that does not depend on writing back head frame pointers.
  Used for assertion checking. Safe only in external primitives (framePointer valid).
  Answer the 0-based index rel to the given frame.
  (This is what stackPointer used to be before conversion to pointer)"
  "In the StackInterpreter stacks grow down."
  | thePage theSP |
  <inline: false>
  <var: #theFP type: #'char *'>
  <var: #thePage type: #'StackPage *'>
  <var: #theSP type: #'char *'>
+ theFP = framePointer ifTrue:
+ [^self stackPointerIndexForFrame: theFP WithSP: stackPointer].
  thePage := stackPages stackPageFor: theFP.
  theSP := self findSPOrNilOf: theFP
  on: thePage
  startingFrom: (thePage = stackPage ifTrue: [framePointer] ifFalse: [thePage headFP]).
  ^self stackPointerIndexForFrame: theFP WithSP: theSP!

Item was changed:
  ----- Method: StackInterpreter>>handleStackOverflow (in category 'message sending') -----
  handleStackOverflow
+ "Check for stack overflow, moving frames to another stack if so.
+ This should *only* be sent from checkForStackOverflow."
+ <inline: #never>
- "Check for stack overflow, moving frames to another stack if so."
  | newPage theFP callerFP overflowLimitAddress overflowCount |
  <var: #newPage type: #'StackPage *'>
  <var: #theFP type: #'char *'>
  <var: #callerFP type: #'char *'>
  <var: #overflowLimitAddress type: #'char *'>
 
+ self assert: stackPointer < stackPage realStackLimit.
- "After checkForInterrupts another event check may have been forced, setting both
- stackLimit and stackPage stackLimit to all ones.  So here we must check against
- the real stackLimit, not the effective stackLimit."
- stackPointer < stackPage realStackLimit ifFalse:
- [^self].
 
  self maybeTraceStackOverflow.
  statStackOverflow := statStackOverflow + 1.
 
  "The stack has overflowed this page.  If the system is executing some recursive algorithm,
  e.g. fibonacci, then the system could thrash overflowing the stack if the call soon returns
  back to the current page.  To avoid thrashing, since overflow is quite slow, we can move
  more than one frame.  The idea is to record which page has overflowed, and the first
  time it overflows move one frame, the second time two frames, and so on.  We move no
  more frames than would leave the page half occupied."
  theFP := framePointer.
  stackPage = overflowedPage
  ifTrue:
  [overflowLimitAddress := stackPage baseAddress - stackPages overflowLimit.
  overflowCount := extraFramesToMoveOnOverflow := extraFramesToMoveOnOverflow + 1.
  [(overflowCount := overflowCount - 1) >= 0
    and: [(callerFP := self frameCallerFP: theFP) < overflowLimitAddress
    and: [(self isBaseFrame: callerFP) not]]] whileTrue:
  [theFP := callerFP]]
  ifFalse:
  [overflowedPage := stackPage.
  extraFramesToMoveOnOverflow := 0].
 
  self ensureCallerContext: theFP.
  newPage := stackPages newStackPage.
  self moveFramesIn: stackPage through: theFP toPage: newPage.
  self setStackPageAndLimit: newPage.
  framePointer := stackPage headFP.
  stackPointer := stackPage headSP.
  self isCog
  ifFalse: "To overflow the stack this must be a new frame, but in Cog base frames are married."
  [self assert: (self frameHasContext: framePointer) not.
  self assert: (self validInstructionPointer: instructionPointer + 1
  inMethod: method
  framePointer: framePointer)]
  ifTrue:
  [self assert: (self validInstructionPointer: instructionPointer + 1
  inFrame: framePointer).
  self assert: ((self frameHasContext: framePointer) not
  or: [objectMemory isContext: (self frameContext: framePointer)])]!

Item was changed:
  ----- Method: StackInterpreter>>handleStackOverflowOrEventAllowContextSwitch: (in category 'message sending') -----
  handleStackOverflowOrEventAllowContextSwitch: mayContextSwitch
  "The stackPointer is below the stackLimit.  This is either because of a
  stack overflow or the setting of stackLimit to indicate a possible interrupt.
  Check for interrupts and stackOverflow and deal with each appropriately.
  Answer if a context switch occurred."
  | switched |
  <inline: false>
  "If the stackLimit differs from the realStackLimit then the stackLimit
  has been set to indicate an event or interrupt that needs servicing."
  stackLimit = stackPage realStackLimit
  ifTrue:
  [self externalWriteBackHeadFramePointers.
  switched := false]
  ifFalse: [switched := self checkForEventsMayContextSwitch: mayContextSwitch].
 
+ self checkForStackOverflow.
- self handleStackOverflow.
  ^switched!

Item was changed:
  ----- Method: StackInterpreter>>returnAs:ThroughCallback:Context: (in category 'callback support') -----
  returnAs: returnTypeOop ThroughCallback: vmCallbackContext Context: callbackMethodContext
  "callbackMethodContext is an activation of invokeCallback:[stack:registers:jmpbuf:].
  Its sender is the VM's state prior to the callback.  Reestablish that state (via longjmp),
  and mark callbackMethodContext as dead."
  <export: true>
  <var: #vmCallbackContext type: #'VMCallbackContext *'>
  | calloutMethodContext theFP thePage |
  <var: #theFP type: #'char *'>
  <var: #thePage type: #'StackPage *'>
- debugCallbackPath := 0.
  ((self isIntegerObject: returnTypeOop)
  and: [self isLiveContext: callbackMethodContext]) ifFalse:
+ [^false].
- [debugCallbackPath := 1.
- ^false].
  calloutMethodContext := self externalInstVar: SenderIndex ofContext: callbackMethodContext.
  (self isLiveContext: calloutMethodContext) ifFalse:
+ [^false].
- [debugCallbackPath := 2.
- ^false].
- debugCallbackPath := 4.
  "We're about to leave this stack page; must save the current frame's instructionPointer."
  self push: instructionPointer.
  self externalWriteBackHeadFramePointers.
  "Mark callbackMethodContext as dead; the common case is that it is the current frame.
  We go the extra mile for the debugger."
  (self isSingleContext: callbackMethodContext)
+ ifTrue: [self markContextAsDead: callbackMethodContext]
- ifTrue: [self markContextAsDead: callbackMethodContext. debugCallbackPath := debugCallbackPath bitOr: 8]
  ifFalse:
+ [theFP := self frameOfMarriedContext: callbackMethodContext.
- [debugCallbackPath := debugCallbackPath bitOr: 16.
- theFP := self frameOfMarriedContext: callbackMethodContext.
  framePointer = theFP "common case"
  ifTrue:
+ [(self isBaseFrame: theFP) ifFalse: "calloutMethodContext is immediately below on the same page.  Make it current."
+ [instructionPointer := (self frameCallerSavedIP: framePointer) asUnsignedInteger.
+ stackPointer := framePointer + (self frameStackedReceiverOffset: framePointer) + objectMemory wordSize.
+ framePointer := self frameCallerFP: framePointer.
+ self setMethod: (self frameMethodObject: framePointer).
+ self restoreCStackStateForCallbackContext: vmCallbackContext.
+ "N.B. siglongjmp is defines as _longjmp on non-win32 platforms.
+  This matches the use of _setjmp in ia32abicc.c."
+ self siglong: vmCallbackContext trampoline jmp: (self integerValueOf: returnTypeOop).
+ ^true].
+ stackPages freeStackPage: stackPage]
- [debugCallbackPath := debugCallbackPath bitOr: 32.
- (self isBaseFrame: theFP)
- ifTrue: [stackPages freeStackPage: stackPage. debugCallbackPath := debugCallbackPath bitOr: 64]
- ifFalse: "calloutMethodContext is immediately below on the same page.  Make it current."
- [debugCallbackPath := debugCallbackPath bitOr: 128.
- instructionPointer := (self frameCallerSavedIP: framePointer) asUnsignedInteger.
- stackPointer := framePointer + (self frameStackedReceiverOffset: framePointer) + objectMemory wordSize.
- framePointer := self frameCallerFP: framePointer.
- self setMethod: (self frameMethodObject: framePointer).
- self restoreCStackStateForCallbackContext: vmCallbackContext.
- callbackReturnSP := stackPointer.
- "N.B. siglongjmp is defines as _longjmp on non-win32 platforms.
-  This matches the use of _setjmp in ia32abicc.c."
- self siglong: vmCallbackContext trampoline jmp: (self integerValueOf: returnTypeOop).
- ^true]]
  ifFalse:
+ [self externalDivorceFrame: theFP andContext: callbackMethodContext.
- [debugCallbackPath := debugCallbackPath bitOr: 256.
- self externalDivorceFrame: theFP andContext: callbackMethodContext.
  self markContextAsDead: callbackMethodContext]].
  "Make the calloutMethodContext the active frame.  The case where calloutMethodContext
  is immediately below callbackMethodContext on the same page is handled above."
  (self isStillMarriedContext: calloutMethodContext)
  ifTrue:
+ [theFP := self frameOfMarriedContext: calloutMethodContext.
- [debugCallbackPath := debugCallbackPath bitOr: 512.
- theFP := self frameOfMarriedContext: calloutMethodContext.
  thePage := stackPages stackPageFor: theFP.
  "findSPOf:on: points to the word beneath the instructionPointer, but
   there is no instructionPointer on the top frame of the current page."
  self assert: thePage ~= stackPage.
+ stackPointer := thePage headFP = theFP
+ ifTrue: [thePage headSP]
+ ifFalse: [(self findSPOf: theFP on: thePage) - objectMemory wordSize].
- stackPointer := (self findSPOf: theFP on: thePage) - objectMemory wordSize.
  framePointer := theFP]
  ifFalse:
+ [thePage := self makeBaseFrameFor: calloutMethodContext.
- [debugCallbackPath := debugCallbackPath bitOr: 1024.
- thePage := self makeBaseFrameFor: calloutMethodContext.
  framePointer := thePage headFP.
  stackPointer := thePage headSP].
  instructionPointer := self popStack.
  self setMethod: (objectMemory fetchPointer: MethodIndex ofObject: calloutMethodContext).
  self setStackPageAndLimit: thePage.
  self restoreCStackStateForCallbackContext: vmCallbackContext.
+ "N.B. siglongjmp is defines as _longjmp on non-win32 platforms.
- debugCallbackPath := debugCallbackPath bitOr: 2048.
- callbackReturnSP := stackPointer.
- "N.B. siglongjmp is defines as _longjmp on non-win32 platforms.
   This matches the use of _setjmp in ia32abicc.c."
  self siglong: vmCallbackContext trampoline jmp: (self integerValueOf: returnTypeOop).
  "NOTREACHED"
  ^true!

Item was changed:
  ----- Method: StackInterpreter>>sendInvokeCallback:Stack:Registers:Jmpbuf: (in category 'callback support') -----
  sendInvokeCallback: thunkPtr Stack: stackPtr Registers: regsPtr Jmpbuf: jmpBufPtr
  "Send the 4 argument callback message invokeCallback:stack:registers:jmpbuf:
  to Alien class with the supplied args.  The arguments are raw C addresses
  and are converted to integer objects on the way."
  <export: true>
  | classTag |
  classTag := self fetchClassTagOfNonImm: (self splObj: ClassAlien).
  messageSelector := self splObj: SelectorInvokeCallback.
  argumentCount := 4.
  (self lookupInMethodCacheSel: messageSelector classTag: classTag) ifFalse:
  [(self lookupOrdinaryNoMNUEtcInClass: (objectMemory classForClassTag: classTag)) ~= 0 ifTrue:
  [^false]].
  ((self argumentCountOf: newMethod) = 4
  and: [primitiveFunctionPointer = 0]) ifFalse:
  [^false].
  self push: (self splObj: ClassAlien). "receiver"
  self push: (self positiveMachineIntegerFor: thunkPtr).
  self push: (self positiveMachineIntegerFor: stackPtr).
  self push: (self positiveMachineIntegerFor: regsPtr).
  self push: (self positiveMachineIntegerFor: jmpBufPtr).
  self ifAppropriateCompileToNativeCode: newMethod selector: messageSelector.
  self justActivateNewMethod.
  (self isMachineCodeFrame: framePointer) ifFalse:
  [self maybeFlagMethodAsInterpreted: newMethod].
  self externalWriteBackHeadFramePointers.
+ self checkForStackOverflow.
- self handleStackOverflow.
  self enterSmalltalkExecutiveFromCallback.
  "not reached"
  ^true!

Item was changed:
  ----- Method: StackInterpreter>>sendInvokeCallbackContext: (in category 'callback support') -----
  sendInvokeCallbackContext: vmCallbackContext
  "Send the calllback message to Alien class with the supplied arg(s).  Use either the
  1 arg invokeCallbackContext: or the 4 arg invokeCallback:stack:registers:jmpbuf:
  message, depending on what selector is installed in the specialObjectsArray.
  Note that if invoking the legacy invokeCallback:stack:registers:jmpbuf: we pass the
  vmCallbackContext as the jmpbuf argument (see reestablishContextPriorToCallback:).
  The arguments are raw C addresses and are converted to integer objects on the way."
  <export: true>
  <var: #vmCallbackContext type: #'VMCallbackContext *'>
  | classTag |
  classTag := self fetchClassTagOfNonImm: (self splObj: ClassAlien).
  messageSelector := self splObj: SelectorInvokeCallback.
  (self lookupInMethodCacheSel: messageSelector classTag: classTag) ifFalse:
  [(self lookupOrdinaryNoMNUEtcInClass: (objectMemory classForClassTag: classTag)) ~= 0 ifTrue:
  [^false]].
  primitiveFunctionPointer ~= 0 ifTrue:
  [^false].
  self saveCStackStateForCallbackContext: vmCallbackContext.
  self push: (self splObj: ClassAlien). "receiver"
  (self argumentCountOf: newMethod) = 4 ifTrue:
  [self push: (self positiveMachineIntegerFor: vmCallbackContext thunkp asUnsignedInteger).
  self push: (self positiveMachineIntegerFor: vmCallbackContext stackp asUnsignedInteger).
  self push: (self positiveMachineIntegerFor: vmCallbackContext intregargsp asUnsignedInteger)].
  self push: (self positiveMachineIntegerFor: vmCallbackContext asUnsignedInteger).
  self ifAppropriateCompileToNativeCode: newMethod selector: messageSelector.
  self justActivateNewMethod.
  (self isMachineCodeFrame: framePointer) ifFalse:
  [self maybeFlagMethodAsInterpreted: newMethod].
  self externalWriteBackHeadFramePointers.
+ self checkForStackOverflow.
- self handleStackOverflow.
  self enterSmalltalkExecutiveFromCallback.
  "not reached"
  ^true!

Item was changed:
  ----- Method: ThreadedARMFFIPlugin>>ffiCalloutTo:SpecOnStack:in: (in category 'callout support') -----
  ffiCalloutTo: procAddr SpecOnStack: specOnStack in: calloutState
  <var: #procAddr type: #'void *'>
  <var: #calloutState type: #'CalloutState *'>
  <var: #loadFloatRegs declareC: 'extern void loadFloatRegs(double, double, double, double, double, double, double, double)'>
  "Go out, call this guy and create the return value.  This *must* be inlined because of
  the alloca of the outgoing stack frame in ffiCall:WithFlags:NumArgs:Args:AndTypes:"
+ | myThreadIndex atomicType floatRet intRet loadFloatRegs |
- | myThreadIndex atomicType floatRet intRet loadFloatRegs oop |
  <var: #floatRet type: #double>
  <var: #intRet type: #usqLong>
  <inline: true>
  self cCode: '' inSmalltalk: [loadFloatRegs := #used. loadFloatRegs class].
+ self maybeDisownVM: calloutState threadIndexInto: [:threadIndex| myThreadIndex := threadIndex].
- self cppIf: COGMTVM ifTrue:
- [(calloutState callFlags anyMask: FFICallFlagThreaded) ifTrue:
- [myThreadIndex := interpreterProxy disownVM: 0]].
 
  calloutState floatRegisterIndex > 0 ifTrue:
  [self
  load: ((self cCoerceSimple: (self addressOf: (calloutState floatRegisters at: 0)) to: 'double *') at: 0)
  Flo: ((self cCoerceSimple: (self addressOf: (calloutState floatRegisters at: 2)) to: 'double *') at: 0)
  a: ((self cCoerceSimple: (self addressOf: (calloutState floatRegisters at: 4)) to: 'double *') at: 0)
  t: ((self cCoerceSimple: (self addressOf: (calloutState floatRegisters at: 6)) to: 'double *') at: 0)
  R: ((self cCoerceSimple: (self addressOf: (calloutState floatRegisters at: 8)) to: 'double *') at: 0)
  e: ((self cCoerceSimple: (self addressOf: (calloutState floatRegisters at: 10)) to: 'double *') at: 0)
  g: ((self cCoerceSimple: (self addressOf: (calloutState floatRegisters at: 12)) to: 'double *') at: 0)
  s: ((self cCoerceSimple: (self addressOf: (calloutState floatRegisters at: 14)) to: 'double *') at: 0)].
 
  (self allocaLiesSoSetSpBeforeCall or: [self mustAlignStack]) ifTrue:
  [self setsp: calloutState argVector].
 
  atomicType := self atomicTypeOf: calloutState ffiRetHeader.
+ (atomicType >> 1) = (FFITypeSingleFloat >> 1) ifTrue:
+ [atomicType = FFITypeSingleFloat
+ ifTrue:
+ [floatRet := self
+ dispatchFunctionPointer: (self cCoerceSimple: procAddr to: 'float (*)(sqIntptr_t, sqIntptr_t, sqIntptr_t, sqIntptr_t)')
+ with: (calloutState integerRegisters at: 0)
+ with: (calloutState integerRegisters at: 1)
+ with: (calloutState integerRegisters at: 2)
+ with: (calloutState integerRegisters at: 3)]
+ ifFalse: "atomicType = FFITypeDoubleFloat"
+ [floatRet := self
+ dispatchFunctionPointer: (self cCoerceSimple: procAddr to: 'double (*)(sqIntptr_t, sqIntptr_t, sqIntptr_t, sqIntptr_t)')
+ with: (calloutState integerRegisters at: 0)
+ with: (calloutState integerRegisters at: 1)
+ with: (calloutState integerRegisters at: 2)
+ with: (calloutState integerRegisters at: 3)].
+
+ "undo any callee argument pops because it may confuse stack management with the alloca."
+ (self isCalleePopsConvention: calloutState callFlags) ifTrue:
+ [self setsp: calloutState argVector].
+ self maybeOwnVM: calloutState threadIndex: myThreadIndex.
+
+ ^interpreterProxy floatObjectOf: floatRet].
+
+ intRet := self
- (atomicType >> 1) = (FFITypeSingleFloat >> 1)
- ifTrue:
- [atomicType = FFITypeSingleFloat
- ifTrue:
- [floatRet := self
- dispatchFunctionPointer: (self cCoerceSimple: procAddr to: 'float (*)(sqIntptr_t, sqIntptr_t, sqIntptr_t, sqIntptr_t)')
- with: (calloutState integerRegisters at: 0)
- with: (calloutState integerRegisters at: 1)
- with: (calloutState integerRegisters at: 2)
- with: (calloutState integerRegisters at: 3)]
- ifFalse: "atomicType = FFITypeDoubleFloat"
- [floatRet := self
- dispatchFunctionPointer: (self cCoerceSimple: procAddr to: 'double (*)(sqIntptr_t, sqIntptr_t, sqIntptr_t, sqIntptr_t)')
- with: (calloutState integerRegisters at: 0)
- with: (calloutState integerRegisters at: 1)
- with: (calloutState integerRegisters at: 2)
- with: (calloutState integerRegisters at: 3)]]
- ifFalse:
- [intRet := self
  dispatchFunctionPointer: (self cCoerceSimple: procAddr to: 'usqIntptr_t (*)(sqIntptr_t, sqIntptr_t, sqIntptr_t, sqIntptr_t)')
  with: (calloutState integerRegisters at: 0)
  with: (calloutState integerRegisters at: 1)
  with: (calloutState integerRegisters at: 2)
+ with: (calloutState integerRegisters at: 3).
+
- with: (calloutState integerRegisters at: 3)].
  "undo any callee argument pops because it may confuse stack management with the alloca."
  (self isCalleePopsConvention: calloutState callFlags) ifTrue:
  [self setsp: calloutState argVector].
+ self maybeOwnVM: calloutState threadIndex: myThreadIndex.
 
- self cppIf: COGMTVM ifTrue:
- [(calloutState callFlags anyMask: FFICallFlagThreaded) ifTrue:
- [interpreterProxy ownVM: myThreadIndex]].
-
  (calloutState ffiRetHeader anyMask: FFIFlagPointer+FFIFlagStructure) ifTrue:
  ["Note: Order is important here since FFIFlagPointer + FFIFlagStructure is used to represent
  'typedef void* VoidPointer' and VoidPointer must be returned as pointer *not* as struct."
+ (calloutState ffiRetHeader anyMask: FFIFlagPointer) ifTrue:
+ [^self ffiReturnPointer: intRet ofType: (self ffiReturnType: specOnStack) in: calloutState].
+ ^self ffiReturnStruct: intRet ofType: (self ffiReturnType: specOnStack) in: calloutState].
- (calloutState ffiRetHeader anyMask: FFIFlagPointer)
- ifTrue:
- [oop := self ffiReturnPointer: intRet ofType: (self ffiReturnType: specOnStack) in: calloutState]
- ifFalse:
- [oop := self ffiReturnStruct: intRet ofType: (self ffiReturnType: specOnStack) in: calloutState].
- ^oop].
 
+ ^self ffiCreateIntegralResultOop: intRet ofAtomicType: atomicType in: calloutState!
- (atomicType >> 1) = (FFITypeSingleFloat >> 1)
- ifTrue:
- [oop := interpreterProxy floatObjectOf: floatRet]
- ifFalse:
- [oop := self ffiCreateIntegralResultOop: intRet
- ofAtomicType: atomicType
- in: calloutState].
- ^interpreterProxy methodReturnValue: oop!

Item was changed:
  ----- Method: ThreadedARMFFIPlugin>>ffiReturnStruct:ofType:in: (in category 'callout support') -----
  ffiReturnStruct: longLongRet ofType: ffiRetType in: calloutState
  <var: #longLongRet type: #usqLong>
  <var: #calloutState type: #'CalloutState *'>
  "Create a structure return value from an external function call.  The value has been stored in
  alloca'ed space pointed to by the calloutState or in the return value."
  | retOop retClass oop |
  <inline: true>
  retClass := interpreterProxy fetchPointer: 1 ofObject: ffiRetType.
  retOop := interpreterProxy instantiateClass: retClass indexableSize: 0.
  self remapOop: retOop
  in: [oop := interpreterProxy
  instantiateClass: interpreterProxy classByteArray
  indexableSize: calloutState structReturnSize].
  self mem: (interpreterProxy firstIndexableField: oop)
  cp: ((self returnStructInRegisters: calloutState structReturnSize)
  ifTrue: [self addressOf: longLongRet]
  ifFalse: [calloutState limit])
  y: calloutState structReturnSize.
  interpreterProxy storePointer: 0 ofObject: retOop withValue: oop.
+ ^retOop!
- ^interpreterProxy methodReturnValue: retOop!

Item was changed:
  ----- Method: ThreadedFFIPlugin>>ffiCall:ArgArrayOrNil:NumArgs: (in category 'callout support') -----
  ffiCall: externalFunction ArgArrayOrNil: argArrayOrNil NumArgs: nArgs
  "Generic callout. Does the actual work.  If argArrayOrNil is nil it takes args from the stack
  and the spec from the method.  If argArrayOrNil is not nil takes args from argArrayOrNil
  and the spec from the receiver."
+ | flags argTypeArray address argType oop argSpec argClass err theCalloutState calloutState requiredStackSize stackSize allocation result primNumArgs |
- | flags argTypeArray address argType oop argSpec argClass err theCalloutState calloutState requiredStackSize stackSize allocation result |
  <inline: true>
  <var: #theCalloutState type: #'CalloutState'>
  <var: #calloutState type: #'CalloutState *'>
  <var: #allocation type: #'char *'>
 
+ primNumArgs := interpreterProxy methodArgumentCount.
  (interpreterProxy is: externalFunction KindOfClass: interpreterProxy classExternalFunction) ifFalse:
  [^self ffiFail: FFIErrorNotFunction].
  "Load and check the values in the externalFunction before we call out"
  flags := interpreterProxy fetchInteger: ExternalFunctionFlagsIndex ofObject: externalFunction.
  interpreterProxy failed ifTrue:
  [^self ffiFail: FFIErrorBadArgs].
 
  "This must come early for compatibility with the old FFIPlugin.  Image-level code
  may assume the function pointer is loaded eagerly.  Thanks to Nicolas Cellier."
  address := self ffiLoadCalloutAddress: externalFunction.
  interpreterProxy failed ifTrue:
  [^0 "error code already set by ffiLoadCalloutAddress:"].
 
  argTypeArray := interpreterProxy fetchPointer: ExternalFunctionArgTypesIndex ofObject: externalFunction.
  "must be array of arg types"
  ((interpreterProxy isArray: argTypeArray)
  and: [(interpreterProxy slotSizeOf: argTypeArray) = (nArgs + 1)]) ifFalse:
  [^self ffiFail: FFIErrorBadArgs].
  "check if the calling convention is supported"
  self cppIf: COGMTVM
  ifTrue:
  [(self ffiSupportsCallingConvention: (flags bitAnd: FFICallTypesMask)) ifFalse:
  [^self ffiFail: FFIErrorCallType]]
  ifFalse: "not masking causes threaded calls to fail, which is as they should if the plugin is not threaded."
  [(self ffiSupportsCallingConvention: flags) ifFalse:
  [^self ffiFail: FFIErrorCallType]].
 
  requiredStackSize := self externalFunctionHasStackSizeSlot
  ifTrue: [interpreterProxy
  fetchInteger: ExternalFunctionStackSizeIndex
  ofObject: externalFunction]
  ifFalse: [-1].
  interpreterProxy failed ifTrue:
  [^interpreterProxy primitiveFailFor: (argArrayOrNil isNil
  ifTrue: [PrimErrBadMethod]
  ifFalse: [PrimErrBadReceiver])].
  stackSize := requiredStackSize < 0 ifTrue: [DefaultMaxStackSize] ifFalse: [requiredStackSize].
  self cCode: [] inSmalltalk: [theCalloutState := self class calloutStateClass new].
  calloutState := self addressOf: theCalloutState.
  self cCode: [self me: calloutState ms: 0 et: (self sizeof: #CalloutState)].
  calloutState callFlags: flags.
  "Fetch return type and args"
  argType := interpreterProxy fetchPointer: 0 ofObject: argTypeArray.
  argSpec := interpreterProxy fetchPointer: 0 ofObject: argType.
  argClass := interpreterProxy fetchPointer: 1 ofObject: argType.
  (err := self ffiCheckReturn: argSpec With: argClass in: calloutState) ~= 0 ifTrue:
  [^self ffiFail: err]. "cannot return"
  "alloca the outgoing stack frame, leaving room for marshalling args, and including space for the return struct, if any.
  Additional space reserved for saving register args like mandated by Win64 X64 or PPC ABI, will be managed by the call itself"
  allocation := self alloca: stackSize + calloutState structReturnSize + self cStackAlignment.
  self mustAlignStack ifTrue:
+ [allocation := self cCoerce: (allocation asUnsignedIntegerPtr bitClear: self cStackAlignment - 1) to: #'char *'].
- [allocation := self cCoerce: (allocation asUnsignedIntegerPtr bitClear: self cStackAlignment - 1)
- to: #'char *'].
  calloutState
  argVector: allocation;
  currentArg: allocation;
  limit: allocation + stackSize.
  (calloutState structReturnSize > 0
  and: [self nonRegisterStructReturnIsViaImplicitFirstArgument
  and: [(self returnStructInRegisters: calloutState structReturnSize) not]]) ifTrue:
  [err := self ffiPushPointer: calloutState limit in: calloutState.
  err ~= 0 ifTrue:
  [self cleanupCalloutState: calloutState.
  self cppIf: COGMTVM ifTrue:
  [err = PrimErrObjectMayMove negated ifTrue:
  [^PrimErrObjectMayMove]]. "N.B. Do not fail if object may move because caller will GC and retry."
  ^self ffiFail: err]].
  1 to: nArgs do:
  [:i|
  argType := interpreterProxy fetchPointer: i ofObject: argTypeArray.
  argSpec := interpreterProxy fetchPointer: 0 ofObject: argType.
  argClass := interpreterProxy fetchPointer: 1 ofObject: argType.
  oop := argArrayOrNil isNil
  ifTrue: [interpreterProxy stackValue: nArgs - i]
  ifFalse: [interpreterProxy fetchPointer: i - 1 ofObject: argArrayOrNil].
  err := self ffiArgument: oop Spec: argSpec Class: argClass in: calloutState.
  err ~= 0 ifTrue:
  [self cleanupCalloutState: calloutState.
  self cppIf: COGMTVM ifTrue:
  [err = PrimErrObjectMayMove negated ifTrue:
  [^PrimErrObjectMayMove]]. "N.B. Do not fail if object may move because caller will GC and retry."
  ^self ffiFail: err]]. "coercion failed or out of stack space"
  "Failures must be reported back from ffiArgument:Spec:Class:in:.
  Should not fail from here on in."
  self assert: interpreterProxy failed not.
  self ffiLogCallout: externalFunction.
  (requiredStackSize < 0
  and: [self externalFunctionHasStackSizeSlot]) ifTrue:
  [stackSize := calloutState currentArg - calloutState argVector.
  interpreterProxy storeInteger: ExternalFunctionStackSizeIndex ofObject: externalFunction withValue: stackSize].
  "Go out and call this guy"
  result := self ffiCalloutTo: address SpecOnStack: argArrayOrNil notNil in: calloutState.
  self cleanupCalloutState: calloutState.
+ "Can not safely use argumentCount (via e.g. methodReturnValue:) since it may have been changed by a callback."
+ interpreterProxy pop: primNumArgs + 1 thenPush: result.
  ^result!

Item was changed:
  ----- Method: ThreadedFFIPlugin>>ffiReturnCStringFrom: (in category 'callout support') -----
  ffiReturnCStringFrom: cPointer
  "Create a Smalltalk string from a zero terminated C string"
  | strLen strOop cString strPtr |
  <var: #cString type: #'char *'>
  <var: #strPtr type: #'char *'>
+ cPointer ifNil: [^interpreterProxy nilObject]. "nil always returns as nil"
+ cString := self cCoerce: cPointer to: #'char *'.
- cPointer = nil ifTrue:[
- ^interpreterProxy methodReturnValue: interpreterProxy nilObject]. "nil always returns as nil"
- cString := self cCoerce: cPointer to:'char *'.
  strLen := 0.
  [(cString at: strLen) = 0] whileFalse:[strLen := strLen+1].
  strOop := interpreterProxy
  instantiateClass: interpreterProxy classString
  indexableSize: strLen.
  strPtr := interpreterProxy firstIndexableField: strOop.
+ 0 to: strLen-1 do: [:i| strPtr at: i put: (cString at: i)].
+ ^strOop!
- 0 to: strLen-1 do:[:i| strPtr at: i put: (cString at: i)].
- ^interpreterProxy methodReturnValue: strOop!

Item was changed:
  ----- Method: ThreadedFFIPlugin>>ffiReturnPointer:ofType:in: (in category 'callout support') -----
  ffiReturnPointer: retVal ofType: retType in: calloutState
  <var: #calloutState type: #'CalloutState *'>
  <var: #retVal type: #usqLong>
  "Generic callout support. Create a pointer return value from an external function call"
  | retClass atomicType retOop oop ptr classOop |
  <var: #ptr type: #'sqInt *'>
  retClass := interpreterProxy fetchPointer: 1 ofObject: retType.
  retClass = interpreterProxy nilObject ifTrue:
  ["Create ExternalData upon return"
  atomicType := self atomicTypeOf: calloutState ffiRetHeader.
  (atomicType >> 1) = (FFITypeSignedChar >> 1) ifTrue: "String return"
+ [retOop := self ffiReturnCStringFrom: (self cCoerceSimple: retVal to: #usqInt).
+ ^retOop].
- [^self ffiReturnCStringFrom: (self cCoerceSimple: retVal to: #usqInt)].
  "generate external data"
  self remapOop: retType in:
  [oop := interpreterProxy
  instantiateClass: interpreterProxy classExternalAddress
  indexableSize: BytesPerWord.
  ptr := interpreterProxy firstIndexableField: oop.
  ptr at: 0 put: (self cCoerceSimple: retVal to: #sqInt).
  self remapOop: oop in:
  [retOop := interpreterProxy
  instantiateClass: interpreterProxy classExternalData
  indexableSize: 0].
  interpreterProxy storePointer: 0 ofObject: retOop withValue: oop].
  interpreterProxy storePointer: 1 ofObject: retOop withValue: retType.
+ ^retOop].
- ^interpreterProxy methodReturnValue: retOop].
  "non-atomic pointer return"
  classOop := (calloutState ffiRetHeader anyMask: FFIFlagStructure)
  ifTrue:[interpreterProxy classByteArray]
  ifFalse:[interpreterProxy classExternalAddress].
  self remapOop: retClass in:
  [oop := interpreterProxy
  instantiateClass: classOop
  indexableSize: BytesPerWord].
  ptr := interpreterProxy firstIndexableField: oop.
  ptr at: 0 put: (self cCoerceSimple: retVal to: #sqInt).
  self remapOop: oop in:
  [retOop := interpreterProxy instantiateClass: retClass indexableSize: 0].
  interpreterProxy storePointer: 0 ofObject: retOop withValue: oop.
+ ^retOop!
- ^interpreterProxy methodReturnValue: retOop!

Item was added:
+ ----- Method: ThreadedFFIPlugin>>maybeDisownVM:threadIndexInto: (in category 'primitive support') -----
+ maybeDisownVM: calloutState threadIndexInto: aBlock
+ <var: #calloutState type: #'CalloutState *'>
+ <inline: #always>
+ self cppIf: COGMTVM
+ ifTrue:
+ [(calloutState callFlags anyMask: FFICallFlagThreaded) ifTrue:
+ [aBlock value: (interpreterProxy disownVM: 0)]]!

Item was added:
+ ----- Method: ThreadedFFIPlugin>>maybeOwnVM:threadIndex: (in category 'primitive support') -----
+ maybeOwnVM: calloutState threadIndex: myThreadIndex
+ <var: #calloutState type: #'CalloutState *'>
+ <inline: #always>
+ self cppIf: COGMTVM ifTrue:
+ [(calloutState callFlags anyMask: FFICallFlagThreaded) ifTrue:
+ [interpreterProxy ownVM: myThreadIndex]]!

Item was changed:
  ----- Method: ThreadedFFIPlugin>>mustAlignStack (in category 'marshalling') -----
  mustAlignStack
  "Many ABIs mandate a particular stack alignment greater than the natural word size.
  If so, this macro will answer true.  See class-side preambleCCode."
  <cmacro: '() MUST_ALIGN_STACK'>
+ ^false!
- ^0!

Item was changed:
  ----- Method: ThreadedFFIPlugin>>primitiveLoadSymbolFromModule (in category 'primitives') -----
  primitiveLoadSymbolFromModule
  "Attempt to find the address of a symbol in a loaded library.
  loadSymbol: aSymbol fromModule: moduleName
  <primitive: 'primitiveLoadSymbolFromModule' error: errorCode module: 'SqueakFFIPrims'>
  "
  <export: true>
 
  | symbol module moduleHandle address oop ptr |
 
  <var: #address type: #'void *'>
  <var: #ptr type: #'void **'>
 
  interpreterProxy methodArgumentCount = 2 ifFalse:
  [^interpreterProxy primitiveFailFor: PrimErrBadNumArgs].
 
  module := interpreterProxy stackValue: 0.
  symbol := interpreterProxy stackValue: 1.
 
+ moduleHandle := module ~= interpreterProxy nilObject ifTrue:
+ [self ffiLoadCalloutModule: module].
- module ~= interpreterProxy nilObject
- ifTrue: [ moduleHandle := self ffiLoadCalloutModule: module ]
- ifFalse: [ moduleHandle := nil ].
  interpreterProxy failed ifTrue:
  [^interpreterProxy primitiveFailFor: PrimErrNotFound].
  address := interpreterProxy
  ioLoadSymbol: (self cCoerce: (interpreterProxy firstIndexableField: symbol) to: #sqInt)
  OfLength: (interpreterProxy byteSizeOf: symbol)
  FromModule: moduleHandle.
  (interpreterProxy failed
  or: [address = 0]) ifTrue:
  [^interpreterProxy primitiveFailFor: PrimErrNotFound].
 
  oop := interpreterProxy
  instantiateClass: interpreterProxy classExternalAddress
  indexableSize: (self sizeof: #'void *').
  ptr := interpreterProxy firstIndexableField: oop.
  ptr at: 0 put: address.
 
  ^interpreterProxy methodReturnValue: oop!

Item was changed:
  ----- Method: ThreadedIA32FFIPlugin>>ffiCalloutTo:SpecOnStack:in: (in category 'callout support') -----
  ffiCalloutTo: procAddr SpecOnStack: specOnStack in: calloutState
  <var: #procAddr type: #'void *'>
  <var: #calloutState type: #'CalloutState *'>
  "Go out, call this guy and create the return value.  This *must* be inlined because of
  the alloca of the outgoing stack frame in ffiCall:WithFlags:NumArgs:Args:AndTypes:"
+ | myThreadIndex atomicType floatRet intRet |
- | myThreadIndex atomicType floatRet intRet oop |
  <var: #floatRet type: #double>
  <var: #intRet type: #usqLong>
  <inline: true>
+ self maybeDisownVM: calloutState threadIndexInto: [:threadIndex| myThreadIndex := threadIndex].
- self cppIf: COGMTVM ifTrue:
- [(calloutState callFlags anyMask: FFICallFlagThreaded) ifTrue:
- [myThreadIndex := interpreterProxy disownVM: 0]].
 
  (self allocaLiesSoSetSpBeforeCall or: [self mustAlignStack]) ifTrue:
  [self setsp: calloutState argVector].
 
  atomicType := self atomicTypeOf: calloutState ffiRetHeader.
+ (atomicType >> 1) = (FFITypeSingleFloat >> 1) ifTrue:
+ [floatRet := self dispatchFunctionPointer: (self cCoerceSimple: procAddr to: 'double (*)()').
+
+ "undo any callee argument pops because it may confuse stack management with the alloca."
+ (self isCalleePopsConvention: calloutState callFlags) ifTrue:
+ [self setsp: calloutState argVector].
+ self maybeOwnVM: calloutState threadIndex: myThreadIndex.
+
+ ^interpreterProxy floatObjectOf: floatRet].
+
+ intRet := self dispatchFunctionPointer: (self cCoerceSimple: procAddr to: 'usqLong (*)()').
+
- (atomicType >> 1) = (FFITypeSingleFloat >> 1)
- ifTrue:
- [floatRet := self dispatchFunctionPointer: (self cCoerceSimple: procAddr to: 'double (*)()')]
- ifFalse:
- [intRet := self dispatchFunctionPointer: (self cCoerceSimple: procAddr to: 'usqLong (*)()')].
  "undo any callee argument pops because it may confuse stack management with the alloca."
  (self isCalleePopsConvention: calloutState callFlags) ifTrue:
  [self setsp: calloutState argVector].
+ self maybeOwnVM: calloutState threadIndex: myThreadIndex.
 
- self cppIf: COGMTVM ifTrue:
- [(calloutState callFlags anyMask: FFICallFlagThreaded) ifTrue:
- [interpreterProxy ownVM: myThreadIndex]].
-
  (calloutState ffiRetHeader anyMask: FFIFlagPointer+FFIFlagStructure) ifTrue:
  ["Note: Order is important here since FFIFlagPointer + FFIFlagStructure is used to represent
  'typedef void* VoidPointer' and VoidPointer must be returned as pointer *not* as struct."
+ (calloutState ffiRetHeader anyMask: FFIFlagPointer) ifTrue:
+ [^self ffiReturnPointer: intRet ofType: (self ffiReturnType: specOnStack) in: calloutState].
+ ^self ffiReturnStruct: intRet ofType: (self ffiReturnType: specOnStack) in: calloutState].
- (calloutState ffiRetHeader anyMask: FFIFlagPointer)
- ifTrue:
- [oop := self ffiReturnPointer: intRet ofType: (self ffiReturnType: specOnStack) in: calloutState]
- ifFalse:
- [oop := self ffiReturnStruct: intRet ofType: (self ffiReturnType: specOnStack) in: calloutState].
- ^oop].
 
+ ^self ffiCreateIntegralResultOop: intRet ofAtomicType: atomicType in: calloutState!
- (atomicType >> 1) = (FFITypeSingleFloat >> 1)
- ifTrue:
- [oop := interpreterProxy floatObjectOf: floatRet]
- ifFalse:
- [oop := self ffiCreateIntegralResultOop: intRet
- ofAtomicType: atomicType
- in: calloutState].
- ^interpreterProxy methodReturnValue: oop!

Item was changed:
  ----- Method: ThreadedIA32FFIPlugin>>ffiReturnStruct:ofType:in: (in category 'callout support') -----
  ffiReturnStruct: longLongRet ofType: ffiRetType in: calloutState
  <var: #longLongRet type: #usqLong>
  <var: #calloutState type: #'CalloutState *'>
  "Create a structure return value from an external function call.  The value as been stored in
  alloca'ed space pointed to by the calloutState."
  | retOop retClass oop |
  <inline: true>
  retClass := interpreterProxy fetchPointer: 1 ofObject: ffiRetType.
  retOop := interpreterProxy instantiateClass: retClass indexableSize: 0.
  self remapOop: retOop
  in: [oop := interpreterProxy
  instantiateClass: interpreterProxy classByteArray
  indexableSize: calloutState structReturnSize].
+ self mem: (interpreterProxy firstIndexableField: oop)
+ cp: ((self returnStructInRegisters: calloutState structReturnSize)
+ ifTrue: [(self addressOf: longLongRet) asVoidPointer]
+ ifFalse: [calloutState limit])
+ y: calloutState structReturnSize.
- (self returnStructInRegisters: calloutState structReturnSize)
- ifTrue:
- [self mem: (interpreterProxy firstIndexableField: oop) cp: (self addressOf: longLongRet) y: calloutState structReturnSize]
- ifFalse:
- [self mem: (interpreterProxy firstIndexableField: oop) cp: calloutState limit y: calloutState structReturnSize].
  interpreterProxy storePointer: 0 ofObject: retOop withValue: oop.
+ ^retOop!
- ^interpreterProxy methodReturnValue: retOop!

Item was changed:
  ----- Method: ThreadedX64SysVFFIPlugin>>ffiCalloutTo:SpecOnStack:in: (in category 'callout support') -----
  ffiCalloutTo: procAddr SpecOnStack: specOnStack in: calloutState
  <var: #procAddr type: #'void *'>
  <var: #calloutState type: #'CalloutState *'>
  <var: #loadFloatRegs declareC: 'extern void loadFloatRegs(double, double, double, double, double, double, double, double)'>
  "Go out, call this guy and create the return value.  This *must* be inlined because of
  the alloca of the outgoing stack frame in ffiCall:WithFlags:NumArgs:Args:AndTypes:"
+ | myThreadIndex atomicType floatRet intRet loadFloatRegs |
- | myThreadIndex atomicType floatRet intRet loadFloatRegs oop |
  <var: #floatRet type: #double>
  <var: #intRet type: 'SixteenByteReturn'>
  <inline: true>
  self cCode: '' inSmalltalk: [loadFloatRegs := #used. loadFloatRegs class].
+ self maybeDisownVM: calloutState threadIndexInto: [:threadIndex| myThreadIndex := threadIndex].
- self cppIf: COGMTVM ifTrue:
- [(calloutState callFlags anyMask: FFICallFlagThreaded) ifTrue:
- [myThreadIndex := interpreterProxy disownVM: 0]].
 
  calloutState floatRegisterIndex > 0 ifTrue:
  [self
  load: (calloutState floatRegisters at: 0)
  Flo: (calloutState floatRegisters at: 1)
  a: (calloutState floatRegisters at: 2)
  t: (calloutState floatRegisters at: 3)
  R: (calloutState floatRegisters at: 4)
  e: (calloutState floatRegisters at: 5)
  g: (calloutState floatRegisters at: 6)
  s: (calloutState floatRegisters at: 7)].
 
  (self allocaLiesSoSetSpBeforeCall or: [self mustAlignStack]) ifTrue:
  [self setsp: calloutState argVector].
 
  atomicType := self atomicTypeOf: calloutState ffiRetHeader.
+ (atomicType >> 1) = (FFITypeSingleFloat >> 1) ifTrue:
+ [atomicType = FFITypeSingleFloat
+ ifTrue:
+ [floatRet := self
+ dispatchFunctionPointer: (self cCoerceSimple: procAddr to: 'float (*)(sqIntptr_t, sqIntptr_t, sqIntptr_t, sqIntptr_t, sqIntptr_t, sqIntptr_t)')
+ with: (calloutState integerRegisters at: 0)
+ with: (calloutState integerRegisters at: 1)
+ with: (calloutState integerRegisters at: 2)
+ with: (calloutState integerRegisters at: 3)
+ with: (calloutState integerRegisters at: 4)
+ with: (calloutState integerRegisters at: 5)]
+ ifFalse: "atomicType = FFITypeDoubleFloat"
+ [floatRet := self
+ dispatchFunctionPointer: (self cCoerceSimple: procAddr to: 'double (*)(sqIntptr_t, sqIntptr_t, sqIntptr_t, sqIntptr_t, sqIntptr_t, sqIntptr_t)')
+ with: (calloutState integerRegisters at: 0)
+ with: (calloutState integerRegisters at: 1)
+ with: (calloutState integerRegisters at: 2)
+ with: (calloutState integerRegisters at: 3)
+ with: (calloutState integerRegisters at: 4)
+ with: (calloutState integerRegisters at: 5)].
+
+ self maybeOwnVM: calloutState threadIndex: myThreadIndex.
+
+ ^interpreterProxy floatObjectOf: floatRet].
+
+ intRet := self
- (atomicType >> 1) = (FFITypeSingleFloat >> 1)
- ifTrue:
- [atomicType = FFITypeSingleFloat
- ifTrue:
- [floatRet := self
- dispatchFunctionPointer: (self cCoerceSimple: procAddr to: 'float (*)(sqIntptr_t, sqIntptr_t, sqIntptr_t, sqIntptr_t, sqIntptr_t, sqIntptr_t)')
- with: (calloutState integerRegisters at: 0)
- with: (calloutState integerRegisters at: 1)
- with: (calloutState integerRegisters at: 2)
- with: (calloutState integerRegisters at: 3)
- with: (calloutState integerRegisters at: 4)
- with: (calloutState integerRegisters at: 5)]
- ifFalse: "atomicType = FFITypeDoubleFloat"
- [floatRet := self
- dispatchFunctionPointer: (self cCoerceSimple: procAddr to: 'double (*)(sqIntptr_t, sqIntptr_t, sqIntptr_t, sqIntptr_t, sqIntptr_t, sqIntptr_t)')
- with: (calloutState integerRegisters at: 0)
- with: (calloutState integerRegisters at: 1)
- with: (calloutState integerRegisters at: 2)
- with: (calloutState integerRegisters at: 3)
- with: (calloutState integerRegisters at: 4)
- with: (calloutState integerRegisters at: 5)]]
- ifFalse:
- [intRet := self
  dispatchFunctionPointer: (self cCoerceSimple: procAddr to: 'SixteenByteReturn (*)(sqIntptr_t, sqIntptr_t, sqIntptr_t, sqIntptr_t, sqIntptr_t, sqIntptr_t)')
  with: (calloutState integerRegisters at: 0)
  with: (calloutState integerRegisters at: 1)
  with: (calloutState integerRegisters at: 2)
  with: (calloutState integerRegisters at: 3)
  with: (calloutState integerRegisters at: 4)
+ with: (calloutState integerRegisters at: 5).
- with: (calloutState integerRegisters at: 5)].
- "undo any callee argument pops because it may confuse stack management with the alloca."
- (self isCalleePopsConvention: calloutState callFlags) ifTrue:
- [self setsp: calloutState argVector].
 
+ self maybeOwnVM: calloutState threadIndex: myThreadIndex.
- self cppIf: COGMTVM ifTrue:
- [(calloutState callFlags anyMask: FFICallFlagThreaded) ifTrue:
- [interpreterProxy ownVM: myThreadIndex]].
 
  (calloutState ffiRetHeader anyMask: FFIFlagPointer+FFIFlagStructure) ifTrue:
  ["Note: Order is important here since FFIFlagPointer + FFIFlagStructure is used to represent
  'typedef void* VoidPointer' and VoidPointer must be returned as pointer *not* as struct."
+ (calloutState ffiRetHeader anyMask: FFIFlagPointer) ifTrue:
+ [^self ffiReturnPointer: intRet ofType: (self ffiReturnType: specOnStack) in: calloutState].
+ ^self ffiReturnStruct: intRet ofType: (self ffiReturnType: specOnStack) in: calloutState].
- (calloutState ffiRetHeader anyMask: FFIFlagPointer)
- ifTrue:
- [oop := self ffiReturnPointer: intRet a ofType: (self ffiReturnType: specOnStack) in: calloutState]
- ifFalse:
- [oop := self ffiReturnStruct: intRet ofType: (self ffiReturnType: specOnStack) in: calloutState].
- ^oop].
 
+ ^self ffiCreateIntegralResultOop: intRet ofAtomicType: atomicType in: calloutState!
- (atomicType >> 1) = (FFITypeSingleFloat >> 1)
- ifTrue:
- [oop := interpreterProxy floatObjectOf: floatRet]
- ifFalse:
- [oop := self ffiCreateIntegralResultOop: intRet a
- ofAtomicType: atomicType
- in: calloutState].
- ^interpreterProxy methodReturnValue: oop!

Item was changed:
  ----- Method: ThreadedX64SysVFFIPlugin>>ffiReturnStruct:ofType:in: (in category 'callout support') -----
  ffiReturnStruct: sixteenByteRet ofType: ffiRetType in: calloutState
  <var: #sixteenByteRet type: 'SixteenByteReturn'>
  <var: #calloutState type: #'CalloutState *'>
  "Create a structure return value from an external function call.  The value has been stored in
  alloca'ed space pointed to by the calloutState or in the return value."
  | retOop retClass oop |
  <inline: true>
  retClass := interpreterProxy fetchPointer: 1 ofObject: ffiRetType.
  retOop := interpreterProxy instantiateClass: retClass indexableSize: 0.
  self remapOop: retOop
  in: [oop := interpreterProxy
  instantiateClass: interpreterProxy classByteArray
  indexableSize: calloutState structReturnSize].
  self mem: (interpreterProxy firstIndexableField: oop)
  cp: ((self returnStructInRegisters: calloutState structReturnSize)
  ifTrue: [(self addressOf: sixteenByteRet) asVoidPointer]
  ifFalse: [calloutState limit])
  y: calloutState structReturnSize.
  interpreterProxy storePointer: 0 ofObject: retOop withValue: oop.
+ ^retOop!
- ^interpreterProxy methodReturnValue: retOop!

Item was changed:
  ----- Method: ThreadedX64Win64FFIPlugin>>ffiCalloutTo:SpecOnStack:in: (in category 'callout support') -----
  ffiCalloutTo: procAddr SpecOnStack: specOnStack in: calloutState
  <var: #procAddr type: #'void *'>
  <var: #calloutState type: #'CalloutState *'>
  <var: #loadFloatRegs declareC: 'extern void loadFloatRegs(double, double, double, double)'>
  "Go out, call this guy and create the return value.  This *must* be inlined because of
  the alloca of the outgoing stack frame in ffiCall:WithFlags:NumArgs:Args:AndTypes:"
+ | myThreadIndex atomicType floatRet intRet loadFloatRegs |
- | myThreadIndex atomicType floatRet intRet loadFloatRegs oop |
  <var: #floatRet type: #double>
  <var: #intRet type: #usqLong>
  <inline: true>
  self cCode: '' inSmalltalk: [loadFloatRegs := #used. loadFloatRegs class].
+ self maybeDisownVM: calloutState threadIndexInto: [:threadIndex| myThreadIndex := threadIndex].
- self cppIf: COGMTVM ifTrue:
- [(calloutState callFlags anyMask: FFICallFlagThreaded) ifTrue:
- [myThreadIndex := interpreterProxy disownVM: 0]].
 
  calloutState floatRegisterSignature > 0 ifTrue:
  [self
  load: (calloutState floatRegisters at: 0)
  Flo: (calloutState floatRegisters at: 1)
  at: (calloutState floatRegisters at: 2)
  Re: (calloutState floatRegisters at: 3)
  gs: (calloutState floatRegisters at: 4)].
 
  (self allocaLiesSoSetSpBeforeCall or: [self mustAlignStack]) ifTrue:
  [self setsp: calloutState argVector].
+
-
  atomicType := self atomicTypeOf: calloutState ffiRetHeader.
+ (atomicType >> 1) = (FFITypeSingleFloat >> 1) ifTrue:
+ [atomicType = FFITypeSingleFloat
+ ifTrue:
+ [floatRet := self
+ dispatchFunctionPointer: (self cCoerceSimple: procAddr to: 'float (*)(sqIntptr_t, sqIntptr_t, sqIntptr_t, sqIntptr_t)')
+ with: (calloutState integerRegisters at: 0)
+ with: (calloutState integerRegisters at: 1)
+ with: (calloutState integerRegisters at: 2)
+ with: (calloutState integerRegisters at: 3)]
+ ifFalse: "atomicType = FFITypeDoubleFloat"
+ [floatRet := self
+ dispatchFunctionPointer: (self cCoerceSimple: procAddr to: 'double (*)(sqIntptr_t, sqIntptr_t, sqIntptr_t, sqIntptr_t)')
+ with: (calloutState integerRegisters at: 0)
+ with: (calloutState integerRegisters at: 1)
+ with: (calloutState integerRegisters at: 2)
+ with: (calloutState integerRegisters at: 3)].
+
+ "undo any callee argument pops because it may confuse stack management with the alloca."
+ (self isCalleePopsConvention: calloutState callFlags) ifTrue:
+ [self setsp: calloutState argVector].
+ self maybeOwnVM: calloutState threadIndex: myThreadIndex.
+
+ ^interpreterProxy floatObjectOf: floatRet].
+
+ intRet := self
- (atomicType >> 1) = (FFITypeSingleFloat >> 1)
- ifTrue:
- [atomicType = FFITypeSingleFloat
- ifTrue:
- [floatRet := self
- dispatchFunctionPointer: (self cCoerceSimple: procAddr to: 'float (*)(sqIntptr_t, sqIntptr_t, sqIntptr_t, sqIntptr_t)')
- with: (calloutState integerRegisters at: 0)
- with: (calloutState integerRegisters at: 1)
- with: (calloutState integerRegisters at: 2)
- with: (calloutState integerRegisters at: 3)]
- ifFalse: "atomicType = FFITypeDoubleFloat"
- [floatRet := self
- dispatchFunctionPointer: (self cCoerceSimple: procAddr to: 'double (*)(sqIntptr_t, sqIntptr_t, sqIntptr_t, sqIntptr_t)')
- with: (calloutState integerRegisters at: 0)
- with: (calloutState integerRegisters at: 1)
- with: (calloutState integerRegisters at: 2)
- with: (calloutState integerRegisters at: 3)]]
- ifFalse:
- [intRet := self
  dispatchFunctionPointer: (self cCoerceSimple: procAddr to: 'usqIntptr_t (*)(sqIntptr_t, sqIntptr_t, sqIntptr_t, sqIntptr_t)')
  with: (calloutState integerRegisters at: 0)
  with: (calloutState integerRegisters at: 1)
  with: (calloutState integerRegisters at: 2)
+ with: (calloutState integerRegisters at: 3).
+
- with: (calloutState integerRegisters at: 3)].
  "undo any callee argument pops because it may confuse stack management with the alloca."
  (self isCalleePopsConvention: calloutState callFlags) ifTrue:
  [self setsp: calloutState argVector].
+ self maybeOwnVM: calloutState threadIndex: myThreadIndex.
 
- self cppIf: COGMTVM ifTrue:
- [(calloutState callFlags anyMask: FFICallFlagThreaded) ifTrue:
- [interpreterProxy ownVM: myThreadIndex]].
-
  (calloutState ffiRetHeader anyMask: FFIFlagPointer+FFIFlagStructure) ifTrue:
  ["Note: Order is important here since FFIFlagPointer + FFIFlagStructure is used to represent
  'typedef void* VoidPointer' and VoidPointer must be returned as pointer *not* as struct."
+ (calloutState ffiRetHeader anyMask: FFIFlagPointer) ifTrue:
+ [^self ffiReturnPointer: intRet ofType: (self ffiReturnType: specOnStack) in: calloutState].
+ ^self ffiReturnStruct: intRet ofType: (self ffiReturnType: specOnStack) in: calloutState].
- (calloutState ffiRetHeader anyMask: FFIFlagPointer)
- ifTrue:
- [oop := self ffiReturnPointer: intRet ofType: (self ffiReturnType: specOnStack) in: calloutState]
- ifFalse:
- [oop := self ffiReturnStruct: intRet ofType: (self ffiReturnType: specOnStack) in: calloutState].
- ^oop].
 
+ ^self ffiCreateIntegralResultOop: intRet ofAtomicType: atomicType in: calloutState!
- (atomicType >> 1) = (FFITypeSingleFloat >> 1)
- ifTrue:
- [oop := interpreterProxy floatObjectOf: floatRet]
- ifFalse:
- [oop := self ffiCreateIntegralResultOop: intRet
- ofAtomicType: atomicType
- in: calloutState].
- ^interpreterProxy methodReturnValue: oop!

Item was changed:
  ----- Method: ThreadedX64Win64FFIPlugin>>ffiReturnStruct:ofType:in: (in category 'callout support') -----
  ffiReturnStruct: intRet ofType: ffiRetType in: calloutState
  <var: #intRet type: #usqLong>
  <var: #calloutState type: #'CalloutState *'>
  "Create a structure return value from an external function call.  The value has been stored in
  alloca'ed space pointed to by the calloutState or in the return value."
  | retOop retClass oop |
  <inline: true>
  retClass := interpreterProxy fetchPointer: 1 ofObject: ffiRetType.
  retOop := interpreterProxy instantiateClass: retClass indexableSize: 0.
  self remapOop: retOop
  in: [oop := interpreterProxy
  instantiateClass: interpreterProxy classByteArray
  indexableSize: calloutState structReturnSize].
  self mem: (interpreterProxy firstIndexableField: oop)
  cp: ((self returnStructInRegisters: calloutState structReturnSize)
  ifTrue: [self addressOf: intRet]
  ifFalse: [calloutState limit])
  y: calloutState structReturnSize.
  interpreterProxy storePointer: 0 ofObject: retOop withValue: oop.
+ ^retOop!
- ^interpreterProxy methodReturnValue: retOop!