VM Maker Inbox: VMMaker.oscog-nice.2542.mcz

Previous Topic Next Topic
 
classic Classic list List threaded Threaded
1 message Options
Reply | Threaded
Open this post in threaded view
|

VM Maker Inbox: VMMaker.oscog-nice.2542.mcz

commits-2
 
Nicolas Cellier uploaded a new version of VMMaker to project VM Maker Inbox:
http://source.squeak.org/VMMakerInbox/VMMaker.oscog-nice.2542.mcz

==================== Summary ====================

Name: VMMaker.oscog-nice.2542
Author: nice
Time: 27 August 2019, 5:35:02.074497 am
UUID: 0dffed15-ed80-45dd-a042-24e3521665a3
Ancestors: VMMaker.oscog-nice.2541

Provide a new primitiveBitShift and corresponding jit generation in response to https://github.com/OpenSmalltalk/opensmalltalk-vm/issues/418.

Some comments:
I have created a new RTL OpCode ClzRR (it exists in gcc)
I have also added a IA32 specific BSR opcode.

So it might be necessary to:
CogRTLOpcodes initialize.
CogX64Compiler initialize.
CogIA32Compiler initialize.
and recompile the whole package if ever MC fails to get correct load order.

I have provided a ConcretizeClzRR for ARM and MIPS but I did no test at all, so it definitely require validation.

On Intel arch, CLZ instruction (LZCNT) is not available before Haswel, so it is mandatory to check availability via CPUID mean - see generateCheckLZCNT ceCheckLZCNTFunction & co.
If CLZ is not available on that CPU, a good fallback with BSR is easy (a bit slower?).

Another problem is that Intel LZCNT/BSR operates on 64bits only for R0-R7, but ReceiverResutReg is R9 on WIN64, which is unfortunate...

In order to handle those different path, the generation is delegated to the backEnd.

There are some bit tricks that works well for single tag bit, and we shift the 3-tag bits Spur64 so as to use the same tricks.

No provision has been made for simulation, this is a TODO.

I have used a free slot 575 for numbered primitiveHighBit.
A bad side effect is that we have to generate 575 primitives in the Spur32 case instead of 222 previously...
But worse, it seems that it makes it a slow primitive instead of quick one, see MaxQuickPrimitiveIndex.
This choice should better be reviewed!

=============== Diff against VMMaker.oscog-nice.2541 ===============

Item was changed:
  ----- Method: CogARMCompiler>>computeMaximumSize (in category 'generate machine code') -----
  computeMaximumSize
  "Because we don't use Thumb, each ARM instruction has 4 bytes. Many
  abstract opcodes need more than one instruction. Instructions that refer
  to constants and/or literals depend on literals being stored in-line or out-of-line.
 
  N.B.  The ^N forms are to get around the bytecode compiler's long branch
  limits which are exceeded when each case jumps around the otherwise."
 
  opcode
  caseOf: {
  "Noops & Pseudo Ops"
  [Label] -> [^0].
  [Literal] -> [^4].
  [AlignmentNops] -> [^(operands at: 0) - 4].
  [Fill32] -> [^4].
  [Nop] -> [^4].
  "Control"
  [Call] -> [^4].
  [CallFull] -> [^self literalLoadInstructionBytes + 4].
  [JumpR] -> [^4].
  [Jump] -> [^4].
  [JumpFull] -> [^self literalLoadInstructionBytes + 4].
  [JumpLong] -> [^4].
  [JumpZero] -> [^4].
  [JumpNonZero] -> [^4].
  [JumpNegative] -> [^4].
  [JumpNonNegative] -> [^4].
  [JumpOverflow] -> [^4].
  [JumpNoOverflow] -> [^4].
  [JumpCarry] -> [^4].
  [JumpNoCarry] -> [^4].
  [JumpLess] -> [^4].
  [JumpGreaterOrEqual] -> [^4].
  [JumpGreater] -> [^4].
  [JumpLessOrEqual] -> [^4].
  [JumpBelow] -> [^4].
  [JumpAboveOrEqual] -> [^4].
  [JumpAbove] -> [^4].
  [JumpBelowOrEqual] -> [^4].
  [JumpLongZero] -> [^4].
  [JumpLongNonZero] -> [^4].
  [JumpFPEqual] -> [^8].
  [JumpFPNotEqual] -> [^8].
  [JumpFPLess] -> [^8].
  [JumpFPGreaterOrEqual]-> [^8].
  [JumpFPGreater] -> [^8].
  [JumpFPLessOrEqual] -> [^8].
  [JumpFPOrdered] -> [^8].
  [JumpFPUnordered] -> [^8].
  [RetN] -> [^(operands at: 0) = 0 ifTrue: [4] ifFalse: [8]].
  [Stop] -> [^4].
 
  "Arithmetic"
  [AddCqR] -> [^self rotateable8bitSignedImmediate: (operands at: 0)
  ifTrue: [:r :i :n| 4]
  ifFalse: [self literalLoadInstructionBytes + 4]].
  [AndCqR] -> [^self rotateable8bitBitwiseImmediate: (operands at: 0)
  ifTrue: [:r :i :n| 4]
  ifFalse:
  [self literalLoadInstructionBytes = 4
  ifTrue: [8]
  ifFalse:
  [1 << (operands at: 0) highBit = ((operands at: 0) + 1)
  ifTrue: [8]
  ifFalse: [self literalLoadInstructionBytes + 4]]]].
  [AndCqRR] -> [^self rotateable8bitBitwiseImmediate: (operands at: 0)
  ifTrue: [:r :i :n| 4]
  ifFalse:
  [self literalLoadInstructionBytes = 4
  ifTrue: [8]
  ifFalse:
  [1 << (operands at: 0) highBit = ((operands at: 0) + 1)
  ifTrue: [8]
  ifFalse: [self literalLoadInstructionBytes + 4]]]].
  [CmpCqR] -> [^self rotateable8bitSignedImmediate: (operands at: 0)
  ifTrue: [:r :i :n| 4]
  ifFalse: [self literalLoadInstructionBytes + 4]].
  [OrCqR] -> [^self rotateable8bitImmediate: (operands at: 0)
  ifTrue: [:r :i| 4]
  ifFalse: [self literalLoadInstructionBytes + 4]].
  [SubCqR] -> [^self rotateable8bitSignedImmediate: (operands at: 0)
  ifTrue: [:r :i :n| 4]
  ifFalse: [self literalLoadInstructionBytes + 4]].
  [TstCqR] -> [^self rotateable8bitImmediate: (operands at: 0)
  ifTrue: [:r :i| 4]
  ifFalse: [self literalLoadInstructionBytes + 4]].
  [XorCqR] -> [^self rotateable8bitBitwiseImmediate: (operands at: 0)
  ifTrue: [:r :i :n| 4]
  ifFalse:
  [self literalLoadInstructionBytes = 4
  ifTrue: [8]
  ifFalse:
  [1 << (operands at: 0) highBit = ((operands at: 0) + 1)
  ifTrue: [8]
  ifFalse: [self literalLoadInstructionBytes + 4]]]].
  [AddCwR] -> [^self literalLoadInstructionBytes + 4].
  [AndCwR] -> [^self literalLoadInstructionBytes + 4].
  [CmpCwR] -> [^self literalLoadInstructionBytes + 4].
  [OrCwR] -> [^self literalLoadInstructionBytes + 4].
  [SubCwR] -> [^self literalLoadInstructionBytes + 4].
  [XorCwR] -> [^self literalLoadInstructionBytes + 4].
  [AddRR] -> [^4].
  [AndRR] -> [^4].
  [CmpRR] -> [^4].
  [OrRR] -> [^4].
  [XorRR] -> [^4].
  [SubRR] -> [^4].
  [NegateR] -> [^4].
  [LoadEffectiveAddressMwrR]
  -> [^self rotateable8bitImmediate: (operands at: 0)
  ifTrue: [:r :i| 4]
  ifFalse: [self literalLoadInstructionBytes + 4]].
 
  [LogicalShiftLeftCqR] -> [^4].
  [LogicalShiftRightCqR] -> [^4].
  [ArithmeticShiftRightCqR] -> [^4].
  [LogicalShiftLeftRR] -> [^4].
  [LogicalShiftRightRR] -> [^4].
  [ArithmeticShiftRightRR] -> [^4].
  [AddRdRd] -> [^4].
  [CmpRdRd] -> [^4].
  [SubRdRd] -> [^4].
  [MulRdRd] -> [^4].
  [DivRdRd] -> [^4].
  [SqrtRd] -> [^4].
+ [ClzRR] -> [^4].
  "ARM Specific Arithmetic"
  [SMULL] -> [^4].
  [MSR] -> [^4].
  [CMPSMULL] -> [^4]. "special compare for genMulR:R: usage"
  "ARM Specific Data Movement"
  [PopLDM] -> [^4].
  [PushSTM] -> [^4].
  "Data Movement"
  [MoveCqR] -> [^self literalLoadInstructionBytes = 4
  ifTrue: [self literalLoadInstructionBytes]
  ifFalse:
  [self rotateable8bitBitwiseImmediate: (operands at: 0)
  ifTrue: [:r :i :n| 4]
  ifFalse: [self literalLoadInstructionBytes]]].
  [MoveCwR] -> [^self literalLoadInstructionBytes = 4
  ifTrue: [self literalLoadInstructionBytes]
  ifFalse:
  [(self inCurrentCompilation: (operands at: 0))
  ifTrue: [4]
  ifFalse: [self literalLoadInstructionBytes]]].
  [MoveRR] -> [^4].
  [MoveRdRd] -> [^4].
  [MoveAwR] -> [^(self isAddressRelativeToVarBase: (operands at: 0))
  ifTrue: [4]
  ifFalse: [self literalLoadInstructionBytes + 4]].
  [MoveRAw] -> [^(self isAddressRelativeToVarBase: (operands at: 1))
  ifTrue: [4]
  ifFalse: [self literalLoadInstructionBytes + 4]].
  [MoveAbR] -> [^(self isAddressRelativeToVarBase: (operands at: 0))
  ifTrue: [4]
  ifFalse: [self literalLoadInstructionBytes + 4]].
  [MoveRAb] -> [^(self isAddressRelativeToVarBase: (operands at: 1))
  ifTrue: [4]
  ifFalse: [self literalLoadInstructionBytes + 4]].
  [MoveRMwr] -> [^self is12BitValue: (operands at: 1)
  ifTrue: [:u :i| 4]
  ifFalse: [self literalLoadInstructionBytes + 4]].
  [MoveRdM64r] -> [^self literalLoadInstructionBytes + 4].
  [MoveMbrR] -> [^self is12BitValue: (operands at: 0)
  ifTrue: [:u :i| 4]
  ifFalse: [self literalLoadInstructionBytes + 4]].
  [MoveRMbr] -> [^self is12BitValue: (operands at: 1)
  ifTrue: [:u :i| 4]
  ifFalse: [self literalLoadInstructionBytes + 4]].
  [MoveRM16r] -> [^self is12BitValue: (operands at: 1)
  ifTrue: [:u :i| 4]
  ifFalse: [self literalLoadInstructionBytes + 4]].
  [MoveM16rR] -> [^self rotateable8bitImmediate: (operands at: 0)
  ifTrue: [:r :i| 4]
  ifFalse: [self literalLoadInstructionBytes + 4]].
  [MoveM64rRd] -> [^self literalLoadInstructionBytes + 4].
  [MoveMwrR] -> [^self is12BitValue: (operands at: 0)
  ifTrue: [:u :i| 4]
  ifFalse: [self literalLoadInstructionBytes + 4]].
  [MoveXbrRR] -> [^4].
  [MoveRXbrR] -> [^4].
  [MoveXwrRR] -> [^4].
  [MoveRXwrR] -> [^4].
  [PopR] -> [^4].
  [PushR] -> [^4].
  [PushCw] -> [^self literalLoadInstructionBytes = 4
  ifTrue: [self literalLoadInstructionBytes + 4]
  ifFalse:
  [(self inCurrentCompilation: (operands at: 0))
  ifTrue: [8]
  ifFalse:
  [self rotateable8bitBitwiseImmediate: (operands at: 0)
  ifTrue: [:r :i :n| 8]
  ifFalse: [self literalLoadInstructionBytes + 4]]]].
  [PushCq] -> [^self literalLoadInstructionBytes = 4
  ifTrue: [self literalLoadInstructionBytes + 4]
  ifFalse:
  [self rotateable8bitBitwiseImmediate: (operands at: 0)
  ifTrue: [:r :i :n| 8]
  ifFalse: [self literalLoadInstructionBytes + 4]]].
  [PrefetchAw] -> [^(self isAddressRelativeToVarBase: (operands at: 0))
  ifTrue: [4]
  ifFalse: [self literalLoadInstructionBytes + 4]].
  "Conversion"
  [ConvertRRd] -> [^8].
  }.
  ^0 "to keep C compiler quiet"
  !

Item was added:
+ ----- Method: CogARMCompiler>>concretizeClzRR (in category 'generate machine code - concretize') -----
+ concretizeClzRR
+ "Count leading zeros
+ First operand is output (dest)
+ Second operand is input (mask)"
+ "v5 CLZ cond 0001 0110 SBO Rd SBO 0001 Rm
+ That is hexa 16rE16FxF1x for cond = AL"
+ "v7 CLZ 11111-010-1-011 Rm(4bits) 1111 Rd(4bits) 1000 Rm(4bits)
+ That is hexa 16rFABxFx8x"
+ <inline: true>
+ | dest mask |
+ dest := operands at: 0.
+ mask := operands at: 1.
+ self machineCodeAt: 0 put:
+ AL << 28 + 16r16F0F10
+ + (dest << 12) + mask.
+ ^machineCodeSize := 4!

Item was changed:
  ----- Method: CogARMCompiler>>dispatchConcretize (in category 'generate machine code') -----
  dispatchConcretize
  "Attempt to generate concrete machine code for the instruction at address.
  This is the inner dispatch of concretizeAt: actualAddress which exists only
  to get around the branch size limits in the SqueakV3 (blue book derived)
  bytecode set."
  <returnTypeC: #void>
  conditionOrNil ifNotNil:
  [self concretizeConditionalInstruction.
  ^self].
 
  opcode caseOf: {
  "Noops & Pseudo Ops"
  [Label] -> [^self concretizeLabel].
  [Literal] -> [^self concretizeLiteral].
  [AlignmentNops] -> [^self concretizeAlignmentNops].
  [Fill32] -> [^self concretizeFill32].
  [Nop] -> [^self concretizeNop].
  "Control"
  [Call] -> [^self concretizeCall]. "call code within code space"
  [CallFull] -> [^self concretizeCallFull]. "call code anywhere in address space"
  [JumpR] -> [^self concretizeJumpR].
  [JumpFull] -> [^self concretizeJumpFull]."jump within address space"
  [JumpLong] -> [^self concretizeConditionalJump: AL]."jumps witihn code space"
  [JumpLongZero] -> [^self concretizeConditionalJump: EQ].
  [JumpLongNonZero] -> [^self concretizeConditionalJump: NE].
  [Jump] -> [^self concretizeConditionalJump: AL].
  [JumpZero] -> [^self concretizeConditionalJump: EQ].
  [JumpNonZero] -> [^self concretizeConditionalJump: NE].
  [JumpNegative] -> [^self concretizeConditionalJump: MI].
  [JumpNonNegative] -> [^self concretizeConditionalJump: PL].
  [JumpOverflow] -> [^self concretizeConditionalJump: VS].
  [JumpNoOverflow] -> [^self concretizeConditionalJump: VC].
  [JumpCarry] -> [^self concretizeConditionalJump: CS].
  [JumpNoCarry] -> [^self concretizeConditionalJump: CC].
  [JumpLess] -> [^self concretizeConditionalJump: LT].
  [JumpGreaterOrEqual] -> [^self concretizeConditionalJump: GE].
  [JumpGreater] -> [^self concretizeConditionalJump: GT].
  [JumpLessOrEqual] -> [^self concretizeConditionalJump: LE].
  [JumpBelow] -> [^self concretizeConditionalJump: CC]. "unsigned lower"
  [JumpAboveOrEqual] -> [^self concretizeConditionalJump: CS]. "unsigned greater or equal"
  [JumpAbove] -> [^self concretizeConditionalJump: HI].
  [JumpBelowOrEqual] -> [^self concretizeConditionalJump: LS].
  [JumpFPEqual] -> [^self concretizeFPConditionalJump: EQ].
  [JumpFPNotEqual] -> [^self concretizeFPConditionalJump: NE].
  [JumpFPLess] -> [^self concretizeFPConditionalJump: LT].
  [JumpFPGreaterOrEqual] -> [^self concretizeFPConditionalJump: GE].
  [JumpFPGreater] -> [^self concretizeFPConditionalJump: GT].
  [JumpFPLessOrEqual] -> [^self concretizeFPConditionalJump: LE].
  [JumpFPOrdered] -> [^self concretizeFPConditionalJump: VC].
  [JumpFPUnordered] -> [^self concretizeFPConditionalJump: VS].
  [RetN] -> [^self concretizeRetN].
  [Stop] -> [^self concretizeStop].
  "Arithmetic"
  [AddCqR] -> [^self concretizeNegateableDataOperationCqR: AddOpcode].
  [AndCqR] -> [^self concretizeInvertibleDataOperationCqR: AndOpcode].
  [AndCqRR] -> [^self concretizeAndCqRR].
  [CmpCqR] -> [^self concretizeNegateableDataOperationCqR: CmpOpcode].
  [OrCqR] -> [^self concretizeDataOperationCqR: OrOpcode].
  [SubCqR] -> [^self concretizeSubCqR].
  [TstCqR] -> [^self concretizeTstCqR].
  [XorCqR] -> [^self concretizeInvertibleDataOperationCqR: XorOpcode].
  [AddCwR] -> [^self concretizeDataOperationCwR: AddOpcode].
  [AndCwR] -> [^self concretizeDataOperationCwR: AndOpcode].
  [CmpCwR] -> [^self concretizeDataOperationCwR: CmpOpcode].
  [OrCwR] -> [^self concretizeDataOperationCwR: OrOpcode].
  [SubCwR] -> [^self concretizeDataOperationCwR: SubOpcode].
  [XorCwR] -> [^self concretizeDataOperationCwR: XorOpcode].
  [AddRR] -> [^self concretizeDataOperationRR: AddOpcode].
  [AndRR] -> [^self concretizeDataOperationRR: AndOpcode].
  [CmpRR] -> [^self concretizeDataOperationRR: CmpOpcode].
  [OrRR] -> [^self concretizeDataOperationRR: OrOpcode].
  [SubRR] -> [^self concretizeDataOperationRR: SubOpcode].
  [XorRR] -> [^self concretizeDataOperationRR: XorOpcode].
  [AddRdRd] -> [^self concretizeAddRdRd].
  [CmpRdRd] -> [^self concretizeCmpRdRd].
  [DivRdRd] -> [^self concretizeDivRdRd].
  [MulRdRd] -> [^self concretizeMulRdRd].
  [SubRdRd] -> [^self concretizeSubRdRd].
  [SqrtRd] -> [^self concretizeSqrtRd].
  [NegateR] -> [^self concretizeNegateR].
  [LoadEffectiveAddressMwrR] -> [^self concretizeLoadEffectiveAddressMwrR].
  [ArithmeticShiftRightCqR] -> [^self concretizeArithmeticShiftRightCqR].
  [LogicalShiftRightCqR] -> [^self concretizeLogicalShiftRightCqR].
  [LogicalShiftLeftCqR] -> [^self concretizeLogicalShiftLeftCqR].
  [ArithmeticShiftRightRR] -> [^self concretizeArithmeticShiftRightRR].
  [LogicalShiftLeftRR] -> [^self concretizeLogicalShiftLeftRR].
  [LogicalShiftRightRR] -> [^self concretizeLogicalShiftRightRR].
+ [ClzRR] -> [^self concretizeClzRR].
  "ARM Specific Arithmetic"
  [SMULL] -> [^self concretizeSMULL] .
  [CMPSMULL] -> [^self concretizeCMPSMULL].
  [MSR] -> [^self concretizeMSR].
  "ARM Specific Data Movement"
  [PopLDM] -> [^self concretizePushOrPopMultipleRegisters: false].
  [PushSTM] -> [^self concretizePushOrPopMultipleRegisters: true].
  "Data Movement"
  [MoveCqR] -> [^self concretizeMoveCqR].
  [MoveCwR] -> [^self concretizeMoveCwR].
  [MoveRR] -> [^self concretizeMoveRR].
  [MoveAwR] -> [^self concretizeMoveAwR].
  [MoveRAw] -> [^self concretizeMoveRAw].
  [MoveAbR] -> [^self concretizeMoveAbR].
    [MoveRAb] -> [^self concretizeMoveRAb].
  [MoveMbrR] -> [^self concretizeMoveMbrR].
  [MoveRMbr] -> [^self concretizeMoveRMbr].
  [MoveRM16r] -> [^self concretizeMoveRM16r].
  [MoveM16rR] -> [^self concretizeMoveM16rR].
  [MoveM64rRd] -> [^self concretizeMoveM64rRd].
  [MoveMwrR] -> [^self concretizeMoveMwrR].
  [MoveXbrRR] -> [^self concretizeMoveXbrRR].
  [MoveRXbrR] -> [^self concretizeMoveRXbrR].
  [MoveXwrRR] -> [^self concretizeMoveXwrRR].
  [MoveRXwrR] -> [^self concretizeMoveRXwrR].
  [MoveRMwr] -> [^self concretizeMoveRMwr].
  [MoveRdM64r] -> [^self concretizeMoveRdM64r].
  [PopR] -> [^self concretizePopR].
  [PushR] -> [^self concretizePushR].
  [PushCq] -> [^self concretizePushCq].
  [PushCw] -> [^self concretizePushCw].
  [PrefetchAw] -> [^self concretizePrefetchAw].
  "Conversion"
  [ConvertRRd] -> [^self concretizeConvertRRd]}!

Item was added:
+ ----- Method: CogARMCompiler>>hasLZCNTInstructions (in category 'testing') -----
+ hasLZCNTInstructions
+ "Answer if the processor has a LZCNT (leading zero count) instruction"
+ <inline: true>
+ ^true "CLZ"!

Item was changed:
  ----- Method: CogARMCompiler>>setsConditionCodesFor: (in category 'testing') -----
  setsConditionCodesFor: aConditionalJumpOpcode
  <inline: false> "to save Slang from having to be a real compiler (it can't inline switches that return)"
  "Answer if the receiver's opcode sets the condition codes correctly for the given conditional jump opcode.
  ARM has to check carefully since the V flag is not affected by non-comparison instructions"
  ^opcode caseOf:
  { [ArithmeticShiftRightCqR] -> [self shiftSetsConditionCodesFor: aConditionalJumpOpcode].
  [ArithmeticShiftRightRR] -> [self shiftSetsConditionCodesFor: aConditionalJumpOpcode].
  [LogicalShiftLeftCqR] -> [self shiftSetsConditionCodesFor: aConditionalJumpOpcode].
  [LogicalShiftLeftRR] -> [self shiftSetsConditionCodesFor: aConditionalJumpOpcode].
+ [XorRR] -> [true].
+ [ClzRR] -> [false]
- [XorRR] -> [true]
  }
  otherwise: [self halt: 'unhandled opcode in setsConditionCodesFor:'. false]!

Item was added:
+ ----- Method: CogAbstractInstruction>>genHighBitAlternativeIn:ofSmallIntegerOopWithSingleTagBit: (in category 'abstract instructions') -----
+ genHighBitAlternativeIn: destReg ofSmallIntegerOopWithSingleTagBit: srcReg
+ "Use an alternative - if any for generating highBit of SmallInteger oop.
+ Default implementation is no-op - there is no universal alternative to CLZ.
+ Some target architecture might offer more..."
+ <inline: true>
+ <returnTypeC: #'AbstractInstruction *'>
+ ^0!

Item was added:
+ ----- Method: CogAbstractInstruction>>genHighBitClzIn:ofSmallIntegerOopWithSingleTagBit: (in category 'abstract instructions') -----
+ genHighBitClzIn: destReg ofSmallIntegerOopWithSingleTagBit: srcReg
+ "Use CLZ instruction for generating highBit of SmallInteger oop.
+ It is sender responsibility to make sure that such CLZ instruction is available."
+ <inline: true>
+ | jumpNegativeReceiver |
+ <var: #jumpNegativeReceiver type: #'AbstractInstruction *'>
+ cogit ClzR: destReg R: srcReg.
+ (cogit lastOpcode setsConditionCodesFor: JumpZero) ifFalse:
+ [cogit CmpCq: 0 R: destReg]. "N.B. FLAGS := destReg - 0"
+ jumpNegativeReceiver := cogit JumpZero: 0.
+ "Note the nice bit trick below:
+ highBit_1based_of_small_int_value = (BytesPerWord * 8) - leadingZeroCout_of_oop - 1 toAccountForTagBit.
+ This is like 2 complements (- reg - 1) on (BytesPerWord * 8) log2 bits, or exactly a bit invert operation..."
+ cogit XorCw: BytesPerWord * 8 - 1 R: destReg.
+ ^jumpNegativeReceiver!

Item was added:
+ ----- Method: CogAbstractInstruction>>hasLZCNTInstructions (in category 'testing') -----
+ hasLZCNTInstructions
+ "Answer if the processor has a LZCNT (leading zero count) instruction"
+ <inline: true>
+ ^self subclassResponsibility!

Item was added:
+ ----- Method: CogAbstractInstruction>>numCheckLZCNTOpcodes (in category 'initialization') -----
+ numCheckLZCNTOpcodes
+ "If the priocessor has a feature check facility answer the number
+ of opcodes required to compile an accessor for the feature."
+ ^0!

Item was changed:
  CogAbstractInstruction subclass: #CogIA32Compiler
  instanceVariableNames: ''
+ classVariableNames: 'BSR CDQ CLD CMPXCHGAwR CMPXCHGMwrR CPUID EAX EBP EBX ECX EDI EDX ESI ESP FSTPD FSTPS IDIVR IMULRR LFENCE LOCK MFENCE MOVSB MOVSD ModReg ModRegInd ModRegIndDisp32 ModRegIndSIB ModRegRegDisp32 ModRegRegDisp8 REP SFENCE SIB1 SIB2 SIB4 SIB8 XCHGAwR XCHGMwrR XCHGRR XMM0L XMM1L XMM2L XMM3L XMM4L XMM5L XMM6L XMM7L'
- classVariableNames: 'CDQ CLD CMPXCHGAwR CMPXCHGMwrR CPUID EAX EBP EBX ECX EDI EDX ESI ESP FSTPD FSTPS IDIVR IMULRR LFENCE LOCK MFENCE MOVSB MOVSD ModReg ModRegInd ModRegIndDisp32 ModRegIndSIB ModRegRegDisp32 ModRegRegDisp8 REP SFENCE SIB1 SIB2 SIB4 SIB8 XCHGAwR XCHGMwrR XCHGRR XMM0L XMM1L XMM2L XMM3L XMM4L XMM5L XMM6L XMM7L'
  poolDictionaries: ''
  category: 'VMMaker-JIT'!
 
  !CogIA32Compiler commentStamp: 'eem 9/14/2015 17:13' prior: 0!
  I generate IA32 (x86) instructions from CogAbstractInstructions.  For reference see
  1. IA-32 Intel® Architecture Software Developer's Manual Volume 2A: Instruction Set Reference, A-M
  2. IA-32 Intel® Architecture Software Developer's Manual Volume 2A: Instruction Set Reference, N-Z
  http://www.intel.com/products/processor/manuals/
  (® is supposed to be the Unicode "registered  sign".
 
  This class does not take any special action to flush the instruction cache on instruction-modification, trusting that Intel and AMD processors correctly invalidate the instruction cache via snooping.  According to the manuals, this will work on systems where code and data have the same virtual address.  The CogICacheFlushingIA32Compiler subclass exists to use the CPUID instruction to serialize instruction-modification for systems with code and data at different virtual addresses.!

Item was changed:
  ----- Method: CogIA32Compiler class>>initialize (in category 'class initialization') -----
  initialize
  "Initialize various IA32/x86 instruction-related constants.
  [1] IA-32 Intel® Architecture Software Developer's Manual Volume 2A: Instruction Set Reference, A-M"
 
  "CogIA32Compiler initialize"
 
  self ~~ CogIA32Compiler ifTrue: [^self].
 
  "N.B. EAX ECX and EDX are caller-save (scratch) registers.
  EBX ESI and EDI are callee-save; see concreteRegisterFor:"
  EAX := 0.
  ECX := 1.  "Were they completely mad or simply sadistic?"
  EDX := 2.
  EBX := 3.
  ESP := 4.
  EBP := 5.
  ESI := 6.
  EDI := 7.
 
  XMM0L := 0.
  XMM1L := 1.
  XMM2L := 2.
  XMM3L := 3.
  XMM4L := 4.
  XMM5L := 5.
  XMM6L := 6.
  XMM7L := 7.
 
  "Mod R/M Mod fields.  See [1] Sec 2.4, 2.5 & 2.6 & Table 2-2"
  ModRegInd := 0.
  ModRegIndSIB := 4.
  ModRegIndDisp32 := 5.
  ModRegRegDisp8 := 1.
  ModRegRegDisp32 := 2.
  ModReg := 3.
 
  "SIB Scaled Index modes.  See [1] Sec 2.4, 2.5 & 2.6 & Table 2-3"
  SIB1 := 0.
  SIB2 := 1.
  SIB4 := 2.
  SIB8 := 3.
 
  "Specific instructions"
  self
+ initializeSpecificOpcodes: #(CDQ IDIVR IMULRR CPUID LFENCE MFENCE SFENCE LOCK CMPXCHGAwR CMPXCHGMwrR XCHGAwR XCHGMwrR XCHGRR FSTPS FSTPD CLD REP MOVSB MOVSD BSR)
- initializeSpecificOpcodes: #(CDQ IDIVR IMULRR CPUID LFENCE MFENCE SFENCE LOCK CMPXCHGAwR CMPXCHGMwrR XCHGAwR XCHGMwrR XCHGRR FSTPS FSTPD CLD REP MOVSB MOVSD)
  in: thisContext method!

Item was changed:
  ----- Method: CogIA32Compiler>>computeMaximumSize (in category 'generate machine code') -----
(excessive size, no diff calculated)

Item was added:
+ ----- Method: CogIA32Compiler>>concretizeBSR (in category 'generate machine code') -----
+ concretizeBSR
+ "Bit Scan Reverse
+ First operand is output register (dest)
+ Second operand is input register (mask)"
+ "BSR"
+ <inline: true>
+ | dest mask |
+ dest := operands at: 0.
+ mask := operands at: 1.
+ machineCode
+ at: 0 put: 16r0F;
+ at: 1 put: 16rBD;
+ at: 2 put: (self mod: ModReg RM: dest RO: mask).
+ ^machineCodeSize := 3!

Item was added:
+ ----- Method: CogIA32Compiler>>concretizeClzRR (in category 'generate machine code') -----
+ concretizeClzRR
+ "Count leading zeros
+ First operand is output (dest)
+ Second operand is input (mask)"
+ "LZCNT"
+ <inline: true>
+ | dest mask |
+ dest := operands at: 0.
+ mask := operands at: 1.
+ machineCode
+ at: 0 put: 16rF3;
+ at: 1 put: 16r0F;
+ at: 2 put: 16rBD;
+ at: 3 put: (self mod: ModReg RM: dest RO: mask).
+ ^machineCodeSize := 4!

Item was changed:
  ----- Method: CogIA32Compiler>>dispatchConcretize (in category 'generate machine code') -----
  dispatchConcretize
  "Attempt to generate concrete machine code for the instruction at address.
  This is the inner dispatch of concretizeAt: actualAddress which exists only
  to get around the branch size limits in the SqueakV3 (blue book derived)
  bytecode set."
  <returnTypeC: #void>
  opcode >= CDQ ifTrue:
  [^self dispatchConcretizeProcessorSpecific].
  opcode caseOf: {
  "Noops & Pseudo Ops"
  [Label] -> [^self concretizeLabel].
  [AlignmentNops] -> [^self concretizeAlignmentNops].
  [Fill32] -> [^self concretizeFill32].
  [Nop] -> [^self concretizeNop].
  "Control"
  [Call] -> [^self concretizeCall].
  [CallR] -> [^self concretizeCallR].
  [CallFull] -> [^self concretizeCall].
  [JumpR] -> [^self concretizeJumpR].
  [JumpFull] -> [^self concretizeJumpLong].
  [JumpLong] -> [^self concretizeJumpLong].
  [JumpLongZero] -> [^self concretizeConditionalJump: 16r4].
  [JumpLongNonZero] -> [^self concretizeConditionalJump: 16r5].
  [Jump] -> [^self concretizeJump].
  "Table B-1 Intel® 64 and IA-32 Architectures Software Developer's Manual Volume 1: Basic Architecture"
  [JumpZero] -> [^self concretizeConditionalJump: 16r4].
  [JumpNonZero] -> [^self concretizeConditionalJump: 16r5].
  [JumpNegative] -> [^self concretizeConditionalJump: 16r8].
  [JumpNonNegative] -> [^self concretizeConditionalJump: 16r9].
  [JumpOverflow] -> [^self concretizeConditionalJump: 16r0].
  [JumpNoOverflow] -> [^self concretizeConditionalJump: 16r1].
  [JumpCarry] -> [^self concretizeConditionalJump: 16r2].
  [JumpNoCarry] -> [^self concretizeConditionalJump: 16r3].
  [JumpLess] -> [^self concretizeConditionalJump: 16rC].
  [JumpGreaterOrEqual] -> [^self concretizeConditionalJump: 16rD].
  [JumpGreater] -> [^self concretizeConditionalJump: 16rF].
  [JumpLessOrEqual] -> [^self concretizeConditionalJump: 16rE].
  [JumpBelow] -> [^self concretizeConditionalJump: 16r2].
  [JumpAboveOrEqual] -> [^self concretizeConditionalJump: 16r3].
  [JumpAbove] -> [^self concretizeConditionalJump: 16r7].
  [JumpBelowOrEqual] -> [^self concretizeConditionalJump: 16r6].
  [JumpFPEqual] -> [^self concretizeConditionalJump: 16r4].
  [JumpFPNotEqual] -> [^self concretizeConditionalJump: 16r5].
  [JumpFPLess] -> [^self concretizeConditionalJump: 16r2].
  [JumpFPGreaterOrEqual] -> [^self concretizeConditionalJump: 16r3].
  [JumpFPGreater] -> [^self concretizeConditionalJump: 16r7].
  [JumpFPLessOrEqual] -> [^self concretizeConditionalJump: 16r6].
  [JumpFPOrdered] -> [^self concretizeConditionalJump: 16rB].
  [JumpFPUnordered] -> [^self concretizeConditionalJump: 16rA].
  [RetN] -> [^self concretizeRetN].
  [Stop] -> [^self concretizeStop].
  "Arithmetic"
  [AddCqR] -> [^self concretizeAddCqR].
  [AddCwR] -> [^self concretizeAddCwR].
  [AddRR] -> [^self concretizeOpRR: 16r03].
  [AddcRR] -> [^self concretizeAddcRR].
  [AddcCqR] -> [^self concretizeAddcCqR].
  [AddRdRd] -> [^self concretizeSEE2OpRdRd: 16r58].
  [AddRsRs] -> [^self concretizeSEEOpRsRs: 16r58].
  [AndCqR] -> [^self concretizeAndCqR].
  [AndCwR] -> [^self concretizeAndCwR].
  [AndRR] -> [^self concretizeOpRR: 16r23].
  [TstCqR] -> [^self concretizeTstCqR].
  [CmpCqR] -> [^self concretizeCmpCqR].
  [CmpCwR] -> [^self concretizeCmpCwR].
  [CmpRR] -> [^self concretizeReverseOpRR: 16r39].
  [CmpRdRd] -> [^self concretizeCmpRdRd].
  [CmpRsRs] -> [^self concretizeCmpRsRs].
  [DivRdRd] -> [^self concretizeSEE2OpRdRd: 16r5E].
  [DivRsRs] -> [^self concretizeSEEOpRsRs: 16r5E].
  [MulRdRd] -> [^self concretizeSEE2OpRdRd: 16r59].
  [MulRsRs] -> [^self concretizeSEEOpRsRs: 16r59].
  [OrCqR] -> [^self concretizeOrCqR].
  [OrCwR] -> [^self concretizeOrCwR].
  [OrRR] -> [^self concretizeOpRR: 16r0B].
  [SubCqR] -> [^self concretizeSubCqR].
  [SubCwR] -> [^self concretizeSubCwR].
  [SubRR] -> [^self concretizeOpRR: 16r2B].
  [SubbRR] -> [^self concretizeSubbRR].
  [SubRdRd] -> [^self concretizeSEE2OpRdRd: 16r5C].
  [SubRsRs] -> [^self concretizeSEEOpRsRs: 16r5C].
  [SqrtRd] -> [^self concretizeSqrtRd].
  [SqrtRs] -> [^self concretizeSqrtRs].
  [XorCwR] -> [^self concretizeXorCwR].
  [XorRR] -> [^self concretizeOpRR: 16r33].
  [XorRdRd] -> [^self concretizeXorRdRd].
  [XorRsRs] -> [^self concretizeXorRsRs].
  [NegateR] -> [^self concretizeNegateR].
  [NotR] -> [^self concretizeNotR].
  [LoadEffectiveAddressMwrR] -> [^self concretizeLoadEffectiveAddressMwrR].
  [ArithmeticShiftRightCqR] -> [^self concretizeArithmeticShiftRightCqR].
  [LogicalShiftRightCqR] -> [^self concretizeLogicalShiftRightCqR].
  [LogicalShiftLeftCqR] -> [^self concretizeLogicalShiftLeftCqR].
  [ArithmeticShiftRightRR] -> [^self concretizeArithmeticShiftRightRR].
  [LogicalShiftLeftRR] -> [^self concretizeLogicalShiftLeftRR].
+ [ClzRR] -> [^self concretizeClzRR].
  "Data Movement"
  [MoveCqR] -> [^self concretizeMoveCqR].
  [MoveCwR] -> [^self concretizeMoveCwR].
  [MoveRR] -> [^self concretizeReverseOpRR: 16r89].
  [MoveRdRd] -> [^self concretizeMoveRdRd].
  [MoveRsRs] -> [^self concretizeMoveRsRs].
  [MoveAwR] -> [^self concretizeMoveAwR].
  [MoveRAw] -> [^self concretizeMoveRAw].
  [MoveAbR] -> [^self concretizeMoveAbR].
  [MoveRAb] -> [^self concretizeMoveRAb].
  [MoveMbrR] -> [^self concretizeMoveMbrR].
  [MoveRMbr] -> [^self concretizeMoveRMbr].
  [MoveRM8r] -> [^self concretizeMoveRMbr].
  [MoveM8rR] -> [^self concretizeMoveM8rR].
  [MoveM16rR] -> [^self concretizeMoveM16rR].
  [MoveRM16r] -> [^self concretizeMoveRM16r].
  [MoveM32rR] -> [^self concretizeMoveMwrR].
  [MoveRM32r] -> [^self concretizeMoveRMwr].
  [MoveM32rRs] -> [^self concretizeMoveM32rRs].
  [MoveRsM32r] -> [^self concretizeMoveRsM32r].
  [MoveM64rRd] -> [^self concretizeMoveM64rRd].
  [MoveMwrR] -> [^self concretizeMoveMwrR].
  [MoveXbrRR] -> [^self concretizeMoveXbrRR].
  [MoveRXbrR] -> [^self concretizeMoveRXbrR].
  [MoveXwrRR] -> [^self concretizeMoveXwrRR].
  [MoveRXwrR] -> [^self concretizeMoveRXwrR].
  [MoveRMwr] -> [^self concretizeMoveRMwr].
  [MoveRdM64r] -> [^self concretizeMoveRdM64r].
  [PopR] -> [^self concretizePopR].
  [PushR] -> [^self concretizePushR].
  [PushCq] -> [^self concretizePushCq].
  [PushCw] -> [^self concretizePushCw].
  [PrefetchAw] -> [^self concretizePrefetchAw].
  "Conversion"
  [ConvertRRd] -> [^self concretizeConvertRRd].
  [ConvertRdR] -> [^self concretizeConvertRdR].
 
  [ConvertRsRd] -> [^self concretizeConvertRsRd].
  [ConvertRdRs] -> [^self concretizeConvertRdRs].
  [ConvertRsR] -> [^self concretizeConvertRsR].
  [ConvertRRs] -> [^self concretizeConvertRRs].
 
  [SignExtend8RR] -> [^self concretizeSignExtend8RR].
  [SignExtend16RR] -> [^self concretizeSignExtend16RR].
 
  [ZeroExtend8RR] -> [^self concretizeZeroExtend8RR].
  [ZeroExtend16RR] -> [^self concretizeZeroExtend16RR].}!

Item was changed:
  ----- Method: CogIA32Compiler>>dispatchConcretizeProcessorSpecific (in category 'generate machine code') -----
  dispatchConcretizeProcessorSpecific
  "Attempt to generate concrete machine code for the instruction at address.
  This is part of the inner dispatch of concretizeAt: actualAddress which exists only
  to get around the number of literals limits in the SqueakV3 (blue book derived)
  bytecode set."
  <returnTypeC: #void>
  opcode caseOf: {
  "Specific Control/Data Movement"
  [CDQ] -> [^self concretizeCDQ].
  [IDIVR] -> [^self concretizeIDIVR].
  [IMULRR] -> [^self concretizeMulRR].
  [CPUID] -> [^self concretizeCPUID].
  [CMPXCHGAwR] -> [^self concretizeCMPXCHGAwR].
  [CMPXCHGMwrR] -> [^self concretizeCMPXCHGMwrR].
  [LFENCE] -> [^self concretizeFENCE: 5].
  [MFENCE] -> [^self concretizeFENCE: 6].
  [SFENCE] -> [^self concretizeFENCE: 7].
  [LOCK] -> [^self concretizeLOCK].
  [XCHGAwR] -> [^self concretizeXCHGAwR].
  [XCHGMwrR] -> [^self concretizeXCHGMwrR].
  [XCHGRR] -> [^self concretizeXCHGRR].
  [FSTPS] -> [^self concretizeFSTPS].
  [FSTPD] -> [^self concretizeFSTPD].
  [REP] -> [^self concretizeREP].
  [CLD] -> [^self concretizeCLD].
  [MOVSB] -> [^self concretizeMOVSB].
  [MOVSD] -> [^self concretizeMOVSD].
+ [BSR] -> [^self concretizeBSR].
  }!

Item was added:
+ ----- Method: CogIA32Compiler>>genHighBitAlternativeIn:ofSmallIntegerOopWithSingleTagBit: (in category 'abstract instructions') -----
+ genHighBitAlternativeIn: destReg ofSmallIntegerOopWithSingleTagBit: srcReg
+ "When CLZ is not available, we can use BSR (Bit Scan Reverse) instruction"
+ <inline: true>
+ | jumpNegativeReceiver |
+ <var: #jumpNegativeReceiver type: #'AbstractInstruction *'>
+ <returnTypeC: #'AbstractInstruction *'>
+ "The primitive must fail if receiver is negative"
+ (cogit lastOpcode setsConditionCodesFor: JumpNegative) ifFalse:
+ [cogit CmpCq: 0 R: srcReg]. "N.B. FLAGS := srcReg - 0"
+ jumpNegativeReceiver := cogit JumpNegative: 0.
+ cogit gen: BSR operand: destReg operand: srcReg.
+ "theoretically we should handle case when srcReg is zero, but we do not care because it is never zero thanks to the tagBit"
+ "and thanks to the tag bit, the +1 operation for getting 1-based rank instead of 0-based rank is not necessary, so we are done"
+ ^jumpNegativeReceiver!

Item was added:
+ ----- Method: CogIA32Compiler>>generateCheckLZCNT (in category 'feature detection') -----
+ generateCheckLZCNT
+ "to check is Leading Zero Count operation is present
+ cf. MSVC builtin __lzcnt documentation
+ The result will be in bit 5 of return value (in EAX)"
+ cogit
+ PushR: EDX;
+ PushR: ECX;
+ PushR: EBX;
+ MoveCq: 16r80000001 R: EAX;
+ gen: CPUID;
+ MoveR: ECX R: EAX;
+ PopR: EBX;
+ PopR: ECX;
+ PopR: EDX;
+ RetN: 0!

Item was added:
+ ----- Method: CogIA32Compiler>>hasLZCNTInstructions (in category 'testing') -----
+ hasLZCNTInstructions
+ "Answer if we support LZCNT"
+ <inline: true>
+ ^(cogit ceCheckLZCNT bitAnd: (1 << 5)) ~= 0!

Item was added:
+ ----- Method: CogIA32Compiler>>numCheckLZCNTOpcodes (in category 'feature detection') -----
+ numCheckLZCNTOpcodes
+ "Answer the number of opcodes required to compile the CPUID call to extract the extended features information."
+ ^11!

Item was changed:
  ----- Method: CogIA32Compiler>>setsConditionCodesFor: (in category 'testing') -----
  setsConditionCodesFor: aConditionalJumpOpcode
  <inline: false> "to save Slang from having to be a real compiler (it can't inline switches that return)"
  "Answer if the receiver's opcode sets the condition codes correctly for the given conditional jump opcode."
  ^opcode caseOf:
  { [ArithmeticShiftRightCqR] -> [self shiftSetsConditionCodesFor: aConditionalJumpOpcode].
  [ArithmeticShiftRightRR] -> [self shiftSetsConditionCodesFor: aConditionalJumpOpcode].
  [LogicalShiftLeftCqR] -> [self shiftSetsConditionCodesFor: aConditionalJumpOpcode].
+ [LogicalShiftLeftRR] -> [self shiftSetsConditionCodesFor: aConditionalJumpOpcode].
+ [XorRR] -> [true].
+ [ClzRR] -> [aConditionalJumpOpcode = JumpZero or: [aConditionalJumpOpcode = JumpNonZero or: [aConditionalJumpOpcode = JumpNoCarry or: [aConditionalJumpOpcode = JumpNoCarry "carry flag is set if input is zero"]]]]
- [LogicalShiftLeftRR] -> [self shiftSetsConditionCodesFor: aConditionalJumpOpcode].
- [XorRR] -> [true]
  }
  otherwise: [self halt: 'unhandled opcode in setsConditionCodesFor:'. false]!

Item was added:
+ ----- Method: CogMIPSELCompiler>>clzR:R:R: (in category 'encoding - arithmetic') -----
+ clzR: destReg R: leftReg R: rightReg
+ "NOTE: the right reg (rt) MUST equal dest reg (rd) or behavior is undefined"
+ ^self rtype: SPECIAL rs: leftReg rt: rightReg rd: destReg sa: 0 funct: 2r100000 "CLZ"!

Item was changed:
  ----- Method: CogMIPSELCompiler>>computeMaximumSize (in category 'generate machine code') -----
  computeMaximumSize
  "Each MIPS instruction has 4 bytes. Many abstract opcodes need more than one
  instruction. Instructions that refer to constants and/or literals depend on literals
  being stored in-line or out-of-line.
 
  N.B.  The ^N forms are to get around the bytecode compiler's long branch
  limits which are exceeded when each case jumps around the otherwise."
 
  opcode
  caseOf: {
  [BrEqualRR] -> [^8].
  [BrNotEqualRR] -> [^8].
  [BrUnsignedLessRR] -> [^12].
  [BrUnsignedLessEqualRR] -> [^12].
  [BrUnsignedGreaterRR] -> [^12].
  [BrUnsignedGreaterEqualRR] -> [^12].
  [BrSignedLessRR] -> [^12].
  [BrSignedLessEqualRR] -> [^12].
  [BrSignedGreaterRR] -> [^12].
  [BrSignedGreaterEqualRR] -> [^12].
  [BrLongEqualRR] -> [^16].
  [BrLongNotEqualRR] -> [^16].
  [MulRR] -> [^4].
  [DivRR] -> [^4].
  [MoveLowR] -> [^4].
  [MoveHighR] -> [^4].
 
  "Noops & Pseudo Ops"
  [Label] -> [^0].
  [Literal] -> [^4].
  [AlignmentNops] -> [^(operands at: 0) - 4].
  [Fill32] -> [^4].
  [Nop] -> [^4].
  "Control"
  [Call] -> [^self literalLoadInstructionBytes + 8].
  [CallFull] -> [^self literalLoadInstructionBytes + 8].
  [JumpR] -> [^8].
  [Jump] -> [^8].
  [JumpFull] -> [^self literalLoadInstructionBytes + 8].
  [JumpLong] -> [^self literalLoadInstructionBytes + 8].
  [JumpZero] -> [^8].
  [JumpNonZero] -> [^8].
  [JumpNegative] -> [^8].
  [JumpNonNegative] -> [^8].
  [JumpOverflow] -> [^8].
  [JumpNoOverflow] -> [^8].
  [JumpCarry] -> [^8].
  [JumpNoCarry] -> [^8].
  [JumpLess] -> [^8].
  [JumpGreaterOrEqual] -> [^8].
  [JumpGreater] -> [^8].
  [JumpLessOrEqual] -> [^8].
  [JumpBelow] -> [^8].
  [JumpAboveOrEqual] -> [^8].
  [JumpAbove] -> [^8].
  [JumpBelowOrEqual] -> [^8].
  [JumpLongZero] -> [^self literalLoadInstructionBytes + 8].
  [JumpLongNonZero] -> [^self literalLoadInstructionBytes + 8].
  [JumpFPEqual] -> [^8].
  [JumpFPNotEqual] -> [^8].
  [JumpFPLess] -> [^8].
  [JumpFPGreaterOrEqual]-> [^8].
  [JumpFPGreater] -> [^8].
  [JumpFPLessOrEqual] -> [^8].
  [JumpFPOrdered] -> [^8].
  [JumpFPUnordered] -> [^8].
  [RetN] -> [^8].
  [Stop] -> [^4].
 
  "Arithmetic"
  [AddCqR] -> [^12].
  [AndCqR] -> [^16].
  [AndCqRR] -> [^12].
  [CmpCqR] -> [^28].
  [OrCqR] -> [^12].
  [SubCqR] -> [^12].
  [TstCqR] -> [^12].
  [XorCqR] -> [^12].
  [AddCwR] -> [^12].
  [AndCwR] -> [^12].
  [CmpCwR] -> [^28].
  [OrCwR] -> [^12].
  [SubCwR] -> [^12].
  [XorCwR] -> [^12].
  [AddRR] -> [^4].
  [AndRR] -> [^4].
  [CmpRR] -> [^20].
  [OrRR] -> [^4].
  [XorRR] -> [^4].
  [SubRR] -> [^4].
  [NegateR] -> [^4].
  [LoadEffectiveAddressMwrR] -> [^12].
  [LogicalShiftLeftCqR] -> [^4].
  [LogicalShiftRightCqR] -> [^4].
  [ArithmeticShiftRightCqR] -> [^4].
  [LogicalShiftLeftRR] -> [^4].
  [LogicalShiftRightRR] -> [^4].
  [ArithmeticShiftRightRR] -> [^4].
  [AddRdRd] -> [^4].
  [CmpRdRd] -> [^4].
  [SubRdRd] -> [^4].
  [MulRdRd] -> [^4].
  [DivRdRd] -> [^4].
  [SqrtRd] -> [^4].
  [AddCheckOverflowCqR] -> [^28].
  [AddCheckOverflowRR] -> [^20].
  [SubCheckOverflowCqR] -> [^28].
  [SubCheckOverflowRR] -> [^20].
  [MulCheckOverflowRR] -> [^20].
+ [ClzRR] -> [^4].
  "Data Movement"
  [MoveCqR] -> [^8 "or 4"].
  [MoveCwR] -> [^8].
  [MoveRR] -> [^4].
  [MoveRdRd] -> [^4].
  [MoveAwR] -> [^(self isAddressRelativeToVarBase: (operands at: 0))
  ifTrue: [4]
  ifFalse: [self literalLoadInstructionBytes + 4]].
  [MoveRAw] -> [^(self isAddressRelativeToVarBase: (operands at: 1))
  ifTrue: [4]
  ifFalse: [self literalLoadInstructionBytes + 4]].
  [MoveAbR] -> [^(self isAddressRelativeToVarBase: (operands at: 0))
  ifTrue: [4]
  ifFalse: [self literalLoadInstructionBytes + 4]].
  [MoveRAb] -> [^(self isAddressRelativeToVarBase: (operands at: 1))
  ifTrue: [4]
  ifFalse: [self literalLoadInstructionBytes + 4]].
  [MoveRMwr] -> [^16].
  [MoveRdM64r] -> [^self literalLoadInstructionBytes + 4].
  [MoveMbrR] -> [^4].
  [MoveRMbr] -> [^4].
  [MoveM16rR] -> [^4].
  [MoveRM16r] -> [^4].
  [MoveM64rRd] -> [^self literalLoadInstructionBytes + 4].
  [MoveMwrR] -> [^16].
  [MoveXbrRR] -> [^8].
  [MoveRXbrR] -> [^8].
  [MoveXwrRR] -> [^12].
  [MoveRXwrR] -> [^12].
  [PopR] -> [^8].
  [PushR] -> [^8].
  [PushCw] -> [^16].
  [PushCq] -> [^16].
  [PrefetchAw] -> [^12].
  "Conversion"
  [ConvertRRd] -> [^8].
  }.
  ^0 "to keep C compiler quiet"
  !

Item was added:
+ ----- Method: CogMIPSELCompiler>>concretizeClzRR (in category 'generate machine code - concretize') -----
+ concretizeClzRR
+ | destReg leftReg rightReg |
+ destReg := rightReg := operands at: 0.
+ leftReg := operands at: 1.
+ self machineCodeAt: 0 put: (self clzR: destReg R: leftReg R:  rightReg).
+ ^machineCodeSize := 4!

Item was changed:
  ----- Method: CogMIPSELCompiler>>dispatchConcretize (in category 'generate machine code') -----
  dispatchConcretize
  "Attempt to generate concrete machine code for the instruction at address.
  This is the inner dispatch of concretizeAt: actualAddress which exists only
  to get around the branch size limits in the SqueakV3 (blue book derived)
  bytecode set."
  <returnTypeC: #void>
  opcode caseOf: {
  [BrEqualRR] -> [^self concretizeBrEqualRR].
  [BrNotEqualRR] -> [^self concretizeBrNotEqualRR].
  [BrUnsignedLessRR] -> [^self concretizeBrUnsignedLessRR].
  [BrUnsignedLessEqualRR] -> [^self concretizeBrUnsignedLessEqualRR].
  [BrUnsignedGreaterRR] -> [^self concretizeBrUnsignedGreaterRR].
  [BrUnsignedGreaterEqualRR] -> [^self concretizeBrUnsignedGreaterEqualRR].
  [BrSignedLessRR] -> [^self concretizeBrSignedLessRR].
  [BrSignedLessEqualRR] -> [^self concretizeBrSignedLessEqualRR].
  [BrSignedGreaterRR] -> [^self concretizeBrSignedGreaterRR].
  [BrSignedGreaterEqualRR] -> [^self concretizeBrSignedGreaterEqualRR].
  [BrLongEqualRR] -> [^self concretizeBrLongEqualRR].
  [BrLongNotEqualRR] -> [^self concretizeBrLongNotEqualRR].
  [MulRR] -> [^self concretizeUnimplemented].
  [DivRR] -> [^self concretizeDivRR].
  [MoveLowR] -> [^self concretizeMoveLowR].
  [MoveHighR] -> [^self concretizeMoveHighR].
 
 
  "Noops & Pseudo Ops"
  [Label] -> [^self concretizeLabel].
  [AlignmentNops] -> [^self concretizeAlignmentNops].
  [Fill32] -> [^self concretizeFill32].
  [Nop] -> [^self concretizeNop].
  "Control"
  [Call] -> [^self concretizeCall]. "call code within code space"
  [CallFull] -> [^self concretizeCallFull]. "call code anywhere in address space"
  [JumpR] -> [^self concretizeJumpR].
  [JumpFull] -> [^self concretizeJumpFull]."jump within address space"
  [JumpLong] -> [^self concretizeJumpLong]."jumps witihn code space"
  [JumpLongZero] -> [^self concretizeJumpLongZero].
  [JumpLongNonZero] -> [^self concretizeJumpLongNonZero].
  [Jump] -> [^self concretizeJump].
  [JumpZero] -> [^self concretizeJumpZero].
  [JumpNonZero] -> [^self concretizeJumpNonZero].
  [JumpNegative] -> [^self concretizeUnimplemented].
  [JumpNonNegative] -> [^self concretizeUnimplemented].
  [JumpOverflow] -> [^self concretizeJumpOverflow].
  [JumpNoOverflow] -> [^self concretizeJumpNoOverflow].
  [JumpCarry] -> [^self concretizeUnimplemented].
  [JumpNoCarry] -> [^self concretizeUnimplemented].
  [JumpLess] -> [^self concretizeJumpSignedLessThan].
  [JumpGreaterOrEqual] -> [^self concretizeJumpSignedGreaterEqual].
  [JumpGreater] -> [^self concretizeJumpSignedGreaterThan].
  [JumpLessOrEqual] -> [^self concretizeJumpSignedLessEqual].
  [JumpBelow] -> [^self concretizeJumpUnsignedLessThan].
  [JumpAboveOrEqual] -> [^self concretizeJumpUnsignedGreaterEqual].
  [JumpAbove] -> [^self concretizeJumpUnsignedGreaterThan].
  [JumpBelowOrEqual] -> [^self concretizeJumpUnsignedLessEqual].
  [JumpFPEqual] -> [^self concretizeUnimplemented].
  [JumpFPNotEqual] -> [^self concretizeUnimplemented].
  [JumpFPLess] -> [^self concretizeUnimplemented].
  [JumpFPGreaterOrEqual] -> [^self concretizeUnimplemented].
  [JumpFPGreater] -> [^self concretizeUnimplemented].
  [JumpFPLessOrEqual] -> [^self concretizeUnimplemented].
  [JumpFPOrdered] -> [^self concretizeUnimplemented].
  [JumpFPUnordered] -> [^self concretizeUnimplemented].
  [RetN] -> [^self concretizeRetN].
  [Stop] -> [^self concretizeStop].
  "Arithmetic"
  [AddCqR] -> [^self concretizeAddCqR].
  [AndCqR] -> [^self concretizeAndCqR].
  [AndCqRR] -> [^self concretizeAndCqRR].
  [CmpCqR] -> [^self concretizeCmpCqR].
  [OrCqR] -> [^self concretizeOrCqR].
  [SubCqR] -> [^self concretizeSubCqR].
  [TstCqR] -> [^self concretizeTstCqR].
  [XorCqR] -> [^self concretizeUnimplemented].
  [AddCwR] -> [^self concretizeAddCwR].
  [AndCwR] -> [^self concretizeAndCwR].
  [CmpCwR] -> [^self concretizeCmpCwR].
  [OrCwR] -> [^self concretizeOrCwR].
  [SubCwR] -> [^self concretizeSubCwR].
  [XorCwR] -> [^self concretizeXorCwR].
  [AddRR] -> [^self concretizeAddRR].
  [AndRR] -> [^self concretizeAndRR].
  [CmpRR] -> [^self concretizeCmpRR].
  [OrRR] -> [^self concretizeOrRR].
  [SubRR] -> [^self concretizeSubRR].
  [XorRR] -> [^self concretizeXorRR].
  [AddRdRd] -> [^self concretizeUnimplemented].
  [CmpRdRd] -> [^self concretizeUnimplemented].
  [DivRdRd] -> [^self concretizeUnimplemented].
  [MulRdRd] -> [^self concretizeUnimplemented].
  [SubRdRd] -> [^self concretizeUnimplemented].
  [SqrtRd] -> [^self concretizeUnimplemented].
  [NegateR] -> [^self concretizeNegateR].
  [LoadEffectiveAddressMwrR] -> [^self concretizeLoadEffectiveAddressMwrR].
  [ArithmeticShiftRightCqR] -> [^self concretizeArithmeticShiftRightCqR].
  [LogicalShiftRightCqR] -> [^self concretizeLogicalShiftRightCqR].
  [LogicalShiftLeftCqR] -> [^self concretizeLogicalShiftLeftCqR].
  [ArithmeticShiftRightRR] -> [^self concretizeArithmeticShiftRightRR].
  [LogicalShiftLeftRR] -> [^self concretizeLogicalShiftLeftRR].
  [LogicalShiftRightRR] -> [^self concretizeLogicalShiftRightRR].
+ [ClzRR] -> [^self concretizeClzRR].
  "Data Movement"
  [MoveCqR] -> [^self concretizeMoveCqR].
  [MoveCwR] -> [^self concretizeMoveCwR].
  [MoveRR] -> [^self concretizeMoveRR].
  [MoveAwR] -> [^self concretizeMoveAwR].
  [MoveRAw] -> [^self concretizeMoveRAw].
  [MoveAbR] -> [^self concretizeMoveAbR].
  [MoveRAb] -> [^self concretizeMoveRAb].
  [MoveMbrR] -> [^self concretizeMoveMbrR].
  [MoveRMbr] -> [^self concretizeUnimplemented].
  [MoveM16rR] -> [^self concretizeMoveM16rR].
  [MoveRM16r] -> [^self concretizeMoveRM16r].
  [MoveM64rRd] -> [^self concretizeUnimplemented].
  [MoveMwrR] -> [^self concretizeMoveMwrR].
  [MoveXbrRR] -> [^self concretizeMoveXbrRR].
  [MoveRXbrR] -> [^self concretizeMoveRXbrR].
  [MoveXwrRR] -> [^self concretizeMoveXwrRR].
  [MoveRXwrR] -> [^self concretizeMoveRXwrR].
  [MoveRMwr] -> [^self concretizeMoveRMwr].
  [MoveRdM64r] -> [^self concretizeUnimplemented].
  [PopR] -> [^self concretizePopR].
  [PushR] -> [^self concretizePushR].
  [PushCq] -> [^self concretizePushCq].
  [PushCw] -> [^self concretizePushCw].
  [PrefetchAw] -> [^self concretizePrefetchAw].
  [AddCheckOverflowCqR] -> [^self concretizeAddCheckOverflowCqR].
  [AddCheckOverflowRR] -> [^self concretizeAddCheckOverflowRR].
  [SubCheckOverflowCqR] -> [^self concretizeSubCheckOverflowCqR].
  [SubCheckOverflowRR] -> [^self concretizeSubCheckOverflowRR].
  [MulCheckOverflowRR] -> [^self concretizeMulCheckOverflowRR].
  "Conversion"
  [ConvertRRd] -> [^self concretizeUnimplemented]}!

Item was added:
+ ----- Method: CogMIPSELCompiler>>hasLZCNTInstructions (in category 'testing') -----
+ hasLZCNTInstructions
+ "Answer if the processor has a LZCNT (leading zero count) instruction"
+ <inline: true>
+ ^true "CLZ"!

Item was changed:
  ----- Method: CogMIPSELCompiler>>noteFollowingConditionalBranch: (in category 'abstract instructions') -----
  noteFollowingConditionalBranch: branch
  "Support for processors without condition codes, such as the MIPS.
  Answer the branch opcode.  Modify the receiver and the branch to
  implement a suitable conditional branch that doesn't depend on
  condition codes being set by the receiver."
  <returnTypeC: #'AbstractInstruction *'>
  <var: #branch type: #'AbstractInstruction *'>
  | newBranchLeft newBranchOpcode newBranchRight |
 
  ((branch opcode = JumpOverflow) or: [branch opcode = JumpNoOverflow])
  ifTrue: [^self noteFollowingOverflowBranch: branch].
 
  newBranchOpcode := branch opcode caseOf: {
  [JumpZero] -> [BrEqualRR].
  [JumpNonZero] -> [BrNotEqualRR].
  [JumpBelow] -> [BrUnsignedLessRR].
  [JumpBelowOrEqual] -> [BrUnsignedLessEqualRR].
  [JumpAbove] -> [BrUnsignedGreaterRR].
  [JumpAboveOrEqual] -> [BrUnsignedGreaterEqualRR].
  [JumpLess] -> [BrSignedLessRR].
  [JumpLessOrEqual] -> [BrSignedLessEqualRR].
  [JumpGreater] -> [BrSignedGreaterRR].
  [JumpGreaterOrEqual] -> [BrSignedGreaterEqualRR].
  [JumpLongZero] -> [BrLongEqualRR].
  [JumpLongNonZero] -> [BrLongNotEqualRR].
 
  [JumpNegative] -> [BrSignedLessRR].
  } otherwise: [self unreachable. 0].
 
  opcode caseOf: {
  [BrEqualRR] -> ["I.e., two jumps after a compare."
  newBranchLeft := operands at: 1.
  newBranchRight := operands at: 2].
  [BrUnsignedLessRR] -> ["I.e., two jumps after a compare."
  newBranchLeft := operands at: 1.
  newBranchRight := operands at: 2].
 
  [CmpRR] -> [newBranchLeft := operands at: 1.
  newBranchRight := operands at: 0.
  opcode := Label].
  [CmpCqR] -> [newBranchLeft := operands at: 1.
  newBranchRight := AT.
  opcode := MoveCqR.
  operands at: 1 put: AT].
  [CmpCwR] -> [newBranchLeft := operands at: 1.
  newBranchRight := AT.
  opcode := MoveCwR.
  operands at: 1 put: AT].
  [TstCqR] -> [newBranchLeft := Cmp.
  newBranchRight := ZR].
  [AndCqR] -> [newBranchLeft := operands at: 1.
  newBranchRight := ZR].
  [AndCqRR] -> [newBranchLeft := operands at: 2.
  newBranchRight := ZR].
  [OrRR] -> [newBranchLeft := operands at: 1.
  newBranchRight := ZR].
  [XorRR] -> [newBranchLeft := operands at: 1.
  newBranchRight := ZR].
  [SubCwR] -> [newBranchLeft := operands at: 1.
  newBranchRight := ZR].
  [SubCqR] -> [newBranchLeft := operands at: 1.
  newBranchRight := ZR].
  [ArithmeticShiftRightCqR] -> [newBranchLeft := operands at: 1.
  newBranchRight := ZR].
+ [ClzRR] -> [newBranchLeft := operands at: 0. "we test if the destination register is zero"
+ newBranchRight := ZR].
  } otherwise: [self unreachable].
 
  branch rewriteOpcode: newBranchOpcode with: newBranchLeft with: newBranchRight.
  ^branch!

Item was changed:
  ----- Method: CogMIPSELCompiler>>setsConditionCodesFor: (in category 'testing') -----
  setsConditionCodesFor: aConditionalJumpOpcode
  <inline: false>
  "Not really, but we can merge this in noteFollowingConditionalBranch:."
  opcode = XorRR ifTrue: [^true].
  opcode = ArithmeticShiftRightCqR ifTrue: [^true].
+ (opcode = ClzRR and: [aConditionalJumpOpcode = JumpZero]) ifTrue: [^true].
  self unreachable.
  ^false!

Item was added:
+ ----- Method: CogObjectRepresentation>>genPrimitiveHighBit (in category 'primitive generators') -----
+ genPrimitiveHighBit
+ | jumpNegativeReceiver |
+ <var: #jumpNegativeReceiver type: #'AbstractInstruction *'>
+ "remove excess tag bits from the receiver oop"
+ self numSmallIntegerTagBits > 1
+ ifTrue:
+ [cogit OrCq: 1 << self numSmallIntegerTagBits - 1 R: ReceiverResultReg.
+ cogit ArithmeticShiftRightCq: self numSmallIntegerTagBits - 1 R: ReceiverResultReg].
+ "and use the abstract cogit facility for case of single tag-bit"
+ jumpNegativeReceiver := cogit genHighBitIn: ReceiverResultReg ofSmallIntegerOopWithSingleTagBit: ReceiverResultReg.
+ "Jump is NULL if above operation is not implemented, else return the result"
+ jumpNegativeReceiver = 0
+ ifFalse:
+ [cogit genConvertIntegerToSmallIntegerInReg: ReceiverResultReg.
+ cogit genPrimReturn.
+ jumpNegativeReceiver jmpTarget: cogit Label].
+ ^CompletePrimitive!

Item was added:
+ ----- Method: CogObjectRepresentation>>genPrimitiveHighBitgenPrimitiveHighBit (in category 'primitive generators') -----
+ genPrimitiveHighBitgenPrimitiveHighBit
+ | jumpNegativeReceiver |
+ <var: #jumpNegativeReceiver type: #'AbstractInstruction *'>
+ "remove excess tag bits from the receiver oop"
+ self numSmallIntegerTagBits > 1
+ ifTrue:
+ [cogit OrCw: 1 << self numSmallIntegerTagBits - 1 R: ReceiverResultReg.
+ cogit ArithmeticShiftRightCq: self numSmallIntegerTagBits - 1 R: ReceiverResultReg].
+ "and use the abstract cogit facility for case of single tag-bit"
+ jumpNegativeReceiver := cogit genHighBitIn: ReceiverResultReg ofSmallIntegerOopWithSingleTagBit: ReceiverResultReg.
+ "The jump instruction is NULL when backend does not really has a jitted implementation: fallback to normal primitive"
+ jumpNegativeReceiver = 0 ifTrue: [^CompletePrimitive].
+ cogit genPrimReturn.
+ jumpNegativeReceiver jmpTarget: cogit Label.
+ ^UnimplementedPrimitive!

Item was added:
+ ----- Method: CogObjectRepresentationFor64BitSpur>>genPrimitiveHighBit (in category 'primitive generators') -----
+ genPrimitiveHighBit
+ "Implementation notes: same as super, but CLZ/BSR only work on 64bits for registers R0-R7 on Intel X64.
+ Normally, this should be backEnd dependent, but for now we have a single 64bits target..."
+ | jumpNegativeReceiver reg |
+ <var: #jumpNegativeReceiver type: #'AbstractInstruction *'>
+ "remove excess tag bits from the receiver oop"
+
+ ReceiverResultReg > 7
+ ifTrue: [cogit MoveR: ReceiverResultReg R: (reg := TempReg)]
+ ifFalse: [reg := ReceiverResultReg].
+ self numSmallIntegerTagBits > 1
+ ifTrue:
+ [cogit OrCw: 1 << self numSmallIntegerTagBits - 1 R: reg.
+ cogit ArithmeticShiftRightCq: self numSmallIntegerTagBits - 1 R: reg].
+ "and use the abstract cogit facility for case of single tag-bit"
+ jumpNegativeReceiver := cogit genHighBitIn: reg ofSmallIntegerOopWithSingleTagBit: reg.
+ "Jump is NULL if above operation is not implemented, else return the result"
+ jumpNegativeReceiver = 0
+ ifFalse:
+ [ReceiverResultReg > 7
+ ifTrue: [cogit MoveR: reg R: ReceiverResultReg].
+ cogit genConvertIntegerToSmallIntegerInReg: ReceiverResultReg.
+ cogit genPrimReturn.
+ jumpNegativeReceiver jmpTarget: cogit Label].
+ ^CompletePrimitive!

Item was changed:
  SharedPool subclass: #CogRTLOpcodes
  instanceVariableNames: ''
+ classVariableNames: 'AddCqR AddCwR AddRR AddRdRd AddRsRs AddcCqR AddcRR AlignmentNops AndCqR AndCqRR AndCwR AndRR ArithmeticShiftRightCqR ArithmeticShiftRightRR Call CallFull CallR ClzRR CmpC32R CmpCqR CmpCwR CmpRR CmpRdRd CmpRsRs ConvertRRd ConvertRRs ConvertRdR ConvertRdRs ConvertRsR ConvertRsRd DivRdRd DivRsRs Fill32 FirstJump FirstShortJump Jump JumpAbove JumpAboveOrEqual JumpBelow JumpBelowOrEqual JumpCarry JumpFPEqual JumpFPGreater JumpFPGreaterOrEqual JumpFPLess JumpFPLessOrEqual JumpFPNotEqual JumpFPOrdered JumpFPUnordered JumpFull JumpGreater JumpGreaterOrEqual JumpLess JumpLessOrEqual JumpLong JumpLongNonZero JumpLongZero JumpNegative JumpNoCarry JumpNoOverflow JumpNonNegative JumpNonZero JumpOverflow JumpR JumpZero Label LastJump LastRTLCode Literal LoadEffectiveAddressMwrR LogicalShiftLeftCqR LogicalShiftLeftRR LogicalShiftRightCqR LogicalShiftRightRR MoveA32R MoveAbR MoveAwR MoveC32R MoveCqR MoveCwR MoveM16rR MoveM32rR MoveM32rRs MoveM64rRd MoveM8rR MoveMbrR MoveMs8rR
  MoveMwrR MoveRA32 MoveRAb MoveRAw MoveRM16r MoveRM32r MoveRM8r MoveRMbr MoveRMwr MoveRR MoveRRd MoveRX16rR MoveRX32rR MoveRXbrR MoveRXwrR MoveRdM64r MoveRdR MoveRdRd MoveRsM32r MoveRsRs MoveX16rRR MoveX32rRR MoveXbrRR MoveXwrRR MulRdRd MulRsRs NegateR Nop NotR OrCqR OrCwR OrRR PopR PrefetchAw PushCq PushCw PushR RetN RotateLeftCqR RotateRightCqR SignExtend16RR SignExtend32RR SignExtend8RR SqrtRd SqrtRs Stop SubCqR SubCwR SubRR SubRdRd SubRsRs SubbCqR SubbRR TstCqR XorCqR XorCwR XorRR XorRdRd XorRsRs ZeroExtend16RR ZeroExtend32RR ZeroExtend8RR'
- classVariableNames: 'AddCqR AddCwR AddRR AddRdRd AddRsRs AddcCqR AddcRR AlignmentNops AndCqR AndCqRR AndCwR AndRR ArithmeticShiftRightCqR ArithmeticShiftRightRR Call CallFull CallR CmpC32R CmpCqR CmpCwR CmpRR CmpRdRd CmpRsRs ConvertRRd ConvertRRs ConvertRdR ConvertRdRs ConvertRsR ConvertRsRd DivRdRd DivRsRs Fill32 FirstJump FirstShortJump Jump JumpAbove JumpAboveOrEqual JumpBelow JumpBelowOrEqual JumpCarry JumpFPEqual JumpFPGreater JumpFPGreaterOrEqual JumpFPLess JumpFPLessOrEqual JumpFPNotEqual JumpFPOrdered JumpFPUnordered JumpFull JumpGreater JumpGreaterOrEqual JumpLess JumpLessOrEqual JumpLong JumpLongNonZero JumpLongZero JumpNegative JumpNoCarry JumpNoOverflow JumpNonNegative JumpNonZero JumpOverflow JumpR JumpZero Label LastJump LastRTLCode Literal LoadEffectiveAddressMwrR LogicalShiftLeftCqR LogicalShiftLeftRR LogicalShiftRightCqR LogicalShiftRightRR MoveA32R MoveAbR MoveAwR MoveC32R MoveCqR MoveCwR MoveM16rR MoveM32rR MoveM32rRs MoveM64rRd MoveM8rR MoveMbrR MoveMs8rR MoveM
 wrR MoveRA32 MoveRAb MoveRAw MoveRM16r MoveRM32r MoveRM8r MoveRMbr MoveRMwr MoveRR MoveRRd MoveRX16rR MoveRX32rR MoveRXbrR MoveRXwrR MoveRdM64r MoveRdR MoveRdRd MoveRsM32r MoveRsRs MoveX16rRR MoveX32rRR MoveXbrRR MoveXwrRR MulRdRd MulRsRs NegateR Nop NotR OrCqR OrCwR OrRR PopR PrefetchAw PushCq PushCw PushR RetN RotateLeftCqR RotateRightCqR SignExtend16RR SignExtend32RR SignExtend8RR SqrtRd SqrtRs Stop SubCqR SubCwR SubRR SubRdRd SubRsRs SubbCqR SubbRR TstCqR XorCqR XorCwR XorRR XorRdRd XorRsRs ZeroExtend16RR ZeroExtend32RR ZeroExtend8RR'
  poolDictionaries: ''
  category: 'VMMaker-JIT'!
 
  !CogRTLOpcodes commentStamp: 'eem 12/26/2015 14:00' prior: 0!
  I am a pool for the Register-Transfer-Language to which Cog compiles.  I define unique integer values for all RTL opcodes.  See CogAbstractInstruction for instances of instructions with the opcodes that I define.!

Item was changed:
  ----- Method: CogRTLOpcodes class>>initialize (in category 'class initialization') -----
  initialize
  "Abstract opcodes are a compound of a one word operation specifier and zero or more operand type specifiers.
  The assembler is in Cogit protocol abstract instructions and uses `at&t' syntax, assigning to the register on the
  right. e.g. MoveRR is the Move opcode with two register operand specifiers and defines a move register to
  register instruction from operand 0 to operand 1.  The word and register size is assumed to be either 32-bits
  on a 32-bit architecture or 64-bits on a 64-bit architecture.  The abstract machine is mostly a 2 address machine
  with the odd three address instruction added to better exploit RISCs.
  (self initialize)
  The operand specifiers are
  R - general purpose register
  Rs - single-precision floating-point register
  Rd - double-precision floating-point register
  Cq - a `quick' constant that can be encoded in the minimum space possible.
  Cw - a constant with word size where word is the default operand size for the Smalltalk VM, 32-bits
   for a 32-bit VM, 64-bits for a 64-bit VM.  The generated constant must occupy the default number
   of bits.  This allows e.g. a garbage collector to update the value without invalidating the code.
  C32 - a constant with 32 bit size.  The generated constant must occupy 32 bits.
  C64 - a constant with 64 bit size.  The generated constant must occupy 64 bits.
  Aw - memory word (32-bits for a 32-bit VM, 64-bits for a 64-bit VM) at an absolute address
  Ab - memory byte at an absolute address
  A32 - memory 32-bit halfword at an absolute address
  Mwr - memory word whose address is at a constant offset from an address in a register
  Mbr - memory byte whose address is at a constant offset from an address in a register (zero-extended on read)
  M16r - memory 16-bit halfword whose address is at a constant offset from an address in a register
  M32r - memory 32-bit halfword whose address is at a constant offset from an address in a register
  M64r - memory 64-bit doubleword whose address is at a constant offset from an address in a register
  Xbr - memory byte whose address is r * byte size away from an address in a register
  X16r - memory 16-bit halfword whose address is r * (2 bytes size) away from an address in a register
  X32r - memory 32-bit halfword whose address is r * (4 bytes size) away from an address in a register (64-bit ISAs only)
  Xwr - memory word whose address is r * word size away from an address in a register
  Xowr - memory word whose address is o + (r * word size) away from an address in a register (scaled indexed)
 
  An alternative would be to decouple opcodes from operands, e.g.
  Move := 1. Add := 2. Sub := 3...
  RegisterOperand := 1. ConstantQuickOperand := 2. ConstantWordOperand := 3...
  But not all combinations make sense and even fewer are used so we stick with the simple compound approach.
 
  The assumption is that comparison and arithmetic instructions set condition codes and that move instructions
  leave the condition codes unaffected.  In particular LoadEffectiveAddressMwrR does not set condition codes
  although it can be used to do arithmetic.  On processors such as MIPS this distinction is invalid; there are no
  condition codes.  So the backend is allowed to collapse operation, branch pairs to internal instruction definitions
  (see senders and implementors of noteFollowingConditionalBranch:).
 
  Not all of the definitions in opcodeDefinitions below are implemented.  In particular we do not implement the
  XowrR scaled index addressing mode since it requires 4 operands.
 
  Not all instructions make sense on all architectures.  MoveRRd and MoveRdR aqre meaningful only on 64-bit machines.
 
  Note that there are no generic division instructions defined, but a processor may define some.
 
  Branch/Call ranges.  Jump[Cond] can be generated as short as possible.  Call/Jump[Cond]Long must be generated
  in the same number of bytes irrespective of displacement since their targets may be updated, but they need only
  span 16Mb, the maximum size of the code zone.  This allows e.g. ARM to use single-word call and jump instructions
  for most calls and jumps.  CallFull/JumpFull must also be generated in the same number of bytes irrespective of
  displacement for the same reason, but they must be able to span the full (32-bit or 64-bit) address space because
  they are used to call code in the C runtime, which may be distant from the code zone.  CallFull/JumpFull are allowed
  to use the cResultRegister as a scratch if required (e.g. on x64 where there is no direct 64-bit call or jump).
 
  Byte reads.  If the concrete compiler class answers true to byteReadsZeroExtend then byte reads must zero-extend
  the byte read into the destination register.  If not, the other bits of the register should be left undisturbed and the
  Cogit will add an instruction to zero the register as required.  Under no circumstances should byte reads sign-extend.
 
  16-bit (and on 64-bits, 32-bit) reads.  These /are/ expected to always zero-extend."
 
  | opcodeNames refs |
  opcodeNames := #("Noops & Pseudo Ops"
  Label
  Literal "a word-sized literal"
  AlignmentNops
  Fill32 "output four byte's worth of bytes with operand 0"
  Nop
 
  "Control"
  Call "call within the code zone"
  CallFull "call anywhere within the full address space"
  CallR
  RetN
  JumpR "Not a regular jump, i.e. not pc dependent."
  Stop "Halt the processor"
 
  "N.B.  Jumps are contiguous.  Long and Full jumps are contiguous within them.  See FirstJump et al below"
  JumpFull "Jump anywhere within the address space"
  JumpLong "Jump anywhere within the 16mb code zone."
  JumpLongZero "a.k.a. JumpLongEqual"
  JumpLongNonZero "a.k.a. JumpLongNotEqual"
  Jump "short jumps; can be encoded in as few bytes as possible; will not be disturbed by GC or relocation."
  JumpZero "a.k.a. JumpEqual"
  JumpNonZero "a.k.a. JumpNotEqual"
  JumpNegative
  JumpNonNegative
  JumpOverflow
  JumpNoOverflow
  JumpCarry
  JumpNoCarry
  JumpLess "signed"
  JumpGreaterOrEqual
  JumpGreater
  JumpLessOrEqual
  JumpBelow "unsigned"
  JumpAboveOrEqual
  JumpAbove
  JumpBelowOrEqual
 
  JumpFPEqual
  JumpFPNotEqual
  JumpFPLess
  JumpFPLessOrEqual
  JumpFPGreater
  JumpFPGreaterOrEqual
  JumpFPOrdered
  JumpFPUnordered
 
  "Data Movement; destination is always last operand"
  MoveRR
  MoveAwR MoveA32R
  MoveRAw MoveRA32
  MoveAbR
  MoveRAb
  MoveMwrR MoveRMwr MoveXwrRR MoveRXwrR "MoveXowrR MoveRXowr""Unused"
  MoveM8rR MoveMs8rR MoveRM8r
  MoveM16rR MoveRM16r MoveX16rRR MoveRX16rR
  MoveM32rR MoveRM32r MoveX32rRR MoveRX32rR
  MoveMbrR MoveRMbr MoveXbrRR MoveRXbrR
  MoveCqR MoveCwR MoveC32R "MoveC64R""Not used"
  MoveRRd MoveRdR MoveRdRd MoveM64rRd MoveRdM64r
  MoveRsRs MoveM32rRs MoveRsM32r
  PopR PushR PushCq PushCw
  PrefetchAw
 
  "Arithmetic; destination is always last operand except Cmp; CmpXR is SubRX with no update of result"
  LoadEffectiveAddressMwrR "LoadEffectiveAddressXowrR" "Variants of add/multiply"
  NegateR "2's complement negation"
  NotR
  ArithmeticShiftRightCqR ArithmeticShiftRightRR
  LogicalShiftRightCqR LogicalShiftRightRR
  LogicalShiftLeftCqR LogicalShiftLeftRR
  RotateLeftCqR RotateRightCqR
 
  CmpRR AddRR SubRR AndRR OrRR XorRR
  CmpCqR AddCqR SubCqR AndCqR OrCqR TstCqR XorCqR
  CmpCwR CmpC32R AddCwR SubCwR AndCwR OrCwR XorCwR
  AddcRR AddcCqR SubbRR SubbCqR
 
  AndCqRR "Three address ops for RISCs; feel free to add and extend"
 
  CmpRdRd AddRdRd SubRdRd MulRdRd DivRdRd SqrtRd XorRdRd
  CmpRsRs AddRsRs SubRsRs MulRsRs DivRsRs SqrtRs XorRsRs
 
  "Conversion"
  ConvertRRd ConvertRdR
  ConvertRsRd ConvertRdRs ConvertRsR ConvertRRs
 
  SignExtend8RR SignExtend16RR SignExtend32RR
  ZeroExtend8RR ZeroExtend16RR ZeroExtend32RR
 
+ "Advanced bit manipulation (aritmetic)"
+ ClzRR
+
  LastRTLCode).
 
  "Magic auto declaration. Add to the classPool any new variables and nuke any obsolete ones, and assign values"
  "Find the variables directly referenced by this method"
  refs := (thisContext method literals select: [:l| l isVariableBinding and: [classPool includesKey: l key]]) collect:
  [:ea| ea key].
  "Move to Undeclared any opcodes in classPool not in opcodes or this method."
  (classPool keys reject: [:k| (opcodeNames includes: k) or: [refs includes: k]]) do:
  [:k|
  Undeclared declare: k from: classPool].
  "Declare as class variables and number elements of opcodeArray above"
  opcodeNames withIndexDo:
  [:classVarName :value|
  self classPool
  declare: classVarName from: Undeclared;
  at: classVarName put: value].
 
  "For CogAbstractInstruction>>isJump etc..."
  FirstJump := JumpFull.
  LastJump := JumpFPUnordered.
  FirstShortJump := Jump.
 
  "And now initialize the backends; they add their own opcodes and hence these must be reinitialized."
  (Smalltalk classNamed: #CogAbstractInstruction) ifNotNil:
  [:cogAbstractInstruction| cogAbstractInstruction allSubclasses do: [:sc| sc initialize]]!

Item was changed:
  CogAbstractInstruction subclass: #CogX64Compiler
  instanceVariableNames: ''
+ classVariableNames: 'BSR CDQ CLD CMPXCHGAwR CMPXCHGMwrR CPUID IDIVR IMULRR LFENCE LOCK MFENCE MOVSB MOVSQ ModReg ModRegInd ModRegIndDisp32 ModRegIndSIB ModRegRegDisp32 ModRegRegDisp8 R10 R11 R12 R13 R14 R15 R8 R9 RAX RBP RBX RCX RDI RDX REP RSI RSP SFENCE SIB1 SIB2 SIB4 SIB8 SysV XCHGAwR XCHGMwrR XCHGRR XMM0L XMM10L XMM11L XMM12L XMM13L XMM14L XMM15L XMM1L XMM2L XMM3L XMM4L XMM5L XMM6L XMM7L XMM8L XMM9L'
- classVariableNames: 'CDQ CLD CMPXCHGAwR CMPXCHGMwrR CPUID IDIVR IMULRR LFENCE LOCK MFENCE MOVSB MOVSQ ModReg ModRegInd ModRegIndDisp32 ModRegIndSIB ModRegRegDisp32 ModRegRegDisp8 R10 R11 R12 R13 R14 R15 R8 R9 RAX RBP RBX RCX RDI RDX REP RSI RSP SFENCE SIB1 SIB2 SIB4 SIB8 SysV XCHGAwR XCHGMwrR XCHGRR XMM0L XMM10L XMM11L XMM12L XMM13L XMM14L XMM15L XMM1L XMM2L XMM3L XMM4L XMM5L XMM6L XMM7L XMM8L XMM9L'
  poolDictionaries: ''
  category: 'VMMaker-JIT'!
 
  !CogX64Compiler commentStamp: 'eem 9/14/2015 17:12' prior: 0!
  I generate x64 (x86-64) instructions from CogAbstractInstructions.  For reference see
  1. IA-32 Intel® Architecture Software Developer's Manual Volume 2A: Instruction Set Reference, A-M
  2. IA-32 Intel® Architecture Software Developer's Manual Volume 2A: Instruction Set Reference, N-Z
  http://www.intel.com/products/processor/manuals/
  or
  AMD64 Architecture Programmer's Manual Volume 3: General-Purpose and System Instructions
  AMD64 Architecture Programmer's Manual Volume 4: 128-bit Media Instructions
  AMD64 Architecture Programmer's Manual Volume 5: 64-bit Media and x87 Floating Point Instructions
  http://developer.amd.com/resources/documentation-articles/developer-guides-manuals/
  (® is supposed to be the Unicode "registered  sign").!

Item was changed:
  ----- Method: CogX64Compiler class>>initialize (in category 'class initialization') -----
  initialize
  "Initialize various x64 instruction-related constants.
  [1] IA-32 Intel® Architecture Software Developer's Manual Volume 2A: Instruction Set Reference, A-M"
 
  "CogX64Compiler initialize"
 
  self ~~ CogX64Compiler ifTrue: [^self].
 
  (InitializationOptions ifNil: [Dictionary new])
  at: #ABI
  ifPresent: [:abi| SysV := abi asUppercase ~= #WIN64 and: [abi asUppercase ~= #'_WIN64']]
  ifAbsent: [SysV := true]. "Default ABI; set to true for SysV, false for WIN64/_WIN64"
 
  RAX := 0.
  RCX := 1.  "Were they completely mad or simply sadistic?"
  RDX := 2.
  RBX := 3.
  RSP := 4.
  RBP := 5.
  RSI := 6.
  RDI := 7.
  R8 := 8.
  R9 := 9.
  R10 := 10.
  R11 := 11.
  R12 := 12.
  R13 := 13.
  R14 := 14.
  R15 := 15.
 
  XMM0L := 0.
  XMM1L := 1.
  XMM2L := 2.
  XMM3L := 3.
  XMM4L := 4.
  XMM5L := 5.
  XMM6L := 6.
  XMM7L := 7.
  XMM8L := 8.
  XMM9L := 9.
  XMM10L := 10.
  XMM11L := 11.
  XMM12L := 12.
  XMM13L := 13.
  XMM14L := 14.
  XMM15L := 15.
 
  "Mod R/M Mod fields.  See [1] Sec 2.4, 2.5 & 2.6 & Table 2-2"
  ModRegInd := 0.
  ModRegIndSIB := 4.
  ModRegIndDisp32 := 5.
  ModRegRegDisp8 := 1.
  ModRegRegDisp32 := 2.
  ModReg := 3.
 
  "SIB Scaled Index modes.  See [1] Sec 2.4, 2.5 & 2.6 & Table 2-3"
  SIB1 := 0.
  SIB2 := 1.
  SIB4 := 2.
  SIB8 := 3.
 
  "Specific instructions"
  self
+ initializeSpecificOpcodes: #(CDQ IDIVR IMULRR CPUID LFENCE MFENCE SFENCE LOCK CMPXCHGAwR CMPXCHGMwrR XCHGAwR XCHGMwrR XCHGRR CLD REP MOVSB MOVSQ BSR)
- initializeSpecificOpcodes: #(CDQ IDIVR IMULRR CPUID LFENCE MFENCE SFENCE LOCK CMPXCHGAwR CMPXCHGMwrR XCHGAwR XCHGMwrR XCHGRR CLD REP MOVSB MOVSQ)
  in: thisContext method!

Item was changed:
  ----- Method: CogX64Compiler>>computeMaximumSize (in category 'generate machine code') -----
(excessive size, no diff calculated)

Item was added:
+ ----- Method: CogX64Compiler>>concretizeBSR (in category 'generate machine code') -----
+ concretizeBSR
+ "Bit Scan Reverse
+ First operand is output register (dest)
+ Second operand is input register (mask)"
+ "BSR"
+ <inline: true>
+ | dest mask |
+ dest := operands at: 0.
+ mask := operands at: 1.
+ (dest <= 7 and: [mask <= 7])
+ ifTrue: [machineCode at: 0 put: (self rexw: true r: 0 x: 0 b: 0)]
+ ifFalse: ["Beware: operation is on 32bits for R8-15"machineCode at: 0 put: (self rexw: false r: 0 x: 0 b: 0)].
+
+ machineCode
+ at: 1 put: 16r0F;
+ at: 2 put: 16rBD;
+ at: 3 put: (self mod: ModReg RM: dest RO: mask).
+ ^machineCodeSize := 4!

Item was added:
+ ----- Method: CogX64Compiler>>concretizeCPUID (in category 'generate machine code') -----
+ concretizeCPUID
+ <inline: true>
+ machineCode
+ at: 0 put: 16r0F;
+ at: 1 put: 16rA2.
+ ^machineCodeSize := 2!

Item was added:
+ ----- Method: CogX64Compiler>>concretizeClzRR (in category 'generate machine code') -----
+ concretizeClzRR
+ "Count leading zeros
+ First operand is output (dest)
+ Second operand is input (mask)"
+ "LZCNT"
+ <inline: true>
+ | dest mask |
+ dest := operands at: 0.
+ mask := operands at: 1.
+ machineCode
+ at: 0 put: 16rF3.
+ (dest <= 7 and: [mask <= 7])
+ ifTrue: [machineCode at: 1 put: (self rexw: true r: 0 x: 0 b: 0)]
+ ifFalse: [machineCode at: 1 put: (self rexw: false r: 0 x: 0 b: 0)].
+
+ machineCode
+ at: 2 put: 16r0F;
+ at: 3 put: 16rBD;
+ at: 4 put: (self mod: ModReg RM: dest RO: mask).
+ ^machineCodeSize := 5!

Item was changed:
  ----- Method: CogX64Compiler>>dispatchConcretize (in category 'generate machine code') -----
  dispatchConcretize
  "Attempt to generate concrete machine code for the instruction at address.
  This is the inner dispatch of concretizeAt: actualAddress which exists only
  to get around the branch size limits in the SqueakV3 (blue book derived)
  bytecode set."
  <returnTypeC: #void>
  opcode >= CDQ ifTrue:
  [^self dispatchConcretizeProcessorSpecific].
  opcode caseOf: {
  "Noops & Pseudo Ops"
  [Label] -> [^self concretizeLabel].
  [AlignmentNops] -> [^self concretizeAlignmentNops].
  [Fill32] -> [^self concretizeFill32].
  [Nop] -> [^self concretizeNop].
  "Control"
  [Call] -> [^self concretizeCall].
  [CallR] -> [^self concretizeCallR].
  [CallFull] -> [^self concretizeCallFull].
  [JumpR] -> [^self concretizeJumpR].
  [JumpFull] -> [^self concretizeJumpFull].
  [JumpLong] -> [^self concretizeJumpLong].
  [JumpLongZero] -> [^self concretizeConditionalJump: 16r4].
  [JumpLongNonZero] -> [^self concretizeConditionalJump: 16r5].
  [Jump] -> [^self concretizeJump].
  "Table B-1 Intel 64 and IA-32 Architectures Software Developer's Manual Volume 1: Basic Architecture"
  [JumpZero] -> [^self concretizeConditionalJump: 16r4].
  [JumpNonZero] -> [^self concretizeConditionalJump: 16r5].
  [JumpNegative] -> [^self concretizeConditionalJump: 16r8].
  [JumpNonNegative] -> [^self concretizeConditionalJump: 16r9].
  [JumpOverflow] -> [^self concretizeConditionalJump: 16r0].
  [JumpNoOverflow] -> [^self concretizeConditionalJump: 16r1].
  [JumpCarry] -> [^self concretizeConditionalJump: 16r2].
  [JumpNoCarry] -> [^self concretizeConditionalJump: 16r3].
  [JumpLess] -> [^self concretizeConditionalJump: 16rC].
  [JumpGreaterOrEqual] -> [^self concretizeConditionalJump: 16rD].
  [JumpGreater] -> [^self concretizeConditionalJump: 16rF].
  [JumpLessOrEqual] -> [^self concretizeConditionalJump: 16rE].
  [JumpBelow] -> [^self concretizeConditionalJump: 16r2].
  [JumpAboveOrEqual] -> [^self concretizeConditionalJump: 16r3].
  [JumpAbove] -> [^self concretizeConditionalJump: 16r7].
  [JumpBelowOrEqual] -> [^self concretizeConditionalJump: 16r6].
  [JumpFPEqual] -> [^self concretizeConditionalJump: 16r4].
  [JumpFPNotEqual] -> [^self concretizeConditionalJump: 16r5].
  [JumpFPLess] -> [^self concretizeConditionalJump: 16r2].
  [JumpFPGreaterOrEqual] -> [^self concretizeConditionalJump: 16r3].
  [JumpFPGreater] -> [^self concretizeConditionalJump: 16r7].
  [JumpFPLessOrEqual] -> [^self concretizeConditionalJump: 16r6].
  [JumpFPOrdered] -> [^self concretizeConditionalJump: 16rB].
  [JumpFPUnordered] -> [^self concretizeConditionalJump: 16rA].
  [RetN] -> [^self concretizeRetN].
  [Stop] -> [^self concretizeStop].
  "Arithmetic"
  [AddCqR] -> [^self concretizeArithCqRWithRO: 0 raxOpcode: 15r05].
  [AddcCqR] -> [^self concretizeArithCqRWithRO: 2 raxOpcode: 15r15].
  [AddCwR] -> [^self concretizeArithCwR: 16r03].
  [AddRR] -> [^self concretizeOpRR: 16r03].
  [AddRsRs] -> [^self concretizeSEEOpRsRs: 16r58].
  [AddRdRd] -> [^self concretizeSEE2OpRdRd: 16r58].
  [AndCqR] -> [^self concretizeArithCqRWithRO: 4 raxOpcode: 16r25].
  [AndCwR] -> [^self concretizeArithCwR: 16r23].
  [AndRR] -> [^self concretizeOpRR: 16r23].
  [TstCqR] -> [^self concretizeTstCqR].
  [CmpCqR] -> [^self concretizeArithCqRWithRO: 7 raxOpcode: 16r3D].
  [CmpCwR] -> [^self concretizeArithCwR: 16r39].
  [CmpC32R] -> [^self concretizeCmpC32R].
  [CmpRR] -> [^self concretizeReverseOpRR: 16r39].
  [CmpRdRd] -> [^self concretizeCmpRdRd].
  [CmpRsRs] -> [^self concretizeCmpRsRs].
  [DivRdRd] -> [^self concretizeSEE2OpRdRd: 16r5E].
  [DivRsRs] -> [^self concretizeSEEOpRsRs: 16r5E].
  [MulRdRd] -> [^self concretizeSEE2OpRdRd: 16r59].
  [MulRsRs] -> [^self concretizeSEEOpRsRs: 16r59].
  [OrCqR] -> [^self concretizeArithCqRWithRO: 1 raxOpcode: 16r0D].
  [OrCwR] -> [^self concretizeArithCwR: 16r0B].
  [OrRR] -> [^self concretizeOpRR: 16r0B].
  [SubCqR] -> [^self concretizeArithCqRWithRO: 5 raxOpcode: 16r2D].
  [SubbCqR] -> [^self concretizeArithCqRWithRO: 3 raxOpcode: 16r1D].
  [SubCwR] -> [^self concretizeArithCwR: 16r2B].
  [SubRR] -> [^self concretizeOpRR: 16r2B].
  [SubRdRd] -> [^self concretizeSEE2OpRdRd: 16r5C].
  [SubRsRs] -> [^self concretizeSEEOpRsRs: 16r5C].
  [SqrtRd] -> [^self concretizeSqrtRd].
  [SqrtRs] -> [^self concretizeSqrtRs].
  [XorCwR] -> [^self concretizeArithCwR: 16r33].
  [XorRR] -> [^self concretizeOpRR: 16r33].
  [XorRdRd] -> [^self concretizeXorRdRd].
  [XorRsRs] -> [^self concretizeXorRsRs].
  [NegateR] -> [^self concretizeNegateR].
  [LoadEffectiveAddressMwrR] -> [^self concretizeLoadEffectiveAddressMwrR].
  [RotateLeftCqR] -> [^self concretizeShiftCqRegOpcode: 0].
  [RotateRightCqR] -> [^self concretizeShiftCqRegOpcode: 1].
  [ArithmeticShiftRightCqR] -> [^self concretizeShiftCqRegOpcode: 7].
  [LogicalShiftRightCqR] -> [^self concretizeShiftCqRegOpcode: 5].
  [LogicalShiftLeftCqR] -> [^self concretizeShiftCqRegOpcode: 4].
  [ArithmeticShiftRightRR] -> [^self concretizeShiftRegRegOpcode: 7].
  [LogicalShiftLeftRR] -> [^self concretizeShiftRegRegOpcode: 4].
+ [ClzRR] -> [^self concretizeClzRR].
  "Data Movement"
  [MoveCqR] -> [^self concretizeMoveCqR].
  [MoveCwR] -> [^self concretizeMoveCwR].
  [MoveC32R] -> [^self concretizeMoveC32R].
  [MoveRR] -> [^self concretizeReverseOpRR: 16r89].
  [MoveAwR] -> [^self concretizeMoveAwR].
  [MoveA32R] -> [^self concretizeMoveA32R].
  [MoveRAw] -> [^self concretizeMoveRAw].
  [MoveRA32] -> [^self concretizeMoveRA32].
  [MoveAbR] -> [^self concretizeMoveAbR].
  [MoveRAb] -> [^self concretizeMoveRAb].
  [MoveMbrR] -> [^self concretizeMoveMbrR].
  [MoveRMbr] -> [^self concretizeMoveRMbr].
  [MoveM8rR] -> [^self concretizeMoveMbrR].
  [MoveRM8r] -> [^self concretizeMoveRMbr].
  [MoveM16rR] -> [^self concretizeMoveM16rR].
  [MoveRM16r] -> [^self concretizeMoveRM16r].
  [MoveM32rR] -> [^self concretizeMoveM32rR].
  [MoveM32rRs] -> [^self concretizeMoveM32rRs].
  [MoveM64rRd] -> [^self concretizeMoveM64rRd].
  [MoveMwrR] -> [^self concretizeMoveMwrR].
  [MoveXbrRR] -> [^self concretizeMoveXbrRR].
  [MoveRXbrR] -> [^self concretizeMoveRXbrR].
  [MoveXwrRR] -> [^self concretizeMoveXwrRR].
  [MoveRXwrR] -> [^self concretizeMoveRXwrR].
  [MoveX32rRR] -> [^self concretizeMoveX32rRR].
  [MoveRX32rR] -> [^self concretizeMoveRX32rR].
  [MoveRMwr] -> [^self concretizeMoveRMwr].
  [MoveRM32r] -> [^self concretizeMoveRM32r].
  [MoveRsM32r] -> [^self concretizeMoveRsM32r].
  [MoveRdM64r] -> [^self concretizeMoveRdM64r].
  [MoveRdR] -> [^self concretizeMoveRdR].
  [MoveRRd] -> [^self concretizeMoveRRd].
  [MoveRdRd] -> [^self concretizeMoveRdRd].
  [MoveRsRs] -> [^self concretizeMoveRsRs].
  [PopR] -> [^self concretizePopR].
  [PushR] -> [^self concretizePushR].
  [PushCq] -> [^self concretizePushCq].
  [PushCw] -> [^self concretizePushCw].
  [PrefetchAw] -> [^self concretizePrefetchAw].
  "Conversion"
  [ConvertRRd] -> [^self concretizeConvertRRd].
  [ConvertRdR] -> [^self concretizeConvertRdR].
  [ConvertRRs] -> [^self concretizeConvertRRs].
  [ConvertRsR] -> [^self concretizeConvertRsR].
  [ConvertRsRd] -> [^self concretizeConvertRsRd].
  [ConvertRdRs] -> [^self concretizeConvertRdRs].
 
  [SignExtend8RR] -> [^self concretizeSignExtend8RR].
  [SignExtend16RR] -> [^self concretizeSignExtend16RR].
  [SignExtend32RR] -> [^self concretizeSignExtend32RR].
 
  [ZeroExtend8RR] -> [^self concretizeZeroExtend8RR].
  [ZeroExtend16RR] -> [^self concretizeZeroExtend16RR].
  [ZeroExtend32RR] -> [^self concretizeZeroExtend32RR].
  }!

Item was changed:
  ----- Method: CogX64Compiler>>dispatchConcretizeProcessorSpecific (in category 'generate machine code') -----
  dispatchConcretizeProcessorSpecific
  "Attempt to generate concrete machine code for the instruction at address.
  This is part of the inner dispatch of concretizeAt: actualAddress which exists only
  to get around the number of literals limits in the SqueakV3 (blue book derived)
  bytecode set."
  <returnTypeC: #void>
  opcode caseOf: {
  "Specific Control/Data Movement"
  [CDQ] -> [^self concretizeCDQ].
  [IDIVR] -> [^self concretizeIDIVR].
  [IMULRR] -> [^self concretizeMulRR].
+ [CPUID] -> [^self concretizeCPUID].
- "[CPUID] -> [^self concretizeCPUID]."
  "[CMPXCHGAwR] -> [^self concretizeCMPXCHGAwR]."
  "[CMPXCHGMwrR] -> [^self concretizeCMPXCHGMwrR]."
  "[LFENCE] -> [^self concretizeFENCE: 5]."
  "[MFENCE] -> [^self concretizeFENCE: 6].
  [SFENCE] -> [^self concretizeFENCE: 7]."
  "[LOCK] -> [^self concretizeLOCK]."
  "[XCHGAwR] -> [^self concretizeXCHGAwR]."
  "[XCHGMwrR] -> [^self concretizeXCHGMwrR]."
  [XCHGRR] -> [^self concretizeXCHGRR].
  [REP] -> [^self concretizeREP].
  [CLD] -> [^self concretizeCLD].
  [MOVSB] -> [^self concretizeMOVSB].
  [MOVSQ] -> [^self concretizeMOVSQ].
+ [BSR] -> [^self concretizeBSR].
  }!

Item was added:
+ ----- Method: CogX64Compiler>>genHighBitAlternativeIn:ofSmallIntegerOopWithSingleTagBit: (in category 'abstract instructions') -----
+ genHighBitAlternativeIn: destReg ofSmallIntegerOopWithSingleTagBit: srcReg
+ "When CLZ is not available, we can use BSR (Bit Scan Reverse) instruction.
+ BEWARE: make sure that srcReg has a single tag bit."
+ <inline: true>
+ | jumpNegativeReceiver |
+ <var: #jumpNegativeReceiver type: #'AbstractInstruction *'>
+ <returnTypeC: #'AbstractInstruction *'>
+ "The primitive must fail if receiver is negative"
+ (cogit lastOpcode setsConditionCodesFor: JumpNegative) ifFalse:
+ [cogit CmpCq: 0 R: srcReg]. "N.B. FLAGS := srcReg - 0"
+ jumpNegativeReceiver := cogit JumpNegative: 0.
+ cogit gen: BSR operand: destReg operand: srcReg.
+ "theoretically we should handle case when srcReg is zero, but we do not care because it is never zero thanks to the tagBit"
+ "and thanks to the tag bit, the +1 operation for getting 1-based rank instead of 0-based rank is not necessary, so we are done"
+ ^jumpNegativeReceiver!

Item was added:
+ ----- Method: CogX64Compiler>>generateCheckLZCNT (in category 'feature detection') -----
+ generateCheckLZCNT
+ "to check is Leading Zero Count operation is present
+ cf. MSVC builtin __lzcnt documentation
+ The result will be in bit 5 of return value (in RAX)"
+ cogit
+ PushR: RDX;
+ PushR: RCX;
+ PushR: RBX;
+ MoveCw: 16r80000001 R: RAX;
+ gen: CPUID;
+ MoveR: RCX R: RAX;
+ PopR: RBX;
+ PopR: RCX;
+ PopR: RDX;
+ RetN: 0!

Item was added:
+ ----- Method: CogX64Compiler>>hasLZCNTInstructions (in category 'testing') -----
+ hasLZCNTInstructions
+ "Answer if we support LZCNT"
+ <inline: true>
+ ^(cogit ceCheckLZCNT bitAnd: (1 << 5)) ~= 0!

Item was added:
+ ----- Method: CogX64Compiler>>numCheckLZCNTOpcodes (in category 'feature detection') -----
+ numCheckLZCNTOpcodes
+ "Answer the number of opcodes required to compile the CPUID call to extract the extended features information."
+ ^11!

Item was changed:
  ----- Method: CogX64Compiler>>setsConditionCodesFor: (in category 'testing') -----
  setsConditionCodesFor: aConditionalJumpOpcode
  <inline: false> "to save Slang from having to be a real compiler (it can't inline switches that return)"
  "Answer if the receiver's opcode sets the condition codes correctly for the given conditional jump opcode."
  ^opcode caseOf:
  { [ArithmeticShiftRightCqR] -> [self shiftSetsConditionCodesFor: aConditionalJumpOpcode].
  [ArithmeticShiftRightRR] -> [self shiftSetsConditionCodesFor: aConditionalJumpOpcode].
  [LogicalShiftLeftCqR] -> [self shiftSetsConditionCodesFor: aConditionalJumpOpcode].
  [LogicalShiftLeftRR] -> [self shiftSetsConditionCodesFor: aConditionalJumpOpcode].
  [LogicalShiftRightCqR] -> [false].
+ [XorRR] -> [true].
+ [ClzRR] -> [aConditionalJumpOpcode = JumpZero or: [aConditionalJumpOpcode = JumpNonZero or: [aConditionalJumpOpcode = JumpNoCarry or: [aConditionalJumpOpcode = JumpNoCarry "carry flag is set if input is zero"]]]]
- [XorRR] -> [true]
  }
  otherwise: [self halt: 'unhandled opcode in setsConditionCodesFor:'. false]!

Item was changed:
  CogClass subclass: #Cogit
+ instanceVariableNames: 'coInterpreter objectMemory objectRepresentation processor threadManager methodZone methodZoneBase codeBase minValidCallAddress lastNInstructions simulatedAddresses simulatedTrampolines simulatedVariableGetters simulatedVariableSetters printRegisters printInstructions compilationTrace clickConfirm breakPC breakBlock singleStep guardPageSize traceFlags traceStores breakMethod methodObj enumeratingCogMethod methodHeader initialPC endPC methodOrBlockNumArgs inBlock needsFrame hasYoungReferent primitiveIndex backEnd literalsManager postCompileHook methodLabel stackCheckLabel blockEntryLabel blockEntryNoContextSwitch blockNoContextSwitchOffset stackOverflowCall sendMiss missOffset entryPointMask checkedEntryAlignment uncheckedEntryAlignment cmEntryOffset entry cmNoCheckEntryOffset noCheckEntry fullBlockEntry cbEntryOffset fullBlockNoContextSwitchEntry cbNoSwitchEntryOffset picMNUAbort picInterpretAbort endCPICCase0 endCPICCase1 firstCPICCaseOffset cPICCaseSize cP
 ICEndSize closedPICSize openPICSize fixups abstractOpcodes generatorTable byte0 byte1 byte2 byte3 bytecodePC bytecodeSetOffset opcodeIndex numAbstractOpcodes blockStarts blockCount labelCounter cStackAlignment expectedSPAlignment expectedFPAlignment codeModified maxLitIndex ceMethodAbortTrampoline cePICAbortTrampoline ceCheckForInterruptTrampoline ceCPICMissTrampoline ceReturnToInterpreterTrampoline ceBaseFrameReturnTrampoline ceReapAndResetErrorCodeTrampoline ceSendMustBeBooleanAddTrueTrampoline ceSendMustBeBooleanAddFalseTrampoline ceCannotResumeTrampoline ceEnterCogCodePopReceiverReg ceCallCogCodePopReceiverReg ceCallCogCodePopReceiverAndClassRegs cePrimReturnEnterCogCode cePrimReturnEnterCogCodeProfiling ceNonLocalReturnTrampoline ceFetchContextInstVarTrampoline ceStoreContextInstVarTrampoline ceEnclosingObjectTrampoline ceFlushICache ceCheckFeaturesFunction ceTraceLinkedSendTrampoline ceTraceBlockActivationTrampoline ceTraceStoreTrampoline ceGetFP ceGetSP ceCaptureCStackPointer
 s ordinarySendTrampolines superSendTrampolines directedSuperSendTrampolines directedSuperBindingSendTrampolines dynamicSuperSendTrampolines outerSendTrampolines selfSendTrampolines firstSend lastSend realCEEnterCogCodePopReceiverReg realCECallCogCodePopReceiverReg realCECallCogCodePopReceiverAndClassRegs trampolineTableIndex trampolineAddresses objectReferencesInRuntime runtimeObjectRefIndex cFramePointerInUse debugPrimCallStackOffset ceTryLockVMOwner ceUnlockVMOwner extA extB numExtB tempOop numIRCs indexOfIRC theIRCs receiverTags implicitReceiverSendTrampolines cogMethodSurrogateClass cogBlockMethodSurrogateClass nsSendCacheSurrogateClass CStackPointer CFramePointer cPICPrototype cPICEndOfCodeOffset cPICEndOfCodeLabel ceMallocTrampoline ceFreeTrampoline ceFFICalloutTrampoline debugBytecodePointers debugOpcodeIndices disassemblingMethod cogConstituentIndex directedSendUsesBinding ceCheckLZCNTFunction'
- instanceVariableNames: 'coInterpreter objectMemory objectRepresentation processor threadManager methodZone methodZoneBase codeBase minValidCallAddress lastNInstructions simulatedAddresses simulatedTrampolines simulatedVariableGetters simulatedVariableSetters printRegisters printInstructions compilationTrace clickConfirm breakPC breakBlock singleStep guardPageSize traceFlags traceStores breakMethod methodObj enumeratingCogMethod methodHeader initialPC endPC methodOrBlockNumArgs inBlock needsFrame hasYoungReferent primitiveIndex backEnd literalsManager postCompileHook methodLabel stackCheckLabel blockEntryLabel blockEntryNoContextSwitch blockNoContextSwitchOffset stackOverflowCall sendMiss missOffset entryPointMask checkedEntryAlignment uncheckedEntryAlignment cmEntryOffset entry cmNoCheckEntryOffset noCheckEntry fullBlockEntry cbEntryOffset fullBlockNoContextSwitchEntry cbNoSwitchEntryOffset picMNUAbort picInterpretAbort endCPICCase0 endCPICCase1 firstCPICCaseOffset cPICCaseSize cP
 ICEndSize closedPICSize openPICSize fixups abstractOpcodes generatorTable byte0 byte1 byte2 byte3 bytecodePC bytecodeSetOffset opcodeIndex numAbstractOpcodes blockStarts blockCount labelCounter cStackAlignment expectedSPAlignment expectedFPAlignment codeModified maxLitIndex ceMethodAbortTrampoline cePICAbortTrampoline ceCheckForInterruptTrampoline ceCPICMissTrampoline ceReturnToInterpreterTrampoline ceBaseFrameReturnTrampoline ceReapAndResetErrorCodeTrampoline ceSendMustBeBooleanAddTrueTrampoline ceSendMustBeBooleanAddFalseTrampoline ceCannotResumeTrampoline ceEnterCogCodePopReceiverReg ceCallCogCodePopReceiverReg ceCallCogCodePopReceiverAndClassRegs cePrimReturnEnterCogCode cePrimReturnEnterCogCodeProfiling ceNonLocalReturnTrampoline ceFetchContextInstVarTrampoline ceStoreContextInstVarTrampoline ceEnclosingObjectTrampoline ceFlushICache ceCheckFeaturesFunction ceTraceLinkedSendTrampoline ceTraceBlockActivationTrampoline ceTraceStoreTrampoline ceGetFP ceGetSP ceCaptureCStackPointer
 s ordinarySendTrampolines superSendTrampolines directedSuperSendTrampolines directedSuperBindingSendTrampolines dynamicSuperSendTrampolines outerSendTrampolines selfSendTrampolines firstSend lastSend realCEEnterCogCodePopReceiverReg realCECallCogCodePopReceiverReg realCECallCogCodePopReceiverAndClassRegs trampolineTableIndex trampolineAddresses objectReferencesInRuntime runtimeObjectRefIndex cFramePointerInUse debugPrimCallStackOffset ceTryLockVMOwner ceUnlockVMOwner extA extB numExtB tempOop numIRCs indexOfIRC theIRCs receiverTags implicitReceiverSendTrampolines cogMethodSurrogateClass cogBlockMethodSurrogateClass nsSendCacheSurrogateClass CStackPointer CFramePointer cPICPrototype cPICEndOfCodeOffset cPICEndOfCodeLabel ceMallocTrampoline ceFreeTrampoline ceFFICalloutTrampoline debugBytecodePointers debugOpcodeIndices disassemblingMethod cogConstituentIndex directedSendUsesBinding'
  classVariableNames: 'AltBlockCreationBytecodeSize AltFirstSpecialSelector AltNSSendIsPCAnnotated AltNumSpecialSelectors AnnotationConstantNames AnnotationShift AnnotationsWithBytecodePCs BlockCreationBytecodeSize Debug DisplacementMask DisplacementX2N EagerInstructionDecoration FirstAnnotation FirstSpecialSelector HasBytecodePC IsAbsPCReference IsAnnotationExtension IsDirectedSuperBindingSend IsDirectedSuperSend IsDisplacementX2N IsNSDynamicSuperSend IsNSImplicitReceiverSend IsNSSelfSend IsNSSendCall IsObjectReference IsRelativeCall IsSendCall IsSuperSend MapEnd MaxCPICCases MaxCompiledPrimitiveIndex MaxStackAllocSize MaxX2NDisplacement NSCClassTagIndex NSCEnclosingObjectIndex NSCNumArgsIndex NSCSelectorIndex NSCTargetIndex NSSendIsPCAnnotated NumObjRefsInRuntime NumOopsPerNSC NumSpecialSelectors NumTrampolines ProcessorClass RRRName'
  poolDictionaries: 'CogAbstractRegisters CogCompilationConstants CogMethodConstants CogRTLOpcodes VMBasicConstants VMBytecodeConstants VMObjectIndices VMStackFrameOffsets'
  category: 'VMMaker-JIT'!
  Cogit class
  instanceVariableNames: 'generatorTable primitiveTable'!
 
  !Cogit commentStamp: 'eem 2/25/2017 17:53' prior: 0!
  I am the code generator for the Cog VM.  My job is to produce machine code versions of methods for faster execution and to manage inline caches for faster send performance.  I can be tested in the current image using my class-side in-image compilation facilities.  e.g. try
 
  StackToRegisterMappingCogit genAndDis: (Integer >> #benchFib)
 
  I have concrete subclasses that implement different levels of optimization:
  SimpleStackBasedCogit is the simplest code generator.
 
  StackToRegisterMappingCogit is the current production code generator  It defers pushing operands
  to the stack until necessary and implements a register-based calling convention for low-arity sends.
 
  SistaCogit is an experimental code generator with support for counting
  conditional branches, intended to support adaptive optimization.
 
  RegisterAllocatingCogit is an experimental code generator with support for allocating temporary variables
  to registers. It is inended to serve as the superclass to SistaCogit once it is working.
 
  SistaRegisterAllocatingCogit and SistaCogitClone are temporary classes that allow testing a clone of
  SistaCogit that inherits from RegisterAllocatingCogit.  Once things work these will be merged and
  will replace SistaCogit.
 
  coInterpreter <CoInterpreterSimulator>
  the VM's interpreter with which I cooperate
  methodZoneManager <CogMethodZoneManager>
  the manager of the machine code zone
  objectRepresentation <CogObjectRepresentation>
  the object used to generate object accesses
  processor <BochsIA32Alien|?>
  the simulator that executes the IA32/x86 machine code I generate when simulating execution in Smalltalk
  simulatedTrampolines <Dictionary of Integer -> MessageSend>
  the dictionary mapping trap jump addresses to run-time routines used to warp from simulated machine code in to the Smalltalk run-time.
  simulatedVariableGetters <Dictionary of Integer -> MessageSend>
  the dictionary mapping trap read addresses to variables in run-time objects used to allow simulated machine code to read variables in the Smalltalk run-time.
  simulatedVariableSetters <Dictionary of Integer -> MessageSend>
  the dictionary mapping trap write addresses to variables in run-time objects used to allow simulated machine code to write variables in the Smalltalk run-time.
  printRegisters printInstructions clickConfirm <Boolean>
  flags controlling debug printing and code simulation
  breakPC <Integer>
  machine code pc breakpoint
  cFramePointer cStackPointer <Integer>
  the variables representing the C stack & frame pointers, which must change on FFI callback and return
  selectorOop <sqInt>
  the oop of the methodObj being compiled
  methodObj <sqInt>
  the bytecode method being compiled
  initialPC endPC <Integer>
  the start and end pcs of the methodObj being compiled
  methodOrBlockNumArgs <Integer>
  argument count of current method or block being compiled
  needsFrame <Boolean>
  whether methodObj or block needs a frame to execute
  primitiveIndex <Integer>
  primitive index of current method being compiled
  methodLabel <CogAbstractOpcode>
  label for the method header
  blockEntryLabel <CogAbstractOpcode>
  label for the start of the block dispatch code
  stackOverflowCall <CogAbstractOpcode>
  label for the call of ceStackOverflow in the method prolog
  sendMissCall <CogAbstractOpcode>
  label for the call of ceSICMiss in the method prolog
  entryOffset <Integer>
  offset of method entry code from start (header) of method
  entry <CogAbstractOpcode>
  label for the first instruction of the method entry code
  noCheckEntryOffset <Integer>
  offset of the start of a method proper (after the method entry code) from start (header) of method
  noCheckEntry <CogAbstractOpcode>
  label for the first instruction of start of a method proper
  fixups <Array of <AbstractOpcode Label | nil>>
  the labels for forward jumps that will be fixed up when reaching the relevant bytecode.  fixups has one element per byte in methodObj's bytecode; initialPC maps to fixups[0].
  abstractOpcodes <Array of <AbstractOpcode>>
  the code generated when compiling methodObj
  byte0 byte1 byte2 byte3 <Integer>
  individual bytes of current bytecode being compiled in methodObj
  bytecodePointer <Integer>
  bytecode pc (same as Smalltalk) of the current bytecode being compiled
  opcodeIndex <Integer>
  the index of the next free entry in abstractOpcodes (this code is translated into C where OrderedCollection et al do not exist)
  numAbstractOpcodes <Integer>
  the number of elements in abstractOpcocdes
  blockStarts <Array of <BlockStart>>
  the starts of blocks in the current method
  blockCount
  the index into blockStarts as they are being noted, and hence eventually the total number of blocks in the current method
  labelCounter <Integer>
  a nicety for numbering labels not needed in the production system but probably not expensive enough to worry about
  ceStackOverflowTrampoline <Integer>
  ceSend0ArgsTrampoline <Integer>
  ceSend1ArgsTrampoline <Integer>
  ceSend2ArgsTrampoline <Integer>
  ceSendNArgsTrampoline <Integer>
  ceSendSuper0ArgsTrampoline <Integer>
  ceSendSuper1ArgsTrampoline <Integer>
  ceSendSuper2ArgsTrampoline <Integer>
  ceSendSuperNArgsTrampoline <Integer>
  ceSICMissTrampoline <Integer>
  ceCPICMissTrampoline <Integer>
  ceStoreCheckTrampoline <Integer>
  ceReturnToInterpreterTrampoline <Integer>
  ceBaseFrameReturnTrampoline <Integer>
  ceSendMustBeBooleanTrampoline <Integer>
  ceClosureCopyTrampoline <Integer>
  the various trampolines (system-call-like jumps from machine code to the run-time).
  See Cogit>>generateTrampolines for the mapping from trampoline to run-time
  routine and then read the run-time routine for a funcitonal description.
  ceEnterCogCodePopReceiverReg <Integer>
  the enilopmart (jump from run-time to machine-code)
  methodZoneBase <Integer>
  !
  Cogit class
  instanceVariableNames: 'generatorTable primitiveTable'!

Item was changed:
  ----- Method: Cogit class>>declareCVarsIn: (in category 'translation') -----
  declareCVarsIn: aCCodeGenerator
  #( 'coInterpreter' 'objectMemory' 'methodZone' 'objectRepresentation'
  'cogBlockMethodSurrogateClass' 'cogMethodSurrogateClass' 'nsSendCacheSurrogateClass'
  'threadManager' 'processor' 'lastNInstructions' 'simulatedAddresses'
  'simulatedTrampolines' 'simulatedVariableGetters' 'simulatedVariableSetters'
  'printRegisters' 'printInstructions' 'clickConfirm' 'singleStep') do:
  [:simulationVariableNotNeededForRealVM|
  aCCodeGenerator removeVariable: simulationVariableNotNeededForRealVM].
  NewspeakVM ifFalse:
  [#( 'selfSendTrampolines' 'dynamicSuperSendTrampolines'
  'implicitReceiverSendTrampolines' 'outerSendTrampolines'
  'ceEnclosingObjectTrampoline' 'numIRCs' 'indexOfIRC' 'theIRCs') do:
  [:variableNotNeededInNormalVM|
  aCCodeGenerator removeVariable: variableNotNeededInNormalVM]].
  aCCodeGenerator removeConstant: #COGMTVM. "this should be defined at compile time"
  aCCodeGenerator
  addHeaderFile:'<stddef.h>'; "for e.g. offsetof"
  addHeaderFile:'"sqCogStackAlignment.h"';
  addHeaderFile:'"dispdbg.h"'; "must precede cointerp.h & cogit.h otherwise NoDbgRegParms gets screwed up"
  addHeaderFile:'"cogmethod.h"'.
  NewspeakVM ifTrue:
  [aCCodeGenerator addHeaderFile:'"nssendcache.h"'].
  aCCodeGenerator
  addHeaderFile:'#if COGMTVM';
  addHeaderFile:'"cointerpmt.h"';
  addHeaderFile:'#else';
  addHeaderFile:'"cointerp.h"';
  addHeaderFile:'#endif';
  addHeaderFile:'"cogit.h"'.
  aCCodeGenerator
  var: #ceGetFP
  declareC: 'usqIntptr_t (*ceGetFP)(void)';
  var: #ceGetSP
  declareC: 'usqIntptr_t (*ceGetSP)(void)';
  var: #ceCaptureCStackPointers
  declareC: 'void (*ceCaptureCStackPointers)(void)';
  var: #ceEnterCogCodePopReceiverReg
  declareC: 'void (*ceEnterCogCodePopReceiverReg)(void)';
  var: #realCEEnterCogCodePopReceiverReg
  declareC: 'void (*realCEEnterCogCodePopReceiverReg)(void)';
  var: #ceCallCogCodePopReceiverReg
  declareC: 'void (*ceCallCogCodePopReceiverReg)(void)';
  var: #realCECallCogCodePopReceiverReg
  declareC: 'void (*realCECallCogCodePopReceiverReg)(void)';
  var: #ceCallCogCodePopReceiverAndClassRegs
  declareC: 'void (*ceCallCogCodePopReceiverAndClassRegs)(void)';
  var: #realCECallCogCodePopReceiverAndClassRegs
  declareC: 'void (*realCECallCogCodePopReceiverAndClassRegs)(void)';
  var: #ceFlushICache
  declareC: 'static void (*ceFlushICache)(usqIntptr_t from, usqIntptr_t to)';
  var: #ceCheckFeaturesFunction
  declareC: 'static usqIntptr_t (*ceCheckFeaturesFunction)(void)';
+ var: #ceCheckLZCNTFunction
+ declareC: 'static usqIntptr_t (*ceCheckLZCNTFunction)(void)';
  var: #ceTryLockVMOwner
  declareC: 'usqIntptr_t (*ceTryLockVMOwner)(void)';
  var: #ceUnlockVMOwner
  declareC: 'void (*ceUnlockVMOwner)(void)';
  var: #postCompileHook
  declareC: 'void (*postCompileHook)(CogMethod *)';
  var: #openPICList declareC: 'CogMethod *openPICList = 0';
  var: #maxMethodBefore type: #'CogBlockMethod *';
  var: 'enumeratingCogMethod' type: #'CogMethod *'.
  aCCodeGenerator
  declareVar: 'aMethodLabel' type: #'AbstractInstruction'; "Has to come lexicographically before backEnd & methodLabel"
  var: #backEnd declareC: 'AbstractInstruction * const backEnd = &aMethodLabel';
  var: #methodLabel declareC: 'AbstractInstruction * const methodLabel = &aMethodLabel'.
  self declareC: #(abstractOpcodes stackCheckLabel
  blockEntryLabel blockEntryNoContextSwitch
  stackOverflowCall sendMiss
  entry noCheckEntry selfSendEntry dynSuperEntry
  fullBlockNoContextSwitchEntry fullBlockEntry
  picMNUAbort picInterpretAbort  endCPICCase0 endCPICCase1 cPICEndOfCodeLabel)
  as: #'AbstractInstruction *'
  in: aCCodeGenerator.
  aCCodeGenerator
  declareVar: #blockStarts type: #'BlockStart *';
  declareVar: #fixups type: #'BytecodeFixup *'.
  aCCodeGenerator
  var: #ordinarySendTrampolines
  declareC: 'sqInt ordinarySendTrampolines[NumSendTrampolines]';
  var: #superSendTrampolines
  declareC: 'sqInt superSendTrampolines[NumSendTrampolines]'.
  BytecodeSetHasDirectedSuperSend ifTrue:
  [aCCodeGenerator
  var: #directedSuperSendTrampolines
  declareC: 'sqInt directedSuperSendTrampolines[NumSendTrampolines]';
  var: #directedSuperBindingSendTrampolines
  declareC: 'sqInt directedSuperBindingSendTrampolines[NumSendTrampolines]'].
  NewspeakVM ifTrue:
  [aCCodeGenerator
  var: #selfSendTrampolines
  declareC: 'sqInt selfSendTrampolines[NumSendTrampolines]';
  var: #dynamicSuperSendTrampolines
  declareC: 'sqInt dynamicSuperSendTrampolines[NumSendTrampolines]';
  var: #implicitReceiverSendTrampolines
  declareC: 'sqInt implicitReceiverSendTrampolines[NumSendTrampolines]';
  var: #outerSendTrampolines
  declareC: 'sqInt outerSendTrampolines[NumSendTrampolines]'].
  aCCodeGenerator
  var: #trampolineAddresses
  declareC: 'static char *trampolineAddresses[NumTrampolines*2]';
  var: #objectReferencesInRuntime
  declareC: 'static usqInt objectReferencesInRuntime[NumObjRefsInRuntime+1]';
  var: #labelCounter
  type: #int;
  var: #traceFlags
  declareC: 'int traceFlags = 8 /* prim trace log on by default */';
  var: #cStackAlignment
  declareC: 'const int cStackAlignment = STACK_ALIGN_BYTES'.
  aCCodeGenerator
  declareVar: #CFramePointer type: #'void *';
  declareVar: #CStackPointer type: #'void *';
  declareVar: #minValidCallAddress type: #'usqIntptr_t';
  declareVar: #debugPrimCallStackOffset type: #'usqIntptr_t'.
  aCCodeGenerator vmClass generatorTable ifNotNil:
  [:bytecodeGenTable|
  aCCodeGenerator
  var: #generatorTable
  declareC: 'static BytecodeDescriptor generatorTable[', bytecodeGenTable size printString, ']',
  (self tableInitializerFor: bytecodeGenTable
  in: aCCodeGenerator)].
  "In C the abstract opcode names clash with the Smalltak generator syntactic sugar.
  Most of the syntactic sugar is inlined, but alas some remains.  Rename the syntactic
  sugar to avoid the clash."
  (self organization listAtCategoryNamed: #'abstract instructions') do:
  [:s|
  aCCodeGenerator addSelectorTranslation: s to: 'g', (aCCodeGenerator cFunctionNameFor: s)].
  aCCodeGenerator addSelectorTranslation: #halt: to: 'haltmsg'!

Item was added:
+ ----- Method: Cogit>>ClzR:R: (in category 'abstract instructions') -----
+ ClzR: reg1 R: reg2
+ <inline: true>
+ <returnTypeC: #'AbstractInstruction *'>
+ ^self gen: ClzRR operand: reg1 operand: reg2!

Item was added:
+ ----- Method: Cogit>>ceCheckLZCNT (in category 'testing') -----
+ ceCheckLZCNT
+ <cmacro: '() ceCheckLZCNTFunction()'>
+ ^self simulateLeafCallOf: ceCheckLZCNTFunction!

Item was added:
+ ----- Method: Cogit>>genHighBitIn:ofSmallIntegerOopWithSingleTagBit: (in category 'abstract instructions') -----
+ genHighBitIn: destReg ofSmallIntegerOopWithSingleTagBit: srcReg
+ "Generate code for storing in destReg the 1-based highBit of srcReg.
+ Assume that srcReg contains a SmallInteger Oop with a single tag bit set to 1.
+ Sender should preprocess Oop when cog representation use numSmallIntegerTagBits > 1.
+ Return the jump instruction necessary for handling case of negative integer value.
+ Return null pointer if the abstract highBit operation is not implemented.
+
+ The implementation depends on availability of instructions on the target architecture,
+ so delegate to the backend."
+ <inline: true>
+ <returnTypeC: #'AbstractInstruction *'>
+ backEnd hasLZCNTInstructions
+ ifTrue: [^self genHighBitClzIn: destReg ofSmallIntegerOopWithSingleTagBit: srcReg]
+ ifFalse: [^self genHighBitAlternativeIn: destReg ofSmallIntegerOopWithSingleTagBit: srcReg]!

Item was changed:
  ----- Method: Cogit>>initializeCodeZoneFrom:upTo: (in category 'initialization') -----
  initializeCodeZoneFrom: startAddress upTo: endAddress
  <api>
  self initializeBackend.
  backEnd stopsFrom: startAddress to: endAddress - 1.
  self cCode: [self sqMakeMemoryExecutableFrom: startAddress To: endAddress]
  inSmalltalk:
  [startAddress = self class guardPageSize ifTrue:
  [backEnd stopsFrom: 0 to: endAddress - 1].
  self initializeProcessor].
  codeBase := methodZoneBase := startAddress.
  minValidCallAddress := (codeBase min: coInterpreter interpretAddress)
  min: coInterpreter primitiveFailAddress.
  methodZone manageFrom: methodZoneBase to: endAddress.
  self maybeGenerateCheckFeatures.
+ self maybeGenerateCheckLZCNT.
  self maybeGenerateICacheFlush.
  self generateVMOwnerLockFunctions.
  self genGetLeafCallStackPointer.
  self generateStackPointerCapture.
  self generateTrampolines.
  self computeEntryOffsets.
  self computeFullBlockEntryOffsets.
  self generateClosedPICPrototype.
  "repeat so that now the methodZone ignores the generated run-time"
  methodZone manageFrom: methodZoneBase to: endAddress.
  "N.B. this is assumed to be the last thing done in initialization; see Cogit>>initialized"
  self generateOpenPICPrototype!

Item was added:
+ ----- Method: Cogit>>maybeGenerateCheckLZCNT (in category 'initialization') -----
+ maybeGenerateCheckLZCNT
+ | startAddress |
+ <inline: true>
+ backEnd numCheckLZCNTOpcodes > 0 ifTrue:
+ [self allocateOpcodes: backEnd numCheckLZCNTOpcodes bytecodes: 0.
+ startAddress := methodZoneBase.
+ backEnd generateCheckLZCNT.
+ self outputInstructionsForGeneratedRuntimeAt: startAddress.
+ self recordGeneratedRunTime: 'ceCheckLZCNTFunction' address: startAddress.
+ ceCheckLZCNTFunction := self cCoerceSimple: startAddress to: #'usqIntptr_t (*)(void)']!

Item was added:
+ ----- Method: Cogit>>processorHasLZCNTSupport (in category 'initialization') -----
+ processorHasLZCNTSupport
+ <option: #DPFPReg0>
+ <inline: true>
+ ^backEnd hasLZCNTInstructions!

Item was changed:
  ----- Method: Interpreter class>>initializePrimitiveTable (in category 'initialization') -----
(excessive size, no diff calculated)

Item was added:
+ ----- Method: InterpreterPrimitives>>primitiveHighBit (in category 'arithmetic integer primitives') -----
+ primitiveHighBit
+ | integerReceiverOop leadingZeroCount highestBitZeroBased |
+ integerReceiverOop := self stackTop.
+ "Convert the receiver Oop to use a single tag bit"
+ self numSmallIntegerTagBits > 1
+ ifTrue: [integerReceiverOop := objectMemory integerValueOf: (integerReceiverOop >>> (self numSmallIntegerTagBits-1) bitOr: 1)].
+ self cppIf: #'__GNUC__' defined
+ ifTrue:
+ ["Note: in gcc, result is undefined if input is zero (for compatibility with BSR fallback when no CLZ instruction available).
+ but input is never zero because we pass the oop with tag bits set, so we are safe"
+ objectMemory wordSize = 4
+ ifTrue: [leadingZeroCount := self __builtin_clz: integerReceiverOop]
+ ifFalse: [leadingZeroCount := self __builtin_clzll: integerReceiverOop].
+ leadingZeroCount = 0
+ ifTrue:
+ ["highBit is not defined for negative Integer"
+ self primitiveFail]
+ ifFalse:
+ ["Nice bit trick: 1-based high-bit is (32 - clz) - 1 to account for tag bit.
+ This is like two-complement - clz - 1 on 5 bits, or in other words a bit-invert operation clz ^16r1F"
+ self pop: 1 thenPushInteger: (leadingZeroCount bitXor: (BytesPerWord * 8 - 1))]]
+ ifFalse: [self cppIf: #'_MSC_VER' defined
+ ifTrue:
+ ["In MSVC, _lzcnt and _lzcnt64 builtins do not fallback to BSR when not supported by CPU
+ Instead of messing with __cpuid() we always use the BSR intrinsic"
+
+ "Trick: we test the oop sign rather than the integerValue. Assume oop are signed (so far, they are, sqInt are signed)"
+ integerReceiverOop < 0 ifTrue: [self primitiveFail] ifFalse: [
+ "Setting this variable is useless, but VMMaker will generate it at a worse place"
+ highestBitZeroBased := 0.
+ "We do not even test the return value, because integerReceiverOop is never zero"
+ objectMemory wordSize = 4
+ ifTrue: [self _BitScanReverse: highestBitZeroBased address _: integerReceiverOop]
+ ifFalse: [self _BitScanReverse64: highestBitZeroBased address _: integerReceiverOop].
+ "thanks to the tag bit, the +1 operation for getting 1-based rank is not necessary"
+ self pop: 1 thenPushInteger: highestBitZeroBased]]
+ ifFalse:
+ ["not gcc/clang, nor MSVC, you have to implement if your compiler provide useful builtins"
+ self primitiveFail]].!

Item was changed:
  ----- Method: SimpleStackBasedCogit class>>initializePrimitiveTableForNewsqueak (in category 'class initialization') -----
  initializePrimitiveTableForNewsqueak
  "Initialize the table of primitive generators.  This does not include normal primitives implemented in the coInterpreter.
  N.B. primitives that don't have an explicit arg count (the integer following the generator) may be variadic."
  "SimpleStackBasedCogit initializePrimitiveTableForNewsqueak"
  MaxCompiledPrimitiveIndex := self objectRepresentationClass wordSize = 8
+ ifTrue: [575]
+ ifFalse: [575].
- ifTrue: [555]
- ifFalse: [222].
  primitiveTable := CArrayAccessor on: (Array new: MaxCompiledPrimitiveIndex + 1).
  self table: primitiveTable from:
  #( "Integer Primitives (0-19)"
  (1 genPrimitiveAdd 1)
  (2 genPrimitiveSubtract 1)
  (3 genPrimitiveLessThan 1)
  (4 genPrimitiveGreaterThan 1)
  (5 genPrimitiveLessOrEqual 1)
  (6 genPrimitiveGreaterOrEqual 1)
  (7 genPrimitiveEqual 1)
  (8 genPrimitiveNotEqual 1)
  (9 genPrimitiveMultiply 1)
  (10 genPrimitiveDivide 1)
  (11 genPrimitiveMod 1)
  (12 genPrimitiveDiv 1)
  (13 genPrimitiveQuo 1)
  (14 genPrimitiveBitAnd 1)
  (15 genPrimitiveBitOr 1)
  (16 genPrimitiveBitXor 1)
  (17 genPrimitiveBitShift 1)
  "(18 primitiveMakePoint)"
  "(19 primitiveFail)" "Guard primitive for simulation -- *must* fail"
 
  "LargeInteger Primitives (20-39)"
  "(20 primitiveFail)"
  "(21 primitiveAddLargeIntegers)"
  "(22 primitiveSubtractLargeIntegers)"
  "(23 primitiveLessThanLargeIntegers)"
  "(24 primitiveGreaterThanLargeIntegers)"
  "(25 primitiveLessOrEqualLargeIntegers)"
  "(26 primitiveGreaterOrEqualLargeIntegers)"
  "(27 primitiveEqualLargeIntegers)"
  "(28 primitiveNotEqualLargeIntegers)"
  "(29 primitiveMultiplyLargeIntegers)"
  "(30 primitiveDivideLargeIntegers)"
  "(31 primitiveModLargeIntegers)"
  "(32 primitiveDivLargeIntegers)"
  "(33 primitiveQuoLargeIntegers)"
  "(34 primitiveBitAndLargeIntegers)"
  "(35 primitiveBitOrLargeIntegers)"
  "(36 primitiveBitXorLargeIntegers)"
  "(37 primitiveBitShiftLargeIntegers)"
 
  "Float Primitives (38-59)"
  "(38 genPrimitiveFloatAt)"
  "(39 genPrimitiveFloatAtPut)"
  (40 genPrimitiveAsFloat 0)
  (41 genPrimitiveFloatAdd 1)
  (42 genPrimitiveFloatSubtract 1)
  (43 genPrimitiveFloatLessThan 1)
  (44 genPrimitiveFloatGreaterThan 1)
  (45 genPrimitiveFloatLessOrEqual 1)
  (46 genPrimitiveFloatGreaterOrEqual 1)
  (47 genPrimitiveFloatEqual 1)
  (48 genPrimitiveFloatNotEqual 1)
  (49 genPrimitiveFloatMultiply 1)
  (50 genPrimitiveFloatDivide 1)
  "(51 genPrimitiveTruncated)"
  "(52 genPrimitiveFractionalPart)"
  "(53 genPrimitiveExponent)"
  "(54 genPrimitiveTimesTwoPower)"
  (55 genPrimitiveFloatSquareRoot 0)
  "(56 genPrimitiveSine)"
  "(57 genPrimitiveArctan)"
  "(58 genPrimitiveLogN)"
  "(59 genPrimitiveExp)"
 
  "Subscript and Stream Primitives (60-67)"
  (60 genPrimitiveAt 1)
  (61 genPrimitiveAtPut 2)
  (62 genPrimitiveSize 0)
  (63 genPrimitiveStringAt 1)
  (64 genPrimitiveStringAtPut 2)
  "The stream primitives no longer pay their way; normal Smalltalk code is faster."
  (65 genFastPrimFail)"was primitiveNext"
  (66 genFastPrimFail) "was primitiveNextPut"
  (67 genFastPrimFail) "was primitiveAtEnd"
 
  "StorageManagement Primitives (68-79)"
  (68 genPrimitiveObjectAt 1) "Good for debugger/InstructionStream performance"
  "(69 primitiveObjectAtPut)"
  (70 genPrimitiveNew) "For VMMirror support 1 argument instantiateFixedClass: as well as baiscNew"
  (71 genPrimitiveNewWithArg) "For VMMirror support 2 argument instantiateVariableClass:withSize: as well as baiscNew:"
  "(72 primitiveArrayBecomeOneWay)" "Blue Book: primitiveBecome"
  "(73 primitiveInstVarAt)"
  "(74 primitiveInstVarAtPut)"
  (75 genPrimitiveIdentityHash 0)
  "(76 primitiveStoreStackp)" "Blue Book: primitiveAsObject"
  "(77 primitiveSomeInstance)"
  "(78 primitiveNextInstance)"
  (79 genPrimitiveNewMethod 2)
 
  "Control Primitives (80-89)"
  "(80 primitiveFail)" "Blue Book: primitiveBlockCopy"
  "(81 primitiveFail)" "Blue Book: primitiveValue"
  "(82 primitiveFail)" "Blue Book: primitiveValueWithArgs"
  (83 genPrimitivePerform)
  "(84 primitivePerformWithArgs)"
  "(85 primitiveSignal)"
  "(86 primitiveWait)"
  "(87 primitiveResume)"
  "(88 primitiveSuspend)"
  "(89 primitiveFlushCache)"
 
  "System Primitives (110-119)"
  (110 genPrimitiveIdentical 1)
  (111 genPrimitiveClass) "For objectClass: and VMMirror support 1 argument classOf: as well as class"
  "(112 primitiveBytesLeft)"
  "(113 primitiveQuit)"
  "(114 primitiveExitToDebugger)"
  "(115 primitiveChangeClass)" "Blue Book: primitiveOopsLeft"
  "(116 primitiveFlushCacheByMethod)"
  "(117 primitiveExternalCall)"
  "(118 primitiveDoPrimitiveWithArgs)"
  "(119 primitiveFlushCacheSelective)"
 
  (148 genPrimitiveShallowCopy 0) "a.k.a. clone"
 
  (165 genPrimitiveIntegerAt 1) "Signed version of genPrimitiveAt"
  (166 genPrimitiveIntegerAtPut 2) "Signed version of genPrimitiveAtPut"
 
  (169 genPrimitiveNotIdentical 1)
 
  (170 genPrimitiveAsCharacter) "SmallInteger>>asCharacter, Character class>>value:"
  (171 genPrimitiveImmediateAsInteger 0) "Character>>value SmallFloat64>>asInteger"
 
  "(173 primitiveSlotAt 1)"
  "(174 primitiveSlotAtPut 2)"
  (175 genPrimitiveIdentityHash 0) "Behavior>>identityHash"
 
  "Old closure primitives"
  "(186 primitiveFail)" "was primitiveClosureValue"
  "(187 primitiveFail)" "was primitiveClosureValueWithArgs"
 
  "Perform method directly"
  "(188 primitiveExecuteMethodArgsArray)"
  "(189 primitiveExecuteMethod)"
 
  "Unwind primitives"
  "(195 primitiveFindNextUnwindContext)"
  "(196 primitiveTerminateTo)"
  "(197 primitiveFindHandlerContext)"
  (198 genFastPrimFail "primitiveMarkUnwindMethod")
  (199 genFastPrimFail "primitiveMarkHandlerMethod")
 
  "new closure primitives"
  "(200 primitiveClosureCopyWithCopiedValues)"
  (201 genPrimitiveClosureValue 0) "value"
  (202 genPrimitiveClosureValue 1) "value:"
  (203 genPrimitiveClosureValue 2) "value:value:"
  (204 genPrimitiveClosureValue 3) "value:value:value:"
  (205 genPrimitiveClosureValue 4) "value:value:value:value:"
  "(206 genPrimitiveClosureValueWithArgs)" "valueWithArguments:"
 
  "(210 primitiveContextAt)"
  "(211 primitiveContextAtPut)"
  "(212 primitiveContextSize)"
 
  "(218 primitiveDoNamedPrimitiveWithArgs)"
  "(219 primitiveFail)" "reserved for Cog primitives"
 
  "(220 primitiveFail)" "reserved for Cog primitives"
 
  (221 genPrimitiveClosureValue 0) "valueNoContextSwitch"
  (222 genPrimitiveClosureValue 1) "valueNoContextSwitch:"
 
  "SmallFloat primitives (540-559)"
  (541 genPrimitiveSmallFloatAdd 1)
  (542 genPrimitiveSmallFloatSubtract 1)
  (543 genPrimitiveSmallFloatLessThan 1)
  (544 genPrimitiveSmallFloatGreaterThan 1)
  (545 genPrimitiveSmallFloatLessOrEqual 1)
  (546 genPrimitiveSmallFloatGreaterOrEqual 1)
  (547 genPrimitiveSmallFloatEqual 1)
  (548 genPrimitiveSmallFloatNotEqual 1)
  (549 genPrimitiveSmallFloatMultiply 1)
  (550 genPrimitiveSmallFloatDivide 1)
  "(551 genPrimitiveSmallFloatTruncated 0)"
  "(552 genPrimitiveSmallFloatFractionalPart 0)"
  "(553 genPrimitiveSmallFloatExponent 0)"
  "(554 genPrimitiveSmallFloatTimesTwoPower 1)"
  (555 genPrimitiveSmallFloatSquareRoot 0)
  "(556 genPrimitiveSmallFloatSine 0)"
  "(557 genPrimitiveSmallFloatArctan 0)"
  "(558 genPrimitiveSmallFloatLogN 0)"
  "(559 genPrimitiveSmallFloatExp 0)"
+ (575 genPrimitiveHighBit 0)
  )!

Item was changed:
  ----- Method: SimpleStackBasedCogit class>>initializePrimitiveTableForSqueak (in category 'class initialization') -----
  initializePrimitiveTableForSqueak
  "Initialize the table of primitive generators.  This does not include normal primitives implemented in the coInterpreter.
  N.B. primitives that don't have an explicit arg count (the integer following the generator) may be variadic."
  "SimpleStackBasedCogit initializePrimitiveTableForSqueak"
  MaxCompiledPrimitiveIndex := self objectRepresentationClass wordSize = 8
+ ifTrue: [575]
+ ifFalse: [575].
- ifTrue: [555]
- ifFalse: [222].
  primitiveTable := CArrayAccessor on: (Array new: MaxCompiledPrimitiveIndex + 1).
  self table: primitiveTable from:
  #( "Integer Primitives (0-19)"
  (1 genPrimitiveAdd 1)
  (2 genPrimitiveSubtract 1)
  (3 genPrimitiveLessThan 1)
  (4 genPrimitiveGreaterThan 1)
  (5 genPrimitiveLessOrEqual 1)
  (6 genPrimitiveGreaterOrEqual 1)
  (7 genPrimitiveEqual 1)
  (8 genPrimitiveNotEqual 1)
  (9 genPrimitiveMultiply 1)
  (10 genPrimitiveDivide 1)
  (11 genPrimitiveMod 1)
  (12 genPrimitiveDiv 1)
  (13 genPrimitiveQuo 1)
  (14 genPrimitiveBitAnd 1)
  (15 genPrimitiveBitOr 1)
  (16 genPrimitiveBitXor 1)
  (17 genPrimitiveBitShift 1)
  "(18 primitiveMakePoint)"
  "(19 primitiveFail)" "Guard primitive for simulation -- *must* fail"
 
  "LargeInteger Primitives (20-39)"
  "(20 primitiveFail)"
  "(21 primitiveAddLargeIntegers)"
  "(22 primitiveSubtractLargeIntegers)"
  "(23 primitiveLessThanLargeIntegers)"
  "(24 primitiveGreaterThanLargeIntegers)"
  "(25 primitiveLessOrEqualLargeIntegers)"
  "(26 primitiveGreaterOrEqualLargeIntegers)"
  "(27 primitiveEqualLargeIntegers)"
  "(28 primitiveNotEqualLargeIntegers)"
  "(29 primitiveMultiplyLargeIntegers)"
  "(30 primitiveDivideLargeIntegers)"
  "(31 primitiveModLargeIntegers)"
  "(32 primitiveDivLargeIntegers)"
  "(33 primitiveQuoLargeIntegers)"
  "(34 primitiveBitAndLargeIntegers)"
  "(35 primitiveBitOrLargeIntegers)"
  "(36 primitiveBitXorLargeIntegers)"
  "(37 primitiveBitShiftLargeIntegers)"
 
  "Float Primitives (38-59)"
  "(38 genPrimitiveFloatAt)"
  "(39 genPrimitiveFloatAtPut)"
  (40 genPrimitiveAsFloat 0)
  (41 genPrimitiveFloatAdd 1)
  (42 genPrimitiveFloatSubtract 1)
  (43 genPrimitiveFloatLessThan 1)
  (44 genPrimitiveFloatGreaterThan 1)
  (45 genPrimitiveFloatLessOrEqual 1)
  (46 genPrimitiveFloatGreaterOrEqual 1)
  (47 genPrimitiveFloatEqual 1)
  (48 genPrimitiveFloatNotEqual 1)
  (49 genPrimitiveFloatMultiply 1)
  (50 genPrimitiveFloatDivide 1)
  "(51 genPrimitiveTruncated)"
  "(52 genPrimitiveFractionalPart)"
  "(53 genPrimitiveExponent)"
  "(54 genPrimitiveTimesTwoPower)"
  (55 genPrimitiveFloatSquareRoot 0)
  "(56 genPrimitiveSine)"
  "(57 genPrimitiveArctan)"
  "(58 genPrimitiveLogN)"
  "(59 genPrimitiveExp)"
 
  "Subscript and Stream Primitives (60-67)"
  (60 genPrimitiveAt 1)
  (61 genPrimitiveAtPut 2)
  (62 genPrimitiveSize 0)
  (63 genPrimitiveStringAt 1)
  (64 genPrimitiveStringAtPut 2)
  "The stream primitives no longer pay their way; normal Smalltalk code is faster."
  (65 genFastPrimFail)"was primitiveNext"
  (66 genFastPrimFail) "was primitiveNextPut"
  (67 genFastPrimFail) "was primitiveAtEnd"
 
  "StorageManagement Primitives (68-79)"
  (68 genPrimitiveObjectAt 1) "Good for debugger/InstructionStream performance"
  "(69 primitiveObjectAtPut)"
  (70 genPrimitiveNew 0)
  (71 genPrimitiveNewWithArg 1)
  "(72 primitiveArrayBecomeOneWay)" "Blue Book: primitiveBecome"
  "(73 primitiveInstVarAt)"
  "(74 primitiveInstVarAtPut)"
  (75 genPrimitiveIdentityHash 0)
  "(76 primitiveStoreStackp)" "Blue Book: primitiveAsObject"
  "(77 primitiveSomeInstance)"
  "(78 primitiveNextInstance)"
  (79 genPrimitiveNewMethod 2)
 
  "Control Primitives (80-89)"
  "(80 primitiveFail)" "Blue Book: primitiveBlockCopy"
  "(81 primitiveFail)" "Blue Book: primitiveValue"
  "(82 primitiveFail)" "Blue Book: primitiveValueWithArgs"
  (83 genPrimitivePerform)
  "(84 primitivePerformWithArgs)"
  "(85 primitiveSignal)"
  "(86 primitiveWait)"
  "(87 primitiveResume)"
  "(88 primitiveSuspend)"
  "(89 primitiveFlushCache)"
 
  "(90 primitiveMousePoint)"
  "(91 primitiveTestDisplayDepth)" "Blue Book: primitiveCursorLocPut"
  "(92 primitiveSetDisplayMode)" "Blue Book: primitiveCursorLink"
  "(93 primitiveInputSemaphore)"
  "(94 primitiveGetNextEvent)" "Blue Book: primitiveSampleInterval"
  "(95 primitiveInputWord)"
  "(96 primitiveFail)" "primitiveCopyBits"
  "(97 primitiveSnapshot)"
  "(98 primitiveStoreImageSegment)"
  "(99 primitiveLoadImageSegment)"
  "(100 primitivePerformInSuperclass)" "Blue Book: primitiveSignalAtTick"
  "(101 primitiveBeCursor)"
  "(102 primitiveBeDisplay)"
  "(103 primitiveScanCharacters)"
  "(104 primitiveFail)" "primitiveDrawLoop"
  (105 genPrimitiveStringReplace)
  "(106 primitiveScreenSize)"
  "(107 primitiveMouseButtons)"
  "(108 primitiveKbdNext)"
  "(109 primitiveKbdPeek)"
 
 
  "System Primitives (110-119)"
  (110 genPrimitiveIdentical 1)
  (111 genPrimitiveClass) "Support both class and Context>>objectClass:"
  "(112 primitiveBytesLeft)"
  "(113 primitiveQuit)"
  "(114 primitiveExitToDebugger)"
  "(115 primitiveChangeClass)" "Blue Book: primitiveOopsLeft"
  "(116 primitiveFlushCacheByMethod)"
  "(117 primitiveExternalCall)"
  "(118 primitiveDoPrimitiveWithArgs)"
  "(119 primitiveFlushCacheSelective)"
 
  (148 genPrimitiveShallowCopy 0) "a.k.a. clone"
 
  (158 genPrimitiveStringCompareWith 1)
  (159 genPrimitiveHashMultiply 0)
 
  (165 genPrimitiveIntegerAt 1) "Signed version of genPrimitiveAt"
  (166 genPrimitiveIntegerAtPut 2) "Signed version of genPrimitiveAtPut"
 
  (169 genPrimitiveNotIdentical 1)
 
  (170 genPrimitiveAsCharacter) "SmallInteger>>asCharacter, Character class>>value:"
  (171 genPrimitiveImmediateAsInteger 0) "Character>>value SmallFloat64>>asInteger"
 
  "(173 primitiveSlotAt 1)"
  "(174 primitiveSlotAtPut 2)"
  (175 genPrimitiveIdentityHash 0) "Behavior>>identityHash"
 
  "Old closure primitives"
  "(186 primitiveFail)" "was primitiveClosureValue"
  "(187 primitiveFail)" "was primitiveClosureValueWithArgs"
 
  "Perform method directly"
  "(188 primitiveExecuteMethodArgsArray)"
  "(189 primitiveExecuteMethod)"
 
  "Unwind primitives"
  "(195 primitiveFindNextUnwindContext)"
  "(196 primitiveTerminateTo)"
  "(197 primitiveFindHandlerContext)"
  (198 genFastPrimFail "primitiveMarkUnwindMethod")
  (199 genFastPrimFail "primitiveMarkHandlerMethod")
 
  "new closure primitives"
  "(200 primitiveClosureCopyWithCopiedValues)"
  (201 genPrimitiveClosureValue 0) "value"
  (202 genPrimitiveClosureValue 1) "value:"
  (203 genPrimitiveClosureValue 2) "value:value:"
  (204 genPrimitiveClosureValue 3) "value:value:value:"
  (205 genPrimitiveClosureValue 4) "value:value:value:value:"
  "(206 genPrimitiveClosureValueWithArgs)" "valueWithArguments:"
 
  (207 genPrimitiveFullClosureValue) "value[:value:value:value:] et al"
  "(208 genPrimitiveFullClosureValueWithArgs)" "valueWithArguments:"
  (209 genPrimitiveFullClosureValue) "valueNoContextSwitch[:value:] et al"
 
  "(210 primitiveContextAt)"
  "(211 primitiveContextAtPut)"
  "(212 primitiveContextSize)"
 
  "(218 primitiveDoNamedPrimitiveWithArgs)"
  "(219 primitiveFail)" "reserved for Cog primitives"
 
  "(220 primitiveFail)" "reserved for Cog primitives"
 
  (221 genPrimitiveClosureValue 0) "valueNoContextSwitch"
  (222 genPrimitiveClosureValue 1) "valueNoContextSwitch:"
 
  "SmallFloat primitives (540-559)"
  (541 genPrimitiveSmallFloatAdd 1)
  (542 genPrimitiveSmallFloatSubtract 1)
  (543 genPrimitiveSmallFloatLessThan 1)
  (544 genPrimitiveSmallFloatGreaterThan 1)
  (545 genPrimitiveSmallFloatLessOrEqual 1)
  (546 genPrimitiveSmallFloatGreaterOrEqual 1)
  (547 genPrimitiveSmallFloatEqual 1)
  (548 genPrimitiveSmallFloatNotEqual 1)
  (549 genPrimitiveSmallFloatMultiply 1)
  (550 genPrimitiveSmallFloatDivide 1)
  "(551 genPrimitiveSmallFloatTruncated 0)"
  "(552 genPrimitiveSmallFloatFractionalPart 0)"
  "(553 genPrimitiveSmallFloatExponent 0)"
  "(554 genPrimitiveSmallFloatTimesTwoPower 1)"
  (555 genPrimitiveSmallFloatSquareRoot 0)
  "(556 genPrimitiveSmallFloatSine 0)"
  "(557 genPrimitiveSmallFloatArctan 0)"
  "(558 genPrimitiveSmallFloatLogN 0)"
  "(559 genPrimitiveSmallFloatExp 0)"
+ (575 genPrimitiveHighBit 0)
  )!

Item was changed:
  ----- Method: StackInterpreter class>>initializePrimitiveTable (in category 'initialization') -----
(excessive size, no diff calculated)