VM Maker: VMMaker.oscog-eem.2602.mcz

Previous Topic Next Topic
 
classic Classic list List threaded Threaded
8 messages Options
Reply | Threaded
Open this post in threaded view
|

VM Maker: VMMaker.oscog-eem.2602.mcz

commits-2
 
Eliot Miranda uploaded a new version of VMMaker to project VM Maker:
http://source.squeak.org/VMMaker/VMMaker.oscog-eem.2602.mcz

==================== Summary ====================

Name: VMMaker.oscog-eem.2602
Author: eem
Time: 3 December 2019, 6:33:22.314018 pm
UUID: d0c7f3ff-f03b-4072-a00a-9425ed2b301a
Ancestors: VMMaker.oscog-eem.2601

More work for CogARMv8Compiler.  Add concretizeAddCqR and concretizeAndCqRDest:.  Now seven instructions into generating ceDereferenceSelectorIndex.

ARMv8 has an amazingly exotic scheme for encoding bitfields as immediates in logical operations.  Copy LLVM's encoder.  So far this is unverified because I cannot make sense of
        immN:NOT(imms)
in
                len = HighestSetBit(immN:NOT(imms));
in DecodeBitMasks on page J1-7389 of the Arm ARM.
If you can make sense of this either consider fully implementing CogARMv8Compiler>>decodeBitMasksN:imms:immr: or explaining to me what is meant by "immN:NOT(imms)" (via email to OpenSMalltalk-VM ?).

AdvThanksance.

=============== Diff against VMMaker.oscog-eem.2601 ===============

Item was changed:
  CogAbstractInstruction subclass: #CogARMv8Compiler
  instanceVariableNames: ''
+ classVariableNames: 'AL AddOpcode AndOpcode CC CS CmpOpcode D0 D1 D10 D11 D12 D13 D14 D15 D16 D17 D18 D19 D2 D20 D21 D22 D23 D24 D25 D26 D27 D28 D29 D3 D30 D31 D4 D5 D6 D7 D8 D9 EQ FP GE GT HI LE LR LS LT MI NE OrOpcode PL R0 R1 R10 R11 R12 R13 R14 R15 R16 R17 R18 R19 R2 R20 R21 R22 R23 R24 R25 R26 R27 R28 R29 R3 R30 R31 R4 R5 R6 R7 R8 R9 SP SXTB SXTH SXTW SXTX SubOpcode UXTB UXTH UXTW UXTX VC VS XZR XorOpcode'
- classVariableNames: 'AL AddOpcode AndOpcode CC CS CmpOpcode D0 D1 D10 D11 D12 D13 D14 D15 D16 D17 D18 D19 D2 D20 D21 D22 D23 D24 D25 D26 D27 D28 D29 D3 D30 D31 D4 D5 D6 D7 D8 D9 EQ FP GE GT HI LE LR LS LT MI NE OrOpcode PL R0 R1 R10 R11 R12 R13 R14 R15 R16 R17 R18 R19 R2 R20 R21 R22 R23 R24 R25 R26 R27 R28 R29 R3 R30 R31 R4 R5 R6 R7 R8 R9 SP SXTB SXTH SXTW SXTX SubOpcode UXTB UXTH UXTW UXTX VC VS XorOpcode'
  poolDictionaries: 'ARMv8A64Opcodes'
  category: 'VMMaker-JIT'!
 
  !CogARMv8Compiler commentStamp: 'eem 12/3/2019 08:45' prior: 0!
  I generate ARMv8 machine code instructions from CogAbstractInstructions with CogRTLOpcodes.
  Here in "Arm ARM" refers to
  Arm® Architecture Reference Manual
  Armv8, for Armv8-A architecture profile
  https://developer.arm.com/docs/ddi0487/latest/arm-architecture-reference-manual-armv8-for-armv8-a-architecture-profile
 
  Some things to know about ARMv8 instructions:
  Whether 31 in a register field implies the zero register or the SP register(s) depends on the specific instruction.
 
  C3.2.1 Load/Store register
  If a Load instruction specifies writeback and the register being loaded is also the base register,
  then behavior is CONSTRAINED UNPREDICTABLE and one of the following behaviors must occur:
  - The instruction is treated as UNDEFINED.
  - The instruction is treated as a NOP.
  - The instruction performs the load using the specified addressing mode and the base register
    becomes UNKNOWN.  In addition, if an exception occurs during the execution of such an
    instruction, the base address might be corrupted so that the instruction cannot be repeated.
  If a Store instruction performs a writeback and the register that is stored is also the base register,
  then behavior is CONSTRAINED UNPREDICTABLE and one of the following behaviors must occur:
  - The instruction is treated as UNDEFINED.
  - The instruction is treated as a NOP.
  - The instruction performs the store to the designated register using the specified addressing
    mode, but the value stored is UNKNOWN.!

Item was changed:
  ----- Method: CogARMv8Compiler class>>initialize (in category 'class initialization') -----
  initialize
  "Initialize various ARM64 instruction-related constants."
  "self initialize"
 
  "main registers; a minor complication in reading the doc.
  ARM refer to the 64bit registers as X0...30 and use R0...30 to refer to the 32bit lower halves.They also use a whole suite of names for the floating point/SIMD registers. See ARMARM DDI0487 B1.2.1 etc for the gory details.
  Note that R30 (yes, yes, X30) is used as the link register and as such is not really a general purpose register.
  Also note that R31 (named XZR in ARM doc) is a pseudo-register that always reads as 0 and writes to /dev/null.  
  And note that unlike the ARM32, there is no general purpose register for either the PC or SP; a big difference. See ARMARM DDI0487 C1.2.5. wrt to both issues.
 
  We will stick with R0...30 to refer to the 64 bit general regs and D0...31 (note the extra reg here!!) for the FP/SIMD regs"
  #( (D0 D1 D2 D3 D4 D5 D6 D7 D8 D9 D10 D11 D12 D13 D14 D15 D16 D17 D18 D19 D20 D21 D22 D23 D24 D25 D26 D27 D28 D29 D30 D31)
  (R0 R1 R2 R3 R4 R5 R6 R7 R8 R9 R10 R11 R12 R13 R14 R15 R16 R17 R18 R19 R20 R21 R22 R23 R24 R25 R26 R27 R28 R29 R30 R31)) do:
  [:classVarNames|
  classVarNames doWithIndex:
  [:k :v|
  CogARMv8Compiler classPool at: k put: v - 1]].
+
+ SP := XZR := R31.
+ LR := R30.
+ FP := R29.
+
-
  "Condition Codes. Note that cc=16rF is mapped back to AL in AARCH64. Generally it shouldn't be used"
  "Perhaps have these in the ARMv8A64Opcodes pool?"
  EQ := 0.
  NE := 1.
  CS := 2.
  CC := 3.
  MI := 4.
  PL := 5.
  VS := 6.
  VC := 7.
  HI := 8.
  LS := 9.
  GE := 10.
  LT := 11.
  GT := 12.
  LE := 13.
  AL := 14.
 
  "Extension Methods "
  UXTB := 2r000.
  UXTH := 2r001.
  UXTW := 2r010.
  UXTX := 2r011. "a.k.a. LSL"
  SXTB := 2r100.
  SXTH := 2r101.
  SXTW := 2r110.
  SXTX := 2r111!

Item was changed:
  ----- Method: CogARMv8Compiler class>>initializeAbstractRegisters (in category 'class initialization') -----
  initializeAbstractRegisters
  "Assign the abstract registers with the identities/indices of the relevant concrete registers."
 
  super initializeAbstractRegisters.
 
  "See Table 3-1 Register Usage in AArch64 SMC32, HVC32, SMC64, and HVC64 calls in
  SMC CALLING CONVENTION
  System Software on ARM® Platforms Document number: ARM DEN 0028B
  http://infocenter.arm.com/help/topic/com.arm.doc.den0028b/ARM_DEN0028B_SMC_Calling_Convention.pdf
  "
 
  CallerSavedRegisterMask := (self registerMaskFor: 0 and: 1 and: 2 and: 3 and: 4 and: 5 and: 6 and: 7 and: 8 and: 9)
  + (self registerMaskFor: 10 and: 11 and: 12 and: 13 and: 14 and: 15 and: 16 and: 17).
 
  TempReg := R0.
  ClassReg := R4.
  ReceiverResultReg := R5.
  SendNumArgsReg := R6.
  Arg0Reg := R7.
  Arg1Reg := R8.
  "X18/R18 is the Platform register; leave it be"
  Extra0Reg := R19. "R18 through R28 are callee saved"
  Extra1Reg := R20.
  Extra2Reg := R21.
  Extra3Reg := R22.
  Extra4Reg := R23.
  Extra5Reg := R24.
  Extra6Reg := R25.
  Extra7Reg := R26.
  VarBaseReg := R27.
  RISCTempReg := R28.
+ FPReg := FP "a.k.a. R29".
+ LinkReg := LR "a.k.a. R30".
+ SPReg := SP "a.k.a. R31".
- FPReg := FP := R29.
- LinkReg := LR := R30.
- SPReg := SP := R31.
 
  NumRegisters := 32.
 
  DPFPReg0 := D0.
  DPFPReg1 := D1.
  DPFPReg2 := D2.
  DPFPReg3 := D3.
  DPFPReg4 := D4.
  DPFPReg5 := D5.
  DPFPReg6 := D6.
  DPFPReg7 := D7.
  DPFPReg8 := D8.
  DPFPReg9 := D9.
  DPFPReg10 := D10.
  DPFPReg11 := D11.
  DPFPReg12 := D12.
  DPFPReg13 := D13.
  DPFPReg14 := D14.
  DPFPReg15 := D15.
 
  NumFloatRegisters := 32!

Item was added:
+ ----- Method: CogARMv8Compiler>>concretizeAddCqR (in category 'generate machine code - concretize') -----
+ concretizeAddCqR
+ "Will get inlined into concretizeAt: switch."
+ <inline: true>
+ | constant reg offset |
+ constant := operands at: 0.
+ reg := operands at: 1.
+
+ self deny: reg = SP. "For now; how to add an immediate to the SP?"
+ self isPossiblyShiftableImm12: constant
+ ifTrue:
+ [:shift|
+ "C6.2.8 ADDS (immediate) C6-769"
+ machineCode
+ at: 0
+ put: 2r101100010 << 23
+ + (shift ifTrue: [constant >> 2 + (1 << 22)] ifFalse: [constant << 10])
+ + (reg << 5)
+ + reg.
+ ^machineCodeSize := 4]
+ ifFalse: [].
+ offset := self moveCw: constant intoR: RISCTempReg.
+ "C6.2.7 ADDS (extended register) C6-766"
+ machineCode
+ at: offset // 4
+ put: 2r10101011001
+ + (RISCTempReg << 16)
+ + (SXTX << 13)
+ + (reg << 5)
+ + reg.
+ ^machineCodeSize := offset + 4!

Item was added:
+ ----- Method: CogARMv8Compiler>>concretizeAndCqRDest: (in category 'generate machine code - concretize') -----
+ concretizeAndCqRDest: destReg
+ "C6.2.14 ANDS (immediate) C6-779
+ C6.2.329 TST (immediate) C6-1346"
+ | srcReg constant offset |
+ constant := operands at: 0.
+ srcReg := operands at: 1.
+ self isImmNImmSImmREncodableBitmask: constant
+ ifTrue:
+ [:n :imms :immr|
+ machineCode
+ at: 0
+ put: 2r111100100 << 23
+ + (n << 22)
+ + (immr << 16)
+ + (imms << 10)
+ + (srcReg << 5)
+ + destReg.
+ ^machineCodeSize := 4]
+ ifFalse: [].
+ offset := self moveCw: constant intoR: RISCTempReg.
+ "C6.2.15 ANDS (shifted register) C6-781"
+ machineCode
+ at: offset // 4
+ put: 2r11101010 << 24
+ + (RISCTempReg << 16)
+ + (srcReg << 5)
+ + destReg.
+ ^machineCodeSize := offset + 4!

Item was added:
+ ----- Method: CogARMv8Compiler>>countTrailingOnes: (in category 'generate machine code - support') -----
+ countTrailingOnes: anInteger
+ | bits count |
+ self assert: anInteger ~= 0.
+ "Lazy; we could use binary chop"
+ bits := anInteger.
+ count := 0.
+ [(bits bitAnd: 1) = 1] whileTrue:
+ [bits := bits bitShift: -1.
+ count := count + 1].
+ ^count!

Item was added:
+ ----- Method: CogARMv8Compiler>>countTrailingZeros: (in category 'generate machine code - support') -----
+ countTrailingZeros: anInteger
+ self assert: anInteger ~= 0.
+ ^self cCode: "Lazy; we could use binary chop"
+ [| bits count |
+ bits := anInteger.
+ count := 0.
+ [(bits bitAnd: 1) = 0] whileTrue:
+ [bits := bits bitShift: -1.
+ count := count + 1].
+ count]
+ inSmalltalk: [anInteger lowBit - 1]!

Item was added:
+ ----- Method: CogARMv8Compiler>>decodeBitMasksN:imms:immr: (in category 'generate machine code - support') -----
+ decodeBitMasksN: immN imms: imms immr: immr
+ "See aarch64/instrs/integer/bitmasks/DecodeBitMasks J1-7389"
+ | tmask wmask levels |
+ "eem 12/3/2019 18:17
+ In
+
+ len = HighestSetBit(immN:NOT(imms));
+
+ I have no clue what is meant by
+
+ immN:NOT(imms)
+
+ Reading Arm Pseudocode Definition K13.4 Operators leaves me none the wiser."
+ self shouldBeImplemented!

Item was changed:
  ----- Method: CogARMv8Compiler>>dispatchConcretize (in category 'generate machine code') -----
  dispatchConcretize
  "Attempt to generate concrete machine code for the instruction at address.
  This is the inner dispatch of concretizeAt: actualAddress which exists only
  to get around the branch size limits in the SqueakV3 (blue book derived)
  bytecode set."
  <returnTypeC: #void>
 
  opcode caseOf: {
  "Noops & Pseudo Ops"
  [Label] -> [^self concretizeLabel].
  [Literal] -> [^self concretizeLiteral].
  [AlignmentNops] -> [^self concretizeAlignmentNops].
  [Fill32] -> [^self concretizeFill32].
  [Nop] -> [^self concretizeNop].
  "Control"
  [Call] -> [^self concretizeCall]. "call code within code space"
  [CallFull] -> [^self concretizeCallFull]. "call code anywhere in address space"
  [JumpR] -> [^self concretizeJumpR].
  [JumpFull] -> [^self concretizeJumpFull]."jump within address space"
  [JumpLong] -> [^self concretizeConditionalJump: AL]."jumps witihn code space"
  [JumpLongZero] -> [^self concretizeConditionalJump: EQ].
  [JumpLongNonZero] -> [^self concretizeConditionalJump: NE].
  [Jump] -> [^self concretizeConditionalJump: AL].
  [JumpZero] -> [^self concretizeConditionalJump: EQ].
  [JumpNonZero] -> [^self concretizeConditionalJump: NE].
  [JumpNegative] -> [^self concretizeConditionalJump: MI].
  [JumpNonNegative] -> [^self concretizeConditionalJump: PL].
  [JumpOverflow] -> [^self concretizeConditionalJump: VS].
  [JumpNoOverflow] -> [^self concretizeConditionalJump: VC].
  [JumpCarry] -> [^self concretizeConditionalJump: CS].
  [JumpNoCarry] -> [^self concretizeConditionalJump: CC].
  [JumpLess] -> [^self concretizeConditionalJump: LT].
  [JumpGreaterOrEqual] -> [^self concretizeConditionalJump: GE].
  [JumpGreater] -> [^self concretizeConditionalJump: GT].
  [JumpLessOrEqual] -> [^self concretizeConditionalJump: LE].
  [JumpBelow] -> [^self concretizeConditionalJump: CC]. "unsigned lower"
  [JumpAboveOrEqual] -> [^self concretizeConditionalJump: CS]. "unsigned greater or equal"
  [JumpAbove] -> [^self concretizeConditionalJump: HI].
  [JumpBelowOrEqual] -> [^self concretizeConditionalJump: LS].
  [JumpFPEqual] -> [^self concretizeFPConditionalJump: EQ].
  [JumpFPNotEqual] -> [^self concretizeFPConditionalJump: NE].
  [JumpFPLess] -> [^self concretizeFPConditionalJump: LT].
  [JumpFPGreaterOrEqual] -> [^self concretizeFPConditionalJump: GE].
  [JumpFPGreater] -> [^self concretizeFPConditionalJump: GT].
  [JumpFPLessOrEqual] -> [^self concretizeFPConditionalJump: LE].
  [JumpFPOrdered] -> [^self concretizeFPConditionalJump: VC].
  [JumpFPUnordered] -> [^self concretizeFPConditionalJump: VS].
  [RetN] -> [^self concretizeRetN].
  [Stop] -> [^self concretizeStop].
  "Arithmetic"
+ [AddCqR] -> [^self concretizeAddCqR].
- [AddCqR] -> [^self concretizeNegateableDataOperationCqR: AddOpcode].
  [AndCqR] -> [^self concretizeInvertibleDataOperationCqR: AndOpcode].
+ [AndCqRR] -> [^self concretizeAndCqRDest: (operands at: 1)].
- [AndCqRR] -> [^self concretizeAndCqRR].
  [CmpCqR] -> [^self concretizeCmpCqR].
  [OrCqR] -> [^self concretizeDataOperationCqR: OrOpcode].
  [SubCqR] -> [^self concretizeSubCqR].
+ [TstCqR] -> [^self concretizeAndCqRDest: XZR].
- [TstCqR] -> [^self concretizeTstCqR].
  [XorCqR] -> [^self concretizeInvertibleDataOperationCqR: XorOpcode].
  [AddCwR] -> [^self concretizeDataOperationCwR: AddOpcode].
  [AndCwR] -> [^self concretizeDataOperationCwR: AndOpcode].
  [CmpCwR] -> [^self concretizeDataOperationCwR: CmpOpcode].
  [OrCwR] -> [^self concretizeDataOperationCwR: OrOpcode].
  [SubCwR] -> [^self concretizeDataOperationCwR: SubOpcode].
  [XorCwR] -> [^self concretizeDataOperationCwR: XorOpcode].
  [AddRR] -> [^self concretizeDataOperationRR: AddOpcode].
  [AndRR] -> [^self concretizeDataOperationRR: AndOpcode].
  [CmpRR] -> [^self concretizeDataOperationRR: CmpOpcode].
  [OrRR] -> [^self concretizeDataOperationRR: OrOpcode].
  [SubRR] -> [^self concretizeDataOperationRR: SubOpcode].
  [XorRR] -> [^self concretizeDataOperationRR: XorOpcode].
  [AddRdRd] -> [^self concretizeAddRdRd].
  [CmpRdRd] -> [^self concretizeCmpRdRd].
  [DivRdRd] -> [^self concretizeDivRdRd].
  [MulRdRd] -> [^self concretizeMulRdRd].
  [SubRdRd] -> [^self concretizeSubRdRd].
  [SqrtRd] -> [^self concretizeSqrtRd].
  [NegateR] -> [^self concretizeNegateR].
  [LoadEffectiveAddressMwrR] -> [^self concretizeLoadEffectiveAddressMwrR].
  [ArithmeticShiftRightCqR] -> [^self concretizeArithmeticShiftRightCqR].
  [LogicalShiftRightCqR] -> [^self concretizeLogicalShiftRightCqR].
  [LogicalShiftLeftCqR] -> [^self concretizeLogicalShiftLeftCqR].
  [ArithmeticShiftRightRR] -> [^self concretizeArithmeticShiftRightRR].
  [LogicalShiftLeftRR] -> [^self concretizeLogicalShiftLeftRR].
  [LogicalShiftRightRR] -> [^self concretizeLogicalShiftRightRR].
  [ClzRR] -> [^self concretizeClzRR].
  "Data Movement"
  [MoveCqR] -> [^self concretizeMoveCqR].
  [MoveCwR] -> [^self concretizeMoveCwR].
  [MoveRR] -> [^self concretizeMoveRR].
  [MoveAwR] -> [^self concretizeMoveAwR].
  [MoveRAw] -> [^self concretizeMoveRAw].
  [MoveAbR] -> [^self concretizeMoveAbR].
    [MoveRAb] -> [^self concretizeMoveRAb].
  [MoveMbrR] -> [^self concretizeMoveMbrR].
  [MoveRMbr] -> [^self concretizeMoveRMbr].
  [MoveRM16r] -> [^self concretizeMoveRM16r].
  [MoveM16rR] -> [^self concretizeMoveM16rR].
  [MoveM64rRd] -> [^self concretizeMoveM64rRd].
  [MoveMwrR] -> [^self concretizeMoveMwrR].
  [MoveXbrRR] -> [^self concretizeMoveXbrRR].
  [MoveRXbrR] -> [^self concretizeMoveRXbrR].
  [MoveXwrRR] -> [^self concretizeMoveXwrRR].
  [MoveRXwrR] -> [^self concretizeMoveRXwrR].
  [MoveRMwr] -> [^self concretizeMoveRMwr].
  [MoveRdM64r] -> [^self concretizeMoveRdM64r].
  [PopR] -> [^self concretizePopR].
  [PushR] -> [^self concretizePushR].
  [PushCq] -> [^self concretizePushCq].
  [PushCw] -> [^self concretizePushCw].
  [PrefetchAw] -> [^self concretizePrefetchAw].
  "Conversion"
  [ConvertRRd] -> [^self concretizeConvertRRd].
  [ConvertRdR] -> [^self concretizeConvertRdR].
  [ConvertRRs] -> [^self concretizeConvertRRs].
  [ConvertRsR] -> [^self concretizeConvertRsR].
  [ConvertRsRd] -> [^self concretizeConvertRsRd].
  [ConvertRdRs] -> [^self concretizeConvertRdRs].
 
  [SignExtend8RR] -> [^self concretizeSignExtend8RR].
  [SignExtend16RR] -> [^self concretizeSignExtend16RR].
  [SignExtend32RR] -> [^self concretizeSignExtend32RR].
 
  [ZeroExtend8RR] -> [^self concretizeZeroExtend8RR].
  [ZeroExtend16RR] -> [^self concretizeZeroExtend16RR].
  [ZeroExtend32RR] -> [^self concretizeZeroExtend32RR].}!

Item was added:
+ ----- Method: CogARMv8Compiler>>isImmNImmSImmREncodableBitmask:ifTrue:ifFalse: (in category 'generate machine code - support') -----
+ isImmNImmSImmREncodableBitmask: constant ifTrue: trinaryBlock "[:n :imms :immr|...]" ifFalse: nullaryBlock
+ "See DecodeBitMasks J1-7389.
+ See https://dinfuehr.github.io/blog/encoding-of-immediate-values-on-aarch64/
+ This method is adapted from The LLVM Compiler Infrastructure, AArch64AddressingModes.h processLogicalImmediate"
+
+ | imm size mask countLeadingOnes countTrailingOnes immr n nImms rotateCount |
+ (constant between: -1 and: 0) ifTrue:
+ [^nullaryBlock value].
+ imm := constant.
+  
+ "First, determine the element size."
+ size := 32.
+ [mask := 1 << size - 1.
+ (imm bitAnd: mask) ~= (imm >> size)
+ ifTrue: [size := size * 2. false]
+ ifFalse: [size > 2]]
+ whileTrue: [size := size / 2].
+
+ "Second, determine the rotation to make the element be: 0^m 1^n."
+ mask := 1 << 64 - 1 >> (64 - size).
+ imm := imm bitAnd: mask.
+
+ (self isShiftedMask: imm)
+ ifTrue:
+ [rotateCount := self countTrailingZeros: imm.
+ countTrailingOnes := self countTrailingOnes: imm >> rotateCount]
+ ifFalse:
+ [imm := imm bitOr: mask bitInvert64.
+ (self isShiftedMask: imm) ifFalse:
+ [^nullaryBlock value].
+ countLeadingOnes := self countLeadingOnes: imm.
+ rotateCount := 64 - countLeadingOnes.
+ countTrailingOnes := countLeadingOnes + (self countTrailingOnes: imm) - (64 - size)].
+
+ "Encode in Immr the number of RORs it would take to get *from* 0^m 1^n
+ to our target value, where I is the number of RORs to go the opposite direction."
+  
+ self assert: size > rotateCount. "rotateCount should be smaller than element size"
+ immr := size - rotateCount bitAnd: size - 1.
+
+ "If size has a 1 in the n'th bit, create a value that has zeroes in bits [0, n] and ones above that."
+ nImms := (size - 1) bitInvert64 << 1.
+
+ "Or the CTO value into the low bits, which must be below the Nth bit mentioned above."
+ nImms := nImms bitOr:  countTrailingOnes - 1.
+
+ "Extract the seventh bit and toggle it to create the N field."
+ n := ((nImms >> 6) bitAnd: 1) bitXor: 1.
+
+ nImms := nImms bitAnd: 16r3F.
+
+ false ifTrue: [self assert: (self decodeBitMasksN: n imms: nImms immr: immr) = constant].
+
+ ^trinaryBlock
+ value: n
+ value: nImms
+ value: immr
+ !

Item was added:
+ ----- Method: CogARMv8Compiler>>isShiftedMask: (in category 'generate machine code - support') -----
+ isShiftedMask: anInteger
+ | bits |
+ ^anInteger ~= 0
+  and: [bits := anInteger - 1 bitOr: anInteger.
+   (bits bitAnd: bits + 1) = 0]!

Reply | Threaded
Open this post in threaded view
|

Re: VM Maker: VMMaker.oscog-eem.2602.mcz

Henrik Sperre Johansen
 
commits-2 wrote

> ARMv8 has an amazingly exotic scheme for encoding bitfields as immediates
> in logical operations.  Copy LLVM's encoder.  So far this is unverified
> because I cannot make sense of
> immN:NOT(imms)
> in
> len = HighestSetBit(immN:NOT(imms));
> in DecodeBitMasks on page J1-7389 of the Arm ARM.
> If you can make sense of this either consider fully implementing
> CogARMv8Compiler>>decodeBitMasksN:imms:immr: or explaining to me what is
> meant by "immN:NOT(imms)" (via email to OpenSMalltalk-VM ?).
>
> AdvThanksance.

Is line 91 of
https://github.com/NationalSecurityAgency/ghidra/blob/master/Ghidra/Processors/AARCH64/src/main/java/DecodeBitMasks.java
of any help?

Iow,
: -> concatenate bits
NOT -> ... not

Cheers,
Henry




--
Sent from: http://forum.world.st/Squeak-VM-f104410.html
Reply | Threaded
Open this post in threaded view
|

Re: VM Maker: VMMaker.oscog-eem.2602.mcz

Henrik Sperre Johansen
In reply to this post by commits-2
 


On Wed, Dec 4, 2019 at 3:33 AM <[hidden email]> wrote:
ARMv8 has an amazingly exotic scheme for encoding bitfields as immediates in logical operations.  Copy LLVM's encoder.  So far this is unverified because I cannot make sense of
        immN:NOT(imms)
in
                len = HighestSetBit(immN:NOT(imms));
in DecodeBitMasks on page J1-7389 of the Arm ARM.
If you can make sense of this either consider fully implementing CogARMv8Compiler>>decodeBitMasksN:imms:immr: or explaining to me what is meant by "immN:NOT(imms)" (via email to OpenSMalltalk-VM ?).

AdvThanksance.

Is line 91 of https://github.com/NationalSecurityAgency/ghidra/blob/master/Ghidra/Processors/AARCH64/src/main/java/DecodeBitMasks.java of any use?

Basically,
: -> concatenate bit fields
NOT -> ... not ;)

Cheers,
Henry
Reply | Threaded
Open this post in threaded view
|

Re: VM Maker: VMMaker.oscog-eem.2602.mcz

Eliot Miranda-2
In reply to this post by Henrik Sperre Johansen
 
Thanks, Henry!

On Fri, Dec 6, 2019 at 1:53 AM Henrik Sperre Johansen <[hidden email]> wrote:
 
commits-2 wrote
> ARMv8 has an amazingly exotic scheme for encoding bitfields as immediates
> in logical operations.  Copy LLVM's encoder.  So far this is unverified
> because I cannot make sense of
>       immN:NOT(imms)
> in
>               len = HighestSetBit(immN:NOT(imms));
> in DecodeBitMasks on page J1-7389 of the Arm ARM.
> If you can make sense of this either consider fully implementing
> CogARMv8Compiler>>decodeBitMasksN:imms:immr: or explaining to me what is
> meant by "immN:NOT(imms)" (via email to OpenSMalltalk-VM ?).
>
> AdvThanksance.

Is line 91 of
https://github.com/NationalSecurityAgency/ghidra/blob/master/Ghidra/Processors/AARCH64/src/main/java/DecodeBitMasks.java
of any help?

Iow,
: -> concatenate bits
NOT -> ... not

In the end no.  I found other more concise code to study.  But thanks fir the heads up.  Clever, though, isn't it?  I think I would never have come up with a similar scheme myself.

_,,,^..^,,,_
best, Eliot
Reply | Threaded
Open this post in threaded view
|

Re: VM Maker: VMMaker.oscog-eem.2602.mcz

Henrik Sperre Johansen
 
Mmmhm, reading “Immediate operands” on http://www.peter-cockerell.net/aalp/html/ch-3.html, it is kind of elegant.
I like that a sufficiently smart assembler (like, Cog? ;) ), can translate a pseudo-assembly logical op with *any* immediate operand into 1 - 4(8, for 64 bit) machine code instructions by combining different positions.

“All ops are 32 bits! … but some times you need 1, some times you need 8"

Cheers,
Henry 

On 10 Jan 2020, at 09:28, Eliot Miranda <[hidden email]> wrote:

Thanks, Henry!

On Fri, Dec 6, 2019 at 1:53 AM Henrik Sperre Johansen <[hidden email]> wrote:
 
commits-2 wrote
> ARMv8 has an amazingly exotic scheme for encoding bitfields as immediates
> in logical operations.  Copy LLVM's encoder.  So far this is unverified
> because I cannot make sense of
>       immN:NOT(imms)
> in
>               len = HighestSetBit(immN:NOT(imms));
> in DecodeBitMasks on page J1-7389 of the Arm ARM.
> If you can make sense of this either consider fully implementing
> CogARMv8Compiler>>decodeBitMasksN:imms:immr: or explaining to me what is
> meant by "immN:NOT(imms)" (via email to OpenSMalltalk-VM ?).
>
> AdvThanksance.

Is line 91 of
https://github.com/NationalSecurityAgency/ghidra/blob/master/Ghidra/Processors/AARCH64/src/main/java/DecodeBitMasks.java
of any help?

Iow,
: -> concatenate bits
NOT -> ... not

In the end no.  I found other more concise code to study.  But thanks fir the heads up.  Clever, though, isn't it?  I think I would never have come up with a similar scheme myself.

_,,,^..^,,,_
best, Eliot

Reply | Threaded
Open this post in threaded view
|

Re: VM Maker: VMMaker.oscog-eem.2602.mcz

Henrik Sperre Johansen
 
Orrr, I guess that was the immediate format for arithmetic ops.

What's described here:
https://dinfuehr.github.io/blog/encoding-of-immediate-values-on-aarch64/ 
is indeed quite a bit more clever.

Covering all 32/64 bit 0+1* patterns, with rotation, in and understandable
13 bit format seems the most clever to me.
 I have a harder time seeing a use for the smaller element sizes, but I
guess I'm just narrow-minded :)
 (... or I'm missing something, again)

Cheers,
Henry


Henrik Sperre Johansen wrote

> Mmmhm, reading “Immediate operands” on
> http://www.peter-cockerell.net/aalp/html/ch-3.html
> &lt;http://www.peter-cockerell.net/aalp/html/ch-3.html&gt;, it is kind of
> elegant.
> I like that a sufficiently smart assembler (like, Cog? ;) ), can translate
> a pseudo-assembly logical op with *any* immediate operand into 1 - 4(8,
> for 64 bit) machine code instructions by combining different positions.
>
> “All ops are 32 bits! … but some times you need 1, some times you need 8"
>
> Cheers,
> Henry
>
>> On 10 Jan 2020, at 09:28, Eliot Miranda &lt;

> eliot.miranda@

> &gt; wrote:
>>
>> Thanks, Henry!
>>
>> On Fri, Dec 6, 2019 at 1:53 AM Henrik Sperre Johansen &lt;

> henrik.s.johansen@

>  &lt;mailto:

> henrik.s.johansen@

> &gt;> wrote:
>>  
>> commits-2 wrote
>> > ARMv8 has an amazingly exotic scheme for encoding bitfields as
>> immediates
>> > in logical operations.  Copy LLVM's encoder.  So far this is unverified
>> > because I cannot make sense of
>> >       immN:NOT(imms)
>> > in
>> >               len = HighestSetBit(immN:NOT(imms));
>> > in DecodeBitMasks on page J1-7389 of the Arm ARM.
>> > If you can make sense of this either consider fully implementing
>> > CogARMv8Compiler>>decodeBitMasksN:imms:immr: or explaining to me what
>> is
>> > meant by "immN:NOT(imms)" (via email to OpenSMalltalk-VM ?).
>> >
>> > AdvThanksance.
>>
>> Is line 91 of
>> https://github.com/NationalSecurityAgency/ghidra/blob/master/Ghidra/Processors/AARCH64/src/main/java/DecodeBitMasks.java
>> &lt;https://github.com/NationalSecurityAgency/ghidra/blob/master/Ghidra/Processors/AARCH64/src/main/java/DecodeBitMasks.java&gt;
>> of any help?
>>
>> Iow,
>> : -> concatenate bits
>> NOT -> ... not
>>
>> In the end no.  I found other more concise code to study.  But thanks fir
>> the heads up.  Clever, though, isn't it?  I think I would never have come
>> up with a similar scheme myself.
>>
>> _,,,^..^,,,_
>> best, Eliot





--
Sent from: http://forum.world.st/Squeak-VM-f104410.html
Reply | Threaded
Open this post in threaded view
|

Re: VM Maker: VMMaker.oscog-eem.2602.mcz

Henrik Sperre Johansen
 
Henrik Sperre Johansen wrote
>  I have a harder time seeing a use for the smaller element sizes, but I
> guess I'm just narrow-minded :)
>  (... or I'm missing something, again)

Like, does ARM have packed arithmetic operations ala SSE?
That would seem a natural fit, at least...

Cheers,
Henry



--
Sent from: http://forum.world.st/Squeak-VM-f104410.html
Reply | Threaded
Open this post in threaded view
|

Re: VM Maker: VMMaker.oscog-eem.2602.mcz

Eliot Miranda-2
 


On Fri, Jan 10, 2020 at 2:10 PM Henrik Sperre Johansen <[hidden email]> wrote:
 
Henrik Sperre Johansen wrote
>  I have a harder time seeing a use for the smaller element sizes, but I
> guess I'm just narrow-minded :)
>  (... or I'm missing something, again)

Like, does ARM have packed arithmetic operations ala SSE?
That would seem a natural fit, at least...

It does.  There's a 7,900 page manual.  Section 7 covers the SIMD processing model.  It is a very powerful processor. 32 64 bit registers.  32 128 bit FP registers, accessible (I *think*) as 32-bit FP, 4 32-bit FP, 64-bit FP or 2 64-bit FP, and a 16-bit FP format I haven't looked into [blush].


I also have the PDF armv8_arch_ref_man_DDI0487E_a.pdf which takes over 10 seconds to load on my rather powerful last year MacBook Pro (gulp).

Instruction density is about 20% worse than x86-64 but it provides much more power (three address instructions, twice as much register state, more SIMD instructions, etc).  It doesn't;t have the raws compute power either, being a low power design, but it is simpler and more regular than x86-64.  I admire it very much. Unlike ARM32 which, with its PC reg and naked delay slots, never really appealed, =he ARM64 set seems carefully designed to deliver excellent processing/power ratios.  Apart from the asymmetry in the SP (a field of 31 means the SP reg in some instructions and the zero register in others) and the trickyness of its immediate bit pattern specification it is a pleasure to compile to.

Cheers,
Henry

_,,,^..^,,,_
best, Eliot