Nicolas Cellier uploaded a new version of VMMaker to project VM Maker: http://source.squeak.org/VMMaker/VMMaker.oscog-nice.2687.mcz ==================== Summary ==================== Name: VMMaker.oscog-nice.2687 Author: nice Time: 30 January 2020, 12:48:42.987807 am UUID: c5a0bd9f-2dd5-4f17-a014-28827e5008c2 Ancestors: VMMaker.oscog-eem.2686 FFI X64 SysV: check for unaligned struct and pass them in MEMORY (alloca'd memory passed thru a pointer). Note: assume that an unaligned struct can be recognized as having a different size (declared size) than the size of properly aligned struct with same fields. Note: this is WIP, and should work for packed struct passed by value, but does not yet work for returned struct, because #ffiCall:ArgArrayOrNil:NumArgs: is using #returnStructInRegisters: which only check for size, not for alignment. Use a new (branchless) formulation for aligning the byteSize to next multiple of fieldAlignment. Encode registryType of invalid unaligned candidate as 2r110, and pass the struct address returned by the foreign function in $RAX register in place of callout limit when stuct is returned by MEMORY. NEXT IDEA: currently, we recompute the registerType at each call with recursive functions which does not sound as the right thing for an efficient FFI. I plan to use some unused bits of the compiledSpec as a cache to store this registerType information (using bitXor: 2r1111, so as un-initialized cache still be 0 - that's why I have encoded INVALID as 2r110, leaving 2r111 for UNINITIALIZED). Such cache would be ABI-defined. I propose highest 4 bytes 16rF0000000. =============== Diff against VMMaker.oscog-eem.2686 =============== Item was changed: ----- Method: ThreadedFFIPlugin>>alignmentOfStructSpec:OfLength:StartingAt: (in category 'marshalling-struct') ----- alignmentOfStructSpec: specs OfLength: specSize StartingAt: indexPtr "Answer with the alignment requirement for a structure/union. Note that indexPtr is a pointer so as to be changed on return. On input, the index points to the structure header (the one with FFIFlagStructure + structSize). On output, the index points the the structure trailer (the FFIFlagStructure)." | spec byteAlignment thisAlignment | <var: #specs type: #'unsigned int*'> <var: #indexPtr type: #'unsigned int*'> <inline: false> spec := specs at: (indexPtr at: 0). self assert: (spec bitAnd: FFIFlagPointer + FFIFlagAtomic + FFIFlagStructure) = FFIFlagStructure. byteAlignment := 1. [indexPtr at: 0 put: (indexPtr at: 0) + 1. (indexPtr at: 0) < specSize] whileTrue: [spec := specs at: (indexPtr at: 0). spec = FFIFlagStructure ifTrue: [^byteAlignment]. thisAlignment := (spec anyMask: FFIFlagPointer) ifTrue: [BytesPerWord] ifFalse: [(spec anyMask: FFIFlagStructure) ifTrue: [self alignmentOfStructSpec: specs OfLength: specSize StartingAt: indexPtr] ifFalse: [spec bitAnd: FFIStructSizeMask]]. byteAlignment := byteAlignment max: thisAlignment]. + self assert: false. "should not reach here - because only ever called for sub-struct" + ^byteAlignment! - self assert: false. "should not reach here" - ^-1! Item was added: + ----- Method: ThreadedFFIPlugin>>checkAlignmentOfStructSpec:OfLength:StartingAt: (in category 'marshalling-struct') ----- + checkAlignmentOfStructSpec: specs OfLength: specSize StartingAt: startIndex + "Check the alignment of a structure and return true if correctly aligned. + If computed size = declared size, then the struct is assumed correctly aligned." + | index spec computedSize fieldAlignment fieldSize declaredSize maxAlignment | + <var: #specs type: #'unsigned int*'> + <var: #indexPtr type: #'unsigned int*'> + <inline: false> + index := startIndex. + spec := specs at: index. + self assert: (spec bitAnd: FFIFlagPointer + FFIFlagAtomic + FFIFlagStructure) = FFIFlagStructure. + (self isUnionSpec: specs OfLength: specSize StartingAt: index) ifTrue: [^true]. + declaredSize := spec bitAnd: FFIStructSizeMask. + computedSize := 0. + maxAlignment := 1. + [index := index + 1. + index < specSize] + whileTrue: + [spec := specs at: index. + spec = FFIFlagStructure + ifTrue: [^(computedSize - 1 bitOr: maxAlignment - 1) + 1 = declaredSize]. + (spec anyMask: FFIFlagPointer) + ifTrue: + [fieldSize := BytesPerWord. + fieldAlignment := fieldSize] + ifFalse: + [fieldSize := spec bitAnd: FFIStructSizeMask. + (spec anyMask: FFIFlagStructure) + ifTrue: + [(self checkAlignmentOfStructSpec: specs OfLength: specSize StartingAt: index) + ifFalse: [^false]. + fieldAlignment := self alignmentOfStructSpec: specs OfLength: specSize StartingAt: (self addressOf: index)] + ifFalse: [fieldAlignment := fieldSize]]. + "round to fieldAlignment" + maxAlignment := maxAlignment max: fieldAlignment. + computedSize := (computedSize - 1 bitOr: fieldAlignment - 1) + 1. + computedSize := computedSize + fieldSize]. + ^(computedSize - 1 bitOr: maxAlignment - 1) + 1 = declaredSize! Item was changed: ----- Method: ThreadedFFIPlugin>>isUnionSpec:OfLength:StartingAt: (in category 'marshalling-struct') ----- isUnionSpec: specs OfLength: specSize StartingAt: startIndex "We can't easily distinguish union from structures with available flags. But we have a trick: a union should have one field size equal to its own size." | index spec unionSize thisSize | <var: #specs type: #'unsigned int*'> + <inline: false> index := startIndex. spec := specs at: index. self assert: (spec bitAnd: FFIFlagPointer + FFIFlagAtomic + FFIFlagStructure) = FFIFlagStructure. unionSize := spec bitAnd: FFIStructSizeMask. [index := index + 1. index < specSize] whileTrue: [spec := specs at: index. spec = FFIFlagStructure ifTrue: [^false]. thisSize := spec bitAnd: FFIStructSizeMask. thisSize = unionSize ifTrue: [^true]. ((spec bitAnd: FFIFlagPointer + FFIFlagStructure) = FFIFlagStructure) ifTrue: ["Asking for alignment is a trick for skipping this sub structure/union" self alignmentOfStructSpec: specs OfLength: specSize StartingAt: (self addressOf: index)]]. - self assert: false. "should not reach here" ^false! Item was changed: ----- Method: ThreadedX64SysVFFIPlugin>>ffiCalloutTo:SpecOnStack:in: (in category 'callout support') ----- ffiCalloutTo: procAddr SpecOnStack: specOnStack in: calloutState <var: #procAddr type: #'void *'> <var: #calloutState type: #'CalloutState *'> <var: #loadFloatRegs declareC: 'extern void loadFloatRegs(double, double, double, double, double, double, double, double)'> "Go out, call this guy and create the return value. This *must* be inlined because of the alloca of the outgoing stack frame in ffiCall:WithFlags:NumArgs:Args:AndTypes:" | myThreadIndex atomicType floatRet intRet sddRet sdiRet sidRet siiRet returnStructByValue registerType sRetPtr | <var: #floatRet type: #double> <var: #intRet type: #sqInt> <var: #siiRet type: #SixteenByteReturnII> <var: #sidRet type: #SixteenByteReturnID> <var: #sdiRet type: #SixteenByteReturnDI> <var: #sddRet type: #SixteenByteReturnDD> <var: #sRetPtr type: #'void *'> <inline: true> returnStructByValue := (calloutState ffiRetHeader bitAnd: FFIFlagStructure + FFIFlagPointer + FFIFlagAtomic) = FFIFlagStructure. returnStructByValue ifTrue: [(self returnStructInRegisters: calloutState structReturnSize) ifTrue: [registerType := self registerTypeForStructSpecs: (interpreterProxy firstIndexableField: calloutState ffiRetSpec) OfLength: (interpreterProxy slotSizeOf: calloutState ffiRetSpec)] + ifFalse: [registerType := 2r110 "cannot pass by register"]]. - ifFalse: [registerType := 2r101 "encodes a single sqInt"]]. myThreadIndex := interpreterProxy disownVM: (self disownFlagsFor: calloutState). calloutState floatRegisterIndex > 0 ifTrue: [self load: (calloutState floatRegisters at: 0) Flo: (calloutState floatRegisters at: 1) a: (calloutState floatRegisters at: 2) t: (calloutState floatRegisters at: 3) R: (calloutState floatRegisters at: 4) e: (calloutState floatRegisters at: 5) g: (calloutState floatRegisters at: 6) s: (calloutState floatRegisters at: 7)]. (self allocaLiesSoSetSpBeforeCall or: [self mustAlignStack]) ifTrue: [self setsp: calloutState argVector]. atomicType := self atomicTypeOf: calloutState ffiRetHeader. (atomicType >> 1) = (FFITypeSingleFloat >> 1) ifTrue: [atomicType = FFITypeSingleFloat ifTrue: [floatRet := self dispatchFunctionPointer: (self cCoerceSimple: procAddr to: 'float (*)(sqIntptr_t, sqIntptr_t, sqIntptr_t, sqIntptr_t, sqIntptr_t, sqIntptr_t)') with: (calloutState integerRegisters at: 0) with: (calloutState integerRegisters at: 1) with: (calloutState integerRegisters at: 2) with: (calloutState integerRegisters at: 3) with: (calloutState integerRegisters at: 4) with: (calloutState integerRegisters at: 5)] ifFalse: "atomicType = FFITypeDoubleFloat" [floatRet := self dispatchFunctionPointer: (self cCoerceSimple: procAddr to: 'double (*)(sqIntptr_t, sqIntptr_t, sqIntptr_t, sqIntptr_t, sqIntptr_t, sqIntptr_t)') with: (calloutState integerRegisters at: 0) with: (calloutState integerRegisters at: 1) with: (calloutState integerRegisters at: 2) with: (calloutState integerRegisters at: 3) with: (calloutState integerRegisters at: 4) with: (calloutState integerRegisters at: 5)]. interpreterProxy ownVM: myThreadIndex. ^interpreterProxy floatObjectOf: floatRet]. returnStructByValue ifFalse: [intRet := self dispatchFunctionPointer: (self cCoerceSimple: procAddr to: 'sqInt (*)(sqIntptr_t, sqIntptr_t, sqIntptr_t, sqIntptr_t, sqIntptr_t, sqIntptr_t)') with: (calloutState integerRegisters at: 0) with: (calloutState integerRegisters at: 1) with: (calloutState integerRegisters at: 2) with: (calloutState integerRegisters at: 3) with: (calloutState integerRegisters at: 4) with: (calloutState integerRegisters at: 5). interpreterProxy ownVM: myThreadIndex. (calloutState ffiRetHeader anyMask: FFIFlagPointer) ifTrue: [^self ffiReturnPointer: intRet ofType: (self ffiReturnType: specOnStack) in: calloutState]. ^self ffiCreateIntegralResultOop: intRet ofAtomicType: atomicType in: calloutState]. registerType caseOf: {[2r00] -> [sddRet := self dispatchFunctionPointer: (self cCoerceSimple: procAddr to: 'SixteenByteReturnDD (*)(sqIntptr_t, sqIntptr_t, sqIntptr_t, sqIntptr_t, sqIntptr_t, sqIntptr_t)') with: (calloutState integerRegisters at: 0) with: (calloutState integerRegisters at: 1) with: (calloutState integerRegisters at: 2) with: (calloutState integerRegisters at: 3) with: (calloutState integerRegisters at: 4) with: (calloutState integerRegisters at: 5). sRetPtr := (self addressOf: sddRet) asVoidPointer]. [2r01] -> [sidRet := self dispatchFunctionPointer: (self cCoerceSimple: procAddr to: 'SixteenByteReturnID (*)(sqIntptr_t, sqIntptr_t, sqIntptr_t, sqIntptr_t, sqIntptr_t, sqIntptr_t)') with: (calloutState integerRegisters at: 0) with: (calloutState integerRegisters at: 1) with: (calloutState integerRegisters at: 2) with: (calloutState integerRegisters at: 3) with: (calloutState integerRegisters at: 4) with: (calloutState integerRegisters at: 5). sRetPtr := (self addressOf: sidRet) asVoidPointer]. [2r10] -> [sdiRet := self dispatchFunctionPointer: (self cCoerceSimple: procAddr to: 'SixteenByteReturnDI (*)(sqIntptr_t, sqIntptr_t, sqIntptr_t, sqIntptr_t, sqIntptr_t, sqIntptr_t)') with: (calloutState integerRegisters at: 0) with: (calloutState integerRegisters at: 1) with: (calloutState integerRegisters at: 2) with: (calloutState integerRegisters at: 3) with: (calloutState integerRegisters at: 4) with: (calloutState integerRegisters at: 5). sRetPtr := (self addressOf: sdiRet) asVoidPointer]. [2r11] -> [siiRet := self dispatchFunctionPointer: (self cCoerceSimple: procAddr to: 'SixteenByteReturnII (*)(sqIntptr_t, sqIntptr_t, sqIntptr_t, sqIntptr_t, sqIntptr_t, sqIntptr_t)') with: (calloutState integerRegisters at: 0) with: (calloutState integerRegisters at: 1) with: (calloutState integerRegisters at: 2) with: (calloutState integerRegisters at: 3) with: (calloutState integerRegisters at: 4) with: (calloutState integerRegisters at: 5). sRetPtr := (self addressOf: siiRet) asVoidPointer]. [2r100] -> [floatRet := self dispatchFunctionPointer: (self cCoerceSimple: procAddr to: 'double (*)(sqIntptr_t, sqIntptr_t, sqIntptr_t, sqIntptr_t, sqIntptr_t, sqIntptr_t)') with: (calloutState integerRegisters at: 0) with: (calloutState integerRegisters at: 1) with: (calloutState integerRegisters at: 2) with: (calloutState integerRegisters at: 3) with: (calloutState integerRegisters at: 4) with: (calloutState integerRegisters at: 5). sRetPtr := (self addressOf: floatRet) asVoidPointer]. [2r101] -> [intRet := self dispatchFunctionPointer: (self cCoerceSimple: procAddr to: 'sqInt (*)(sqIntptr_t, sqIntptr_t, sqIntptr_t, sqIntptr_t, sqIntptr_t, sqIntptr_t)') with: (calloutState integerRegisters at: 0) with: (calloutState integerRegisters at: 1) with: (calloutState integerRegisters at: 2) with: (calloutState integerRegisters at: 3) with: (calloutState integerRegisters at: 4) with: (calloutState integerRegisters at: 5). + sRetPtr := (self addressOf: intRet) asVoidPointer]. + [2r110] -> + ["return a pointer to alloca'd memory" + intRet := self + dispatchFunctionPointer: (self cCoerceSimple: procAddr to: 'sqInt (*)(sqIntptr_t, sqIntptr_t, sqIntptr_t, sqIntptr_t, sqIntptr_t, sqIntptr_t)') + with: (calloutState integerRegisters at: 0) + with: (calloutState integerRegisters at: 1) + with: (calloutState integerRegisters at: 2) + with: (calloutState integerRegisters at: 3) + with: (calloutState integerRegisters at: 4) + with: (calloutState integerRegisters at: 5). + sRetPtr := intRet asVoidPointer "address of struct is returned in RAX, which also is calloutState limit"]} - sRetPtr := (self addressOf: intRet) asVoidPointer]} otherwise: [interpreterProxy ownVM: myThreadIndex. self ffiFail: FFIErrorWrongType. ^nil]. interpreterProxy ownVM: myThreadIndex. ^self ffiReturnStruct: sRetPtr ofType: (self ffiReturnType: specOnStack) in: calloutState! Item was changed: ----- Method: ThreadedX64SysVFFIPlugin>>ffiPushStructure:ofSize:typeSpec:ofLength:in: (in category 'marshalling') ----- ffiPushStructure: pointer ofSize: structSize typeSpec: argSpec ofLength: argSpecSize in: calloutState <var: #pointer type: #'void *'> <var: #argSpec type: #'sqInt *'> <var: #calloutState type: #'CalloutState *'> <inline: true> | roundedSize registerType numDoubleRegisters numIntegerRegisters passField0InXmmReg passField1InXmmReg | structSize <= 16 ifTrue: ["See sec 3.2.3 of http://people.freebsd.org/~obrien/amd64-elf-abi.pdf. (dravft version 0.90). All of the folowing are passed in registers: typedef struct { long a; } s0; typedef struct { double a; } s1; typedef struct { long a; double b; } s2; typedef struct { int a; int b; double c; } s2a; typedef struct { short a; short b; short c; short d; double e; } s2b; typedef struct { long a; float b; } s2f; typedef struct { long a; float b; float c; } s2g; typedef struct { int a; float b; int c; float d; } s2h;" registerType := self registerTypeForStructSpecs: (self cCoerce: argSpec to: #'unsigned int *') OfLength: argSpecSize. + registerType = 2r110 "check case of invalid alignment => pass by memory" + ifFalse: + [passField0InXmmReg := (registerType bitAnd: 1) = 0. + structSize <= 8 + ifTrue: + [numIntegerRegisters := registerType bitAnd: 1. + numDoubleRegisters := 1 - numIntegerRegisters] + ifFalse: + [passField1InXmmReg := (registerType bitAnd: 2) = 0. + numIntegerRegisters := (registerType bitAnd: 2) >> 1 + (registerType bitAnd: 1). + numDoubleRegisters := 2 - numIntegerRegisters]. + (calloutState floatRegisterIndex + numDoubleRegisters <= NumFloatRegArgs + and: [calloutState integerRegisterIndex + numIntegerRegisters <= NumIntRegArgs]) ifTrue: + [passField0InXmmReg + ifTrue: [self ffiPushDoubleFloat: ((self cCoerceSimple: pointer to: #'double *') at: 0) in: calloutState] + ifFalse: [self ffiPushSignedLongLong: ((self cCoerceSimple: pointer to: #'long long *') at: 0) in: calloutState]. + structSize > 8 ifTrue: + [passField1InXmmReg + ifTrue: [self ffiPushDoubleFloat: ((self cCoerceSimple: pointer to: #'double *') at: 1) in: calloutState] + ifFalse: [self ffiPushSignedLongLong: ((self cCoerceSimple: pointer to: #'long long *') at: 1) in: calloutState]]. + ^0]]]. - passField0InXmmReg := (registerType bitAnd: 1) = 0. - structSize <= 8 - ifTrue: - [numIntegerRegisters := registerType bitAnd: 1. - numDoubleRegisters := 1 - numIntegerRegisters] - ifFalse: - [passField1InXmmReg := (registerType bitAnd: 2) = 0. - numIntegerRegisters := (registerType bitAnd: 2) >> 1 + (registerType bitAnd: 1). - numDoubleRegisters := 2 - numIntegerRegisters]. - (calloutState floatRegisterIndex + numDoubleRegisters <= NumFloatRegArgs - and: [calloutState integerRegisterIndex + numIntegerRegisters <= NumIntRegArgs]) ifTrue: - [passField0InXmmReg - ifTrue: [self ffiPushDoubleFloat: ((self cCoerceSimple: pointer to: #'double *') at: 0) in: calloutState] - ifFalse: [self ffiPushSignedLongLong: ((self cCoerceSimple: pointer to: #'long long *') at: 0) in: calloutState]. - structSize > 8 ifTrue: - [passField1InXmmReg - ifTrue: [self ffiPushDoubleFloat: ((self cCoerceSimple: pointer to: #'double *') at: 1) in: calloutState] - ifFalse: [self ffiPushSignedLongLong: ((self cCoerceSimple: pointer to: #'long long *') at: 1) in: calloutState]]. - ^0]]. roundedSize := structSize + 7 bitClear: 7. calloutState currentArg + roundedSize > calloutState limit ifTrue: [^FFIErrorCallFrameTooBig]. self memcpy: calloutState currentArg _: (self cCoerceSimple: pointer to: 'char *') _: structSize. calloutState currentArg: calloutState currentArg + roundedSize. ^0! Item was changed: ----- Method: ThreadedX64SysVFFIPlugin>>ffiReturnStruct:ofType:in: (in category 'callout support') ----- + ffiReturnStruct: structRetPtr ofType: ffiRetType in: calloutState + <var: #structRetPtr type: #'void *'> - ffiReturnStruct: sixteenByteRetPtr ofType: ffiRetType in: calloutState - <var: #sixteenByteRetPtr type: #'void *'> <var: #calloutState type: #'CalloutState *'> "Create a structure return value from an external function call. The value has been stored in alloca'ed space pointed to by the calloutState or in the return value passed by pointer." | retOop retClass oop | <inline: true> retClass := interpreterProxy fetchPointer: 1 ofObject: ffiRetType. retOop := interpreterProxy instantiateClass: retClass indexableSize: 0. self remapOop: retOop in: [oop := interpreterProxy instantiateClass: interpreterProxy classByteArray indexableSize: calloutState structReturnSize]. self memcpy: (interpreterProxy firstIndexableField: oop) + _: structRetPtr - _: ((self returnStructInRegisters: calloutState structReturnSize) - ifTrue: [sixteenByteRetPtr] - ifFalse: [calloutState limit]) _: calloutState structReturnSize. interpreterProxy storePointer: 0 ofObject: retOop withValue: oop. ^retOop! Item was changed: ----- Method: ThreadedX64SysVFFIPlugin>>registerTypeForStructSpecs:OfLength: (in category 'marshalling') ----- registerTypeForStructSpecs: specs OfLength: specSize "Answer with a number characterizing the register type for passing a struct of size <= 16 bytes. The bit at offset i of registerType is set to 1 if eightbyte at offset i is a int register (RAX ...) The bit at offset 2 indicates if there is a single eightbyte (struct size <= 8) * 2r00 for float float (XMM0 XMM1) * 2r01 for int float (RAX XMM0) * 2r10 for float int (XMM0 RAX) * 2r11 for int int (RAX RDX) * 2r100 for float (XMM0) * 2r101 for int (RAX) + * 2r110 INVALID (not aligned) Beware, the bits must be read from right to left for decoding register type. Note: this method reconstructs the struct layout according to X64 alignment rules. Therefore, it will not work for packed struct or other exotic alignment." <var: #specs type: #'unsigned int*'> <var: #subIndex type: #'unsigned int'> | eightByteOffset byteOffset index registerType spec fieldSize alignment atomic subIndex isInt | + index := 0. + (self checkAlignmentOfStructSpec: specs OfLength: specSize StartingAt: index) + ifFalse: [^2r110]. eightByteOffset := 0. byteOffset := 0. - index := 0. registerType := ((specs at: index) bitAnd: FFIStructSizeMask) <= 8 ifTrue: [2r100] ifFalse: [0]. [(index := index + 1) < specSize] whileTrue: [spec := specs at: index. isInt := false. spec = FFIFlagStructure "this marks end of structure and should be ignored" ifFalse: [(spec anyMask: FFIFlagPointer) ifTrue: [fieldSize := BytesPerWord. alignment := fieldSize. isInt := true] ifFalse: [(spec bitAnd: FFIFlagStructure + FFIFlagAtomic) caseOf: {[FFIFlagStructure] -> [fieldSize := 0. subIndex := index. alignment := self alignmentOfStructSpec: specs OfLength: specSize StartingAt: (self addressOf: subIndex)]. [FFIFlagAtomic] -> [fieldSize := spec bitAnd: FFIStructSizeMask. alignment := fieldSize. atomic := self atomicTypeOf: spec. isInt := (atomic >> 1) ~= (FFITypeSingleFloat >> 1)]} otherwise: ["invalid spec" ^-1]]. (byteOffset bitAnd: alignment - 1) = 0 ifFalse: ["this field requires alignment" byteOffset := (byteOffset bitClear: alignment - 1) + alignment]. byteOffset + fieldSize > 8 ifTrue: ["Not enough room on current eightbyte for this field, skip to next one" eightByteOffset := eightByteOffset + 1. byteOffset := 0]. isInt ifTrue: ["If this eightbyte contains an int field, then we must use an int register" registerType := registerType bitOr: 1 << eightByteOffset]. "where to put the next field?" byteOffset := byteOffset + fieldSize. byteOffset >= 8 ifTrue: ["This eightbyte is full, skip to next one" eightByteOffset := eightByteOffset + 1. byteOffset := 0]]]. ^registerType! |
Le jeu. 30 janv. 2020 à 00:50, <[hidden email]> a écrit :
err, highest four bits of course... The compiledSpec cache should be reset when restarting the image on a different OS, which is another detail to handle (compiledSpec are currently copied as a literal in the method invoking FFI primitive).
|
Free forum by Nabble | Edit this page |