latest changes (no idea from when it started) are making FFI win32 crash (and FFI Callbacks are not reliable anymore either)

Previous Topic Next Topic
 
classic Classic list List threaded Threaded
12 messages Options
Reply | Threaded
Open this post in threaded view
|

latest changes (no idea from when it started) are making FFI win32 crash (and FFI Callbacks are not reliable anymore either)

EstebanLM
 
Hi,

So, I’m building the PharoVM along with all his dependencies. For me, this is a major step because I can drop the old build process finally.
Now, I’m having serious problems with FFI (that they were not present before), :


1. CRASH IN WINDOWS (32bits):

In Win32, it crashes automatically when trying to access this funtion:

getEnvSize: nameString
        ^ self ffiCall: #( int GetEnvironmentVariableA ( String nameString, nil, 0 ) ) module: #Kernel32

 (this works perfectly fine in older versions)

2. CALLBACKS FAILING:

Callbacks have problems. The examples passes but they are very simple… as soon as I try to do something complicates (like unqlite bindings or libgit2 bindings, who use callbacks intensively), callbacks stops working.
I traced the problem up to this method:

StackInterpreter>>#returnAs:ThroughCallback:Context:

returnAs: returnTypeOop ThroughCallback: vmCallbackContext Context: callbackMethodContext
        "callbackMethodContext is an activation of invokeCallback:[stack:registers:jmpbuf:].
         Its sender is the VM's state prior to the callback.  Reestablish that state (via longjmp),
         and mark callbackMethodContext as dead."
        <export: true>
        <var: #vmCallbackContext type: #'VMCallbackContext *'>
        | calloutMethodContext theFP thePage |
        <var: #theFP type: #'char *'>
        <var: #thePage type: #'StackPage *'>
        ((self isIntegerObject: returnTypeOop)
         and: [self isLiveContext: callbackMethodContext]) ifFalse:
                [^false].
        calloutMethodContext := self externalInstVar: SenderIndex ofContext: callbackMethodContext.
        (self isLiveContext: calloutMethodContext) ifFalse:
                [^false].
        "We're about to leave this stack page; must save the current frame's instructionPointer."
        self push: instructionPointer.
        self externalWriteBackHeadFramePointers.
        "Mark callbackMethodContext as dead; the common case is that it is the current frame.
         We go the extra mile for the debugger."
        (self isSingleContext: callbackMethodContext)
                ifTrue: [self markContextAsDead: callbackMethodContext]
                ifFalse:
                        [theFP := self frameOfMarriedContext: callbackMethodContext.
                         framePointer = theFP "common case"
                                ifTrue:
                                        [(self isBaseFrame: theFP)
                                                ifTrue: [stackPages freeStackPage: stackPage]
                                                ifFalse: "calloutMethodContext is immediately below on the same page.  Make it current."
                                                        [instructionPointer := (self frameCallerSavedIP: framePointer) asUnsignedInteger.
                                                         stackPointer := framePointer + (self frameStackedReceiverOffset: framePointer) + objectMemory wordSize.
                                                         framePointer := self frameCallerFP: framePointer.
                                                         self setMethod: (self frameMethodObject: framePointer).
                                                         self restoreCStackStateForCallbackContext: vmCallbackContext.
                                                         "N.B. siglongjmp is defines as _longjmp on non-win32 platforms.
                                                          This matches the use of _setjmp in ia32abicc.c."
                                                         self siglong: vmCallbackContext trampoline jmp: (self integerValueOf: returnTypeOop).
                                                         ^true]]
                                ifFalse:
                                        [self externalDivorceFrame: theFP andContext: callbackMethodContext.
                                         self markContextAsDead: callbackMethodContext]].
        "Make the calloutMethodContext the active frame.  The case where calloutMethodContext
         is immediately below callbackMethodContext on the same page is handled above."
        (self isStillMarriedContext: calloutMethodContext)
                ifTrue:
                        [theFP := self frameOfMarriedContext: calloutMethodContext.
                         thePage := stackPages stackPageFor: theFP.
                         "findSPOf:on: points to the word beneath the instructionPointer, but
                          there is no instructionPointer on the top frame of the current page."
                         self assert: thePage ~= stackPage.
                         stackPointer := (self findSPOf: theFP on: thePage) - objectMemory wordSize.
                         framePointer := theFP]
                ifFalse:
                        [thePage := self makeBaseFrameFor: calloutMethodContext.
                         framePointer := thePage headFP.
                         stackPointer := thePage headSP].
        instructionPointer := self popStack.
        self setMethod: (objectMemory fetchPointer: MethodIndex ofObject: calloutMethodContext).
        self setStackPageAndLimit: thePage.
        self restoreCStackStateForCallbackContext: vmCallbackContext.
         "N.B. siglongjmp is defines as _longjmp on non-win32 platforms.
          This matches the use of _setjmp in ia32abicc.c."
        self siglong: vmCallbackContext trampoline jmp: (self integerValueOf: returnTypeOop).
        "NOTREACHED"
        ^true

with the first siglongjmp callbacks are passing fine.
with the last (it would be if  framePointer = theFP AND !(isBaseFrame: theFP) ) it doesn’t.

So… from here I’m a bit lost… I need some help :)

thanks,
Esteban



Reply | Threaded
Open this post in threaded view
|

Re: latest changes (no idea from when it started) are making FFI win32 crash (and FFI Callbacks are not reliable anymore either)

Clément Béra
 
Hi,

Can you confirm this bug happen only in Windows ?

Do you have version number (both VMMaker and git commit) of the last version you have that was working ?

Thanks.


On Tue, Nov 29, 2016 at 11:54 AM, Esteban Lorenzano <[hidden email]> wrote:

Hi,

So, I’m building the PharoVM along with all his dependencies. For me, this is a major step because I can drop the old build process finally.
Now, I’m having serious problems with FFI (that they were not present before), :


1. CRASH IN WINDOWS (32bits):

In Win32, it crashes automatically when trying to access this funtion:

getEnvSize: nameString
        ^ self ffiCall: #( int GetEnvironmentVariableA ( String nameString, nil, 0 ) ) module: #Kernel32

 (this works perfectly fine in older versions)

2. CALLBACKS FAILING:

Callbacks have problems. The examples passes but they are very simple… as soon as I try to do something complicates (like unqlite bindings or libgit2 bindings, who use callbacks intensively), callbacks stops working.
I traced the problem up to this method:

StackInterpreter>>#returnAs:ThroughCallback:Context:

returnAs: returnTypeOop ThroughCallback: vmCallbackContext Context: callbackMethodContext
        "callbackMethodContext is an activation of invokeCallback:[stack:registers:jmpbuf:].
         Its sender is the VM's state prior to the callback.  Reestablish that state (via longjmp),
         and mark callbackMethodContext as dead."
        <export: true>
        <var: #vmCallbackContext type: #'VMCallbackContext *'>
        | calloutMethodContext theFP thePage |
        <var: #theFP type: #'char *'>
        <var: #thePage type: #'StackPage *'>
        ((self isIntegerObject: returnTypeOop)
         and: [self isLiveContext: callbackMethodContext]) ifFalse:
                [^false].
        calloutMethodContext := self externalInstVar: SenderIndex ofContext: callbackMethodContext.
        (self isLiveContext: calloutMethodContext) ifFalse:
                [^false].
        "We're about to leave this stack page; must save the current frame's instructionPointer."
        self push: instructionPointer.
        self externalWriteBackHeadFramePointers.
        "Mark callbackMethodContext as dead; the common case is that it is the current frame.
         We go the extra mile for the debugger."
        (self isSingleContext: callbackMethodContext)
                ifTrue: [self markContextAsDead: callbackMethodContext]
                ifFalse:
                        [theFP := self frameOfMarriedContext: callbackMethodContext.
                         framePointer = theFP "common case"
                                ifTrue:
                                        [(self isBaseFrame: theFP)
                                                ifTrue: [stackPages freeStackPage: stackPage]
                                                ifFalse: "calloutMethodContext is immediately below on the same page.  Make it current."
                                                        [instructionPointer := (self frameCallerSavedIP: framePointer) asUnsignedInteger.
                                                         stackPointer := framePointer + (self frameStackedReceiverOffset: framePointer) + objectMemory wordSize.
                                                         framePointer := self frameCallerFP: framePointer.
                                                         self setMethod: (self frameMethodObject: framePointer).
                                                         self restoreCStackStateForCallbackContext: vmCallbackContext.
                                                         "N.B. siglongjmp is defines as _longjmp on non-win32 platforms.
                                                          This matches the use of _setjmp in ia32abicc.c."
                                                         self siglong: vmCallbackContext trampoline jmp: (self integerValueOf: returnTypeOop).
                                                         ^true]]
                                ifFalse:
                                        [self externalDivorceFrame: theFP andContext: callbackMethodContext.
                                         self markContextAsDead: callbackMethodContext]].
        "Make the calloutMethodContext the active frame.  The case where calloutMethodContext
         is immediately below callbackMethodContext on the same page is handled above."
        (self isStillMarriedContext: calloutMethodContext)
                ifTrue:
                        [theFP := self frameOfMarriedContext: calloutMethodContext.
                         thePage := stackPages stackPageFor: theFP.
                         "findSPOf:on: points to the word beneath the instructionPointer, but
                          there is no instructionPointer on the top frame of the current page."
                         self assert: thePage ~= stackPage.
                         stackPointer := (self findSPOf: theFP on: thePage) - objectMemory wordSize.
                         framePointer := theFP]
                ifFalse:
                        [thePage := self makeBaseFrameFor: calloutMethodContext.
                         framePointer := thePage headFP.
                         stackPointer := thePage headSP].
        instructionPointer := self popStack.
        self setMethod: (objectMemory fetchPointer: MethodIndex ofObject: calloutMethodContext).
        self setStackPageAndLimit: thePage.
        self restoreCStackStateForCallbackContext: vmCallbackContext.
         "N.B. siglongjmp is defines as _longjmp on non-win32 platforms.
          This matches the use of _setjmp in ia32abicc.c."
        self siglong: vmCallbackContext trampoline jmp: (self integerValueOf: returnTypeOop).
        "NOTREACHED"
        ^true

with the first siglongjmp callbacks are passing fine.
with the last (it would be if  framePointer = theFP AND !(isBaseFrame: theFP) ) it doesn’t.

So… from here I’m a bit lost… I need some help :)

thanks,
Esteban




Reply | Threaded
Open this post in threaded view
|

Re: latest changes (no idea from when it started) are making FFI win32 crash (and FFI Callbacks are not reliable anymore either)

EstebanLM
 

On 29 Nov 2016, at 13:04, Clément Bera <[hidden email]> wrote:

Hi,

Can you confirm this bug happen only in Windows ?

yes, the crash is just in windows.
the callback problem is general (note that FFICallbackTests works fine, but I think this is related to the fact that it never enters the 2nd condition with the qsort function) .


Do you have version number (both VMMaker and git commit) of the last version you have that was working ?

sadly, not… I tried to get the latest working version, but with the mess I have to get the VM to build with opensmalltalk-vm, I couldn’t track it. 
I suspect is related to the work on 64bits for windows, but I have no proof of that :P

Esteban


Thanks.


On Tue, Nov 29, 2016 at 11:54 AM, Esteban Lorenzano <[hidden email]> wrote:

Hi,

So, I’m building the PharoVM along with all his dependencies. For me, this is a major step because I can drop the old build process finally.
Now, I’m having serious problems with FFI (that they were not present before), :


1. CRASH IN WINDOWS (32bits):

In Win32, it crashes automatically when trying to access this funtion:

getEnvSize: nameString
        ^ self ffiCall: #( int GetEnvironmentVariableA ( String nameString, nil, 0 ) ) module: #Kernel32

 (this works perfectly fine in older versions)

2. CALLBACKS FAILING:

Callbacks have problems. The examples passes but they are very simple… as soon as I try to do something complicates (like unqlite bindings or libgit2 bindings, who use callbacks intensively), callbacks stops working.
I traced the problem up to this method:

StackInterpreter>>#returnAs:ThroughCallback:Context:

returnAs: returnTypeOop ThroughCallback: vmCallbackContext Context: callbackMethodContext
        "callbackMethodContext is an activation of invokeCallback:[stack:registers:jmpbuf:].
         Its sender is the VM's state prior to the callback.  Reestablish that state (via longjmp),
         and mark callbackMethodContext as dead."
        <export: true>
        <var: #vmCallbackContext type: #'VMCallbackContext *'>
        | calloutMethodContext theFP thePage |
        <var: #theFP type: #'char *'>
        <var: #thePage type: #'StackPage *'>
        ((self isIntegerObject: returnTypeOop)
         and: [self isLiveContext: callbackMethodContext]) ifFalse:
                [^false].
        calloutMethodContext := self externalInstVar: SenderIndex ofContext: callbackMethodContext.
        (self isLiveContext: calloutMethodContext) ifFalse:
                [^false].
        "We're about to leave this stack page; must save the current frame's instructionPointer."
        self push: instructionPointer.
        self externalWriteBackHeadFramePointers.
        "Mark callbackMethodContext as dead; the common case is that it is the current frame.
         We go the extra mile for the debugger."
        (self isSingleContext: callbackMethodContext)
                ifTrue: [self markContextAsDead: callbackMethodContext]
                ifFalse:
                        [theFP := self frameOfMarriedContext: callbackMethodContext.
                         framePointer = theFP "common case"
                                ifTrue:
                                        [(self isBaseFrame: theFP)
                                                ifTrue: [stackPages freeStackPage: stackPage]
                                                ifFalse: "calloutMethodContext is immediately below on the same page.  Make it current."
                                                        [instructionPointer := (self frameCallerSavedIP: framePointer) asUnsignedInteger.
                                                         stackPointer := framePointer + (self frameStackedReceiverOffset: framePointer) + objectMemory wordSize.
                                                         framePointer := self frameCallerFP: framePointer.
                                                         self setMethod: (self frameMethodObject: framePointer).
                                                         self restoreCStackStateForCallbackContext: vmCallbackContext.
                                                         "N.B. siglongjmp is defines as _longjmp on non-win32 platforms.
                                                          This matches the use of _setjmp in ia32abicc.c."
                                                         self siglong: vmCallbackContext trampoline jmp: (self integerValueOf: returnTypeOop).
                                                         ^true]]
                                ifFalse:
                                        [self externalDivorceFrame: theFP andContext: callbackMethodContext.
                                         self markContextAsDead: callbackMethodContext]].
        "Make the calloutMethodContext the active frame.  The case where calloutMethodContext
         is immediately below callbackMethodContext on the same page is handled above."
        (self isStillMarriedContext: calloutMethodContext)
                ifTrue:
                        [theFP := self frameOfMarriedContext: calloutMethodContext.
                         thePage := stackPages stackPageFor: theFP.
                         "findSPOf:on: points to the word beneath the instructionPointer, but
                          there is no instructionPointer on the top frame of the current page."
                         self assert: thePage ~= stackPage.
                         stackPointer := (self findSPOf: theFP on: thePage) - objectMemory wordSize.
                         framePointer := theFP]
                ifFalse:
                        [thePage := self makeBaseFrameFor: calloutMethodContext.
                         framePointer := thePage headFP.
                         stackPointer := thePage headSP].
        instructionPointer := self popStack.
        self setMethod: (objectMemory fetchPointer: MethodIndex ofObject: calloutMethodContext).
        self setStackPageAndLimit: thePage.
        self restoreCStackStateForCallbackContext: vmCallbackContext.
         "N.B. siglongjmp is defines as _longjmp on non-win32 platforms.
          This matches the use of _setjmp in ia32abicc.c."
        self siglong: vmCallbackContext trampoline jmp: (self integerValueOf: returnTypeOop).
        "NOTREACHED"
        ^true

with the first siglongjmp callbacks are passing fine.
with the last (it would be if  framePointer = theFP AND !(isBaseFrame: theFP) ) it doesn’t.

So… from here I’m a bit lost… I need some help :)

thanks,
Esteban





Reply | Threaded
Open this post in threaded view
|

Re: latest changes (no idea from when it started) are making FFI win32 crash (and FFI Callbacks are not reliable anymore either)

Ronie Salgado
 
The last week I was having this exactly same crash in the MinimalisticHeadless branch, with both MinGW and with Visual Studio. I managed to get the VM working with MinGW (not yet with MSVC) by using the following defines,which I copied from the old Pharo CMake scripts:

-DSTACK_ALIGN_BYTES=16 -DALLOCA_LIES_SO_USE_GETSP=0

In the pharo-vm, the CogFamilyWindowsConfig >> #commonCompilerFlags method starts with the following comment:
commonCompilerFlags
    "omit -ggdb2 to prevent generating debug info"
    "Some flags explanation:
   
    STACK_ALIGN_BYTES=16 is needed in mingw and FFI (and I suppose on other modules too).
    DALLOCA_LIES_SO_USE_GETSP=0 Some compilers return the stack address+4 on alloca function,
    then FFI module needs to adjust that. It is NOT the case of mingw.
    For more information see this thread: http://forum.world.st/There-are-something-fishy-with-FFI-plugin-td4584226.html
    "


2016-11-29 9:32 GMT-03:00 Esteban Lorenzano <[hidden email]>:
 

On 29 Nov 2016, at 13:04, Clément Bera <[hidden email]> wrote:

Hi,

Can you confirm this bug happen only in Windows ?

yes, the crash is just in windows.
the callback problem is general (note that FFICallbackTests works fine, but I think this is related to the fact that it never enters the 2nd condition with the qsort function) .


Do you have version number (both VMMaker and git commit) of the last version you have that was working ?

sadly, not… I tried to get the latest working version, but with the mess I have to get the VM to build with opensmalltalk-vm, I couldn’t track it. 
I suspect is related to the work on 64bits for windows, but I have no proof of that :P

Esteban


Thanks.


On Tue, Nov 29, 2016 at 11:54 AM, Esteban Lorenzano <[hidden email]> wrote:

Hi,

So, I’m building the PharoVM along with all his dependencies. For me, this is a major step because I can drop the old build process finally.
Now, I’m having serious problems with FFI (that they were not present before), :


1. CRASH IN WINDOWS (32bits):

In Win32, it crashes automatically when trying to access this funtion:

getEnvSize: nameString
        ^ self ffiCall: #( int GetEnvironmentVariableA ( String nameString, nil, 0 ) ) module: #Kernel32

 (this works perfectly fine in older versions)

2. CALLBACKS FAILING:

Callbacks have problems. The examples passes but they are very simple… as soon as I try to do something complicates (like unqlite bindings or libgit2 bindings, who use callbacks intensively), callbacks stops working.
I traced the problem up to this method:

StackInterpreter>>#returnAs:ThroughCallback:Context:

returnAs: returnTypeOop ThroughCallback: vmCallbackContext Context: callbackMethodContext
        "callbackMethodContext is an activation of invokeCallback:[stack:registers:jmpbuf:].
         Its sender is the VM's state prior to the callback.  Reestablish that state (via longjmp),
         and mark callbackMethodContext as dead."
        <export: true>
        <var: #vmCallbackContext type: #'VMCallbackContext *'>
        | calloutMethodContext theFP thePage |
        <var: #theFP type: #'char *'>
        <var: #thePage type: #'StackPage *'>
        ((self isIntegerObject: returnTypeOop)
         and: [self isLiveContext: callbackMethodContext]) ifFalse:
                [^false].
        calloutMethodContext := self externalInstVar: SenderIndex ofContext: callbackMethodContext.
        (self isLiveContext: calloutMethodContext) ifFalse:
                [^false].
        "We're about to leave this stack page; must save the current frame's instructionPointer."
        self push: instructionPointer.
        self externalWriteBackHeadFramePointers.
        "Mark callbackMethodContext as dead; the common case is that it is the current frame.
         We go the extra mile for the debugger."
        (self isSingleContext: callbackMethodContext)
                ifTrue: [self markContextAsDead: callbackMethodContext]
                ifFalse:
                        [theFP := self frameOfMarriedContext: callbackMethodContext.
                         framePointer = theFP "common case"
                                ifTrue:
                                        [(self isBaseFrame: theFP)
                                                ifTrue: [stackPages freeStackPage: stackPage]
                                                ifFalse: "calloutMethodContext is immediately below on the same page.  Make it current."
                                                        [instructionPointer := (self frameCallerSavedIP: framePointer) asUnsignedInteger.
                                                         stackPointer := framePointer + (self frameStackedReceiverOffset: framePointer) + objectMemory wordSize.
                                                         framePointer := self frameCallerFP: framePointer.
                                                         self setMethod: (self frameMethodObject: framePointer).
                                                         self restoreCStackStateForCallbackContext: vmCallbackContext.
                                                         "N.B. siglongjmp is defines as _longjmp on non-win32 platforms.
                                                          This matches the use of _setjmp in ia32abicc.c."
                                                         self siglong: vmCallbackContext trampoline jmp: (self integerValueOf: returnTypeOop).
                                                         ^true]]
                                ifFalse:
                                        [self externalDivorceFrame: theFP andContext: callbackMethodContext.
                                         self markContextAsDead: callbackMethodContext]].
        "Make the calloutMethodContext the active frame.  The case where calloutMethodContext
         is immediately below callbackMethodContext on the same page is handled above."
        (self isStillMarriedContext: calloutMethodContext)
                ifTrue:
                        [theFP := self frameOfMarriedContext: calloutMethodContext.
                         thePage := stackPages stackPageFor: theFP.
                         "findSPOf:on: points to the word beneath the instructionPointer, but
                          there is no instructionPointer on the top frame of the current page."
                         self assert: thePage ~= stackPage.
                         stackPointer := (self findSPOf: theFP on: thePage) - objectMemory wordSize.
                         framePointer := theFP]
                ifFalse:
                        [thePage := self makeBaseFrameFor: calloutMethodContext.
                         framePointer := thePage headFP.
                         stackPointer := thePage headSP].
        instructionPointer := self popStack.
        self setMethod: (objectMemory fetchPointer: MethodIndex ofObject: calloutMethodContext).
        self setStackPageAndLimit: thePage.
        self restoreCStackStateForCallbackContext: vmCallbackContext.
         "N.B. siglongjmp is defines as _longjmp on non-win32 platforms.
          This matches the use of _setjmp in ia32abicc.c."
        self siglong: vmCallbackContext trampoline jmp: (self integerValueOf: returnTypeOop).
        "NOTREACHED"
        ^true

with the first siglongjmp callbacks are passing fine.
with the last (it would be if  framePointer = theFP AND !(isBaseFrame: theFP) ) it doesn’t.

So… from here I’m a bit lost… I need some help :)

thanks,
Esteban







Reply | Threaded
Open this post in threaded view
|

Re: latest changes (no idea from when it started) are making FFI win32 crash (and FFI Callbacks are not reliable anymore either)

EstebanLM
 
hah! 
you know what is the sad part of this? I wrote that message… it was for the future me, but I forget to check our flags :P
I lost 2.5 days then + 2 days now. 

this fixes the problem with Windows crashes (yay!) but not the problem with callbacks (booo!)… any idea in that area?

cheers, 
Esteban

On 29 Nov 2016, at 17:30, Ronie Salgado <[hidden email]> wrote:

The last week I was having this exactly same crash in the MinimalisticHeadless branch, with both MinGW and with Visual Studio. I managed to get the VM working with MinGW (not yet with MSVC) by using the following defines,which I copied from the old Pharo CMake scripts:

-DSTACK_ALIGN_BYTES=16 -DALLOCA_LIES_SO_USE_GETSP=0

In the pharo-vm, the CogFamilyWindowsConfig >> #commonCompilerFlags method starts with the following comment:
commonCompilerFlags
    "omit -ggdb2 to prevent generating debug info"
    "Some flags explanation:
   
    STACK_ALIGN_BYTES=16 is needed in mingw and FFI (and I suppose on other modules too).
    DALLOCA_LIES_SO_USE_GETSP=0 Some compilers return the stack address+4 on alloca function,
    then FFI module needs to adjust that. It is NOT the case of mingw.
    For more information see this thread: http://forum.world.st/There-are-something-fishy-with-FFI-plugin-td4584226.html
    "


2016-11-29 9:32 GMT-03:00 Esteban Lorenzano <[hidden email]>:
 

On 29 Nov 2016, at 13:04, Clément Bera <[hidden email]> wrote:

Hi,

Can you confirm this bug happen only in Windows ?

yes, the crash is just in windows.
the callback problem is general (note that FFICallbackTests works fine, but I think this is related to the fact that it never enters the 2nd condition with the qsort function) .


Do you have version number (both VMMaker and git commit) of the last version you have that was working ?

sadly, not… I tried to get the latest working version, but with the mess I have to get the VM to build with opensmalltalk-vm, I couldn’t track it. 
I suspect is related to the work on 64bits for windows, but I have no proof of that :P

Esteban


Thanks.


On Tue, Nov 29, 2016 at 11:54 AM, Esteban Lorenzano <[hidden email]> wrote:

Hi,

So, I’m building the PharoVM along with all his dependencies. For me, this is a major step because I can drop the old build process finally.
Now, I’m having serious problems with FFI (that they were not present before), :


1. CRASH IN WINDOWS (32bits):

In Win32, it crashes automatically when trying to access this funtion:

getEnvSize: nameString
        ^ self ffiCall: #( int GetEnvironmentVariableA ( String nameString, nil, 0 ) ) module: #Kernel32

 (this works perfectly fine in older versions)

2. CALLBACKS FAILING:

Callbacks have problems. The examples passes but they are very simple… as soon as I try to do something complicates (like unqlite bindings or libgit2 bindings, who use callbacks intensively), callbacks stops working.
I traced the problem up to this method:

StackInterpreter>>#returnAs:ThroughCallback:Context:

returnAs: returnTypeOop ThroughCallback: vmCallbackContext Context: callbackMethodContext
        "callbackMethodContext is an activation of invokeCallback:[stack:registers:jmpbuf:].
         Its sender is the VM's state prior to the callback.  Reestablish that state (via longjmp),
         and mark callbackMethodContext as dead."
        <export: true>
        <var: #vmCallbackContext type: #'VMCallbackContext *'>
        | calloutMethodContext theFP thePage |
        <var: #theFP type: #'char *'>
        <var: #thePage type: #'StackPage *'>
        ((self isIntegerObject: returnTypeOop)
         and: [self isLiveContext: callbackMethodContext]) ifFalse:
                [^false].
        calloutMethodContext := self externalInstVar: SenderIndex ofContext: callbackMethodContext.
        (self isLiveContext: calloutMethodContext) ifFalse:
                [^false].
        "We're about to leave this stack page; must save the current frame's instructionPointer."
        self push: instructionPointer.
        self externalWriteBackHeadFramePointers.
        "Mark callbackMethodContext as dead; the common case is that it is the current frame.
         We go the extra mile for the debugger."
        (self isSingleContext: callbackMethodContext)
                ifTrue: [self markContextAsDead: callbackMethodContext]
                ifFalse:
                        [theFP := self frameOfMarriedContext: callbackMethodContext.
                         framePointer = theFP "common case"
                                ifTrue:
                                        [(self isBaseFrame: theFP)
                                                ifTrue: [stackPages freeStackPage: stackPage]
                                                ifFalse: "calloutMethodContext is immediately below on the same page.  Make it current."
                                                        [instructionPointer := (self frameCallerSavedIP: framePointer) asUnsignedInteger.
                                                         stackPointer := framePointer + (self frameStackedReceiverOffset: framePointer) + objectMemory wordSize.
                                                         framePointer := self frameCallerFP: framePointer.
                                                         self setMethod: (self frameMethodObject: framePointer).
                                                         self restoreCStackStateForCallbackContext: vmCallbackContext.
                                                         "N.B. siglongjmp is defines as _longjmp on non-win32 platforms.
                                                          This matches the use of _setjmp in ia32abicc.c."
                                                         self siglong: vmCallbackContext trampoline jmp: (self integerValueOf: returnTypeOop).
                                                         ^true]]
                                ifFalse:
                                        [self externalDivorceFrame: theFP andContext: callbackMethodContext.
                                         self markContextAsDead: callbackMethodContext]].
        "Make the calloutMethodContext the active frame.  The case where calloutMethodContext
         is immediately below callbackMethodContext on the same page is handled above."
        (self isStillMarriedContext: calloutMethodContext)
                ifTrue:
                        [theFP := self frameOfMarriedContext: calloutMethodContext.
                         thePage := stackPages stackPageFor: theFP.
                         "findSPOf:on: points to the word beneath the instructionPointer, but
                          there is no instructionPointer on the top frame of the current page."
                         self assert: thePage ~= stackPage.
                         stackPointer := (self findSPOf: theFP on: thePage) - objectMemory wordSize.
                         framePointer := theFP]
                ifFalse:
                        [thePage := self makeBaseFrameFor: calloutMethodContext.
                         framePointer := thePage headFP.
                         stackPointer := thePage headSP].
        instructionPointer := self popStack.
        self setMethod: (objectMemory fetchPointer: MethodIndex ofObject: calloutMethodContext).
        self setStackPageAndLimit: thePage.
        self restoreCStackStateForCallbackContext: vmCallbackContext.
         "N.B. siglongjmp is defines as _longjmp on non-win32 platforms.
          This matches the use of _setjmp in ia32abicc.c."
        self siglong: vmCallbackContext trampoline jmp: (self integerValueOf: returnTypeOop).
        "NOTREACHED"
        ^true

with the first siglongjmp callbacks are passing fine.
with the last (it would be if  framePointer = theFP AND !(isBaseFrame: theFP) ) it doesn’t.

So… from here I’m a bit lost… I need some help :)

thanks,
Esteban








Reply | Threaded
Open this post in threaded view
|

Re: latest changes (no idea from when it started) are making FFI win32 crash (and FFI Callbacks are not reliable anymore either)

Nicolas Cellier
 
Thanks Ronie and Esteban.
This seems to be an alignment problem indeed.
What I see is that alignment is defined at least in 3 different places:
- platforms/Cross/vm/sqCogStackAlignment.h
- platforms/Cross/plugins/IA32ABI/ia32abicc.c
- src/plugins/SqueakFFIPrims/IA32FFIPlugin.c and friends...
That's just too many different opinions!!! We have to unify that rather than adding a 4th opinion in a Makefile.

However, about ALLOCA_LIES_SO_USE_GETSP, I'm not so sure that "It is NOT the case of mingw."
Last time I used gdb, it WAS still the case, alloca was STILL lying.
See http://lists.squeakfoundation.org/pipermail/vm-dev/2016-August/022985.html

BUT:
-----
forcing 16 bytes alignment supersedes the alloca hack, making it not strictly necessary anymore
see below in generated src/plgins/IA32FFIPlugin.c:

        allocation = alloca(((stackSize + ((calloutState->structReturnSize)))) + (cStackAlignment()));
        if (allocaLiesSoUseGetsp()) {
                allocation = getsp();
        }
        if ((cStackAlignment()) != 0) {
                allocation = ((char *) ((((((usqInt)allocation)) | ((cStackAlignment()) - 1)) - ((cStackAlignment()) - 1))));
        }
        (calloutState->argVector = allocation);

but we further do:

        if ((0 + (cStackAlignment())) > 0) {
                setsp((calloutState->argVector));
        }

So if ever the stack pointer is greater than alloca return value, but we removed the ALLOCA_LIES hack,
the stack pointer is then set back to alloca returned value, avoiding the stack pointer offset problem
It would be worth writing  a unit test case, and inquiring the reason why it lies in gcc mailing list to be sure...

cheers

2016-11-29 18:14 GMT+01:00 Esteban Lorenzano <[hidden email]>:
 
hah! 
you know what is the sad part of this? I wrote that message… it was for the future me, but I forget to check our flags :P
I lost 2.5 days then + 2 days now. 

this fixes the problem with Windows crashes (yay!) but not the problem with callbacks (booo!)… any idea in that area?

cheers, 
Esteban

On 29 Nov 2016, at 17:30, Ronie Salgado <[hidden email]> wrote:

The last week I was having this exactly same crash in the MinimalisticHeadless branch, with both MinGW and with Visual Studio. I managed to get the VM working with MinGW (not yet with MSVC) by using the following defines,which I copied from the old Pharo CMake scripts:

-DSTACK_ALIGN_BYTES=16 -DALLOCA_LIES_SO_USE_GETSP=0

In the pharo-vm, the CogFamilyWindowsConfig >> #commonCompilerFlags method starts with the following comment:
commonCompilerFlags
    "omit -ggdb2 to prevent generating debug info"
    "Some flags explanation:
   
    STACK_ALIGN_BYTES=16 is needed in mingw and FFI (and I suppose on other modules too).
    DALLOCA_LIES_SO_USE_GETSP=0 Some compilers return the stack address+4 on alloca function,
    then FFI module needs to adjust that. It is NOT the case of mingw.
    For more information see this thread: http://forum.world.st/There-are-something-fishy-with-FFI-plugin-td4584226.html
    "


2016-11-29 9:32 GMT-03:00 Esteban Lorenzano <[hidden email]>:
 

On 29 Nov 2016, at 13:04, Clément Bera <[hidden email]> wrote:

Hi,

Can you confirm this bug happen only in Windows ?

yes, the crash is just in windows.
the callback problem is general (note that FFICallbackTests works fine, but I think this is related to the fact that it never enters the 2nd condition with the qsort function) .


Do you have version number (both VMMaker and git commit) of the last version you have that was working ?

sadly, not… I tried to get the latest working version, but with the mess I have to get the VM to build with opensmalltalk-vm, I couldn’t track it. 
I suspect is related to the work on 64bits for windows, but I have no proof of that :P

Esteban


Thanks.


On Tue, Nov 29, 2016 at 11:54 AM, Esteban Lorenzano <[hidden email]> wrote:

Hi,

So, I’m building the PharoVM along with all his dependencies. For me, this is a major step because I can drop the old build process finally.
Now, I’m having serious problems with FFI (that they were not present before), :


1. CRASH IN WINDOWS (32bits):

In Win32, it crashes automatically when trying to access this funtion:

getEnvSize: nameString
        ^ self ffiCall: #( int GetEnvironmentVariableA ( String nameString, nil, 0 ) ) module: #Kernel32

 (this works perfectly fine in older versions)

2. CALLBACKS FAILING:

Callbacks have problems. The examples passes but they are very simple… as soon as I try to do something complicates (like unqlite bindings or libgit2 bindings, who use callbacks intensively), callbacks stops working.
I traced the problem up to this method:

StackInterpreter>>#returnAs:ThroughCallback:Context:

returnAs: returnTypeOop ThroughCallback: vmCallbackContext Context: callbackMethodContext
        "callbackMethodContext is an activation of invokeCallback:[stack:registers:jmpbuf:].
         Its sender is the VM's state prior to the callback.  Reestablish that state (via longjmp),
         and mark callbackMethodContext as dead."
        <export: true>
        <var: #vmCallbackContext type: #'VMCallbackContext *'>
        | calloutMethodContext theFP thePage |
        <var: #theFP type: #'char *'>
        <var: #thePage type: #'StackPage *'>
        ((self isIntegerObject: returnTypeOop)
         and: [self isLiveContext: callbackMethodContext]) ifFalse:
                [^false].
        calloutMethodContext := self externalInstVar: SenderIndex ofContext: callbackMethodContext.
        (self isLiveContext: calloutMethodContext) ifFalse:
                [^false].
        "We're about to leave this stack page; must save the current frame's instructionPointer."
        self push: instructionPointer.
        self externalWriteBackHeadFramePointers.
        "Mark callbackMethodContext as dead; the common case is that it is the current frame.
         We go the extra mile for the debugger."
        (self isSingleContext: callbackMethodContext)
                ifTrue: [self markContextAsDead: callbackMethodContext]
                ifFalse:
                        [theFP := self frameOfMarriedContext: callbackMethodContext.
                         framePointer = theFP "common case"
                                ifTrue:
                                        [(self isBaseFrame: theFP)
                                                ifTrue: [stackPages freeStackPage: stackPage]
                                                ifFalse: "calloutMethodContext is immediately below on the same page.  Make it current."
                                                        [instructionPointer := (self frameCallerSavedIP: framePointer) asUnsignedInteger.
                                                         stackPointer := framePointer + (self frameStackedReceiverOffset: framePointer) + objectMemory wordSize.
                                                         framePointer := self frameCallerFP: framePointer.
                                                         self setMethod: (self frameMethodObject: framePointer).
                                                         self restoreCStackStateForCallbackContext: vmCallbackContext.
                                                         "N.B. siglongjmp is defines as _longjmp on non-win32 platforms.
                                                          This matches the use of _setjmp in ia32abicc.c."
                                                         self siglong: vmCallbackContext trampoline jmp: (self integerValueOf: returnTypeOop).
                                                         ^true]]
                                ifFalse:
                                        [self externalDivorceFrame: theFP andContext: callbackMethodContext.
                                         self markContextAsDead: callbackMethodContext]].
        "Make the calloutMethodContext the active frame.  The case where calloutMethodContext
         is immediately below callbackMethodContext on the same page is handled above."
        (self isStillMarriedContext: calloutMethodContext)
                ifTrue:
                        [theFP := self frameOfMarriedContext: calloutMethodContext.
                         thePage := stackPages stackPageFor: theFP.
                         "findSPOf:on: points to the word beneath the instructionPointer, but
                          there is no instructionPointer on the top frame of the current page."
                         self assert: thePage ~= stackPage.
                         stackPointer := (self findSPOf: theFP on: thePage) - objectMemory wordSize.
                         framePointer := theFP]
                ifFalse:
                        [thePage := self makeBaseFrameFor: calloutMethodContext.
                         framePointer := thePage headFP.
                         stackPointer := thePage headSP].
        instructionPointer := self popStack.
        self setMethod: (objectMemory fetchPointer: MethodIndex ofObject: calloutMethodContext).
        self setStackPageAndLimit: thePage.
        self restoreCStackStateForCallbackContext: vmCallbackContext.
         "N.B. siglongjmp is defines as _longjmp on non-win32 platforms.
          This matches the use of _setjmp in ia32abicc.c."
        self siglong: vmCallbackContext trampoline jmp: (self integerValueOf: returnTypeOop).
        "NOTREACHED"
        ^true

with the first siglongjmp callbacks are passing fine.
with the last (it would be if  framePointer = theFP AND !(isBaseFrame: theFP) ) it doesn’t.

So… from here I’m a bit lost… I need some help :)

thanks,
Esteban










Reply | Threaded
Open this post in threaded view
|

Re: latest changes (no idea from when it started) are making FFI win32 crash (and FFI Callbacks are not reliable anymore either)

Nicolas Cellier
 
Though, it's necessary to define ALLOCA_LIES_SO_USE_GETSP to zero to make FFI work with gcc.
That does not mean that alloca does not lie, just that there is another problem with stack management...

2016-11-29 21:22 GMT+01:00 Nicolas Cellier <[hidden email]>:
Thanks Ronie and Esteban.
This seems to be an alignment problem indeed.
What I see is that alignment is defined at least in 3 different places:
- platforms/Cross/vm/sqCogStackAlignment.h
- platforms/Cross/plugins/IA32ABI/ia32abicc.c
- src/plugins/SqueakFFIPrims/IA32FFIPlugin.c and friends...
That's just too many different opinions!!! We have to unify that rather than adding a 4th opinion in a Makefile.

However, about ALLOCA_LIES_SO_USE_GETSP, I'm not so sure that "It is NOT the case of mingw."
Last time I used gdb, it WAS still the case, alloca was STILL lying.
See http://lists.squeakfoundation.org/pipermail/vm-dev/2016-August/022985.html

BUT:
-----
forcing 16 bytes alignment supersedes the alloca hack, making it not strictly necessary anymore
see below in generated src/plgins/IA32FFIPlugin.c:

        allocation = alloca(((stackSize + ((calloutState->structReturnSize)))) + (cStackAlignment()));
        if (allocaLiesSoUseGetsp()) {
                allocation = getsp();
        }
        if ((cStackAlignment()) != 0) {
                allocation = ((char *) ((((((usqInt)allocation)) | ((cStackAlignment()) - 1)) - ((cStackAlignment()) - 1))));
        }
        (calloutState->argVector = allocation);

but we further do:

        if ((0 + (cStackAlignment())) > 0) {
                setsp((calloutState->argVector));
        }

So if ever the stack pointer is greater than alloca return value, but we removed the ALLOCA_LIES hack,
the stack pointer is then set back to alloca returned value, avoiding the stack pointer offset problem
It would be worth writing  a unit test case, and inquiring the reason why it lies in gcc mailing list to be sure...

cheers

2016-11-29 18:14 GMT+01:00 Esteban Lorenzano <[hidden email]>:
 
hah! 
you know what is the sad part of this? I wrote that message… it was for the future me, but I forget to check our flags :P
I lost 2.5 days then + 2 days now. 

this fixes the problem with Windows crashes (yay!) but not the problem with callbacks (booo!)… any idea in that area?

cheers, 
Esteban

On 29 Nov 2016, at 17:30, Ronie Salgado <[hidden email]> wrote:

The last week I was having this exactly same crash in the MinimalisticHeadless branch, with both MinGW and with Visual Studio. I managed to get the VM working with MinGW (not yet with MSVC) by using the following defines,which I copied from the old Pharo CMake scripts:

-DSTACK_ALIGN_BYTES=16 -DALLOCA_LIES_SO_USE_GETSP=0

In the pharo-vm, the CogFamilyWindowsConfig >> #commonCompilerFlags method starts with the following comment:
commonCompilerFlags
    "omit -ggdb2 to prevent generating debug info"
    "Some flags explanation:
   
    STACK_ALIGN_BYTES=16 is needed in mingw and FFI (and I suppose on other modules too).
    DALLOCA_LIES_SO_USE_GETSP=0 Some compilers return the stack address+4 on alloca function,
    then FFI module needs to adjust that. It is NOT the case of mingw.
    For more information see this thread: http://forum.world.st/There-are-something-fishy-with-FFI-plugin-td4584226.html
    "


2016-11-29 9:32 GMT-03:00 Esteban Lorenzano <[hidden email]>:
 

On 29 Nov 2016, at 13:04, Clément Bera <[hidden email]> wrote:

Hi,

Can you confirm this bug happen only in Windows ?

yes, the crash is just in windows.
the callback problem is general (note that FFICallbackTests works fine, but I think this is related to the fact that it never enters the 2nd condition with the qsort function) .


Do you have version number (both VMMaker and git commit) of the last version you have that was working ?

sadly, not… I tried to get the latest working version, but with the mess I have to get the VM to build with opensmalltalk-vm, I couldn’t track it. 
I suspect is related to the work on 64bits for windows, but I have no proof of that :P

Esteban


Thanks.


On Tue, Nov 29, 2016 at 11:54 AM, Esteban Lorenzano <[hidden email]> wrote:

Hi,

So, I’m building the PharoVM along with all his dependencies. For me, this is a major step because I can drop the old build process finally.
Now, I’m having serious problems with FFI (that they were not present before), :


1. CRASH IN WINDOWS (32bits):

In Win32, it crashes automatically when trying to access this funtion:

getEnvSize: nameString
        ^ self ffiCall: #( int GetEnvironmentVariableA ( String nameString, nil, 0 ) ) module: #Kernel32

 (this works perfectly fine in older versions)

2. CALLBACKS FAILING:

Callbacks have problems. The examples passes but they are very simple… as soon as I try to do something complicates (like unqlite bindings or libgit2 bindings, who use callbacks intensively), callbacks stops working.
I traced the problem up to this method:

StackInterpreter>>#returnAs:ThroughCallback:Context:

returnAs: returnTypeOop ThroughCallback: vmCallbackContext Context: callbackMethodContext
        "callbackMethodContext is an activation of invokeCallback:[stack:registers:jmpbuf:].
         Its sender is the VM's state prior to the callback.  Reestablish that state (via longjmp),
         and mark callbackMethodContext as dead."
        <export: true>
        <var: #vmCallbackContext type: #'VMCallbackContext *'>
        | calloutMethodContext theFP thePage |
        <var: #theFP type: #'char *'>
        <var: #thePage type: #'StackPage *'>
        ((self isIntegerObject: returnTypeOop)
         and: [self isLiveContext: callbackMethodContext]) ifFalse:
                [^false].
        calloutMethodContext := self externalInstVar: SenderIndex ofContext: callbackMethodContext.
        (self isLiveContext: calloutMethodContext) ifFalse:
                [^false].
        "We're about to leave this stack page; must save the current frame's instructionPointer."
        self push: instructionPointer.
        self externalWriteBackHeadFramePointers.
        "Mark callbackMethodContext as dead; the common case is that it is the current frame.
         We go the extra mile for the debugger."
        (self isSingleContext: callbackMethodContext)
                ifTrue: [self markContextAsDead: callbackMethodContext]
                ifFalse:
                        [theFP := self frameOfMarriedContext: callbackMethodContext.
                         framePointer = theFP "common case"
                                ifTrue:
                                        [(self isBaseFrame: theFP)
                                                ifTrue: [stackPages freeStackPage: stackPage]
                                                ifFalse: "calloutMethodContext is immediately below on the same page.  Make it current."
                                                        [instructionPointer := (self frameCallerSavedIP: framePointer) asUnsignedInteger.
                                                         stackPointer := framePointer + (self frameStackedReceiverOffset: framePointer) + objectMemory wordSize.
                                                         framePointer := self frameCallerFP: framePointer.
                                                         self setMethod: (self frameMethodObject: framePointer).
                                                         self restoreCStackStateForCallbackContext: vmCallbackContext.
                                                         "N.B. siglongjmp is defines as _longjmp on non-win32 platforms.
                                                          This matches the use of _setjmp in ia32abicc.c."
                                                         self siglong: vmCallbackContext trampoline jmp: (self integerValueOf: returnTypeOop).
                                                         ^true]]
                                ifFalse:
                                        [self externalDivorceFrame: theFP andContext: callbackMethodContext.
                                         self markContextAsDead: callbackMethodContext]].
        "Make the calloutMethodContext the active frame.  The case where calloutMethodContext
         is immediately below callbackMethodContext on the same page is handled above."
        (self isStillMarriedContext: calloutMethodContext)
                ifTrue:
                        [theFP := self frameOfMarriedContext: calloutMethodContext.
                         thePage := stackPages stackPageFor: theFP.
                         "findSPOf:on: points to the word beneath the instructionPointer, but
                          there is no instructionPointer on the top frame of the current page."
                         self assert: thePage ~= stackPage.
                         stackPointer := (self findSPOf: theFP on: thePage) - objectMemory wordSize.
                         framePointer := theFP]
                ifFalse:
                        [thePage := self makeBaseFrameFor: calloutMethodContext.
                         framePointer := thePage headFP.
                         stackPointer := thePage headSP].
        instructionPointer := self popStack.
        self setMethod: (objectMemory fetchPointer: MethodIndex ofObject: calloutMethodContext).
        self setStackPageAndLimit: thePage.
        self restoreCStackStateForCallbackContext: vmCallbackContext.
         "N.B. siglongjmp is defines as _longjmp on non-win32 platforms.
          This matches the use of _setjmp in ia32abicc.c."
        self siglong: vmCallbackContext trampoline jmp: (self integerValueOf: returnTypeOop).
        "NOTREACHED"
        ^true

with the first siglongjmp callbacks are passing fine.
with the last (it would be if  framePointer = theFP AND !(isBaseFrame: theFP) ) it doesn’t.

So… from here I’m a bit lost… I need some help :)

thanks,
Esteban











Reply | Threaded
Open this post in threaded view
|

Re: latest changes (no idea from when it started) are making FFI win32 crash (and FFI Callbacks are not reliable anymore either)

Andres Valloud-4
 
To prove alloca() *lies*, one needs to show e.g. a 5-10 C program
independent of anything else exemplifying a clear specification
violation.  Otherwise, how do you know the LIARS_LIARS_PANTS_ON_FIRE
macros are not compensating for undefined behavior elsewhere?

On 11/29/16 16:23 , Nicolas Cellier wrote:

>
>
>
>
> Though, it's necessary to define ALLOCA_LIES_SO_USE_GETSP to zero to
> make FFI work with gcc.
> That does not mean that alloca does not lie, just that there is another
> problem with stack management...
>
> 2016-11-29 21:22 GMT+01:00 Nicolas Cellier
> <[hidden email]
> <mailto:[hidden email]>>:
>
>     Thanks Ronie and Esteban.
>     This seems to be an alignment problem indeed.
>     What I see is that alignment is defined at least in 3 different places:
>     - platforms/Cross/vm/sqCogStackAlignment.h
>     - platforms/Cross/plugins/IA32ABI/ia32abicc.c
>     - src/plugins/SqueakFFIPrims/IA32FFIPlugin.c and friends...
>     That's just too many different opinions!!! We have to unify that
>     rather than adding a 4th opinion in a Makefile.
>
>     However, about ALLOCA_LIES_SO_USE_GETSP, I'm not so sure that "It is
>     NOT the case of mingw."
>     Last time I used gdb, it WAS still the case, alloca was STILL lying.
>     See
>     http://lists.squeakfoundation.org/pipermail/vm-dev/2016-August/022985.html
>     <http://lists.squeakfoundation.org/pipermail/vm-dev/2016-August/022985.html>
>
>     BUT:
>     -----
>     forcing 16 bytes alignment supersedes the alloca hack, making it not
>     strictly necessary anymore
>     see below in generated src/plgins/IA32FFIPlugin.c:
>
>             allocation = alloca(((stackSize +
>     ((calloutState->structReturnSize)))) + (cStackAlignment()));
>             if (allocaLiesSoUseGetsp()) {
>                     allocation = getsp();
>             }
>             if ((cStackAlignment()) != 0) {
>                     allocation = ((char *) ((((((usqInt)allocation)) |
>     ((cStackAlignment()) - 1)) - ((cStackAlignment()) - 1))));
>             }
>             (calloutState->argVector = allocation);
>
>     but we further do:
>
>             if ((0 + (cStackAlignment())) > 0) {
>                     setsp((calloutState->argVector));
>             }
>
>     So if ever the stack pointer is greater than alloca return value,
>     but we removed the ALLOCA_LIES hack,
>     the stack pointer is then set back to alloca returned value,
>     avoiding the stack pointer offset problem
>     It would be worth writing  a unit test case, and inquiring the
>     reason why it lies in gcc mailing list to be sure...
>
>     cheers
>
>     2016-11-29 18:14 GMT+01:00 Esteban Lorenzano <[hidden email]
>     <mailto:[hidden email]>>:
>
>
>         hah!
>         you know what is the sad part of this? I wrote that message… it
>         was for the future me, but I forget to check our flags :P
>         I lost 2.5 days then + 2 days now.
>
>         this fixes the problem with Windows crashes (yay!) but not the
>         problem with callbacks (booo!)… any idea in that area?
>
>         cheers,
>         Esteban
>
>>         On 29 Nov 2016, at 17:30, Ronie Salgado <[hidden email]
>>         <mailto:[hidden email]>> wrote:
>>
>>         The last week I was having this exactly same crash in the
>>         MinimalisticHeadless branch, with both MinGW and with Visual
>>         Studio. I managed to get the VM working with MinGW (not yet
>>         with MSVC) by using the following defines,which I copied from
>>         the old Pharo CMake scripts:
>>
>>         -DSTACK_ALIGN_BYTES=16 -DALLOCA_LIES_SO_USE_GETSP=0
>>
>>         In the pharo-vm, the CogFamilyWindowsConfig >>
>>         #commonCompilerFlags method starts with the following comment:
>>         commonCompilerFlags
>>             "omit -ggdb2 to prevent generating debug info"
>>             "Some flags explanation:
>>
>>             STACK_ALIGN_BYTES=16 is needed in mingw and FFI (and I
>>         suppose on other modules too).
>>             DALLOCA_LIES_SO_USE_GETSP=0 Some compilers return the
>>         stack address+4 on alloca function,
>>             then FFI module needs to adjust that. It is NOT the case
>>         of mingw.
>>             For more information see this thread:
>>         http://forum.world.st/There-are-something-fishy-with-FFI-plugin-td4584226.html
>>         <http://forum.world.st/There-are-something-fishy-with-FFI-plugin-td4584226.html>
>>             "
>>
>>
>>         2016-11-29 9:32 GMT-03:00 Esteban Lorenzano
>>         <[hidden email] <mailto:[hidden email]>>:
>>
>>
>>
>>>             On 29 Nov 2016, at 13:04, Clément Bera
>>>             <[hidden email] <mailto:[hidden email]>>
>>>             wrote:
>>>
>>>             Hi,
>>>
>>>             Can you confirm this bug happen only in Windows ?
>>
>>             yes, the crash is just in windows.
>>             the callback problem is general (note that
>>             FFICallbackTests works fine, but I think this is related
>>             to the fact that it never enters the 2nd condition with
>>             the qsort function) .
>>
>>>
>>>             Do you have version number (both VMMaker and git commit)
>>>             of the last version you have that was working ?
>>
>>             sadly, not… I tried to get the latest working version, but
>>             with the mess I have to get the VM to build with
>>             opensmalltalk-vm, I couldn’t track it.
>>             I suspect is related to the work on 64bits for windows,
>>             but I have no proof of that :P
>>
>>             Esteban
>>
>>>
>>>             Thanks.
>>>
>>>
>>>             On Tue, Nov 29, 2016 at 11:54 AM, Esteban Lorenzano
>>>             <[hidden email] <mailto:[hidden email]>> wrote:
>>>
>>>
>>>                 Hi,
>>>
>>>                 So, I’m building the PharoVM along with all his
>>>                 dependencies. For me, this is a major step because I
>>>                 can drop the old build process finally.
>>>                 Now, I’m having serious problems with FFI (that they
>>>                 were not present before), :
>>>
>>>
>>>                 1. CRASH IN WINDOWS (32bits):
>>>
>>>                 In Win32, it crashes automatically when trying to
>>>                 access this funtion:
>>>
>>>                 getEnvSize: nameString
>>>                         ^ self ffiCall: #( int
>>>                 GetEnvironmentVariableA ( String nameString, nil, 0 )
>>>                 ) module: #Kernel32
>>>
>>>                  (this works perfectly fine in older versions)
>>>
>>>                 2. CALLBACKS FAILING:
>>>
>>>                 Callbacks have problems. The examples passes but they
>>>                 are very simple… as soon as I try to do something
>>>                 complicates (like unqlite bindings or libgit2
>>>                 bindings, who use callbacks intensively), callbacks
>>>                 stops working.
>>>                 I traced the problem up to this method:
>>>
>>>                 StackInterpreter>>#returnAs:ThroughCallback:Context:
>>>
>>>                 returnAs: returnTypeOop ThroughCallback:
>>>                 vmCallbackContext Context: callbackMethodContext
>>>                         "callbackMethodContext is an activation of
>>>                 invokeCallback:[stack:registers:jmpbuf:].
>>>                          Its sender is the VM's state prior to the
>>>                 callback.  Reestablish that state (via longjmp),
>>>                          and mark callbackMethodContext as dead."
>>>                         <export: true>
>>>                         <var: #vmCallbackContext type:
>>>                 #'VMCallbackContext *'>
>>>                         | calloutMethodContext theFP thePage |
>>>                         <var: #theFP type: #'char *'>
>>>                         <var: #thePage type: #'StackPage *'>
>>>                         ((self isIntegerObject: returnTypeOop)
>>>                          and: [self isLiveContext:
>>>                 callbackMethodContext]) ifFalse:
>>>                                 [^false].
>>>                         calloutMethodContext := self externalInstVar:
>>>                 SenderIndex ofContext: callbackMethodContext.
>>>                         (self isLiveContext: calloutMethodContext)
>>>                 ifFalse:
>>>                                 [^false].
>>>                         "We're about to leave this stack page; must
>>>                 save the current frame's instructionPointer."
>>>                         self push: instructionPointer.
>>>                         self externalWriteBackHeadFramePointers.
>>>                         "Mark callbackMethodContext as dead; the
>>>                 common case is that it is the current frame.
>>>                          We go the extra mile for the debugger."
>>>                         (self isSingleContext: callbackMethodContext)
>>>                                 ifTrue: [self markContextAsDead:
>>>                 callbackMethodContext]
>>>                                 ifFalse:
>>>                                         [theFP := self
>>>                 frameOfMarriedContext: callbackMethodContext.
>>>                                          framePointer = theFP "common
>>>                 case"
>>>                                                 ifTrue:
>>>                                                         [(self
>>>                 isBaseFrame: theFP)
>>>
>>>                 ifTrue: [stackPages freeStackPage: stackPage]
>>>
>>>                 ifFalse: "calloutMethodContext is immediately below
>>>                 on the same page.  Make it current."
>>>
>>>                   [instructionPointer := (self frameCallerSavedIP:
>>>                 framePointer) asUnsignedInteger.
>>>
>>>                    stackPointer := framePointer + (self
>>>                 frameStackedReceiverOffset: framePointer) +
>>>                 objectMemory wordSize.
>>>
>>>                    framePointer := self frameCallerFP: framePointer.
>>>
>>>                    self setMethod: (self frameMethodObject:
>>>                 framePointer).
>>>
>>>                    self restoreCStackStateForCallbackContext:
>>>                 vmCallbackContext.
>>>
>>>                    "N.B. siglongjmp is defines as _longjmp on
>>>                 non-win32 platforms.
>>>
>>>                     This matches the use of _setjmp in ia32abicc.c."
>>>
>>>                    self siglong: vmCallbackContext trampoline jmp:
>>>                 (self integerValueOf: returnTypeOop).
>>>
>>>                    ^true]]
>>>                                                 ifFalse:
>>>                                                         [self
>>>                 externalDivorceFrame: theFP andContext:
>>>                 callbackMethodContext.
>>>                                                          self
>>>                 markContextAsDead: callbackMethodContext]].
>>>                         "Make the calloutMethodContext the active
>>>                 frame.  The case where calloutMethodContext
>>>                          is immediately below callbackMethodContext
>>>                 on the same page is handled above."
>>>                         (self isStillMarriedContext:
>>>                 calloutMethodContext)
>>>                                 ifTrue:
>>>                                         [theFP := self
>>>                 frameOfMarriedContext: calloutMethodContext.
>>>                                          thePage := stackPages
>>>                 stackPageFor: theFP.
>>>                                          "findSPOf:on: points to the
>>>                 word beneath the instructionPointer, but
>>>                                           there is no
>>>                 instructionPointer on the top frame of the current page."
>>>                                          self assert: thePage ~=
>>>                 stackPage.
>>>                                          stackPointer := (self
>>>                 findSPOf: theFP on: thePage) - objectMemory wordSize.
>>>                                          framePointer := theFP]
>>>                                 ifFalse:
>>>                                         [thePage := self
>>>                 makeBaseFrameFor: calloutMethodContext.
>>>                                          framePointer := thePage headFP.
>>>                                          stackPointer := thePage headSP].
>>>                         instructionPointer := self popStack.
>>>                         self setMethod: (objectMemory fetchPointer:
>>>                 MethodIndex ofObject: calloutMethodContext).
>>>                         self setStackPageAndLimit: thePage.
>>>                         self restoreCStackStateForCallbackContext:
>>>                 vmCallbackContext.
>>>                          "N.B. siglongjmp is defines as _longjmp on
>>>                 non-win32 platforms.
>>>                           This matches the use of _setjmp in
>>>                 ia32abicc.c."
>>>                         self siglong: vmCallbackContext trampoline
>>>                 jmp: (self integerValueOf: returnTypeOop).
>>>                         "NOTREACHED"
>>>                         ^true
>>>
>>>                 with the first siglongjmp callbacks are passing fine.
>>>                 with the last (it would be if  framePointer = theFP
>>>                 AND !(isBaseFrame: theFP) ) it doesn’t.
>>>
>>>                 So… from here I’m a bit lost… I need some help :)
>>>
>>>                 thanks,
>>>                 Esteban
>>>
>>>
>>>
>>>
>>
>>
>>
>
>
>
>
Reply | Threaded
Open this post in threaded view
|

Re: latest changes (no idea from when it started) are making FFI win32 crash (and FFI Callbacks are not reliable anymore either)

EstebanLM
In reply to this post by Nicolas Cellier
 

On 30 Nov 2016, at 01:23, Nicolas Cellier <[hidden email]> wrote:

Though, it's necessary to define ALLOCA_LIES_SO_USE_GETSP to zero to make FFI work with gcc.
That does not mean that alloca does not lie, just that there is another problem with stack management…

so, this workaround might be incorrect…  


2016-11-29 21:22 GMT+01:00 Nicolas Cellier <[hidden email]>:
Thanks Ronie and Esteban.
This seems to be an alignment problem indeed.
What I see is that alignment is defined at least in 3 different places:
- platforms/Cross/vm/sqCogStackAlignment.h
- platforms/Cross/plugins/IA32ABI/ia32abicc.c
- src/plugins/SqueakFFIPrims/IA32FFIPlugin.c and friends...
That's just too many different opinions!!! We have to unify that rather than adding a 4th opinion in a Makefile.

well, yes :)


However, about ALLOCA_LIES_SO_USE_GETSP, I'm not so sure that "It is NOT the case of mingw."
Last time I used gdb, it WAS still the case, alloca was STILL lying.
See http://lists.squeakfoundation.org/pipermail/vm-dev/2016-August/022985.html

BUT:
-----
forcing 16 bytes alignment supersedes the alloca hack, making it not strictly necessary anymore
see below in generated src/plgins/IA32FFIPlugin.c:

        allocation = alloca(((stackSize + ((calloutState->structReturnSize)))) + (cStackAlignment()));
        if (allocaLiesSoUseGetsp()) {
                allocation = getsp();
        }
        if ((cStackAlignment()) != 0) {
                allocation = ((char *) ((((((usqInt)allocation)) | ((cStackAlignment()) - 1)) - ((cStackAlignment()) - 1))));
        }
        (calloutState->argVector = allocation);

but we further do:

        if ((0 + (cStackAlignment())) > 0) {
                setsp((calloutState->argVector));
        }

So if ever the stack pointer is greater than alloca return value, but we removed the ALLOCA_LIES hack,
the stack pointer is then set back to alloca returned value, avoiding the stack pointer offset problem
It would be worth writing  a unit test case, and inquiring the reason why it lies in gcc mailing list to be sure...

cheers

2016-11-29 18:14 GMT+01:00 Esteban Lorenzano <[hidden email]>:
 
hah! 
you know what is the sad part of this? I wrote that message… it was for the future me, but I forget to check our flags :P
I lost 2.5 days then + 2 days now. 

this fixes the problem with Windows crashes (yay!) but not the problem with callbacks (booo!)… any idea in that area?

cheers, 
Esteban

On 29 Nov 2016, at 17:30, Ronie Salgado <[hidden email]> wrote:

The last week I was having this exactly same crash in the MinimalisticHeadless branch, with both MinGW and with Visual Studio. I managed to get the VM working with MinGW (not yet with MSVC) by using the following defines,which I copied from the old Pharo CMake scripts:

-DSTACK_ALIGN_BYTES=16 -DALLOCA_LIES_SO_USE_GETSP=0

In the pharo-vm, the CogFamilyWindowsConfig >> #commonCompilerFlags method starts with the following comment:
commonCompilerFlags
    "omit -ggdb2 to prevent generating debug info"
    "Some flags explanation:
   
    STACK_ALIGN_BYTES=16 is needed in mingw and FFI (and I suppose on other modules too).
    DALLOCA_LIES_SO_USE_GETSP=0 Some compilers return the stack address+4 on alloca function,
    then FFI module needs to adjust that. It is NOT the case of mingw.
    For more information see this thread: http://forum.world.st/There-are-something-fishy-with-FFI-plugin-td4584226.html
    "


2016-11-29 9:32 GMT-03:00 Esteban Lorenzano <[hidden email]>:
 

On 29 Nov 2016, at 13:04, Clément Bera <[hidden email]> wrote:

Hi,

Can you confirm this bug happen only in Windows ?

yes, the crash is just in windows.
the callback problem is general (note that FFICallbackTests works fine, but I think this is related to the fact that it never enters the 2nd condition with the qsort function) .


Do you have version number (both VMMaker and git commit) of the last version you have that was working ?

sadly, not… I tried to get the latest working version, but with the mess I have to get the VM to build with opensmalltalk-vm, I couldn’t track it. 
I suspect is related to the work on 64bits for windows, but I have no proof of that :P

Esteban


Thanks.


On Tue, Nov 29, 2016 at 11:54 AM, Esteban Lorenzano <[hidden email]> wrote:

Hi,

So, I’m building the PharoVM along with all his dependencies. For me, this is a major step because I can drop the old build process finally.
Now, I’m having serious problems with FFI (that they were not present before), :


1. CRASH IN WINDOWS (32bits):

In Win32, it crashes automatically when trying to access this funtion:

getEnvSize: nameString
        ^ self ffiCall: #( int GetEnvironmentVariableA ( String nameString, nil, 0 ) ) module: #Kernel32

 (this works perfectly fine in older versions)

2. CALLBACKS FAILING:

Callbacks have problems. The examples passes but they are very simple… as soon as I try to do something complicates (like unqlite bindings or libgit2 bindings, who use callbacks intensively), callbacks stops working.
I traced the problem up to this method:

StackInterpreter>>#returnAs:ThroughCallback:Context:

returnAs: returnTypeOop ThroughCallback: vmCallbackContext Context: callbackMethodContext
        "callbackMethodContext is an activation of invokeCallback:[stack:registers:jmpbuf:].
         Its sender is the VM's state prior to the callback.  Reestablish that state (via longjmp),
         and mark callbackMethodContext as dead."
        <export: true>
        <var: #vmCallbackContext type: #'VMCallbackContext *'>
        | calloutMethodContext theFP thePage |
        <var: #theFP type: #'char *'>
        <var: #thePage type: #'StackPage *'>
        ((self isIntegerObject: returnTypeOop)
         and: [self isLiveContext: callbackMethodContext]) ifFalse:
                [^false].
        calloutMethodContext := self externalInstVar: SenderIndex ofContext: callbackMethodContext.
        (self isLiveContext: calloutMethodContext) ifFalse:
                [^false].
        "We're about to leave this stack page; must save the current frame's instructionPointer."
        self push: instructionPointer.
        self externalWriteBackHeadFramePointers.
        "Mark callbackMethodContext as dead; the common case is that it is the current frame.
         We go the extra mile for the debugger."
        (self isSingleContext: callbackMethodContext)
                ifTrue: [self markContextAsDead: callbackMethodContext]
                ifFalse:
                        [theFP := self frameOfMarriedContext: callbackMethodContext.
                         framePointer = theFP "common case"
                                ifTrue:
                                        [(self isBaseFrame: theFP)
                                                ifTrue: [stackPages freeStackPage: stackPage]
                                                ifFalse: "calloutMethodContext is immediately below on the same page.  Make it current."
                                                        [instructionPointer := (self frameCallerSavedIP: framePointer) asUnsignedInteger.
                                                         stackPointer := framePointer + (self frameStackedReceiverOffset: framePointer) + objectMemory wordSize.
                                                         framePointer := self frameCallerFP: framePointer.
                                                         self setMethod: (self frameMethodObject: framePointer).
                                                         self restoreCStackStateForCallbackContext: vmCallbackContext.
                                                         "N.B. siglongjmp is defines as _longjmp on non-win32 platforms.
                                                          This matches the use of _setjmp in ia32abicc.c."
                                                         self siglong: vmCallbackContext trampoline jmp: (self integerValueOf: returnTypeOop).
                                                         ^true]]
                                ifFalse:
                                        [self externalDivorceFrame: theFP andContext: callbackMethodContext.
                                         self markContextAsDead: callbackMethodContext]].
        "Make the calloutMethodContext the active frame.  The case where calloutMethodContext
         is immediately below callbackMethodContext on the same page is handled above."
        (self isStillMarriedContext: calloutMethodContext)
                ifTrue:
                        [theFP := self frameOfMarriedContext: calloutMethodContext.
                         thePage := stackPages stackPageFor: theFP.
                         "findSPOf:on: points to the word beneath the instructionPointer, but
                          there is no instructionPointer on the top frame of the current page."
                         self assert: thePage ~= stackPage.
                         stackPointer := (self findSPOf: theFP on: thePage) - objectMemory wordSize.
                         framePointer := theFP]
                ifFalse:
                        [thePage := self makeBaseFrameFor: calloutMethodContext.
                         framePointer := thePage headFP.
                         stackPointer := thePage headSP].
        instructionPointer := self popStack.
        self setMethod: (objectMemory fetchPointer: MethodIndex ofObject: calloutMethodContext).
        self setStackPageAndLimit: thePage.
        self restoreCStackStateForCallbackContext: vmCallbackContext.
         "N.B. siglongjmp is defines as _longjmp on non-win32 platforms.
          This matches the use of _setjmp in ia32abicc.c."
        self siglong: vmCallbackContext trampoline jmp: (self integerValueOf: returnTypeOop).
        "NOTREACHED"
        ^true

with the first siglongjmp callbacks are passing fine.
with the last (it would be if  framePointer = theFP AND !(isBaseFrame: theFP) ) it doesn’t.

So… from here I’m a bit lost… I need some help :)

thanks,
Esteban












Reply | Threaded
Open this post in threaded view
|

Re: latest changes (no idea from when it started) are making FFI win32 crash (and FFI Callbacks are not reliable anymore either)

Nicolas Cellier
In reply to this post by Andres Valloud-4
 
Hi Andres
"It would be worth writing  a unit test case" did mean exactly that, write a few lines of C.

I don't know what you call specification here, probably our expectation?
The behavior we expect, though sounding reasonnable, is not specified by any standard I know of.
It's probably unspecified and at best implementation defined.
That's why I suggest inquiring gcc implementation.

Both i686-w64-mingw32-gcc and x86_64-w64-mingw32-gcc do reserve bytes on the stack under the alloca'ed.
My findings is that this space depends on max number of parameters of functions called.
For example, calling fprintf(1,"\n") after alloca would reserve 8 additional bytes on i686 and 16 on x86_64.
Calling fprintf(1,"%d\n",x); would reserve 12 and 24 bytes respectively.

This does not happen with clang.

2016-11-30 3:07 GMT+01:00 Andres Valloud <[hidden email]>:

To prove alloca() *lies*, one needs to show e.g. a 5-10 C program independent of anything else exemplifying a clear specification violation.  Otherwise, how do you know the LIARS_LIARS_PANTS_ON_FIRE macros are not compensating for undefined behavior elsewhere?

On 11/29/16 16:23 , Nicolas Cellier wrote:




Though, it's necessary to define ALLOCA_LIES_SO_USE_GETSP to zero to
make FFI work with gcc.
That does not mean that alloca does not lie, just that there is another
problem with stack management...

2016-11-29 21:22 GMT+01:00 Nicolas Cellier
<[hidden email]
<mailto:[hidden email]>>:


    Thanks Ronie and Esteban.
    This seems to be an alignment problem indeed.
    What I see is that alignment is defined at least in 3 different places:
    - platforms/Cross/vm/sqCogStackAlignment.h
    - platforms/Cross/plugins/IA32ABI/ia32abicc.c
    - src/plugins/SqueakFFIPrims/IA32FFIPlugin.c and friends...
    That's just too many different opinions!!! We have to unify that
    rather than adding a 4th opinion in a Makefile.

    However, about ALLOCA_LIES_SO_USE_GETSP, I'm not so sure that "It is
    NOT the case of mingw."
    Last time I used gdb, it WAS still the case, alloca was STILL lying.
    See
    http://lists.squeakfoundation.org/pipermail/vm-dev/2016-August/022985.html
    <http://lists.squeakfoundation.org/pipermail/vm-dev/2016-August/022985.html>

    BUT:
    -----
    forcing 16 bytes alignment supersedes the alloca hack, making it not
    strictly necessary anymore
    see below in generated src/plgins/IA32FFIPlugin.c:

            allocation = alloca(((stackSize +
    ((calloutState->structReturnSize)))) + (cStackAlignment()));
            if (allocaLiesSoUseGetsp()) {
                    allocation = getsp();
            }
            if ((cStackAlignment()) != 0) {
                    allocation = ((char *) ((((((usqInt)allocation)) |
    ((cStackAlignment()) - 1)) - ((cStackAlignment()) - 1))));
            }
            (calloutState->argVector = allocation);

    but we further do:

            if ((0 + (cStackAlignment())) > 0) {
                    setsp((calloutState->argVector));
            }

    So if ever the stack pointer is greater than alloca return value,
    but we removed the ALLOCA_LIES hack,
    the stack pointer is then set back to alloca returned value,
    avoiding the stack pointer offset problem
    It would be worth writing  a unit test case, and inquiring the
    reason why it lies in gcc mailing list to be sure...

    cheers

    2016-11-29 18:14 GMT+01:00 Esteban Lorenzano <[hidden email]
    <mailto:[hidden email]>>:


        hah!
        you know what is the sad part of this? I wrote that message… it
        was for the future me, but I forget to check our flags :P
        I lost 2.5 days then + 2 days now.

        this fixes the problem with Windows crashes (yay!) but not the
        problem with callbacks (booo!)… any idea in that area?

        cheers,
        Esteban

        On 29 Nov 2016, at 17:30, Ronie Salgado <[hidden email]
        <mailto:[hidden email]>> wrote:

        The last week I was having this exactly same crash in the
        MinimalisticHeadless branch, with both MinGW and with Visual
        Studio. I managed to get the VM working with MinGW (not yet
        with MSVC) by using the following defines,which I copied from
        the old Pharo CMake scripts:

        -DSTACK_ALIGN_BYTES=16 -DALLOCA_LIES_SO_USE_GETSP=0

        In the pharo-vm, the CogFamilyWindowsConfig >>
        #commonCompilerFlags method starts with the following comment:
        commonCompilerFlags
            "omit -ggdb2 to prevent generating debug info"
            "Some flags explanation:

            STACK_ALIGN_BYTES=16 is needed in mingw and FFI (and I
        suppose on other modules too).
            DALLOCA_LIES_SO_USE_GETSP=0 Some compilers return the
        stack address+4 on alloca function,
            then FFI module needs to adjust that. It is NOT the case
        of mingw.
            For more information see this thread:
        http://forum.world.st/There-are-something-fishy-with-FFI-plugin-td4584226.html
        <http://forum.world.st/There-are-something-fishy-with-FFI-plugin-td4584226.html>
            "


        2016-11-29 9:32 GMT-03:00 Esteban Lorenzano
        <[hidden email] <mailto:[hidden email]>>:



            On 29 Nov 2016, at 13:04, Clément Bera
            <[hidden email] <mailto:[hidden email]>>
            wrote:

            Hi,

            Can you confirm this bug happen only in Windows ?

            yes, the crash is just in windows.
            the callback problem is general (note that
            FFICallbackTests works fine, but I think this is related
            to the fact that it never enters the 2nd condition with
            the qsort function) .


            Do you have version number (both VMMaker and git commit)
            of the last version you have that was working ?

            sadly, not… I tried to get the latest working version, but
            with the mess I have to get the VM to build with
            opensmalltalk-vm, I couldn’t track it.
            I suspect is related to the work on 64bits for windows,
            but I have no proof of that :P

            Esteban


            Thanks.


            On Tue, Nov 29, 2016 at 11:54 AM, Esteban Lorenzano
            <[hidden email] <mailto:[hidden email]>> wrote:


                Hi,

                So, I’m building the PharoVM along with all his
                dependencies. For me, this is a major step because I
                can drop the old build process finally.
                Now, I’m having serious problems with FFI (that they
                were not present before), :


                1. CRASH IN WINDOWS (32bits):

                In Win32, it crashes automatically when trying to
                access this funtion:

                getEnvSize: nameString
                        ^ self ffiCall: #( int
                GetEnvironmentVariableA ( String nameString, nil, 0 )
                ) module: #Kernel32

                 (this works perfectly fine in older versions)

                2. CALLBACKS FAILING:

                Callbacks have problems. The examples passes but they
                are very simple… as soon as I try to do something
                complicates (like unqlite bindings or libgit2
                bindings, who use callbacks intensively), callbacks
                stops working.
                I traced the problem up to this method:

                StackInterpreter>>#returnAs:ThroughCallback:Context:

                returnAs: returnTypeOop ThroughCallback:
                vmCallbackContext Context: callbackMethodContext
                        "callbackMethodContext is an activation of
                invokeCallback:[stack:registers:jmpbuf:].
                         Its sender is the VM's state prior to the
                callback.  Reestablish that state (via longjmp),
                         and mark callbackMethodContext as dead."
                        <export: true>
                        <var: #vmCallbackContext type:
                #'VMCallbackContext *'>
                        | calloutMethodContext theFP thePage |
                        <var: #theFP type: #'char *'>
                        <var: #thePage type: #'StackPage *'>
                        ((self isIntegerObject: returnTypeOop)
                         and: [self isLiveContext:
                callbackMethodContext]) ifFalse:
                                [^false].
                        calloutMethodContext := self externalInstVar:
                SenderIndex ofContext: callbackMethodContext.
                        (self isLiveContext: calloutMethodContext)
                ifFalse:
                                [^false].
                        "We're about to leave this stack page; must
                save the current frame's instructionPointer."
                        self push: instructionPointer.
                        self externalWriteBackHeadFramePointers.
                        "Mark callbackMethodContext as dead; the
                common case is that it is the current frame.
                         We go the extra mile for the debugger."
                        (self isSingleContext: callbackMethodContext)
                                ifTrue: [self markContextAsDead:
                callbackMethodContext]
                                ifFalse:
                                        [theFP := self
                frameOfMarriedContext: callbackMethodContext.
                                         framePointer = theFP "common
                case"
                                                ifTrue:
                                                        [(self
                isBaseFrame: theFP)

                ifTrue: [stackPages freeStackPage: stackPage]

                ifFalse: "calloutMethodContext is immediately below
                on the same page.  Make it current."

                  [instructionPointer := (self frameCallerSavedIP:
                framePointer) asUnsignedInteger.

                   stackPointer := framePointer + (self
                frameStackedReceiverOffset: framePointer) +
                objectMemory wordSize.

                   framePointer := self frameCallerFP: framePointer.

                   self setMethod: (self frameMethodObject:
                framePointer).

                   self restoreCStackStateForCallbackContext:
                vmCallbackContext.

                   "N.B. siglongjmp is defines as _longjmp on
                non-win32 platforms.

                    This matches the use of _setjmp in ia32abicc.c."

                   self siglong: vmCallbackContext trampoline jmp:
                (self integerValueOf: returnTypeOop).

                   ^true]]
                                                ifFalse:
                                                        [self
                externalDivorceFrame: theFP andContext:
                callbackMethodContext.
                                                         self
                markContextAsDead: callbackMethodContext]].
                        "Make the calloutMethodContext the active
                frame.  The case where calloutMethodContext
                         is immediately below callbackMethodContext
                on the same page is handled above."
                        (self isStillMarriedContext:
                calloutMethodContext)
                                ifTrue:
                                        [theFP := self
                frameOfMarriedContext: calloutMethodContext.
                                         thePage := stackPages
                stackPageFor: theFP.
                                         "findSPOf:on: points to the
                word beneath the instructionPointer, but
                                          there is no
                instructionPointer on the top frame of the current page."
                                         self assert: thePage ~=
                stackPage.
                                         stackPointer := (self
                findSPOf: theFP on: thePage) - objectMemory wordSize.
                                         framePointer := theFP]
                                ifFalse:
                                        [thePage := self
                makeBaseFrameFor: calloutMethodContext.
                                         framePointer := thePage headFP.
                                         stackPointer := thePage headSP].
                        instructionPointer := self popStack.
                        self setMethod: (objectMemory fetchPointer:
                MethodIndex ofObject: calloutMethodContext).
                        self setStackPageAndLimit: thePage.
                        self restoreCStackStateForCallbackContext:
                vmCallbackContext.
                         "N.B. siglongjmp is defines as _longjmp on
                non-win32 platforms.
                          This matches the use of _setjmp in
                ia32abicc.c."
                        self siglong: vmCallbackContext trampoline
                jmp: (self integerValueOf: returnTypeOop).
                        "NOTREACHED"
                        ^true

                with the first siglongjmp callbacks are passing fine.
                with the last (it would be if  framePointer = theFP
                AND !(isBaseFrame: theFP) ) it doesn’t.

                So… from here I’m a bit lost… I need some help :)

                thanks,
                Esteban












Reply | Threaded
Open this post in threaded view
|

Re: latest changes (no idea from when it started) are making FFI win32 crash (and FFI Callbacks are not reliable anymore either)

David T. Lewis
 
Checking a couple of man pages, on Linux we are warned that one should not
expect too much in the way of standards and specifications:


CONFORMING TO
       This function is not in POSIX.1-2001.

       There is evidence that the alloca() function appeared in 32V, PWB, PWB.2, 3BSD, and 4BSD.  There is a man page for it in 4.3BSD.  Linux
       uses the GNU version.


Noting from the above that alloca() seems to have orginated with BSD, check
the man page on FreeBSD:


BUGS
     The alloca() function is machine and compiler dependent; its use is dis-
     couraged.

     The alloca() function is slightly unsafe because it cannot ensure that
     the pointer returned points to a valid and usable block of memory.  The
     allocation made may exceed the bounds of the stack, or even go further
     into other objects in memory, and alloca() cannot determine such an
     error.  Avoid alloca() with large unbounded allocations.

FreeBSD 6.2                    September 5, 2006                   FreeBSD 6.2


So it is not expected to be portable or well specified or well behaved, and
we should not be too surprised if implementation details vary on different
platforms and compilers.

Dave


On Thu, Dec 01, 2016 at 12:27:03AM +0100, Nicolas Cellier wrote:

>  
> Hi Andres
> "It would be worth writing  a unit test case" did mean exactly that, write
> a few lines of C.
>
> I don't know what you call specification here, probably our expectation?
> The behavior we expect, though sounding reasonnable, is not specified by
> any standard I know of.
> It's probably unspecified and at best implementation defined.
> That's why I suggest inquiring gcc implementation.
>
> Both i686-w64-mingw32-gcc and x86_64-w64-mingw32-gcc do reserve bytes on
> the stack under the alloca'ed.
> My findings is that this space depends on max number of parameters of
> functions called.
> For example, calling fprintf(1,"\n") after alloca would reserve 8
> additional bytes on i686 and 16 on x86_64.
> Calling fprintf(1,"%d\n",x); would reserve 12 and 24 bytes respectively.
>
> This does not happen with clang.
>
> 2016-11-30 3:07 GMT+01:00 Andres Valloud <[hidden email]>
> :
>
> >
> > To prove alloca() *lies*, one needs to show e.g. a 5-10 C program
> > independent of anything else exemplifying a clear specification violation.
> > Otherwise, how do you know the LIARS_LIARS_PANTS_ON_FIRE macros are not
> > compensating for undefined behavior elsewhere?
> >
> > On 11/29/16 16:23 , Nicolas Cellier wrote:
> >
> >>
> >>
> >>
> >>
> >> Though, it's necessary to define ALLOCA_LIES_SO_USE_GETSP to zero to
> >> make FFI work with gcc.
> >> That does not mean that alloca does not lie, just that there is another
> >> problem with stack management...
> >>
> >> 2016-11-29 21:22 GMT+01:00 Nicolas Cellier
> >> <[hidden email]
> >> <mailto:[hidden email]>>:
> >>
> >>
> >>     Thanks Ronie and Esteban.
> >>     This seems to be an alignment problem indeed.
> >>     What I see is that alignment is defined at least in 3 different
> >> places:
> >>     - platforms/Cross/vm/sqCogStackAlignment.h
> >>     - platforms/Cross/plugins/IA32ABI/ia32abicc.c
> >>     - src/plugins/SqueakFFIPrims/IA32FFIPlugin.c and friends...
> >>     That's just too many different opinions!!! We have to unify that
> >>     rather than adding a 4th opinion in a Makefile.
> >>
> >>     However, about ALLOCA_LIES_SO_USE_GETSP, I'm not so sure that "It is
> >>     NOT the case of mingw."
> >>     Last time I used gdb, it WAS still the case, alloca was STILL lying.
> >>     See
> >>     http://lists.squeakfoundation.org/pipermail/vm-dev/2016-Augu
> >> st/022985.html
> >>     <http://lists.squeakfoundation.org/pipermail/vm-dev/2016-
> >> August/022985.html>
> >>
> >>     BUT:
> >>     -----
> >>     forcing 16 bytes alignment supersedes the alloca hack, making it not
> >>     strictly necessary anymore
> >>     see below in generated src/plgins/IA32FFIPlugin.c:
> >>
> >>             allocation = alloca(((stackSize +
> >>     ((calloutState->structReturnSize)))) + (cStackAlignment()));
> >>             if (allocaLiesSoUseGetsp()) {
> >>                     allocation = getsp();
> >>             }
> >>             if ((cStackAlignment()) != 0) {
> >>                     allocation = ((char *) ((((((usqInt)allocation)) |
> >>     ((cStackAlignment()) - 1)) - ((cStackAlignment()) - 1))));
> >>             }
> >>             (calloutState->argVector = allocation);
> >>
> >>     but we further do:
> >>
> >>             if ((0 + (cStackAlignment())) > 0) {
> >>                     setsp((calloutState->argVector));
> >>             }
> >>
> >>     So if ever the stack pointer is greater than alloca return value,
> >>     but we removed the ALLOCA_LIES hack,
> >>     the stack pointer is then set back to alloca returned value,
> >>     avoiding the stack pointer offset problem
> >>     It would be worth writing  a unit test case, and inquiring the
> >>     reason why it lies in gcc mailing list to be sure...
> >>
> >>     cheers
> >>
> >>     2016-11-29 18:14 GMT+01:00 Esteban Lorenzano <[hidden email]
> >>     <mailto:[hidden email]>>:
> >>
> >>
> >>         hah!
> >>         you know what is the sad part of this? I wrote that message??? it
> >>         was for the future me, but I forget to check our flags :P
> >>         I lost 2.5 days then + 2 days now.
> >>
> >>         this fixes the problem with Windows crashes (yay!) but not the
> >>         problem with callbacks (booo!)??? any idea in that area?
> >>
> >>         cheers,
> >>         Esteban
> >>
> >>         On 29 Nov 2016, at 17:30, Ronie Salgado <[hidden email]
> >>>         <mailto:[hidden email]>> wrote:
> >>>
> >>>         The last week I was having this exactly same crash in the
> >>>         MinimalisticHeadless branch, with both MinGW and with Visual
> >>>         Studio. I managed to get the VM working with MinGW (not yet
> >>>         with MSVC) by using the following defines,which I copied from
> >>>         the old Pharo CMake scripts:
> >>>
> >>>         -DSTACK_ALIGN_BYTES=16 -DALLOCA_LIES_SO_USE_GETSP=0
> >>>
> >>>         In the pharo-vm, the CogFamilyWindowsConfig >>
> >>>         #commonCompilerFlags method starts with the following comment:
> >>>         commonCompilerFlags
> >>>             "omit -ggdb2 to prevent generating debug info"
> >>>             "Some flags explanation:
> >>>
> >>>             STACK_ALIGN_BYTES=16 is needed in mingw and FFI (and I
> >>>         suppose on other modules too).
> >>>             DALLOCA_LIES_SO_USE_GETSP=0 Some compilers return the
> >>>         stack address+4 on alloca function,
> >>>             then FFI module needs to adjust that. It is NOT the case
> >>>         of mingw.
> >>>             For more information see this thread:
> >>>         http://forum.world.st/There-are-something-fishy-with-FFI-plu
> >>> gin-td4584226.html
> >>>         <http://forum.world.st/There-are-something-fishy-with-FFI-pl
> >>> ugin-td4584226.html>
> >>>             "
> >>>
> >>>
> >>>         2016-11-29 9:32 GMT-03:00 Esteban Lorenzano
> >>>         <[hidden email] <mailto:[hidden email]>>:
> >>>
> >>>
> >>>
> >>>             On 29 Nov 2016, at 13:04, Cl??ment Bera
> >>>>             <[hidden email] <mailto:[hidden email]>>
> >>>>             wrote:
> >>>>
> >>>>             Hi,
> >>>>
> >>>>             Can you confirm this bug happen only in Windows ?
> >>>>
> >>>
> >>>             yes, the crash is just in windows.
> >>>             the callback problem is general (note that
> >>>             FFICallbackTests works fine, but I think this is related
> >>>             to the fact that it never enters the 2nd condition with
> >>>             the qsort function) .
> >>>
> >>>
> >>>>             Do you have version number (both VMMaker and git commit)
> >>>>             of the last version you have that was working ?
> >>>>
> >>>
> >>>             sadly, not??? I tried to get the latest working version, but
> >>>             with the mess I have to get the VM to build with
> >>>             opensmalltalk-vm, I couldn???t track it.
> >>>             I suspect is related to the work on 64bits for windows,
> >>>             but I have no proof of that :P
> >>>
> >>>             Esteban
> >>>
> >>>
> >>>>             Thanks.
> >>>>
> >>>>
> >>>>             On Tue, Nov 29, 2016 at 11:54 AM, Esteban Lorenzano
> >>>>             <[hidden email] <mailto:[hidden email]>> wrote:
> >>>>
> >>>>
> >>>>                 Hi,
> >>>>
> >>>>                 So, I???m building the PharoVM along with all his
> >>>>                 dependencies. For me, this is a major step because I
> >>>>                 can drop the old build process finally.
> >>>>                 Now, I???m having serious problems with FFI (that they
> >>>>                 were not present before), :
> >>>>
> >>>>
> >>>>                 1. CRASH IN WINDOWS (32bits):
> >>>>
> >>>>                 In Win32, it crashes automatically when trying to
> >>>>                 access this funtion:
> >>>>
> >>>>                 getEnvSize: nameString
> >>>>                         ^ self ffiCall: #( int
> >>>>                 GetEnvironmentVariableA ( String nameString, nil, 0 )
> >>>>                 ) module: #Kernel32
> >>>>
> >>>>                  (this works perfectly fine in older versions)
> >>>>
> >>>>                 2. CALLBACKS FAILING:
> >>>>
> >>>>                 Callbacks have problems. The examples passes but they
> >>>>                 are very simple??? as soon as I try to do something
> >>>>                 complicates (like unqlite bindings or libgit2
> >>>>                 bindings, who use callbacks intensively), callbacks
> >>>>                 stops working.
> >>>>                 I traced the problem up to this method:
> >>>>
> >>>>                 StackInterpreter>>#returnAs:ThroughCallback:Context:
> >>>>
> >>>>                 returnAs: returnTypeOop ThroughCallback:
> >>>>                 vmCallbackContext Context: callbackMethodContext
> >>>>                         "callbackMethodContext is an activation of
> >>>>                 invokeCallback:[stack:registers:jmpbuf:].
> >>>>                          Its sender is the VM's state prior to the
> >>>>                 callback.  Reestablish that state (via longjmp),
> >>>>                          and mark callbackMethodContext as dead."
> >>>>                         <export: true>
> >>>>                         <var: #vmCallbackContext type:
> >>>>                 #'VMCallbackContext *'>
> >>>>                         | calloutMethodContext theFP thePage |
> >>>>                         <var: #theFP type: #'char *'>
> >>>>                         <var: #thePage type: #'StackPage *'>
> >>>>                         ((self isIntegerObject: returnTypeOop)
> >>>>                          and: [self isLiveContext:
> >>>>                 callbackMethodContext]) ifFalse:
> >>>>                                 [^false].
> >>>>                         calloutMethodContext := self externalInstVar:
> >>>>                 SenderIndex ofContext: callbackMethodContext.
> >>>>                         (self isLiveContext: calloutMethodContext)
> >>>>                 ifFalse:
> >>>>                                 [^false].
> >>>>                         "We're about to leave this stack page; must
> >>>>                 save the current frame's instructionPointer."
> >>>>                         self push: instructionPointer.
> >>>>                         self externalWriteBackHeadFramePointers.
> >>>>                         "Mark callbackMethodContext as dead; the
> >>>>                 common case is that it is the current frame.
> >>>>                          We go the extra mile for the debugger."
> >>>>                         (self isSingleContext: callbackMethodContext)
> >>>>                                 ifTrue: [self markContextAsDead:
> >>>>                 callbackMethodContext]
> >>>>                                 ifFalse:
> >>>>                                         [theFP := self
> >>>>                 frameOfMarriedContext: callbackMethodContext.
> >>>>                                          framePointer = theFP "common
> >>>>                 case"
> >>>>                                                 ifTrue:
> >>>>                                                         [(self
> >>>>                 isBaseFrame: theFP)
> >>>>
> >>>>                 ifTrue: [stackPages freeStackPage: stackPage]
> >>>>
> >>>>                 ifFalse: "calloutMethodContext is immediately below
> >>>>                 on the same page.  Make it current."
> >>>>
> >>>>                   [instructionPointer := (self frameCallerSavedIP:
> >>>>                 framePointer) asUnsignedInteger.
> >>>>
> >>>>                    stackPointer := framePointer + (self
> >>>>                 frameStackedReceiverOffset: framePointer) +
> >>>>                 objectMemory wordSize.
> >>>>
> >>>>                    framePointer := self frameCallerFP: framePointer.
> >>>>
> >>>>                    self setMethod: (self frameMethodObject:
> >>>>                 framePointer).
> >>>>
> >>>>                    self restoreCStackStateForCallbackContext:
> >>>>                 vmCallbackContext.
> >>>>
> >>>>                    "N.B. siglongjmp is defines as _longjmp on
> >>>>                 non-win32 platforms.
> >>>>
> >>>>                     This matches the use of _setjmp in ia32abicc.c."
> >>>>
> >>>>                    self siglong: vmCallbackContext trampoline jmp:
> >>>>                 (self integerValueOf: returnTypeOop).
> >>>>
> >>>>                    ^true]]
> >>>>                                                 ifFalse:
> >>>>                                                         [self
> >>>>                 externalDivorceFrame: theFP andContext:
> >>>>                 callbackMethodContext.
> >>>>                                                          self
> >>>>                 markContextAsDead: callbackMethodContext]].
> >>>>                         "Make the calloutMethodContext the active
> >>>>                 frame.  The case where calloutMethodContext
> >>>>                          is immediately below callbackMethodContext
> >>>>                 on the same page is handled above."
> >>>>                         (self isStillMarriedContext:
> >>>>                 calloutMethodContext)
> >>>>                                 ifTrue:
> >>>>                                         [theFP := self
> >>>>                 frameOfMarriedContext: calloutMethodContext.
> >>>>                                          thePage := stackPages
> >>>>                 stackPageFor: theFP.
> >>>>                                          "findSPOf:on: points to the
> >>>>                 word beneath the instructionPointer, but
> >>>>                                           there is no
> >>>>                 instructionPointer on the top frame of the current
> >>>> page."
> >>>>                                          self assert: thePage ~=
> >>>>                 stackPage.
> >>>>                                          stackPointer := (self
> >>>>                 findSPOf: theFP on: thePage) - objectMemory wordSize.
> >>>>                                          framePointer := theFP]
> >>>>                                 ifFalse:
> >>>>                                         [thePage := self
> >>>>                 makeBaseFrameFor: calloutMethodContext.
> >>>>                                          framePointer := thePage headFP.
> >>>>                                          stackPointer := thePage
> >>>> headSP].
> >>>>                         instructionPointer := self popStack.
> >>>>                         self setMethod: (objectMemory fetchPointer:
> >>>>                 MethodIndex ofObject: calloutMethodContext).
> >>>>                         self setStackPageAndLimit: thePage.
> >>>>                         self restoreCStackStateForCallbackContext:
> >>>>                 vmCallbackContext.
> >>>>                          "N.B. siglongjmp is defines as _longjmp on
> >>>>                 non-win32 platforms.
> >>>>                           This matches the use of _setjmp in
> >>>>                 ia32abicc.c."
> >>>>                         self siglong: vmCallbackContext trampoline
> >>>>                 jmp: (self integerValueOf: returnTypeOop).
> >>>>                         "NOTREACHED"
> >>>>                         ^true
> >>>>
> >>>>                 with the first siglongjmp callbacks are passing fine.
> >>>>                 with the last (it would be if  framePointer = theFP
> >>>>                 AND !(isBaseFrame: theFP) ) it doesn???t.
> >>>>
> >>>>                 So??? from here I???m a bit lost??? I need some help :)
> >>>>
> >>>>                 thanks,
> >>>>                 Esteban
> >>>>
> >>>>
> >>>>
> >>>>
> >>>>
> >>>
> >>>
> >>>
> >>
> >>
> >>
> >>

Reply | Threaded
Open this post in threaded view
|

Re: latest changes (no idea from when it started) are making FFI win32 crash (and FFI Callbacks are not reliable anymore either)

Andres Valloud-4
 
Yes, precisely.

Side comment: my want of reliable VMs is stronger than my lack of
enthusiasm about spec lawyering.  Ideally the current state of affairs
wouldn't be what it is, with more spec than one can read in a lifetime.
Let's not contribute to the confusion.  It's hard work.  However, keep
in mind it's so nice and encouraging to find something that is well
done, that often one tends to copy the good pattern! :)

On 11/30/16 16:17 , David T. Lewis wrote:

>
> Checking a couple of man pages, on Linux we are warned that one should not
> expect too much in the way of standards and specifications:
>
>
> CONFORMING TO
>        This function is not in POSIX.1-2001.
>
>        There is evidence that the alloca() function appeared in 32V, PWB, PWB.2, 3BSD, and 4BSD.  There is a man page for it in 4.3BSD.  Linux
>        uses the GNU version.
>
>
> Noting from the above that alloca() seems to have orginated with BSD, check
> the man page on FreeBSD:
>
>
> BUGS
>      The alloca() function is machine and compiler dependent; its use is dis-
>      couraged.
>
>      The alloca() function is slightly unsafe because it cannot ensure that
>      the pointer returned points to a valid and usable block of memory.  The
>      allocation made may exceed the bounds of the stack, or even go further
>      into other objects in memory, and alloca() cannot determine such an
>      error.  Avoid alloca() with large unbounded allocations.
>
> FreeBSD 6.2                    September 5, 2006                   FreeBSD 6.2
>
>
> So it is not expected to be portable or well specified or well behaved, and
> we should not be too surprised if implementation details vary on different
> platforms and compilers.
>
> Dave
>
>
> On Thu, Dec 01, 2016 at 12:27:03AM +0100, Nicolas Cellier wrote:
>>
>> Hi Andres
>> "It would be worth writing  a unit test case" did mean exactly that, write
>> a few lines of C.
>>
>> I don't know what you call specification here, probably our expectation?
>> The behavior we expect, though sounding reasonnable, is not specified by
>> any standard I know of.
>> It's probably unspecified and at best implementation defined.
>> That's why I suggest inquiring gcc implementation.
>>
>> Both i686-w64-mingw32-gcc and x86_64-w64-mingw32-gcc do reserve bytes on
>> the stack under the alloca'ed.
>> My findings is that this space depends on max number of parameters of
>> functions called.
>> For example, calling fprintf(1,"\n") after alloca would reserve 8
>> additional bytes on i686 and 16 on x86_64.
>> Calling fprintf(1,"%d\n",x); would reserve 12 and 24 bytes respectively.
>>
>> This does not happen with clang.
>>
>> 2016-11-30 3:07 GMT+01:00 Andres Valloud <[hidden email]>
>> :
>>
>>>
>>> To prove alloca() *lies*, one needs to show e.g. a 5-10 C program
>>> independent of anything else exemplifying a clear specification violation.
>>> Otherwise, how do you know the LIARS_LIARS_PANTS_ON_FIRE macros are not
>>> compensating for undefined behavior elsewhere?
>>>
>>> On 11/29/16 16:23 , Nicolas Cellier wrote:
>>>
>>>>
>>>>
>>>>
>>>>
>>>> Though, it's necessary to define ALLOCA_LIES_SO_USE_GETSP to zero to
>>>> make FFI work with gcc.
>>>> That does not mean that alloca does not lie, just that there is another
>>>> problem with stack management...
>>>>
>>>> 2016-11-29 21:22 GMT+01:00 Nicolas Cellier
>>>> <[hidden email]
>>>> <mailto:[hidden email]>>:
>>>>
>>>>
>>>>     Thanks Ronie and Esteban.
>>>>     This seems to be an alignment problem indeed.
>>>>     What I see is that alignment is defined at least in 3 different
>>>> places:
>>>>     - platforms/Cross/vm/sqCogStackAlignment.h
>>>>     - platforms/Cross/plugins/IA32ABI/ia32abicc.c
>>>>     - src/plugins/SqueakFFIPrims/IA32FFIPlugin.c and friends...
>>>>     That's just too many different opinions!!! We have to unify that
>>>>     rather than adding a 4th opinion in a Makefile.
>>>>
>>>>     However, about ALLOCA_LIES_SO_USE_GETSP, I'm not so sure that "It is
>>>>     NOT the case of mingw."
>>>>     Last time I used gdb, it WAS still the case, alloca was STILL lying.
>>>>     See
>>>>     http://lists.squeakfoundation.org/pipermail/vm-dev/2016-Augu
>>>> st/022985.html
>>>>     <http://lists.squeakfoundation.org/pipermail/vm-dev/2016-
>>>> August/022985.html>
>>>>
>>>>     BUT:
>>>>     -----
>>>>     forcing 16 bytes alignment supersedes the alloca hack, making it not
>>>>     strictly necessary anymore
>>>>     see below in generated src/plgins/IA32FFIPlugin.c:
>>>>
>>>>             allocation = alloca(((stackSize +
>>>>     ((calloutState->structReturnSize)))) + (cStackAlignment()));
>>>>             if (allocaLiesSoUseGetsp()) {
>>>>                     allocation = getsp();
>>>>             }
>>>>             if ((cStackAlignment()) != 0) {
>>>>                     allocation = ((char *) ((((((usqInt)allocation)) |
>>>>     ((cStackAlignment()) - 1)) - ((cStackAlignment()) - 1))));
>>>>             }
>>>>             (calloutState->argVector = allocation);
>>>>
>>>>     but we further do:
>>>>
>>>>             if ((0 + (cStackAlignment())) > 0) {
>>>>                     setsp((calloutState->argVector));
>>>>             }
>>>>
>>>>     So if ever the stack pointer is greater than alloca return value,
>>>>     but we removed the ALLOCA_LIES hack,
>>>>     the stack pointer is then set back to alloca returned value,
>>>>     avoiding the stack pointer offset problem
>>>>     It would be worth writing  a unit test case, and inquiring the
>>>>     reason why it lies in gcc mailing list to be sure...
>>>>
>>>>     cheers
>>>>
>>>>     2016-11-29 18:14 GMT+01:00 Esteban Lorenzano <[hidden email]
>>>>     <mailto:[hidden email]>>:
>>>>
>>>>
>>>>         hah!
>>>>         you know what is the sad part of this? I wrote that message??? it
>>>>         was for the future me, but I forget to check our flags :P
>>>>         I lost 2.5 days then + 2 days now.
>>>>
>>>>         this fixes the problem with Windows crashes (yay!) but not the
>>>>         problem with callbacks (booo!)??? any idea in that area?
>>>>
>>>>         cheers,
>>>>         Esteban
>>>>
>>>>         On 29 Nov 2016, at 17:30, Ronie Salgado <[hidden email]
>>>>>         <mailto:[hidden email]>> wrote:
>>>>>
>>>>>         The last week I was having this exactly same crash in the
>>>>>         MinimalisticHeadless branch, with both MinGW and with Visual
>>>>>         Studio. I managed to get the VM working with MinGW (not yet
>>>>>         with MSVC) by using the following defines,which I copied from
>>>>>         the old Pharo CMake scripts:
>>>>>
>>>>>         -DSTACK_ALIGN_BYTES=16 -DALLOCA_LIES_SO_USE_GETSP=0
>>>>>
>>>>>         In the pharo-vm, the CogFamilyWindowsConfig >>
>>>>>         #commonCompilerFlags method starts with the following comment:
>>>>>         commonCompilerFlags
>>>>>             "omit -ggdb2 to prevent generating debug info"
>>>>>             "Some flags explanation:
>>>>>
>>>>>             STACK_ALIGN_BYTES=16 is needed in mingw and FFI (and I
>>>>>         suppose on other modules too).
>>>>>             DALLOCA_LIES_SO_USE_GETSP=0 Some compilers return the
>>>>>         stack address+4 on alloca function,
>>>>>             then FFI module needs to adjust that. It is NOT the case
>>>>>         of mingw.
>>>>>             For more information see this thread:
>>>>>         http://forum.world.st/There-are-something-fishy-with-FFI-plu
>>>>> gin-td4584226.html
>>>>>         <http://forum.world.st/There-are-something-fishy-with-FFI-pl
>>>>> ugin-td4584226.html>
>>>>>             "
>>>>>
>>>>>
>>>>>         2016-11-29 9:32 GMT-03:00 Esteban Lorenzano
>>>>>         <[hidden email] <mailto:[hidden email]>>:
>>>>>
>>>>>
>>>>>
>>>>>             On 29 Nov 2016, at 13:04, Cl??ment Bera
>>>>>>             <[hidden email] <mailto:[hidden email]>>
>>>>>>             wrote:
>>>>>>
>>>>>>             Hi,
>>>>>>
>>>>>>             Can you confirm this bug happen only in Windows ?
>>>>>>
>>>>>
>>>>>             yes, the crash is just in windows.
>>>>>             the callback problem is general (note that
>>>>>             FFICallbackTests works fine, but I think this is related
>>>>>             to the fact that it never enters the 2nd condition with
>>>>>             the qsort function) .
>>>>>
>>>>>
>>>>>>             Do you have version number (both VMMaker and git commit)
>>>>>>             of the last version you have that was working ?
>>>>>>
>>>>>
>>>>>             sadly, not??? I tried to get the latest working version, but
>>>>>             with the mess I have to get the VM to build with
>>>>>             opensmalltalk-vm, I couldn???t track it.
>>>>>             I suspect is related to the work on 64bits for windows,
>>>>>             but I have no proof of that :P
>>>>>
>>>>>             Esteban
>>>>>
>>>>>
>>>>>>             Thanks.
>>>>>>
>>>>>>
>>>>>>             On Tue, Nov 29, 2016 at 11:54 AM, Esteban Lorenzano
>>>>>>             <[hidden email] <mailto:[hidden email]>> wrote:
>>>>>>
>>>>>>
>>>>>>                 Hi,
>>>>>>
>>>>>>                 So, I???m building the PharoVM along with all his
>>>>>>                 dependencies. For me, this is a major step because I
>>>>>>                 can drop the old build process finally.
>>>>>>                 Now, I???m having serious problems with FFI (that they
>>>>>>                 were not present before), :
>>>>>>
>>>>>>
>>>>>>                 1. CRASH IN WINDOWS (32bits):
>>>>>>
>>>>>>                 In Win32, it crashes automatically when trying to
>>>>>>                 access this funtion:
>>>>>>
>>>>>>                 getEnvSize: nameString
>>>>>>                         ^ self ffiCall: #( int
>>>>>>                 GetEnvironmentVariableA ( String nameString, nil, 0 )
>>>>>>                 ) module: #Kernel32
>>>>>>
>>>>>>                  (this works perfectly fine in older versions)
>>>>>>
>>>>>>                 2. CALLBACKS FAILING:
>>>>>>
>>>>>>                 Callbacks have problems. The examples passes but they
>>>>>>                 are very simple??? as soon as I try to do something
>>>>>>                 complicates (like unqlite bindings or libgit2
>>>>>>                 bindings, who use callbacks intensively), callbacks
>>>>>>                 stops working.
>>>>>>                 I traced the problem up to this method:
>>>>>>
>>>>>>                 StackInterpreter>>#returnAs:ThroughCallback:Context:
>>>>>>
>>>>>>                 returnAs: returnTypeOop ThroughCallback:
>>>>>>                 vmCallbackContext Context: callbackMethodContext
>>>>>>                         "callbackMethodContext is an activation of
>>>>>>                 invokeCallback:[stack:registers:jmpbuf:].
>>>>>>                          Its sender is the VM's state prior to the
>>>>>>                 callback.  Reestablish that state (via longjmp),
>>>>>>                          and mark callbackMethodContext as dead."
>>>>>>                         <export: true>
>>>>>>                         <var: #vmCallbackContext type:
>>>>>>                 #'VMCallbackContext *'>
>>>>>>                         | calloutMethodContext theFP thePage |
>>>>>>                         <var: #theFP type: #'char *'>
>>>>>>                         <var: #thePage type: #'StackPage *'>
>>>>>>                         ((self isIntegerObject: returnTypeOop)
>>>>>>                          and: [self isLiveContext:
>>>>>>                 callbackMethodContext]) ifFalse:
>>>>>>                                 [^false].
>>>>>>                         calloutMethodContext := self externalInstVar:
>>>>>>                 SenderIndex ofContext: callbackMethodContext.
>>>>>>                         (self isLiveContext: calloutMethodContext)
>>>>>>                 ifFalse:
>>>>>>                                 [^false].
>>>>>>                         "We're about to leave this stack page; must
>>>>>>                 save the current frame's instructionPointer."
>>>>>>                         self push: instructionPointer.
>>>>>>                         self externalWriteBackHeadFramePointers.
>>>>>>                         "Mark callbackMethodContext as dead; the
>>>>>>                 common case is that it is the current frame.
>>>>>>                          We go the extra mile for the debugger."
>>>>>>                         (self isSingleContext: callbackMethodContext)
>>>>>>                                 ifTrue: [self markContextAsDead:
>>>>>>                 callbackMethodContext]
>>>>>>                                 ifFalse:
>>>>>>                                         [theFP := self
>>>>>>                 frameOfMarriedContext: callbackMethodContext.
>>>>>>                                          framePointer = theFP "common
>>>>>>                 case"
>>>>>>                                                 ifTrue:
>>>>>>                                                         [(self
>>>>>>                 isBaseFrame: theFP)
>>>>>>
>>>>>>                 ifTrue: [stackPages freeStackPage: stackPage]
>>>>>>
>>>>>>                 ifFalse: "calloutMethodContext is immediately below
>>>>>>                 on the same page.  Make it current."
>>>>>>
>>>>>>                   [instructionPointer := (self frameCallerSavedIP:
>>>>>>                 framePointer) asUnsignedInteger.
>>>>>>
>>>>>>                    stackPointer := framePointer + (self
>>>>>>                 frameStackedReceiverOffset: framePointer) +
>>>>>>                 objectMemory wordSize.
>>>>>>
>>>>>>                    framePointer := self frameCallerFP: framePointer.
>>>>>>
>>>>>>                    self setMethod: (self frameMethodObject:
>>>>>>                 framePointer).
>>>>>>
>>>>>>                    self restoreCStackStateForCallbackContext:
>>>>>>                 vmCallbackContext.
>>>>>>
>>>>>>                    "N.B. siglongjmp is defines as _longjmp on
>>>>>>                 non-win32 platforms.
>>>>>>
>>>>>>                     This matches the use of _setjmp in ia32abicc.c."
>>>>>>
>>>>>>                    self siglong: vmCallbackContext trampoline jmp:
>>>>>>                 (self integerValueOf: returnTypeOop).
>>>>>>
>>>>>>                    ^true]]
>>>>>>                                                 ifFalse:
>>>>>>                                                         [self
>>>>>>                 externalDivorceFrame: theFP andContext:
>>>>>>                 callbackMethodContext.
>>>>>>                                                          self
>>>>>>                 markContextAsDead: callbackMethodContext]].
>>>>>>                         "Make the calloutMethodContext the active
>>>>>>                 frame.  The case where calloutMethodContext
>>>>>>                          is immediately below callbackMethodContext
>>>>>>                 on the same page is handled above."
>>>>>>                         (self isStillMarriedContext:
>>>>>>                 calloutMethodContext)
>>>>>>                                 ifTrue:
>>>>>>                                         [theFP := self
>>>>>>                 frameOfMarriedContext: calloutMethodContext.
>>>>>>                                          thePage := stackPages
>>>>>>                 stackPageFor: theFP.
>>>>>>                                          "findSPOf:on: points to the
>>>>>>                 word beneath the instructionPointer, but
>>>>>>                                           there is no
>>>>>>                 instructionPointer on the top frame of the current
>>>>>> page."
>>>>>>                                          self assert: thePage ~=
>>>>>>                 stackPage.
>>>>>>                                          stackPointer := (self
>>>>>>                 findSPOf: theFP on: thePage) - objectMemory wordSize.
>>>>>>                                          framePointer := theFP]
>>>>>>                                 ifFalse:
>>>>>>                                         [thePage := self
>>>>>>                 makeBaseFrameFor: calloutMethodContext.
>>>>>>                                          framePointer := thePage headFP.
>>>>>>                                          stackPointer := thePage
>>>>>> headSP].
>>>>>>                         instructionPointer := self popStack.
>>>>>>                         self setMethod: (objectMemory fetchPointer:
>>>>>>                 MethodIndex ofObject: calloutMethodContext).
>>>>>>                         self setStackPageAndLimit: thePage.
>>>>>>                         self restoreCStackStateForCallbackContext:
>>>>>>                 vmCallbackContext.
>>>>>>                          "N.B. siglongjmp is defines as _longjmp on
>>>>>>                 non-win32 platforms.
>>>>>>                           This matches the use of _setjmp in
>>>>>>                 ia32abicc.c."
>>>>>>                         self siglong: vmCallbackContext trampoline
>>>>>>                 jmp: (self integerValueOf: returnTypeOop).
>>>>>>                         "NOTREACHED"
>>>>>>                         ^true
>>>>>>
>>>>>>                 with the first siglongjmp callbacks are passing fine.
>>>>>>                 with the last (it would be if  framePointer = theFP
>>>>>>                 AND !(isBaseFrame: theFP) ) it doesn???t.
>>>>>>
>>>>>>                 So??? from here I???m a bit lost??? I need some help :)
>>>>>>
>>>>>>                 thanks,
>>>>>>                 Esteban
>>>>>>
>>>>>>
>>>>>>
>>>>>>
>>>>>>
>>>>>
>>>>>
>>>>>
>>>>
>>>>
>>>>
>>>>
>