[VM-dev] Terminated process with zero pc

Previous Topic Next Topic
 
classic Classic list List threaded Threaded
8 messages Options
Reply | Threaded
Open this post in threaded view
|

[VM-dev] Terminated process with zero pc

Denis Kudriashov
 
Hi.

I encounter interesting behavior of process termination. I can't reproduce it locally but I have constantly failing test on Pharo CI (as part of full test suite) due to following conditions:

ended := false.
process := [ Processor activeProcess suspend. ended := true] 
                       forkAt: Processor activePriority + 1.
self assert: process isSuspended description: 'should be suspended'.

process resume.

self assert: ended description: 'last statement is done'.
self assert: process suspendedContext pc equals: 0

When I try to run it locally the pc is always greater than #startpc. 
On CI the test very rarely behaves same way. The process is always finished (#ended is true) but in most of runs the pc of #terminate context is somehow reset to zero (probably related to the order of overall test suite). . 

In Pharo succeed process ends here:

terminate
            ...........
 self isActiveProcess ifTrue: [
thisContext terminateTo: nil.
self suspend ]

It seems that Context>>#terminateTo: can reset the pc to zero under some condition (jitter related?).

So I wonder what could be the explanation?

I would like to understand this behavior before I will propose the change for Pharo (#isTerminate implementation needs to be fixed).

Best regards,
Denis.

Reply | Threaded
Open this post in threaded view
|

Re: [VM-dev] Terminated process with zero pc

Eliot Miranda-2
 
Hi Denis,

On May 5, 2020, at 12:39 PM, Denis Kudriashov <[hidden email]> wrote:


Hi.

I encounter interesting behavior of process termination. I can't reproduce it locally but I have constantly failing test on Pharo CI (as part of full test suite) due to following conditions:

ended := false.
process := [ Processor activeProcess suspend. ended := true] 
                       forkAt: Processor activePriority + 1.
self assert: process isSuspended description: 'should be suspended'.

process resume.

self assert: ended description: 'last statement is done'.
self assert: process suspendedContext pc equals: 0

When I try to run it locally the pc is always greater than #startpc. 
On CI the test very rarely behaves same way. The process is always finished (#ended is true) but in most of runs the pc of #terminate context is somehow reset to zero (probably related to the order of overall test suite). . 

In Pharo succeed process ends here:

terminate
            ...........
 self isActiveProcess ifTrue: [
thisContext terminateTo: nil.
self suspend ]

It seems that Context>>#terminateTo: can reset the pc to zero under some condition (jitter related?).

The JIT maps machine code PCs to bytecode PCs whenever the pc is accessed of a context which is running JITted code.  If the mapping fails then 0 may be returned.  But this is evidence of a bug in the mapping computation.

So I wonder what could be the explanation?

I would like to understand this behavior before I will propose the change for Pharo (#isTerminate implementation needs to be fixed).

There are exhaustive tests which test all pairs of mappable PCs map correctly in VMMaker.oscog.  Those tests could be run against all methods in Pharo.  The tests use in-image compilation to produce the JITted methods.

But if there is a if such that either the tests are deficient or the VM is somehow suspending at a non-suspension point, the tests will not shoe the problem.

I would work in trying to reproduce the problem, especially in an assert-enabled vm run on the command line.  Then you will see issues in the mapping algorithm well before the mapping algorithm fails and answers a zero pc.


Best regards,
Denis.

Reply | Threaded
Open this post in threaded view
|

Re: [VM-dev] Terminated process with zero pc

Denis Kudriashov
 
Thanks Eliot.
That is interesting.

вт, 5 мая 2020 г. в 20:52, Eliot Miranda <[hidden email]>:
 
Hi Denis,

On May 5, 2020, at 12:39 PM, Denis Kudriashov <[hidden email]> wrote:


Hi.

I encounter interesting behavior of process termination. I can't reproduce it locally but I have constantly failing test on Pharo CI (as part of full test suite) due to following conditions:

ended := false.
process := [ Processor activeProcess suspend. ended := true] 
                       forkAt: Processor activePriority + 1.
self assert: process isSuspended description: 'should be suspended'.

process resume.

self assert: ended description: 'last statement is done'.
self assert: process suspendedContext pc equals: 0

When I try to run it locally the pc is always greater than #startpc. 
On CI the test very rarely behaves same way. The process is always finished (#ended is true) but in most of runs the pc of #terminate context is somehow reset to zero (probably related to the order of overall test suite). . 

In Pharo succeed process ends here:

terminate
            ...........
 self isActiveProcess ifTrue: [
thisContext terminateTo: nil.
self suspend ]

It seems that Context>>#terminateTo: can reset the pc to zero under some condition (jitter related?).

The JIT maps machine code PCs to bytecode PCs whenever the pc is accessed of a context which is running JITted code.  If the mapping fails then 0 may be returned.  But this is evidence of a bug in the mapping computation.

So I wonder what could be the explanation?

I would like to understand this behavior before I will propose the change for Pharo (#isTerminate implementation needs to be fixed).

There are exhaustive tests which test all pairs of mappable PCs map correctly in VMMaker.oscog.  Those tests could be run against all methods in Pharo.  The tests use in-image compilation to produce the JITted methods.

But if there is a if such that either the tests are deficient or the VM is somehow suspending at a non-suspension point, the tests will not shoe the problem.

I would work in trying to reproduce the problem, especially in an assert-enabled vm run on the command line.  Then you will see issues in the mapping algorithm well before the mapping algorithm fails and answers a zero pc.


Best regards,
Denis.

Reply | Threaded
Open this post in threaded view
|

Re: [VM-dev] Terminated process with zero pc

Eliot Miranda-2
 
Hi Denis,


On May 5, 2020, at 1:03 PM, Denis Kudriashov <[hidden email]> wrote:


Thanks Eliot.
That is interesting.

At Cadence we always ran the full test suite on an assert-enabled VM and my boss insisted that there be no assertion failures reported by this vm when running the test suite.  We caught two or three bugs this way, at least one of them with pic mapping (Bob, d’u have a better recollection?). It is well worth the effort.


вт, 5 мая 2020 г. в 20:52, Eliot Miranda <[hidden email]>:
 
Hi Denis,

On May 5, 2020, at 12:39 PM, Denis Kudriashov <[hidden email]> wrote:


Hi.

I encounter interesting behavior of process termination. I can't reproduce it locally but I have constantly failing test on Pharo CI (as part of full test suite) due to following conditions:

ended := false.
process := [ Processor activeProcess suspend. ended := true] 
                       forkAt: Processor activePriority + 1.
self assert: process isSuspended description: 'should be suspended'.

process resume.

self assert: ended description: 'last statement is done'.
self assert: process suspendedContext pc equals: 0

When I try to run it locally the pc is always greater than #startpc. 
On CI the test very rarely behaves same way. The process is always finished (#ended is true) but in most of runs the pc of #terminate context is somehow reset to zero (probably related to the order of overall test suite). . 

In Pharo succeed process ends here:

terminate
            ...........
 self isActiveProcess ifTrue: [
thisContext terminateTo: nil.
self suspend ]

It seems that Context>>#terminateTo: can reset the pc to zero under some condition (jitter related?).

The JIT maps machine code PCs to bytecode PCs whenever the pc is accessed of a context which is running JITted code.  If the mapping fails then 0 may be returned.  But this is evidence of a bug in the mapping computation.

So I wonder what could be the explanation?

I would like to understand this behavior before I will propose the change for Pharo (#isTerminate implementation needs to be fixed).

There are exhaustive tests which test all pairs of mappable PCs map correctly in VMMaker.oscog.  Those tests could be run against all methods in Pharo.  The tests use in-image compilation to produce the JITted methods.

But if there is a if such that either the tests are deficient or the VM is somehow suspending at a non-suspension point, the tests will not shoe the problem.

I would work in trying to reproduce the problem, especially in an assert-enabled vm run on the command line.  Then you will see issues in the mapping algorithm well before the mapping algorithm fails and answers a zero pc.


Best regards,
Denis.

Reply | Threaded
Open this post in threaded view
|

Re: [VM-dev] Terminated process with zero pc

Denis Kudriashov
In reply to this post by Eliot Miranda-2
 

вт, 5 мая 2020 г. в 20:52, Eliot Miranda <[hidden email]>:
 
Hi Denis,

On May 5, 2020, at 12:39 PM, Denis Kudriashov <[hidden email]> wrote:


Hi.

I encounter interesting behavior of process termination. I can't reproduce it locally but I have constantly failing test on Pharo CI (as part of full test suite) due to following conditions:

ended := false.
process := [ Processor activeProcess suspend. ended := true] 
                       forkAt: Processor activePriority + 1.
self assert: process isSuspended description: 'should be suspended'.

process resume.

self assert: ended description: 'last statement is done'.
self assert: process suspendedContext pc equals: 0

When I try to run it locally the pc is always greater than #startpc. 
On CI the test very rarely behaves same way. The process is always finished (#ended is true) but in most of runs the pc of #terminate context is somehow reset to zero (probably related to the order of overall test suite). . 

In Pharo succeed process ends here:

terminate
            ...........
 self isActiveProcess ifTrue: [
thisContext terminateTo: nil.
self suspend ]

It seems that Context>>#terminateTo: can reset the pc to zero under some condition (jitter related?).

The JIT maps machine code PCs to bytecode PCs whenever the pc is accessed of a context which is running JITted code.  If the mapping fails then 0 may be returned.  But this is evidence of a bug in the mapping computation.

And in that case Process>>terminate would be a jitted method context. In Pharo it is quite monstrous: 
I will try to rerun my test with smaller refactored version.
Reply | Threaded
Open this post in threaded view
|

Re: [VM-dev] Terminated process with zero pc

Denis Kudriashov
 
Hi Eliot.

I hope you did not spend your time on it. But if you did sorry for that. 
I found that my issue is related to the Pharo bootstrap process. On CI we run some kernel tests on fresh bootstrapped image to validate its correctness. The code for this image is loaded from prebuilt  binary files (containing precompiled methods). Nothing is recompiled there. So any changes to related part of system requires special update of those files.

I am not sure what exactly happens in this tiny image. But I am quite confident that it is not a VM issue.

Anyway thanks for your help


вт, 5 мая 2020 г. в 21:12, Denis Kudriashov <[hidden email]>:

вт, 5 мая 2020 г. в 20:52, Eliot Miranda <[hidden email]>:
 
Hi Denis,

On May 5, 2020, at 12:39 PM, Denis Kudriashov <[hidden email]> wrote:


Hi.

I encounter interesting behavior of process termination. I can't reproduce it locally but I have constantly failing test on Pharo CI (as part of full test suite) due to following conditions:

ended := false.
process := [ Processor activeProcess suspend. ended := true] 
                       forkAt: Processor activePriority + 1.
self assert: process isSuspended description: 'should be suspended'.

process resume.

self assert: ended description: 'last statement is done'.
self assert: process suspendedContext pc equals: 0

When I try to run it locally the pc is always greater than #startpc. 
On CI the test very rarely behaves same way. The process is always finished (#ended is true) but in most of runs the pc of #terminate context is somehow reset to zero (probably related to the order of overall test suite). . 

In Pharo succeed process ends here:

terminate
            ...........
 self isActiveProcess ifTrue: [
thisContext terminateTo: nil.
self suspend ]

It seems that Context>>#terminateTo: can reset the pc to zero under some condition (jitter related?).

The JIT maps machine code PCs to bytecode PCs whenever the pc is accessed of a context which is running JITted code.  If the mapping fails then 0 may be returned.  But this is evidence of a bug in the mapping computation.

And in that case Process>>terminate would be a jitted method context. In Pharo it is quite monstrous: 
I will try to rerun my test with smaller refactored version.
Reply | Threaded
Open this post in threaded view
|

Re: [VM-dev] Terminated process with zero pc

Eliot Miranda-2
 
Hi Denis,

On Mon, May 11, 2020 at 2:34 PM Denis Kudriashov <[hidden email]> wrote:
 
Hi Eliot.

I hope you did not spend your time on it. But if you did sorry for that. 
I found that my issue is related to the Pharo bootstrap process. On CI we run some kernel tests on fresh bootstrapped image to validate its correctness. The code for this image is loaded from prebuilt  binary files (containing precompiled methods). Nothing is recompiled there. So any changes to related part of system requires special update of those files.

I am not sure what exactly happens in this tiny image. But I am quite confident that it is not a VM issue.

:-)  That's a relief!!  Good to hear.
 
Anyway thanks for your help

You are most welcome.  Glad to help.

вт, 5 мая 2020 г. в 21:12, Denis Kudriashov <[hidden email]>:

вт, 5 мая 2020 г. в 20:52, Eliot Miranda <[hidden email]>:
 
Hi Denis,

On May 5, 2020, at 12:39 PM, Denis Kudriashov <[hidden email]> wrote:


Hi.

I encounter interesting behavior of process termination. I can't reproduce it locally but I have constantly failing test on Pharo CI (as part of full test suite) due to following conditions:

ended := false.
process := [ Processor activeProcess suspend. ended := true] 
                       forkAt: Processor activePriority + 1.
self assert: process isSuspended description: 'should be suspended'.

process resume.

self assert: ended description: 'last statement is done'.
self assert: process suspendedContext pc equals: 0

When I try to run it locally the pc is always greater than #startpc. 
On CI the test very rarely behaves same way. The process is always finished (#ended is true) but in most of runs the pc of #terminate context is somehow reset to zero (probably related to the order of overall test suite). . 

In Pharo succeed process ends here:

terminate
            ...........
 self isActiveProcess ifTrue: [
thisContext terminateTo: nil.
self suspend ]

It seems that Context>>#terminateTo: can reset the pc to zero under some condition (jitter related?).

The JIT maps machine code PCs to bytecode PCs whenever the pc is accessed of a context which is running JITted code.  If the mapping fails then 0 may be returned.  But this is evidence of a bug in the mapping computation.

And in that case Process>>terminate would be a jitted method context. In Pharo it is quite monstrous: 
I will try to rerun my test with smaller refactored version.


--
_,,,^..^,,,_
best, Eliot
Reply | Threaded
Open this post in threaded view
|

Re: [VM-dev] Terminated process with zero pc

Eliot Miranda-2
In reply to this post by Denis Kudriashov
 


On Tue, May 5, 2020 at 1:12 PM Denis Kudriashov <[hidden email]> wrote:
 

вт, 5 мая 2020 г. в 20:52, Eliot Miranda <[hidden email]>:
 
Hi Denis,

On May 5, 2020, at 12:39 PM, Denis Kudriashov <[hidden email]> wrote:


Hi.

I encounter interesting behavior of process termination. I can't reproduce it locally but I have constantly failing test on Pharo CI (as part of full test suite) due to following conditions:

ended := false.
process := [ Processor activeProcess suspend. ended := true] 
                       forkAt: Processor activePriority + 1.
self assert: process isSuspended description: 'should be suspended'.

process resume.

self assert: ended description: 'last statement is done'.
self assert: process suspendedContext pc equals: 0

When I try to run it locally the pc is always greater than #startpc. 
On CI the test very rarely behaves same way. The process is always finished (#ended is true) but in most of runs the pc of #terminate context is somehow reset to zero (probably related to the order of overall test suite). . 

In Pharo succeed process ends here:

terminate
            ...........
 self isActiveProcess ifTrue: [
thisContext terminateTo: nil.
self suspend ]

It seems that Context>>#terminateTo: can reset the pc to zero under some condition (jitter related?).

The JIT maps machine code PCs to bytecode PCs whenever the pc is accessed of a context which is running JITted code.  If the mapping fails then 0 may be returned.  But this is evidence of a bug in the mapping computation.

And in that case Process>>terminate would be a jitted method context. In Pharo it is quite monstrous: 

SQUEAK IS NO BETTER.  THERE BE DRAGONS THERE.
 

I will try to rerun my test with smaller refactored version.

That would be good to see.  please send any refactoring to us here in squeakland; i'm sure we'd like to integrate it..

_,,,^..^,,,_
best, Eliot