unexpected semantics of #doTransaction: with multithreading (used in WAGemStoneServiceTask)

Previous Topic Next Topic
 
classic Classic list List threaded Threaded
5 messages Options
Reply | Threaded
Open this post in threaded view
|

unexpected semantics of #doTransaction: with multithreading (used in WAGemStoneServiceTask)

Johan Brichau-2
Hi folks, Dale,

I have been chasing down on regular occurrences of 'blocked' service VMs (i.e. they were no longer processing tasks and had to be restarted). I ultimately tracked it down to a (for me) unexpected semantics of the GRGemStonePlatform>>doTransaction: implementation in the context of multiple processes in the same vm (as is the case in the service vm).

In a WAGemStoneServiceTaskVM, up to a 100 concurrent WAGemStoneServiceTask instances can be executed at the same time. Each of these processes (running in the same service vm) starts transactions using the #doTransaction: method, which is implemented as follows:

        System inTransaction
                ifTrue: [ "We alread in a transaction, so just evaluate the block"
                        aBlock value.
                        ^true].
        self transactionMutex critical: [
                "Get the transactionMutex, and perform the transaction."
                [
                        self doBeginTransaction.
                        aBlock value.
                ] ensure: [
                        ^self doCommitTransaction]].


If the VM is in a Tx, the task block will be executed right away and always return true. If not, only then we wait for the tx-mutex and execute a transaction.
So it seems that nested calls of #doTransaction should always work, but that is not true when multiple processes are running.

The way we are using the service tasks is that inside their task block, they also use the #doTransaction: method _and_ they are making external (http socket) calls while inside that transaction block.
The result is that the scheduler will interweave processes while there are executing the #doTransaction: blocks. So another process could abort or commit a partial result of another process, etc...
Some of the executed blocks will get committed, some not, pure randomly depending on how the scheduler started interweaving the blocks.

I therefore changed the implementation of #doTransaction: as follows you always need to get the tx-mutex, wether you are already in tx or not because otherwise you might be screwing with another process tx-block.
This means tx-blocks are mutually exclusive for all processes, which I think they were meant to be?


        self transactionMutex critical: [
                "Get the transactionMutex, and perform the transaction."
                System inTransaction
                        ifTrue: [ "We already are in a transaction, so just evaluate the block"
                                        aBlock value.
                                        ^ true]
                        ifFalse:[
                                self doBeginTransaction.
                                [aBlock value] ensure: [ ^self doCommitTransaction]]]


What are your thoughts?
thx
Johan
Reply | Threaded
Open this post in threaded view
|

Re: unexpected semantics of #doTransaction: with multithreading (used in WAGemStoneServiceTask)

Dale Henrichs
Johan,

Your changes are correct ... the only time that it makes sense to perform aBlock when already in transaction without acquiring the mutex is when you are in a _process_ that has already acquired the mutex and that is handled in the #critical: implementation of TransientRecursionLock ...

Good catch!

Dale

----- Original Message -----
| From: "Johan Brichau" <[hidden email]>
| To: "GemStone Seaside beta discussion" <[hidden email]>
| Sent: Monday, February 11, 2013 7:24:14 AM
| Subject: [GS/SS Beta] unexpected semantics of #doTransaction: with multithreading (used in WAGemStoneServiceTask)
|
| Hi folks, Dale,
|
| I have been chasing down on regular occurrences of 'blocked' service VMs
| (i.e. they were no longer processing tasks and had to be restarted). I
| ultimately tracked it down to a (for me) unexpected semantics of the
| GRGemStonePlatform>>doTransaction: implementation in the context of multiple
| processes in the same vm (as is the case in the service vm).
|
| In a WAGemStoneServiceTaskVM, up to a 100 concurrent WAGemStoneServiceTask
| instances can be executed at the same time. Each of these processes (running
| in the same service vm) starts transactions using the #doTransaction:
| method, which is implemented as follows:
|
| System inTransaction
| ifTrue: [ "We alread in a transaction, so just evaluate the block"
| aBlock value.
| ^true].
| self transactionMutex critical: [
| "Get the transactionMutex, and perform the transaction."
| [
| self doBeginTransaction.
| aBlock value.
| ] ensure: [
| ^self doCommitTransaction]].
|
|
| If the VM is in a Tx, the task block will be executed right away and always
| return true. If not, only then we wait for the tx-mutex and execute a
| transaction.
| So it seems that nested calls of #doTransaction should always work, but that
| is not true when multiple processes are running.
|
| The way we are using the service tasks is that inside their task block, they
| also use the #doTransaction: method _and_ they are making external (http
| socket) calls while inside that transaction block.
| The result is that the scheduler will interweave processes while there are
| executing the #doTransaction: blocks. So another process could abort or
| commit a partial result of another process, etc...
| Some of the executed blocks will get committed, some not, pure randomly
| depending on how the scheduler started interweaving the blocks.
|
| I therefore changed the implementation of #doTransaction: as follows you
| always need to get the tx-mutex, wether you are already in tx or not because
| otherwise you might be screwing with another process tx-block.
| This means tx-blocks are mutually exclusive for all processes, which I think
| they were meant to be?
|
|
| self transactionMutex critical: [
| "Get the transactionMutex, and perform the transaction."
| System inTransaction
| ifTrue: [ "We already are in a transaction, so just evaluate the block"
| aBlock value.
| ^ true]
| ifFalse:[
| self doBeginTransaction.
| [aBlock value] ensure: [ ^self doCommitTransaction]]]
|
|
| What are your thoughts?
| thx
| Johan
Reply | Threaded
Open this post in threaded view
|

Re: unexpected semantics of #doTransaction: with multithreading (used in WAGemStoneServiceTask)

Johan Brichau-2
Hi Dale,

Good to know ;-)
I noticed a typo in the code I posted. This is the correct one:

        self transactionMutex critical: [
                "Get the transactionMutex, and perform the transaction."
                System inTransaction
                        ifTrue: [ "We already are in a transaction, so just evaluate the block"
                                        aBlock value.
                                        ^ true]
                        ifFalse:[
                                        [
                                                self doBeginTransaction.
                                                aBlock value
                                        ] ensure: [ ^self doCommitTransaction]]]


On 11 Feb 2013, at 18:39, Dale Henrichs wrote:

> Johan,
>
> Your changes are correct ... the only time that it makes sense to perform aBlock when already in transaction without acquiring the mutex is when you are in a _process_ that has already acquired the mutex and that is handled in the #critical: implementation of TransientRecursionLock ...
>
> Good catch!
>
> Dale
>
> ----- Original Message -----
> | From: "Johan Brichau" <[hidden email]>
> | To: "GemStone Seaside beta discussion" <[hidden email]>
> | Sent: Monday, February 11, 2013 7:24:14 AM
> | Subject: [GS/SS Beta] unexpected semantics of #doTransaction: with multithreading (used in WAGemStoneServiceTask)
> |
> | Hi folks, Dale,
> |
> | I have been chasing down on regular occurrences of 'blocked' service VMs
> | (i.e. they were no longer processing tasks and had to be restarted). I
> | ultimately tracked it down to a (for me) unexpected semantics of the
> | GRGemStonePlatform>>doTransaction: implementation in the context of multiple
> | processes in the same vm (as is the case in the service vm).
> |
> | In a WAGemStoneServiceTaskVM, up to a 100 concurrent WAGemStoneServiceTask
> | instances can be executed at the same time. Each of these processes (running
> | in the same service vm) starts transactions using the #doTransaction:
> | method, which is implemented as follows:
> |
> | System inTransaction
> | ifTrue: [ "We alread in a transaction, so just evaluate the block"
> | aBlock value.
> | ^true].
> | self transactionMutex critical: [
> | "Get the transactionMutex, and perform the transaction."
> | [
> | self doBeginTransaction.
> | aBlock value.
> | ] ensure: [
> | ^self doCommitTransaction]].
> |
> |
> | If the VM is in a Tx, the task block will be executed right away and always
> | return true. If not, only then we wait for the tx-mutex and execute a
> | transaction.
> | So it seems that nested calls of #doTransaction should always work, but that
> | is not true when multiple processes are running.
> |
> | The way we are using the service tasks is that inside their task block, they
> | also use the #doTransaction: method _and_ they are making external (http
> | socket) calls while inside that transaction block.
> | The result is that the scheduler will interweave processes while there are
> | executing the #doTransaction: blocks. So another process could abort or
> | commit a partial result of another process, etc...
> | Some of the executed blocks will get committed, some not, pure randomly
> | depending on how the scheduler started interweaving the blocks.
> |
> | I therefore changed the implementation of #doTransaction: as follows you
> | always need to get the tx-mutex, wether you are already in tx or not because
> | otherwise you might be screwing with another process tx-block.
> | This means tx-blocks are mutually exclusive for all processes, which I think
> | they were meant to be?
> |
> |
> | self transactionMutex critical: [
> | "Get the transactionMutex, and perform the transaction."
> | System inTransaction
> | ifTrue: [ "We already are in a transaction, so just evaluate the block"
> | aBlock value.
> | ^ true]
> | ifFalse:[
> | self doBeginTransaction.
> | [aBlock value] ensure: [ ^self doCommitTransaction]]]
> |
> |
> | What are your thoughts?
> | thx
> | Johan

Reply | Threaded
Open this post in threaded view
|

Re: unexpected semantics of #doTransaction: with multithreading (used in WAGemStoneServiceTask)

Paul DeBruicker
It appears that Johan's changes haven't been adopted in Grease for GemStone.   Are these changes just for those transactions in the ServiceVM?   From reading the thread it seems like you both agreed it was a was a good change.  I'm just not sure where to put it in my stone.  

Thanks

Paul





Johan Brichau-2 wrote
Hi Dale,

Good to know ;-)
I noticed a typo in the code I posted. This is the correct one:

        self transactionMutex critical: [
                "Get the transactionMutex, and perform the transaction."
                System inTransaction
                        ifTrue: [ "We already are in a transaction, so just evaluate the block"
                                        aBlock value.
                                        ^ true]
                        ifFalse:[
                                        [
                                                self doBeginTransaction.
                                                aBlock value
                                        ] ensure: [ ^self doCommitTransaction]]]


On 11 Feb 2013, at 18:39, Dale Henrichs wrote:

> Johan,
>
> Your changes are correct ... the only time that it makes sense to perform aBlock when already in transaction without acquiring the mutex is when you are in a _process_ that has already acquired the mutex and that is handled in the #critical: implementation of TransientRecursionLock ...
>
> Good catch!
>
> Dale
>
> ----- Original Message -----
> | From: "Johan Brichau" <[hidden email]>
> | To: "GemStone Seaside beta discussion" <[hidden email]>
> | Sent: Monday, February 11, 2013 7:24:14 AM
> | Subject: [GS/SS Beta] unexpected semantics of #doTransaction: with multithreading (used in WAGemStoneServiceTask)
> |
> | Hi folks, Dale,
> |
> | I have been chasing down on regular occurrences of 'blocked' service VMs
> | (i.e. they were no longer processing tasks and had to be restarted). I
> | ultimately tracked it down to a (for me) unexpected semantics of the
> | GRGemStonePlatform>>doTransaction: implementation in the context of multiple
> | processes in the same vm (as is the case in the service vm).
> |
> | In a WAGemStoneServiceTaskVM, up to a 100 concurrent WAGemStoneServiceTask
> | instances can be executed at the same time. Each of these processes (running
> | in the same service vm) starts transactions using the #doTransaction:
> | method, which is implemented as follows:
> |
> | System inTransaction
> | ifTrue: [ "We alread in a transaction, so just evaluate the block"
> | aBlock value.
> | ^true].
> | self transactionMutex critical: [
> | "Get the transactionMutex, and perform the transaction."
> | [
> | self doBeginTransaction.
> | aBlock value.
> | ] ensure: [
> | ^self doCommitTransaction]].
> |
> |
> | If the VM is in a Tx, the task block will be executed right away and always
> | return true. If not, only then we wait for the tx-mutex and execute a
> | transaction.
> | So it seems that nested calls of #doTransaction should always work, but that
> | is not true when multiple processes are running.
> |
> | The way we are using the service tasks is that inside their task block, they
> | also use the #doTransaction: method _and_ they are making external (http
> | socket) calls while inside that transaction block.
> | The result is that the scheduler will interweave processes while there are
> | executing the #doTransaction: blocks. So another process could abort or
> | commit a partial result of another process, etc...
> | Some of the executed blocks will get committed, some not, pure randomly
> | depending on how the scheduler started interweaving the blocks.
> |
> | I therefore changed the implementation of #doTransaction: as follows you
> | always need to get the tx-mutex, wether you are already in tx or not because
> | otherwise you might be screwing with another process tx-block.
> | This means tx-blocks are mutually exclusive for all processes, which I think
> | they were meant to be?
> |
> |
> | self transactionMutex critical: [
> | "Get the transactionMutex, and perform the transaction."
> | System inTransaction
> | ifTrue: [ "We already are in a transaction, so just evaluate the block"
> | aBlock value.
> | ^ true]
> | ifFalse:[
> | self doBeginTransaction.
> | [aBlock value] ensure: [ ^self doCommitTransaction]]]
> |
> |
> | What are your thoughts?
> | thx
> | Johan
Reply | Threaded
Open this post in threaded view
|

Re: [Glass] unexpected semantics of #doTransaction: with multithreading (used in WAGemStoneServiceTask)

Johan Brichau-3
Ow, I did not realise this was part of Grease. I always thought it was some part of the GLASS codebase.

This code has been running in production with us for almost a year now, and all troubles with the service vm have stopped.
So yes, let's include it.

Actually, Grease1.1 should be ported decently and I will do that once I get to pick up the Seaside 3.1 work for gemstone.

Johan

On 29 Jan 2014, at 19:37, Paul DeBruicker <[hidden email]> wrote:

> It appears that Johan's changes haven't been adopted in Grease for GemStone.  
> Are these changes just for those transactions in the ServiceVM?   From
> reading the thread it seems like you both agreed it was a was a good change.
> I'm just not sure where to put it in my stone.  
>
> Thanks
>
> Paul
>
>
>
>
>
>
> Johan Brichau-2 wrote
>> Hi Dale,
>>
>> Good to know ;-)
>> I noticed a typo in the code I posted. This is the correct one:
>>
>> self transactionMutex critical: [
>> "Get the transactionMutex, and perform the transaction."
>> System inTransaction
>> ifTrue: [ "We already are in a transaction, so just evaluate the block"
>> aBlock value.
>> ^ true]
>> ifFalse:[
>> [
>> self doBeginTransaction.
>> aBlock value
>> ] ensure: [ ^self doCommitTransaction]]]
>>
>>
>> On 11 Feb 2013, at 18:39, Dale Henrichs wrote:
>>
>>> Johan,
>>>
>>> Your changes are correct ... the only time that it makes sense to perform
>>> aBlock when already in transaction without acquiring the mutex is when
>>> you are in a _process_ that has already acquired the mutex and that is
>>> handled in the #critical: implementation of TransientRecursionLock ...
>>>
>>> Good catch!
>>>
>>> Dale
>>>
>>> ----- Original Message -----
>>> | From: "Johan Brichau" &lt;
>
>> johan@
>
>> &gt;
>>> | To: "GemStone Seaside beta discussion" &lt;
>
>> beta@.gemstone
>
>> &gt;
>>> | Sent: Monday, February 11, 2013 7:24:14 AM
>>> | Subject: [GS/SS Beta] unexpected semantics of #doTransaction: with
>>> multithreading (used in WAGemStoneServiceTask)
>>> |
>>> | Hi folks, Dale,
>>> |
>>> | I have been chasing down on regular occurrences of 'blocked' service
>>> VMs
>>> | (i.e. they were no longer processing tasks and had to be restarted). I
>>> | ultimately tracked it down to a (for me) unexpected semantics of the
>>> | GRGemStonePlatform>>doTransaction: implementation in the context of
>>> multiple
>>> | processes in the same vm (as is the case in the service vm).
>>> |
>>> | In a WAGemStoneServiceTaskVM, up to a 100 concurrent
>>> WAGemStoneServiceTask
>>> | instances can be executed at the same time. Each of these processes
>>> (running
>>> | in the same service vm) starts transactions using the #doTransaction:
>>> | method, which is implemented as follows:
>>> |
>>> | System inTransaction
>>> | ifTrue: [ "We alread in a transaction, so just evaluate the block"
>>> | aBlock value.
>>> | ^true].
>>> | self transactionMutex critical: [
>>> | "Get the transactionMutex, and perform the transaction."
>>> | [
>>> | self doBeginTransaction.
>>> | aBlock value.
>>> | ] ensure: [
>>> | ^self doCommitTransaction]].
>>> |
>>> |
>>> | If the VM is in a Tx, the task block will be executed right away and
>>> always
>>> | return true. If not, only then we wait for the tx-mutex and execute a
>>> | transaction.
>>> | So it seems that nested calls of #doTransaction should always work, but
>>> that
>>> | is not true when multiple processes are running.
>>> |
>>> | The way we are using the service tasks is that inside their task block,
>>> they
>>> | also use the #doTransaction: method _and_ they are making external
>>> (http
>>> | socket) calls while inside that transaction block.
>>> | The result is that the scheduler will interweave processes while there
>>> are
>>> | executing the #doTransaction: blocks. So another process could abort or
>>> | commit a partial result of another process, etc...
>>> | Some of the executed blocks will get committed, some not, pure randomly
>>> | depending on how the scheduler started interweaving the blocks.
>>> |
>>> | I therefore changed the implementation of #doTransaction: as follows
>>> you
>>> | always need to get the tx-mutex, wether you are already in tx or not
>>> because
>>> | otherwise you might be screwing with another process tx-block.
>>> | This means tx-blocks are mutually exclusive for all processes, which I
>>> think
>>> | they were meant to be?
>>> |
>>> |
>>> | self transactionMutex critical: [
>>> | "Get the transactionMutex, and perform the transaction."
>>> | System inTransaction
>>> | ifTrue: [ "We already are in a transaction, so just evaluate the
>>> block"
>>> | aBlock value.
>>> | ^ true]
>>> | ifFalse:[
>>> | self doBeginTransaction.
>>> | [aBlock value] ensure: [ ^self doCommitTransaction]]]
>>> |
>>> |
>>> | What are your thoughts?
>>> | thx
>>> | Johan
>
>
>
>
>
> --
> View this message in context: http://forum.world.st/unexpected-semantics-of-doTransaction-with-multithreading-used-in-WAGemStoneServiceTask-tp4669267p4740174.html
> Sent from the GLASS mailing list archive at Nabble.com.
> _______________________________________________
> Glass mailing list
> [hidden email]
> http://lists.gemtalksystems.com/mailman/listinfo/glass

_______________________________________________
Glass mailing list
[hidden email]
http://lists.gemtalksystems.com/mailman/listinfo/glass