Hi folks, Dale,
I have been chasing down on regular occurrences of 'blocked' service VMs (i.e. they were no longer processing tasks and had to be restarted). I ultimately tracked it down to a (for me) unexpected semantics of the GRGemStonePlatform>>doTransaction: implementation in the context of multiple processes in the same vm (as is the case in the service vm). In a WAGemStoneServiceTaskVM, up to a 100 concurrent WAGemStoneServiceTask instances can be executed at the same time. Each of these processes (running in the same service vm) starts transactions using the #doTransaction: method, which is implemented as follows: System inTransaction ifTrue: [ "We alread in a transaction, so just evaluate the block" aBlock value. ^true]. self transactionMutex critical: [ "Get the transactionMutex, and perform the transaction." [ self doBeginTransaction. aBlock value. ] ensure: [ ^self doCommitTransaction]]. If the VM is in a Tx, the task block will be executed right away and always return true. If not, only then we wait for the tx-mutex and execute a transaction. So it seems that nested calls of #doTransaction should always work, but that is not true when multiple processes are running. The way we are using the service tasks is that inside their task block, they also use the #doTransaction: method _and_ they are making external (http socket) calls while inside that transaction block. The result is that the scheduler will interweave processes while there are executing the #doTransaction: blocks. So another process could abort or commit a partial result of another process, etc... Some of the executed blocks will get committed, some not, pure randomly depending on how the scheduler started interweaving the blocks. I therefore changed the implementation of #doTransaction: as follows you always need to get the tx-mutex, wether you are already in tx or not because otherwise you might be screwing with another process tx-block. This means tx-blocks are mutually exclusive for all processes, which I think they were meant to be? self transactionMutex critical: [ "Get the transactionMutex, and perform the transaction." System inTransaction ifTrue: [ "We already are in a transaction, so just evaluate the block" aBlock value. ^ true] ifFalse:[ self doBeginTransaction. [aBlock value] ensure: [ ^self doCommitTransaction]]] What are your thoughts? thx Johan |
Johan,
Your changes are correct ... the only time that it makes sense to perform aBlock when already in transaction without acquiring the mutex is when you are in a _process_ that has already acquired the mutex and that is handled in the #critical: implementation of TransientRecursionLock ... Good catch! Dale ----- Original Message ----- | From: "Johan Brichau" <[hidden email]> | To: "GemStone Seaside beta discussion" <[hidden email]> | Sent: Monday, February 11, 2013 7:24:14 AM | Subject: [GS/SS Beta] unexpected semantics of #doTransaction: with multithreading (used in WAGemStoneServiceTask) | | Hi folks, Dale, | | I have been chasing down on regular occurrences of 'blocked' service VMs | (i.e. they were no longer processing tasks and had to be restarted). I | ultimately tracked it down to a (for me) unexpected semantics of the | GRGemStonePlatform>>doTransaction: implementation in the context of multiple | processes in the same vm (as is the case in the service vm). | | In a WAGemStoneServiceTaskVM, up to a 100 concurrent WAGemStoneServiceTask | instances can be executed at the same time. Each of these processes (running | in the same service vm) starts transactions using the #doTransaction: | method, which is implemented as follows: | | System inTransaction | ifTrue: [ "We alread in a transaction, so just evaluate the block" | aBlock value. | ^true]. | self transactionMutex critical: [ | "Get the transactionMutex, and perform the transaction." | [ | self doBeginTransaction. | aBlock value. | ] ensure: [ | ^self doCommitTransaction]]. | | | If the VM is in a Tx, the task block will be executed right away and always | return true. If not, only then we wait for the tx-mutex and execute a | transaction. | So it seems that nested calls of #doTransaction should always work, but that | is not true when multiple processes are running. | | The way we are using the service tasks is that inside their task block, they | also use the #doTransaction: method _and_ they are making external (http | socket) calls while inside that transaction block. | The result is that the scheduler will interweave processes while there are | executing the #doTransaction: blocks. So another process could abort or | commit a partial result of another process, etc... | Some of the executed blocks will get committed, some not, pure randomly | depending on how the scheduler started interweaving the blocks. | | I therefore changed the implementation of #doTransaction: as follows you | always need to get the tx-mutex, wether you are already in tx or not because | otherwise you might be screwing with another process tx-block. | This means tx-blocks are mutually exclusive for all processes, which I think | they were meant to be? | | | self transactionMutex critical: [ | "Get the transactionMutex, and perform the transaction." | System inTransaction | ifTrue: [ "We already are in a transaction, so just evaluate the block" | aBlock value. | ^ true] | ifFalse:[ | self doBeginTransaction. | [aBlock value] ensure: [ ^self doCommitTransaction]]] | | | What are your thoughts? | thx | Johan |
Hi Dale,
Good to know ;-) I noticed a typo in the code I posted. This is the correct one: self transactionMutex critical: [ "Get the transactionMutex, and perform the transaction." System inTransaction ifTrue: [ "We already are in a transaction, so just evaluate the block" aBlock value. ^ true] ifFalse:[ [ self doBeginTransaction. aBlock value ] ensure: [ ^self doCommitTransaction]]] On 11 Feb 2013, at 18:39, Dale Henrichs wrote: > Johan, > > Your changes are correct ... the only time that it makes sense to perform aBlock when already in transaction without acquiring the mutex is when you are in a _process_ that has already acquired the mutex and that is handled in the #critical: implementation of TransientRecursionLock ... > > Good catch! > > Dale > > ----- Original Message ----- > | From: "Johan Brichau" <[hidden email]> > | To: "GemStone Seaside beta discussion" <[hidden email]> > | Sent: Monday, February 11, 2013 7:24:14 AM > | Subject: [GS/SS Beta] unexpected semantics of #doTransaction: with multithreading (used in WAGemStoneServiceTask) > | > | Hi folks, Dale, > | > | I have been chasing down on regular occurrences of 'blocked' service VMs > | (i.e. they were no longer processing tasks and had to be restarted). I > | ultimately tracked it down to a (for me) unexpected semantics of the > | GRGemStonePlatform>>doTransaction: implementation in the context of multiple > | processes in the same vm (as is the case in the service vm). > | > | In a WAGemStoneServiceTaskVM, up to a 100 concurrent WAGemStoneServiceTask > | instances can be executed at the same time. Each of these processes (running > | in the same service vm) starts transactions using the #doTransaction: > | method, which is implemented as follows: > | > | System inTransaction > | ifTrue: [ "We alread in a transaction, so just evaluate the block" > | aBlock value. > | ^true]. > | self transactionMutex critical: [ > | "Get the transactionMutex, and perform the transaction." > | [ > | self doBeginTransaction. > | aBlock value. > | ] ensure: [ > | ^self doCommitTransaction]]. > | > | > | If the VM is in a Tx, the task block will be executed right away and always > | return true. If not, only then we wait for the tx-mutex and execute a > | transaction. > | So it seems that nested calls of #doTransaction should always work, but that > | is not true when multiple processes are running. > | > | The way we are using the service tasks is that inside their task block, they > | also use the #doTransaction: method _and_ they are making external (http > | socket) calls while inside that transaction block. > | The result is that the scheduler will interweave processes while there are > | executing the #doTransaction: blocks. So another process could abort or > | commit a partial result of another process, etc... > | Some of the executed blocks will get committed, some not, pure randomly > | depending on how the scheduler started interweaving the blocks. > | > | I therefore changed the implementation of #doTransaction: as follows you > | always need to get the tx-mutex, wether you are already in tx or not because > | otherwise you might be screwing with another process tx-block. > | This means tx-blocks are mutually exclusive for all processes, which I think > | they were meant to be? > | > | > | self transactionMutex critical: [ > | "Get the transactionMutex, and perform the transaction." > | System inTransaction > | ifTrue: [ "We already are in a transaction, so just evaluate the block" > | aBlock value. > | ^ true] > | ifFalse:[ > | self doBeginTransaction. > | [aBlock value] ensure: [ ^self doCommitTransaction]]] > | > | > | What are your thoughts? > | thx > | Johan |
It appears that Johan's changes haven't been adopted in Grease for GemStone. Are these changes just for those transactions in the ServiceVM? From reading the thread it seems like you both agreed it was a was a good change. I'm just not sure where to put it in my stone.
Thanks Paul
|
Ow, I did not realise this was part of Grease. I always thought it was some part of the GLASS codebase.
This code has been running in production with us for almost a year now, and all troubles with the service vm have stopped. So yes, let's include it. Actually, Grease1.1 should be ported decently and I will do that once I get to pick up the Seaside 3.1 work for gemstone. Johan On 29 Jan 2014, at 19:37, Paul DeBruicker <[hidden email]> wrote: > It appears that Johan's changes haven't been adopted in Grease for GemStone. > Are these changes just for those transactions in the ServiceVM? From > reading the thread it seems like you both agreed it was a was a good change. > I'm just not sure where to put it in my stone. > > Thanks > > Paul > > > > > > > Johan Brichau-2 wrote >> Hi Dale, >> >> Good to know ;-) >> I noticed a typo in the code I posted. This is the correct one: >> >> self transactionMutex critical: [ >> "Get the transactionMutex, and perform the transaction." >> System inTransaction >> ifTrue: [ "We already are in a transaction, so just evaluate the block" >> aBlock value. >> ^ true] >> ifFalse:[ >> [ >> self doBeginTransaction. >> aBlock value >> ] ensure: [ ^self doCommitTransaction]]] >> >> >> On 11 Feb 2013, at 18:39, Dale Henrichs wrote: >> >>> Johan, >>> >>> Your changes are correct ... the only time that it makes sense to perform >>> aBlock when already in transaction without acquiring the mutex is when >>> you are in a _process_ that has already acquired the mutex and that is >>> handled in the #critical: implementation of TransientRecursionLock ... >>> >>> Good catch! >>> >>> Dale >>> >>> ----- Original Message ----- >>> | From: "Johan Brichau" < > >> johan@ > >> > >>> | To: "GemStone Seaside beta discussion" < > >> beta@.gemstone > >> > >>> | Sent: Monday, February 11, 2013 7:24:14 AM >>> | Subject: [GS/SS Beta] unexpected semantics of #doTransaction: with >>> multithreading (used in WAGemStoneServiceTask) >>> | >>> | Hi folks, Dale, >>> | >>> | I have been chasing down on regular occurrences of 'blocked' service >>> VMs >>> | (i.e. they were no longer processing tasks and had to be restarted). I >>> | ultimately tracked it down to a (for me) unexpected semantics of the >>> | GRGemStonePlatform>>doTransaction: implementation in the context of >>> multiple >>> | processes in the same vm (as is the case in the service vm). >>> | >>> | In a WAGemStoneServiceTaskVM, up to a 100 concurrent >>> WAGemStoneServiceTask >>> | instances can be executed at the same time. Each of these processes >>> (running >>> | in the same service vm) starts transactions using the #doTransaction: >>> | method, which is implemented as follows: >>> | >>> | System inTransaction >>> | ifTrue: [ "We alread in a transaction, so just evaluate the block" >>> | aBlock value. >>> | ^true]. >>> | self transactionMutex critical: [ >>> | "Get the transactionMutex, and perform the transaction." >>> | [ >>> | self doBeginTransaction. >>> | aBlock value. >>> | ] ensure: [ >>> | ^self doCommitTransaction]]. >>> | >>> | >>> | If the VM is in a Tx, the task block will be executed right away and >>> always >>> | return true. If not, only then we wait for the tx-mutex and execute a >>> | transaction. >>> | So it seems that nested calls of #doTransaction should always work, but >>> that >>> | is not true when multiple processes are running. >>> | >>> | The way we are using the service tasks is that inside their task block, >>> they >>> | also use the #doTransaction: method _and_ they are making external >>> (http >>> | socket) calls while inside that transaction block. >>> | The result is that the scheduler will interweave processes while there >>> are >>> | executing the #doTransaction: blocks. So another process could abort or >>> | commit a partial result of another process, etc... >>> | Some of the executed blocks will get committed, some not, pure randomly >>> | depending on how the scheduler started interweaving the blocks. >>> | >>> | I therefore changed the implementation of #doTransaction: as follows >>> you >>> | always need to get the tx-mutex, wether you are already in tx or not >>> because >>> | otherwise you might be screwing with another process tx-block. >>> | This means tx-blocks are mutually exclusive for all processes, which I >>> think >>> | they were meant to be? >>> | >>> | >>> | self transactionMutex critical: [ >>> | "Get the transactionMutex, and perform the transaction." >>> | System inTransaction >>> | ifTrue: [ "We already are in a transaction, so just evaluate the >>> block" >>> | aBlock value. >>> | ^ true] >>> | ifFalse:[ >>> | self doBeginTransaction. >>> | [aBlock value] ensure: [ ^self doCommitTransaction]]] >>> | >>> | >>> | What are your thoughts? >>> | thx >>> | Johan > > > > > > -- > View this message in context: http://forum.world.st/unexpected-semantics-of-doTransaction-with-multithreading-used-in-WAGemStoneServiceTask-tp4669267p4740174.html > Sent from the GLASS mailing list archive at Nabble.com. > _______________________________________________ > Glass mailing list > [hidden email] > http://lists.gemtalksystems.com/mailman/listinfo/glass _______________________________________________ Glass mailing list [hidden email] http://lists.gemtalksystems.com/mailman/listinfo/glass |
Free forum by Nabble | Edit this page |