[Glass] Nginx , FastCGI and GLASS problem

Previous Topic Next Topic
 
classic Classic list List threaded Threaded
25 messages Options
12
Reply | Threaded
Open this post in threaded view
|

Re: [Glass] Nginx , FastCGI and GLASS problem

Mariano Martinez Peck



On Thu, Dec 5, 2013 at 10:30 PM, Dale K. Henrichs <[hidden email]> wrote:
Johan and Mariano,

For 3.1, I think the following should work in the startSeaside30_Adaptor script. NOTE, I've used `true ifTrue: [] ifFalse:[]` to enclose the new-style static exception handler that in addition to handling (and logging) the fact that a SigAbort was handled installs a handler for the RepositoryViewLost notification that dumps a stack so we can get more info about what the gem is doing when the lost ot is issued:

true ifTrue: [.
Transcript cr; show: 'New style SigAbort hanlder'.
TransactionBacklog
  addDefaultHandler: [ :ex |.
    "Run the abort in a lowPriority process, since we must acquire the
     transactionMutex."
    Transcript cr; show: 'handled sigabort: ', DateAndTime now printString.
    [
      GRPlatform current transactionMutex
        critical: [ GRPlatform current doAbortTransaction ].
      TransactionBacklog enableSignalling ]
        forkAt: Processor lowestPriority.
      ex resume ].
 RepositoryViewLost
   addDefaultHandler: [ :ex |.
     GRPlatform current logError: ex description title: 'Lost OT' shouldCommit: false.
     ex pass ].
TransactionBacklog enableSignalling.
] ifFalse: [
Transcript cr; show: 'Old style SigAbort hanlder'.
Exception.
  installStaticException:.
    [:ex :cat :num :args |
      "Run the abort in a lowPriority process, since we must acquire the
       transactionMutex."
      Transcript cr; show: 'handled sigabort: ', DateAndTime now printString.
      [
        GRPlatform current transactionMutex.
          critical: [.
            GRPlatform current doAbortTransaction ].
        System enableSignaledAbortError.
      ] forkAt: Processor lowestPriority.
    ]
  category: GemStoneError
  number: 6009
  subtype: nil.
System enableSignaledAbortError.
].



Thanks Dale, I have updated my code with that. But notice that you script is full of dots where they should not be (maybe a weird copy paste from your email client)...
Here is the version that works:



true ifTrue: [
Transcript cr; show: 'New style SigAbort hanlder'.
TransactionBacklog
  addDefaultHandler: [ :ex |
    "Run the abort in a lowPriority process, since we must acquire the
     transactionMutex."
    Transcript cr; show: 'handled sigabort: ', DateAndTime now printString.
    [
      GRPlatform current transactionMutex
        critical: [ GRPlatform current doAbortTransaction ].
      TransactionBacklog enableSignalling ]
        forkAt: Processor lowestPriority.
      ex resume ].
 RepositoryViewLost
   addDefaultHandler: [ :ex |
     GRPlatform current logError: ex description title: 'Lost OT' shouldCommit: false.
     ex pass ].
TransactionBacklog enableSignalling.
] ifFalse: [
Transcript cr; show: 'Old style SigAbort hanlder'.
Exception
  installStaticException:
    [:ex :cat :num :args |
      "Run the abort in a lowPriority process, since we must acquire the
       transactionMutex."
      Transcript cr; show: 'handled sigabort: ', DateAndTime now printString.
      [
        GRPlatform current transactionMutex
          critical: [
            GRPlatform current doAbortTransaction ].
        System enableSignaledAbortError.
      ] forkAt: Processor lowestPriority.
    ]
  category: GemStoneError
  number: 6009
  subtype: nil.
System enableSignaledAbortError.
].



 
If this is happening in 2.4 (Johan), I'll have to do a little more research to see if there is an exception equivalent to RepositoryViewLost signalled.

Dale

From: "Johan Brichau" <[hidden email]>
To: "Mariano Martinez Peck" <[hidden email]>
Cc: [hidden email]
Sent: Thursday, December 5, 2013 9:09:46 AM
Subject: Re: [Glass] Nginx , FastCGI and GLASS problem

And were you being blocked outside of a transaction or not? Because that is what the error says. I am getting this from time to time and still looking for a reason. 


Johan (sent from my mobile)

On 05 Dec 2013, at 17:48, Mariano Martinez Peck <[hidden email]> wrote:




On Thu, Dec 5, 2013 at 1:43 PM, Johan Brichau <[hidden email]> wrote:
Mariano,

The logs you show looks so similar to what I see in our gem logs when a gem stops responding...
When it occurs, can you look at the object with the mentioned oop in the write-write conflicts:

> Write-Write Conflicts...
>     383168257
> Write-Dependency Conflicts...

Inspect this:

        Object _objectForOop: 383168257

In my case this always opens on a dictionary with the #cacheTimeout property in it.
Is this also true in your case?


Thanks Johan for the advice. Unfortunately, I restarted everything did some changes etc so that OOP doesn't exist anymore :(

But I will keep it in case I need it again.

BTW, in my case I discovered the problem...it's that reading from /dev/random is blocking and when you run the OS in a virtual machine (like in my case I am using a VirtualBox vm) it is likely it will block for some time. There are lot of info about this in the internet. 
So at least that was my problem.

Thanks,

--
Mariano
http://marianopeck.wordpress.com

_______________________________________________
Glass mailing list
[hidden email]
http://lists.gemtalksystems.com/mailman/listinfo/glass




--
Mariano
http://marianopeck.wordpress.com

_______________________________________________
Glass mailing list
[hidden email]
http://lists.gemtalksystems.com/mailman/listinfo/glass
Reply | Threaded
Open this post in threaded view
|

Re: [Glass] Nginx , FastCGI and GLASS problem

Dale Henrichs-3
In reply to this post by Mariano Martinez Peck



From: "Mariano Martinez Peck" <[hidden email]>
To: "Johan Brichau" <[hidden email]>
Cc: [hidden email]
Sent: Thursday, December 5, 2013 5:27:04 PM
Subject: Re: [Glass] Nginx , FastCGI and GLASS problem




On Thu, Dec 5, 2013 at 2:09 PM, Johan Brichau <[hidden email]> wrote:
And were you being blocked outside of a transaction or not?

Sorry Johan, I didn't understand the question. 
But yes, I think I was being block outside the transaction. 
I guess the fundamental question is why are you out of transaction? In the normal seaside case the gem is sitting on a socket accept and it is outside of transaction while waiting for a connection ... when the stone decides that a session needs to abort, it signals the sigabort and the little background process is supposed to wake up and allow the sigabort to be processed ..

In Johan's case I am suspicious that the gem process may be slow to respond on a loaded system, but that is only a guess ...

In your case, it seems that you've described a scenario, where you are getting this error ... while doing processing ... if that is truly the case then we need to understand how you got "outside of transaction" ... two possibilities come to mind: forking a process while doing http request processing; or doing an explicit abort/commit whild doing http processing ...

 
Because that is what the error says. I am getting this from time to time and still looking for a reason. 


Johan (sent from my mobile)

On 05 Dec 2013, at 17:48, Mariano Martinez Peck <[hidden email]> wrote:




On Thu, Dec 5, 2013 at 1:43 PM, Johan Brichau <[hidden email]> wrote:
Mariano,

The logs you show looks so similar to what I see in our gem logs when a gem stops responding...
When it occurs, can you look at the object with the mentioned oop in the write-write conflicts:

> Write-Write Conflicts...
>     383168257
> Write-Dependency Conflicts...

Inspect this:

        Object _objectForOop: 383168257

In my case this always opens on a dictionary with the #cacheTimeout property in it.
Is this also true in your case?


Thanks Johan for the advice. Unfortunately, I restarted everything did some changes etc so that OOP doesn't exist anymore :(

But I will keep it in case I need it again.

BTW, in my case I discovered the problem...it's that reading from /dev/random is blocking and when you run the OS in a virtual machine (like in my case I am using a VirtualBox vm) it is likely it will block for some time. There are lot of info about this in the internet. 
So at least that was my problem.

Thanks,

--
Mariano
http://marianopeck.wordpress.com



--
Mariano
http://marianopeck.wordpress.com

_______________________________________________
Glass mailing list
[hidden email]
http://lists.gemtalksystems.com/mailman/listinfo/glass


_______________________________________________
Glass mailing list
[hidden email]
http://lists.gemtalksystems.com/mailman/listinfo/glass
Reply | Threaded
Open this post in threaded view
|

Re: [Glass] Nginx , FastCGI and GLASS problem

Dale Henrichs-3
In reply to this post by Mariano Martinez Peck
Ah, the only thing I can think of is that vi likes to display dots for trailing blank spaces and those puppies got picked up by my copy ... sorry about that ...

Dale


From: "Mariano Martinez Peck" <[hidden email]>
To: "Dale K. Henrichs" <[hidden email]>
Cc: "Johan Brichau" <[hidden email]>, [hidden email]
Sent: Friday, December 6, 2013 5:52:38 AM
Subject: Re: [Glass] Nginx , FastCGI and GLASS problem




On Thu, Dec 5, 2013 at 10:30 PM, Dale K. Henrichs <[hidden email]> wrote:
Johan and Mariano,

For 3.1, I think the following should work in the startSeaside30_Adaptor script. NOTE, I've used `true ifTrue: [] ifFalse:[]` to enclose the new-style static exception handler that in addition to handling (and logging) the fact that a SigAbort was handled installs a handler for the RepositoryViewLost notification that dumps a stack so we can get more info about what the gem is doing when the lost ot is issued:

true ifTrue: [.
Transcript cr; show: 'New style SigAbort hanlder'.
TransactionBacklog
  addDefaultHandler: [ :ex |.
    "Run the abort in a lowPriority process, since we must acquire the
     transactionMutex."
    Transcript cr; show: 'handled sigabort: ', DateAndTime now printString.
    [
      GRPlatform current transactionMutex
        critical: [ GRPlatform current doAbortTransaction ].
      TransactionBacklog enableSignalling ]
        forkAt: Processor lowestPriority.
      ex resume ].
 RepositoryViewLost
   addDefaultHandler: [ :ex |.
     GRPlatform current logError: ex description title: 'Lost OT' shouldCommit: false.
     ex pass ].
TransactionBacklog enableSignalling.
] ifFalse: [
Transcript cr; show: 'Old style SigAbort hanlder'.
Exception.
  installStaticException:.
    [:ex :cat :num :args |
      "Run the abort in a lowPriority process, since we must acquire the
       transactionMutex."
      Transcript cr; show: 'handled sigabort: ', DateAndTime now printString.
      [
        GRPlatform current transactionMutex.
          critical: [.
            GRPlatform current doAbortTransaction ].
        System enableSignaledAbortError.
      ] forkAt: Processor lowestPriority.
    ]
  category: GemStoneError
  number: 6009
  subtype: nil.
System enableSignaledAbortError.
].



Thanks Dale, I have updated my code with that. But notice that you script is full of dots where they should not be (maybe a weird copy paste from your email client)...
Here is the version that works:



true ifTrue: [
Transcript cr; show: 'New style SigAbort hanlder'.
TransactionBacklog
  addDefaultHandler: [ :ex |
    "Run the abort in a lowPriority process, since we must acquire the
     transactionMutex."
    Transcript cr; show: 'handled sigabort: ', DateAndTime now printString.
    [
      GRPlatform current transactionMutex
        critical: [ GRPlatform current doAbortTransaction ].
      TransactionBacklog enableSignalling ]
        forkAt: Processor lowestPriority.
      ex resume ].
 RepositoryViewLost
   addDefaultHandler: [ :ex |
     GRPlatform current logError: ex description title: 'Lost OT' shouldCommit: false.
     ex pass ].
TransactionBacklog enableSignalling.
] ifFalse: [
Transcript cr; show: 'Old style SigAbort hanlder'.
Exception
  installStaticException:
    [:ex :cat :num :args |
      "Run the abort in a lowPriority process, since we must acquire the
       transactionMutex."
      Transcript cr; show: 'handled sigabort: ', DateAndTime now printString.
      [
        GRPlatform current transactionMutex
          critical: [
            GRPlatform current doAbortTransaction ].
        System enableSignaledAbortError.
      ] forkAt: Processor lowestPriority.
    ]
  category: GemStoneError
  number: 6009
  subtype: nil.
System enableSignaledAbortError.
].



 
If this is happening in 2.4 (Johan), I'll have to do a little more research to see if there is an exception equivalent to RepositoryViewLost signalled.

Dale

From: "Johan Brichau" <[hidden email]>
To: "Mariano Martinez Peck" <[hidden email]>
Cc: [hidden email]
Sent: Thursday, December 5, 2013 9:09:46 AM
Subject: Re: [Glass] Nginx , FastCGI and GLASS problem

And were you being blocked outside of a transaction or not? Because that is what the error says. I am getting this from time to time and still looking for a reason. 


Johan (sent from my mobile)

On 05 Dec 2013, at 17:48, Mariano Martinez Peck <[hidden email]> wrote:




On Thu, Dec 5, 2013 at 1:43 PM, Johan Brichau <[hidden email]> wrote:
Mariano,

The logs you show looks so similar to what I see in our gem logs when a gem stops responding...
When it occurs, can you look at the object with the mentioned oop in the write-write conflicts:

> Write-Write Conflicts...
>     383168257
> Write-Dependency Conflicts...

Inspect this:

        Object _objectForOop: 383168257

In my case this always opens on a dictionary with the #cacheTimeout property in it.
Is this also true in your case?


Thanks Johan for the advice. Unfortunately, I restarted everything did some changes etc so that OOP doesn't exist anymore :(

But I will keep it in case I need it again.

BTW, in my case I discovered the problem...it's that reading from /dev/random is blocking and when you run the OS in a virtual machine (like in my case I am using a VirtualBox vm) it is likely it will block for some time. There are lot of info about this in the internet. 
So at least that was my problem.

Thanks,

--
Mariano
http://marianopeck.wordpress.com

_______________________________________________
Glass mailing list
[hidden email]
http://lists.gemtalksystems.com/mailman/listinfo/glass




--
Mariano
http://marianopeck.wordpress.com


_______________________________________________
Glass mailing list
[hidden email]
http://lists.gemtalksystems.com/mailman/listinfo/glass
Reply | Threaded
Open this post in threaded view
|

Re: [Glass] Nginx , FastCGI and GLASS problem

Johan Brichau-3
In reply to this post by Dale Henrichs-3

On 06 Dec 2013, at 02:30, Dale K. Henrichs <[hidden email]> wrote:

> If this is happening in 2.4 (Johan), I'll have to do a little more research to see if there is an exception equivalent to RepositoryViewLost signalled.

Yes, those things happen in in our "busy" repositories, which are on 2.4
If there is an equivalent, I would be glad to install that handler.

I am indeed also suspicious if it occurs on a loaded system. Though, statmonitor has learned me that the Tx state of a gem was 'in transaction' even when this 'termination from stone' occurs.
Will try to retrieve that statmonit file again...

Johan
_______________________________________________
Glass mailing list
[hidden email]
http://lists.gemtalksystems.com/mailman/listinfo/glass
Reply | Threaded
Open this post in threaded view
|

Re: [Glass] Nginx , FastCGI and GLASS problem

Dale Henrichs-3
Johan,

I'll gen a 2.4 script up later today ... it is  possible that on a loaded system, the sigabort gets sent ... not responded to ... stone shrugs and sends lost ot ... gem wakes up aborts, starts handling request, receive lost OT ...

I haven't analyzed statmons for this kind of issue before, but do you see a stat for when the sigabort was sent from the stone? There might be something in the stone log as well ...

Dale

----- Original Message -----
| From: "Johan Brichau" <[hidden email]>
| To: "Dale K. Henrichs" <[hidden email]>
| Cc: [hidden email], "Mariano Martinez Peck" <[hidden email]>
| Sent: Friday, December 6, 2013 8:10:01 AM
| Subject: Re: [Glass] Nginx , FastCGI and GLASS problem
|
|
| On 06 Dec 2013, at 02:30, Dale K. Henrichs
| <[hidden email]> wrote:
|
| > If this is happening in 2.4 (Johan), I'll have to do a little more
| > research to see if there is an exception equivalent to
| > RepositoryViewLost signalled.
|
| Yes, those things happen in in our "busy" repositories, which are on
| 2.4
| If there is an equivalent, I would be glad to install that handler.
|
| I am indeed also suspicious if it occurs on a loaded system. Though,
| statmonitor has learned me that the Tx state of a gem was 'in
| transaction' even when this 'termination from stone' occurs.
| Will try to retrieve that statmonit file again...
|
| Johan
_______________________________________________
Glass mailing list
[hidden email]
http://lists.gemtalksystems.com/mailman/listinfo/glass
12