Smalltalk › Gemtalk › GLASS

Random Internal Server Error

_‹ Previous Topic Next Topic _›

Classic

List

Threaded

2 messages Options

keropotter

Random Internal Server Error

Hi!!

This time i need to ask about a random "internal server error" that occurs on our application. Looking at the fastCGI logs (we have 3 fastCGI running) we get this kind of lines when the error occurs:

Write-Dependency Conflicts...
Write-ReadLock Conflicts...
Write-WriteLock Conflicts...
Rc-Write-Write Conflicts...
Synchronized-Commit Conflicts...
----------- Commit failure - retrying LOG ENTRY: aSymbolDictionary-----------
----------- Unreportable ERROR Encountered: 2012-04-19T11:40:51.53651309013367-07:00UserDefinedError: send(69,0x2b7e586c3e10,117,0) failed with errno=107,ENOTCONN, Socket is not connected-----------
failure
Read-Write Conflicts...
Write-Write Conflicts...
16975315969
16975317249
16975319297
16975674881
16975675137
16975675393
16975675649
16975675905
16976480513
16976484353
16976484609
16976484865
16976485121
16976485377
16976485889
16976486145
16976486401
16976486657
16976486913
16979452673
16984274177
Write-Dependency Conflicts...
Write-ReadLock Conflicts...
Write-WriteLock Conflicts...
Rc-Write-Write Conflicts...
Synchronized-Commit Conflicts...
----------- Commit failure - retrying LOG ENTRY: aSymbolDictionary-----------
----------- Unreportable ERROR Encountered: 2012-04-19T11:41:05.54662108421326-07:00UserDefinedError: send(70,0x2b7e587f2688,117,0) failed with errno=107,ENOTCONN, Socket is not connected-----------
----------- Unreportable ERROR Encountered: 2012-04-19T16:16:42.11317491531372-07:00UserDefinedError: send(69,0x2b7e5874a2e0,94,0) failed with errno=107,ENOTCONN, Socket is not connected-----------

Does anybody know how to solve this issue or why it happens? It's absolutely random. Sometimes you just click on a link that requires almost no processing on the server and it crashes like that.

We're working on Gemstone 3.0.1 64bits, Seaside 3.0.

Any hints? We're about to go live with the project and this should not happen on a production environment.

Thanks in advance!!
Alejandro

Dale Henrichs

Re: Random Internal Server Error

Alejandro,

You have two different errors showing up in the log:

Commit failure
ENOTCONN

The "Commit failure" is not quite as bad as it looks. The system is actually working as expected:

When you get a commit failure while processing a transaction, the system aborts
and retries the original HTTP request. If you get 10 retry failures, you'll get a
"Too many retries" log entry.

Since you have reported any "Too many retries" messages from your log, then it is likely that the commit failures are affecting the user's experience.

With that said, I agree that you should dig into this a bit deeper ... under load a commit conflict like this may turn into a 'real problem'.

If you look at GRGemstonePlatfrom>>seasideProcessRequestWtihRetry:resultBlock: you'll see the code that is logging the commit conflict:

self doCommitTransaction
ifFalse: [ | conflicts |
conflicts := System transactionConflicts.
self doAbortTransaction.
self
saveLogEntry: (WAObjectLogEntry warn: 'Commit failure - retrying' request: aNativeRequest object: conflicts)
shouldCommit: true.
^nil "retry request"].

Notice that we are using the #saveLogEntry:shouldCommit: and passing in the result of System class>>transactionConflicts, which is a dictionary reporting the objects that are involved in the commit conflict. Besides writing to the gem log file (those numbers are oops), the dictionary is stashed in the Object Log.

You can inspect the Object Log from a GemTools image (see the ObjectLog class for access methods) or better, you can look at the Object Log through Seaside. If you haven't already enabled the object log tool, evaluate the following:

WAAdmin
register: WAObjectLog
asApplicationAt: 'tools/objectLog'
user: 'admin' password: 'tool'.

For a production systems you want to make sure that your customer facing web browser will not allow access to 'tools/objectLog' ... oh and to be safe, change the password:). Then set up a separate seaside gem that runs Swazoo instead of fastCGI and make the port for Swazoo accessible to your development engineers.

When you hit the object log tool, you'll be able to filter out all log entries except errors and inspect the conflicts dictionary directly ... with this you can see which classes are involved in the conflicts:

You are getting write-write conflicts which means that two
separate Seaside requests are concurrently updating the same object.

If the objects look to be Seaside session state objects, then it is likely a bug that I will have to fix. If the objects are your business objects, then you will have to address the problem ... Most likely you are updating a shared collection of users, inventory or something similar and you will need to replace the collection with one of the RC classes.

When you've characterized your issue a bit I can help there ...

If you can't tell why the conflicts are occurring then I'l try to help with debugging (there are some techniques that you can try).

The ENOTCONN sounds like a case where the client disconnected before we had finished processing. The error is coming from FSGsSocketServer and it could provide more information[1] to help with understanding the root cause ...

Basically, an ENOTCONN during accept (one of the callers) has a different significance, than getting an ENOTCONN during a read or write (client disconnect)...

Dale

[1] http://code.google.com/p/glassdb/issues/detail?id=339
----- Original Message -----
| From: "keropotter" <[hidden email]>
| To: [hidden email]
| Sent: Saturday, April 21, 2012 6:45:42 PM
| Subject: [GS/SS Beta] Random Internal Server Error
|
| Hi!!
|
| This time i need to ask about a random "internal server error" that
| occurs
| on our application. Looking at the fastCGI logs (we have 3 fastCGI
| running)
| we get this kind of lines when the error occurs:
|
|
| Write-Dependency Conflicts...
| Write-ReadLock Conflicts...
| Write-WriteLock Conflicts...
| Rc-Write-Write Conflicts...
| Synchronized-Commit Conflicts...
| ----------- Commit failure - retrying LOG ENTRY:
| aSymbolDictionary-----------
| ----------- Unreportable ERROR Encountered:
| 2012-04-19T11:40:51.53651309013367-07:00UserDefinedError:
| send(69,0x2b7e586c3e10,117,0) failed with errno=107,ENOTCONN, Socket
| is not
| connected-----------
| failure
| Read-Write Conflicts...
| Write-Write Conflicts...
| 16975315969
| 16975317249
| 16975319297
| 16975674881
| 16975675137
| 16975675393
| 16975675649
| 16975675905
| 16976480513
| 16976484353
| 16976484609
| 16976484865
| 16976485121
| 16976485377
| 16976485889
| 16976486145
| 16976486401
| 16976486657
| 16976486913
| 16979452673
| 16984274177
| Write-Dependency Conflicts...
| Write-ReadLock Conflicts...
| Write-WriteLock Conflicts...
| Rc-Write-Write Conflicts...
| Synchronized-Commit Conflicts...
| ----------- Commit failure - retrying LOG ENTRY:
| aSymbolDictionary-----------
| ----------- Unreportable ERROR Encountered:
| 2012-04-19T11:41:05.54662108421326-07:00UserDefinedError:
| send(70,0x2b7e587f2688,117,0) failed with errno=107,ENOTCONN, Socket
| is not
| connected-----------
| ----------- Unreportable ERROR Encountered:
| 2012-04-19T16:16:42.11317491531372-07:00UserDefinedError:
| send(69,0x2b7e5874a2e0,94,0) failed with errno=107,ENOTCONN, Socket
| is not
| connected-----------
|
|
|
|
|
| Does anybody know how to solve this issue or why it happens? It's
| absolutely
| random. Sometimes you just click on a link that requires almost no
| processing on the server and it crashes like that.
|
| We're working on Gemstone 3.0.1 64bits, Seaside 3.0.
|
| Any hints? We're about to go live with the project and this should
| not
| happen on a production environment.
|
| Thanks in advance!!
| Alejandro
|
| --
| View this message in context:
| http://forum.world.st/Random-Internal-Server-Error-tp4577501p4577501.html
| Sent from the GLASS mailing list archive at Nabble.com.
|