Hi!!
This time i need to ask about a random "internal server error" that occurs on our application. Looking at the fastCGI logs (we have 3 fastCGI running) we get this kind of lines when the error occurs: Write-Dependency Conflicts... Write-ReadLock Conflicts... Write-WriteLock Conflicts... Rc-Write-Write Conflicts... Synchronized-Commit Conflicts... ----------- Commit failure - retrying LOG ENTRY: aSymbolDictionary----------- ----------- Unreportable ERROR Encountered: 2012-04-19T11:40:51.53651309013367-07:00UserDefinedError: send(69,0x2b7e586c3e10,117,0) failed with errno=107,ENOTCONN, Socket is not connected----------- failure Read-Write Conflicts... Write-Write Conflicts... 16975315969 16975317249 16975319297 16975674881 16975675137 16975675393 16975675649 16975675905 16976480513 16976484353 16976484609 16976484865 16976485121 16976485377 16976485889 16976486145 16976486401 16976486657 16976486913 16979452673 16984274177 Write-Dependency Conflicts... Write-ReadLock Conflicts... Write-WriteLock Conflicts... Rc-Write-Write Conflicts... Synchronized-Commit Conflicts... ----------- Commit failure - retrying LOG ENTRY: aSymbolDictionary----------- ----------- Unreportable ERROR Encountered: 2012-04-19T11:41:05.54662108421326-07:00UserDefinedError: send(70,0x2b7e587f2688,117,0) failed with errno=107,ENOTCONN, Socket is not connected----------- ----------- Unreportable ERROR Encountered: 2012-04-19T16:16:42.11317491531372-07:00UserDefinedError: send(69,0x2b7e5874a2e0,94,0) failed with errno=107,ENOTCONN, Socket is not connected----------- Does anybody know how to solve this issue or why it happens? It's absolutely random. Sometimes you just click on a link that requires almost no processing on the server and it crashes like that. We're working on Gemstone 3.0.1 64bits, Seaside 3.0. Any hints? We're about to go live with the project and this should not happen on a production environment. Thanks in advance!! Alejandro |
Alejandro,
You have two different errors showing up in the log: Commit failure ENOTCONN The "Commit failure" is not quite as bad as it looks. The system is actually working as expected: When you get a commit failure while processing a transaction, the system aborts and retries the original HTTP request. If you get 10 retry failures, you'll get a "Too many retries" log entry. Since you have reported any "Too many retries" messages from your log, then it is likely that the commit failures are affecting the user's experience. With that said, I agree that you should dig into this a bit deeper ... under load a commit conflict like this may turn into a 'real problem'. If you look at GRGemstonePlatfrom>>seasideProcessRequestWtihRetry:resultBlock: you'll see the code that is logging the commit conflict: self doCommitTransaction ifFalse: [ | conflicts | conflicts := System transactionConflicts. self doAbortTransaction. self saveLogEntry: (WAObjectLogEntry warn: 'Commit failure - retrying' request: aNativeRequest object: conflicts) shouldCommit: true. ^nil "retry request"]. Notice that we are using the #saveLogEntry:shouldCommit: and passing in the result of System class>>transactionConflicts, which is a dictionary reporting the objects that are involved in the commit conflict. Besides writing to the gem log file (those numbers are oops), the dictionary is stashed in the Object Log. You can inspect the Object Log from a GemTools image (see the ObjectLog class for access methods) or better, you can look at the Object Log through Seaside. If you haven't already enabled the object log tool, evaluate the following: WAAdmin register: WAObjectLog asApplicationAt: 'tools/objectLog' user: 'admin' password: 'tool'. For a production systems you want to make sure that your customer facing web browser will not allow access to 'tools/objectLog' ... oh and to be safe, change the password:). Then set up a separate seaside gem that runs Swazoo instead of fastCGI and make the port for Swazoo accessible to your development engineers. When you hit the object log tool, you'll be able to filter out all log entries except errors and inspect the conflicts dictionary directly ... with this you can see which classes are involved in the conflicts: You are getting write-write conflicts which means that two separate Seaside requests are concurrently updating the same object. If the objects look to be Seaside session state objects, then it is likely a bug that I will have to fix. If the objects are your business objects, then you will have to address the problem ... Most likely you are updating a shared collection of users, inventory or something similar and you will need to replace the collection with one of the RC classes. When you've characterized your issue a bit I can help there ... If you can't tell why the conflicts are occurring then I'l try to help with debugging (there are some techniques that you can try). The ENOTCONN sounds like a case where the client disconnected before we had finished processing. The error is coming from FSGsSocketServer and it could provide more information[1] to help with understanding the root cause ... Basically, an ENOTCONN during accept (one of the callers) has a different significance, than getting an ENOTCONN during a read or write (client disconnect)... Dale [1] http://code.google.com/p/glassdb/issues/detail?id=339 ----- Original Message ----- | From: "keropotter" <[hidden email]> | To: [hidden email] | Sent: Saturday, April 21, 2012 6:45:42 PM | Subject: [GS/SS Beta] Random Internal Server Error | | Hi!! | | This time i need to ask about a random "internal server error" that | occurs | on our application. Looking at the fastCGI logs (we have 3 fastCGI | running) | we get this kind of lines when the error occurs: | | | Write-Dependency Conflicts... | Write-ReadLock Conflicts... | Write-WriteLock Conflicts... | Rc-Write-Write Conflicts... | Synchronized-Commit Conflicts... | ----------- Commit failure - retrying LOG ENTRY: | aSymbolDictionary----------- | ----------- Unreportable ERROR Encountered: | 2012-04-19T11:40:51.53651309013367-07:00UserDefinedError: | send(69,0x2b7e586c3e10,117,0) failed with errno=107,ENOTCONN, Socket | is not | connected----------- | failure | Read-Write Conflicts... | Write-Write Conflicts... | 16975315969 | 16975317249 | 16975319297 | 16975674881 | 16975675137 | 16975675393 | 16975675649 | 16975675905 | 16976480513 | 16976484353 | 16976484609 | 16976484865 | 16976485121 | 16976485377 | 16976485889 | 16976486145 | 16976486401 | 16976486657 | 16976486913 | 16979452673 | 16984274177 | Write-Dependency Conflicts... | Write-ReadLock Conflicts... | Write-WriteLock Conflicts... | Rc-Write-Write Conflicts... | Synchronized-Commit Conflicts... | ----------- Commit failure - retrying LOG ENTRY: | aSymbolDictionary----------- | ----------- Unreportable ERROR Encountered: | 2012-04-19T11:41:05.54662108421326-07:00UserDefinedError: | send(70,0x2b7e587f2688,117,0) failed with errno=107,ENOTCONN, Socket | is not | connected----------- | ----------- Unreportable ERROR Encountered: | 2012-04-19T16:16:42.11317491531372-07:00UserDefinedError: | send(69,0x2b7e5874a2e0,94,0) failed with errno=107,ENOTCONN, Socket | is not | connected----------- | | | | | | Does anybody know how to solve this issue or why it happens? It's | absolutely | random. Sometimes you just click on a link that requires almost no | processing on the server and it crashes like that. | | We're working on Gemstone 3.0.1 64bits, Seaside 3.0. | | Any hints? We're about to go live with the project and this should | not | happen on a production environment. | | Thanks in advance!! | Alejandro | | -- | View this message in context: | http://forum.world.st/Random-Internal-Server-Error-tp4577501p4577501.html | Sent from the GLASS mailing list archive at Nabble.com. | |
Free forum by Nabble | Edit this page |