Hi,
My code that runs background jobs on a separate gem (based on Otto's code...similar to ServiceVM), stores the commit conflict information inside the persistent background process instance for post commit conflict analysis. I have a job that failed today and with the following commit conflict info: Inspect aFaBackgroundProcess/aSymbolDictionary( #'WriteWrite_minusRcReadSet'->anArray( aDictionary( )), #'commitResult'->#'failure', #'RcReadSet'->anArray( aRcCollisionBucket( aRcKeyValueDictionary( ,........) -------------------- . -> aSymbolDictionary( #'WriteWrite_minusRcReadSet'->anArray( aDictionary( )), #'commitResult'->#'failure', #'RcReadSet'->anArray( aRcCollisionB... .. -> aFaBackgroundProcess (class)@ -> SymbolDictionary (oop)@ -> 15059544321 (committed)@ -> true (notTranlogged)@ -> nil 1@ -> #'commitResult'->#'failure' 2@ -> #'RcReadSet'->anArray( aRcCollisionBucket( aRcKeyValueDictionary( 'siteDB-debris-gemstone'->aFaGemStoneDataStore)), aRcCollisionBucket( aRcK... 3@ -> #'Write-Write'->anArray( aDictionary( )) 4@ -> #'WriteWrite_minusRcReadSet'->anArray( aDictionary( )) So... commitResult is #failure and the only thing set up is RcReadSet. I have searched in the programming guides for RcReadSet and I found nothing. I am inspecting the array of it, but it is of size 973...which makes it impossible to understand what it really caused the conflict. Is there any tip on how can analyze this better? Thanks, _______________________________________________ Glass mailing list [hidden email] http://lists.gemtalksystems.com/mailman/listinfo/glass |
Mariano,
It has been awhile since I've done any reduced conflict work, so the details of the reduced conflict mechanism need to be paged back in ... with a bit of poking around I believe that the RcReadSet is not involved directly in the actual conflict, especially since the #Write-Write set is empty ... from the information I've seen here, there shouldn't have been a conflict ... Is there a reason that you are not showing the full set of fields in a conflictDictionary? I would expect to see fields for each of the following (from System class>>transactionConflicts): Key Conflicts Both the Write-Dependency and Rc-Write-Write fields would have useful information if they weren't empty ... Barring additional information I suppose it might be useful to see the error message and any other information from the log file (perhaps a stack?) Dale On 6/16/16 7:54 AM, Mariano Martinez
Peck via Glass wrote:
_______________________________________________ Glass mailing list [hidden email] http://lists.gemtalksystems.com/mailman/listinfo/glass |
On Fri, Jun 17, 2016 at 3:17 PM, Dale Henrichs via Glass <[hidden email]> wrote:
That was the part I didn't understand. As the rest of the keys were empty and the result was #failure , I couldn't understand why RcReadSet (the only none empty key) would make the commit to fail....
No. Actually, I pasted all I can see from that dictionary. So maybe the problem is how I am capturing and persisting this? Here [1] you can see how I do my best to capture the transaction conflicts for later analysis. Note that since this is a background process I cannot immediatly debug it. That's why I try to persist the commit conflicts together with my persistent background instance before I loose it
I suspect I am capturing/persisting wrongly the commit conflicts. Thanks in advance for any help.
_______________________________________________ Glass mailing list [hidden email] http://lists.gemtalksystems.com/mailman/listinfo/glass |
Okay ... I'm interested in the TransactionError that is not a TransactionError ... that is a bit surprising to me and I'd be interested in addtional details ... don't think that it's related to the current problem, but who knows :) What does the #commitConflicts: method do? So far I don't see anything suspicious ... Is this problem reproducable? It may be time to record a stack in the log and perhaps take a look at System class>>_commitPrintingDiagnostics for logging the commit conflicts to the log as well ... The code in System class>>_commitPrintingDiagnostics implies that not all fields are in the transactionConflicts dictionary are set, so that may be a red herring. Dale On 6/17/16 11:56 AM, Mariano Martinez
Peck wrote:
_______________________________________________ Glass mailing list [hidden email] http://lists.gemtalksystems.com/mailman/listinfo/glass |
Hi Dale,
On Fri, Jun 17, 2016 at 4:20 PM, Dale Henrichs <[hidden email]> wrote:
Yes, but I don't know how to give more information. More below.
It's a simple setter. The method #runInForeground I published is in FaBackgroundProcess (persistent domain object). And so this #commitConflicts simply stores the commit conflicts into an instVar of it for further analysis.
No, unfortunately, it's not :(
I already record into the stack, maybe not exactly what you suggest next, but at least, please see attached file. As you can see in the stack, it looks like the normal / domain logic finished (in #runProcessBlock) and so the line 10 of #runInForeground does the #commit. So it simply looks like a commit conflict when the job finished. Does this stack helps somehow? Note that it shows some OOP etc and I can still get this OOP if you need them.
Thanks for pointing out to #_commitPrintingDiagnostics. In fact it looks pretty similar to what I do, right? And yeah, those #ifAbsent:[] may suggest we may not be able to get all fields. But then...how do we get them??? Thanks!
_______________________________________________ Glass mailing list [hidden email] http://lists.gemtalksystems.com/mailman/listinfo/glass gemLog.txt (3K) Download Attachment |
On 6/17/16 12:56 PM, Mariano Martinez
Peck wrote:
Well, right now you are resuming a TransactionError and I would want to see the stack and the description of the actual error ... it might be related to the current problem ... if you are "ignoring a meaningful" it may have an impact of future commits. well, the transactionConflicts report shows no conflicts, so we are getting a commit conflict where there are not actual conflicts ... this could be a bug, but I would like to know for sure that the "ignored" TransactionError is not involved ... so perhaps some additional logging for the "resumed TransactionError" is called for .. The stack isn't useful - as you claim - so right now I can only be suspicious of the "resumed Transaction errors" ... I'm not in the office today --- not feeling well --- so on Monday I will talk to some folks about additional measures we can take --- it's also time to ask about the version of GemStone that you are using:) Dale _______________________________________________ Glass mailing list [hidden email] http://lists.gemtalksystems.com/mailman/listinfo/glass |
On Fri, Jun 17, 2016 at 5:30 PM, Dale Henrichs <[hidden email]> wrote:
Ahhh yes, if it was #success I do #resume. However, just for the record note that transaction error that caused the final error has #failure as #commitResult. So we are not talking about the same TransactionError. But maybe... the original TransactionError that we #pass may have somehow raised the next #failure TransactionError ? That's what you have in mind? Thanks! _______________________________________________ Glass mailing list [hidden email] http://lists.gemtalksystems.com/mailman/listinfo/glass |
On 6/17/16 1:40 PM, Mariano Martinez
Peck wrote:
Yes ... I can't be sure but I don't think you should be getting a TransactionError for a successful commit ... so that means something went wrong and _if_ this particular error was immediately preceeded by a "resumed TransactionError" then we might have a lot more information and perhaps we can get to the bottom of this ... I also don't think that you should get a TransactionError claiming that there were commit conflicts when there are no commit conflicts present in the transaction conflicts dictionary ... Dale _______________________________________________ Glass mailing list [hidden email] http://lists.gemtalksystems.com/mailman/listinfo/glass |
On Fri, Jun 17, 2016 at 5:50 PM, Dale Henrichs <[hidden email]> wrote:
mmmm I am not sure. Checking the code of #_commitPrintingDiagnostics (which I guess it it correct), expects to have "(conflicts at: #'commitResult') == #'success' " in case of " self commitTransaction" answering false. And .... "self commitTransaction" answering false isn't it the same as doing #commit and getting the TransactionError ?
_______________________________________________ Glass mailing list [hidden email] http://lists.gemtalksystems.com/mailman/listinfo/glass |
On 06/20/2016 06:28 AM, Mariano
Martinez Peck wrote:
Well, this is why I want to have more information about your TransactionError ... there _are_ a several different places (in the C code) that signal a TransactionError that aren't directly ralted to a commit failure ... your comment leads me to believe that the you hit an reduced conflict related TransactionError which lead you to then resume the error ... without knowing the details of the "ignored" TransactionError, I can only guess about what might be going on ... The fact that you have seen a commit failure with no conflicts and no additional information is odd and I am hoping that an "ignored" TransactionError might provide additional information ... I'm finally in the office today (although I may not last long ... unfortunately) and I'll ask around for any other possibilities ... Dale _______________________________________________ Glass mailing list [hidden email] http://lists.gemtalksystems.com/mailman/listinfo/glass |
In reply to this post by GLASS mailing list
On 06/16/2016 07:54 AM, Mariano Martinez Peck via Glass wrote: > > > Inspect aFaBackgroundProcess/aSymbolDictionary( > #'WriteWrite_minusRcReadSet'->anArray( aDictionary( )), > #'commitResult'->#'failure', #'RcReadSet'->anArray( > aRcCollisionBucket( aRcKeyValueDictionary( ,........) > -------------------- > . -> aSymbolDictionary( > #'WriteWrite_minusRcReadSet'->anArray( aDictionary( )), > #'commitResult'->#'failure', #'RcReadSet'->anArray( aRcCollisionB... > .. -> aFaBackgroundProcess > (class)@ -> SymbolDictionary > (oop)@ -> 15059544321 > (committed)@ -> true > (notTranlogged)@ -> nil > 1@ -> #'commitResult'->#'failure' > 2@ -> #'RcReadSet'->anArray( aRcCollisionBucket( > aRcKeyValueDictionary( > 'siteDB-debris-gemstone'->aFaGemStoneDataStore)), aRcCollisionBucket( > aRcK... > 3@ -> #'Write-Write'->anArray( aDictionary( )) > 4@ -> #'WriteWrite_minusRcReadSet'->anArray( aDictionary( )) > I guess we are both blind:) ... While describing the "problem" to a co-worker I noticed that the the Write-Write set is NOT EMPTY :): 3@ -> #'Write-Write'->anArray( aDictionary( )) The conflict that you are getting is on the empty dictionary (aDictionary( )) ... presumably you are creating an empty Dictionary somewhere in shared state and two sessions are trying to update the dictionary... Dale _______________________________________________ Glass mailing list [hidden email] http://lists.gemtalksystems.com/mailman/listinfo/glass |
On Wed, Jun 22, 2016 at 2:52 PM, Dale Henrichs via Glass <[hidden email]> wrote:
Uff wtf!!!! Crap, we were both blind. Sorry to bother. Ok...the hunting (findAllReferencePathsTo... ) has become... will see if I can track down this guilty empty dict. Thanks! Dale _______________________________________________ Glass mailing list [hidden email] http://lists.gemtalksystems.com/mailman/listinfo/glass |
On 06/22/2016 12:20 PM, Mariano
Martinez Peck wrote:
haha ... happy hunting ... BTW, if you find traipsing through the output of findAllReferencePathsTo... to be frustrating, you might want to try Obex[1] for visualizing the referencespath structure... Obex is still under development and it might require a bit of tweaking, but one of it's features is to allow you to build a visual network of the objects in the findAllReferencePathsTo... result set ... Dale [1] https://github.com/dalehenrich/obex#object-explorer-for-gemstones-64 _______________________________________________ Glass mailing list [hidden email] http://lists.gemtalksystems.com/mailman/listinfo/glass |
On Wed, Jun 22, 2016 at 5:21 PM, Dale Henrichs <[hidden email]> wrote:
OK, it was quite easy. Thanks for the extra pair of eyes.
Yes, I am aware of it... but: 1) I still cannot use latest tODE and won't be able until we have all the none-DataCurator thingy 2) findAllReferencePathsTo takes a lifetime for my repo. I am talking of an hour or so... so I cannot imagine a UI with that performance. What I end up doing is to fire the findAllReferencePathsTo in a gem (like in tODE) and then do a "tail -f " of the gem log. Most of the times, the most important reference paths is the first one or so... then I see first..analyze code, then get the OOP which may be pointing to it. Then kill gem and start again the findAllReferencePathsTo with the new OOP ... and like that until I find the guilty one. Thanks! _______________________________________________ Glass mailing list [hidden email] http://lists.gemtalksystems.com/mailman/listinfo/glass |
On 6/23/16 12:20 PM, Mariano Martinez
Peck wrote:
For 3.4 we are working on a fast reference scan based on quickly finding the parents of a particular set ofobjects in a single scan and this scan can be done relatively quickly ... the result of this scan will also yield a single single reference path ... if I'm not mistaken this is equivalent to "getting the first reference and killing the gem" ...the obex gui is really targeted at supporting this model as well as other schemes for analyzing objects in your repo ... while working on this I am creating visualizations for all of the existing repo analysis methods as well ... Dale _______________________________________________ Glass mailing list [hidden email] http://lists.gemtalksystems.com/mailman/listinfo/glass |
Free forum by Nabble | Edit this page |