On 8/27/15 7:37 PM, Mariano Martinez
Peck via Glass wrote:
... Okay, while the iron is hot,we need to capture the exact errors that are leading to the creation of the errorDefinitions (second pass errors) and work our ways back to the set of definitions that can be used to construct a failing test case ... on the surface there is nothing that could lead to the errors, so identifying the specific errors would be useful ... Specifically we need to instrument up MCPackageLoader>>tryToLoad: and record the exception along with problematic definition for each of the places where a definition is added to errorDefinitions .... including the places where we're removing definitions from errorDefinitions ... the best would be to just drop the exception and definition into the object log ... but dumping printStrings of the exception and definition to the gem log (or Transcript) would probably give us enough info (would want to see the details for problematic definitions as well)... Dale _______________________________________________ Glass mailing list [hidden email] http://lists.gemtalksystems.com/mailman/listinfo/glass |
On Fri, Aug 28, 2015 at 4:29 PM, Dale Henrichs via Glass <[hidden email]> wrote:
Exactly!!!! This would easy quite a lot the debugging. The way it is now it is very very complicated to debug. I have to halt into the add: of the error definition to see which was the exception.. exceptions MUST be recorded. In fact...why not to store a particular exception which indeed has an instVar holding the definition? that way we can easily keep both. That said, I will try to reproduce my problem with dummy classes in an empty stone,,, but you know...it's like when you take the care to the garage and you want the guy to listen the noise... and then you cannot reproduce it. So...I am not optimistic I will be able to reproduce it with a test. But I will try. Thanks for your explanations.
_______________________________________________ Glass mailing list [hidden email] http://lists.gemtalksystems.com/mailman/listinfo/glass |
In reply to this post by GLASS mailing list
On 8/27/15 7:37 PM, Mariano Martinez
Peck via Glass wrote:
What is the exception that is being raised in this code ... Dale _______________________________________________ Glass mailing list [hidden email] http://lists.gemtalksystems.com/mailman/listinfo/glass |
In reply to this post by GLASS mailing list
On 8/28/15 12:38 PM, Mariano Martinez
Peck wrote:
Don't put halts in the code, either log the definition/exception combo or put them into the object log so we can get all of the information later .... I want to know the specific error and the definition that leads to the error ... and then I want to work backwards to understand exactly what it is that's causing the error - it will make creating a dummy case much easier ... unless a dummy case is trivial:) Dale _______________________________________________ Glass mailing list [hidden email] http://lists.gemtalksystems.com/mailman/listinfo/glass |
In reply to this post by GLASS mailing list
This looks like a GemStone bug (at first blush), since the error is
originating in [13] Repository >> _doListInstancesFrom:with:includeMemory: (envId 0) so if you could extract some detail from that frame about where the source of the nil, we might be able to work our way back to the root cause ... Dale On 8/28/15 7:58 AM, Mariano Martinez Peck via Glass wrote: > [13] Repository >> _doListInstancesFrom:with:includeMemory: (envId 0) _______________________________________________ Glass mailing list [hidden email] http://lists.gemtalksystems.com/mailman/listinfo/glass |
Definitely a GemStone bug, since
Repository>>_scanPomWithMaxThreads:waitForLock:pageBufSize:percentCpuActiveLimit:identSet:limit:scanKind:toDirectory: is a primitive call and is not documented to return nil ... unless I misread the stack and calls? Dale On 8/28/15 12:50 PM, Dale Henrichs wrote: > This looks like a GemStone bug (at first blush), since the error is > originating in > > [13] Repository >> _doListInstancesFrom:with:includeMemory: (envId 0) > > so if you could extract some detail from that frame about where the > source of the nil, we might be able to work our way back to the root > cause ... > > Dale > > On 8/28/15 7:58 AM, Mariano Martinez Peck via Glass wrote: >> [13] Repository >> _doListInstancesFrom:with:includeMemory: (envId 0) > _______________________________________________ Glass mailing list [hidden email] http://lists.gemtalksystems.com/mailman/listinfo/glass |
On Fri, Aug 28, 2015 at 4:56 PM, Dale Henrichs via Glass <[hidden email]> wrote: Definitely a GemStone bug, since Repository>>_scanPomWithMaxThreads:waitForLock:pageBufSize:percentCpuActiveLimit:identSet:limit:scanKind:toDirectory: is a primitive call and is not documented to return nil ... unless I misread the stack and calls? No, I got exactly until that point too. I read the comment of the method, it said anywhere why it would answer nil, so I kind of give up. Dale _______________________________________________ Glass mailing list [hidden email] http://lists.gemtalksystems.com/mailman/listinfo/glass |
On Fri, Aug 28, 2015 at 5:00 PM, Mariano Martinez Peck <[hidden email]> wrote:
BTW..I will try to see if I can get there again. But I suspect this is related to the fact that I end up having 2 metaclasses and 2 classes (as I show some emails up) for the same "class" after loading.
_______________________________________________ Glass mailing list [hidden email] http://lists.gemtalksystems.com/mailman/listinfo/glass |
Nonetheless, this looks like a bug in GemStone and now I need to
understand why so I can provide a test case to the server guys ... I
don't think that 2 metaclasses and 2 classes is a good reason for
nil, since GemStone is supposed to survive this kind of situation
... likely it is something else ... this is a multi-thread operation
and it is possible that something related to that is the root cause
could you check in the stone log and the gem log to see if any error
information is written there ...
Thanks for pushing on this like you did ... I was an email behind and you were way ahead of me ... but ... I have an "expectation" that certain class loading scenarios could result in an error, but this kind of error is not part of my expectation and the monticello loading scheme does hide these kinds of problems .... Need to make the error more visible at the very least ... and fix the GemStone bug ... it's possible that the bug is already fixed and it is not likely that we will ship a 3.1.0.7 with a bugfix - Let's get a simplified test case and move forward from there ... Is it possible that you're in a limited memory situation in your development environment? Dale On 8/28/15 1:04 PM, Mariano Martinez
Peck wrote:
_______________________________________________ Glass mailing list [hidden email] http://lists.gemtalksystems.com/mailman/listinfo/glass |
Mariano,
Does the following return nil consistently when run in your image: System abortTransaction. SystemRepository listInstances: {FaSecurityAdjustedClosingPriceRecord} limit: 0 toDirectory: nil withMaxThreads: 1 maxCpuUsage: 95 memoryOnly: false I think this is the equivalent call that is causing a nil return value. I've looked at the code and there is no obvious code path for returning nill instead of an Array .... I've submitted an internal bug. Dale On 8/28/15 1:20 PM, Dale Henrichs
wrote:
Nonetheless, this looks like a bug in GemStone and now I need to understand why so I can provide a test case to the server guys ... I don't think that 2 metaclasses and 2 classes is a good reason for nil, since GemStone is supposed to survive this kind of situation ... likely it is something else ... this is a multi-thread operation and it is possible that something related to that is the root cause could you check in the stone log and the gem log to see if any error information is written there ... _______________________________________________ Glass mailing list [hidden email] http://lists.gemtalksystems.com/mailman/listinfo/glass |
Here's an untested patch for
Repository>>listInstances:limit:toDirectory:withMaxThreads:maxCpuUsage:memoryOnly:
that you can try (Probably have to use topaz logged in as SystemUser
to install this patch) based on 3.1.0.6:
listInstances: anArray limit: aSmallInt toDirectory: directoryString withMaxThreads: maxThreads maxCpuUsage: aPercentage memoryOnly: memOnlyBool "If directoryString == nil, result includes in-memory objects. If memBool == true, result contains only the in-memory objects, and maxThreads and aPercentage are ignored." | inputSet resultInSetOrder result code scanBlk | memOnlyBool ifFalse: [ System needsCommit ifTrue: [ self _error: #'rtErrAbortWouldLoseData' ] ]. inputSet := self _arrayOfClassesAsSet: anArray. inputSet size < 1 ifTrue: [ ^ {} ]. memOnlyBool ifFalse: [ scanBlk := [ :scanSetThisTime | (self _scanPomWithMaxThreads: maxThreads waitForLock: 60 pageBufSize: 8 percentCpuActiveLimit: aPercentage identSet: scanSetThisTime limit: aSmallInt scanKind: code toDirectory: directoryString) ifNil: [ #() ] ] ]. code := directoryString ifNotNil: [ 2 ] ifNil: [ 0 ]. inputSet := IdentitySet withAll: anArray. resultInSetOrder := self _doListInstancesFrom: inputSet with: scanBlk includeMemory: directoryString == nil. directoryString ifNotNil: [ result := resultInSetOrder "primitive wrote the files, so we are done" ] ifNil: [ | inputArraySize | inputArraySize := anArray size. result := Array new: inputArraySize * 2. 1 to: inputArraySize do: [ :j | | soIdx resIdx anObj | anObj := anArray at: j. soIdx := (inputSet _offsetOf: anObj) * 2. resIdx := j * 2. result at: resIdx - 1 put: (resultInSetOrder at: soIdx - 1). "totalCount" result at: resIdx put: (resultInSetOrder at: soIdx) "array of instances" ] ]. ^ result On 8/28/15 3:04 PM, Dale Henrichs
wrote:
Mariano, _______________________________________________ Glass mailing list [hidden email] http://lists.gemtalksystems.com/mailman/listinfo/glass |
Mariano,
Here's a better patch (from engineering) ... there are bugs in the handling of temps and blocks (in 3.1.x) that appears to be the culprit (the code i.v. value was not correctly picked up ... do the calculation in the context of the block ... this is the preferred patch because the nil return value dod not imply an empty result set as my patch implied ... this one should give you correct results ... let me know if there are further problems): listInstances: anArray limit: aSmallInt toDirectory: directoryString withMaxThreads: maxThreads maxCpuUsage: aPercentage memoryOnly: memOnlyBool "If directoryString == nil, result includes in-memory objects. If memBool == true, result contains only the in-memory objects, and maxThreads and aPercentage are ignored." | inputSet resultInSetOrder result scanBlk | memOnlyBool ifFalse: [ System needsCommit ifTrue: [ self _error: #'rtErrAbortWouldLoseData' ] ]. inputSet := self _arrayOfClassesAsSet: anArray. inputSet size < 1 ifTrue: [ ^ {} ]. memOnlyBool ifFalse: [ scanBlk := [ :scanSetThisTime | self _scanPomWithMaxThreads: maxThreads waitForLock: 60 pageBufSize: 8 percentCpuActiveLimit: aPercentage identSet: scanSetThisTime limit: aSmallInt scanKind: (directoryString ifNotNil:[ 2 ] ifNil:[ 0 ]) toDirectory: directoryString ] ]. inputSet := IdentitySet withAll: anArray. resultInSetOrder := self _doListInstancesFrom: inputSet with: scanBlk includeMemory: directoryString == nil. directoryString ifNotNil: [ result := resultInSetOrder "primitive wrote the files, so we are done" ] ifNil: [ | inputArraySize | inputArraySize := anArray size. result := Array new: inputArraySize * 2. 1 to: inputArraySize do: [ :j | | soIdx resIdx anObj | anObj := anArray at: j. soIdx := (inputSet _offsetOf: anObj) * 2. resIdx := j * 2. result at: resIdx - 1 put: (resultInSetOrder at: soIdx - 1). "totalCount" result at: resIdx put: (resultInSetOrder at: soIdx) "array of instances" ] ]. ^ result On 08/28/2015 03:09 PM, Dale Henrichs
wrote:
Here's an untested patch for Repository>>listInstances:limit:toDirectory:withMaxThreads:maxCpuUsage:memoryOnly: that you can try (Probably have to use topaz logged in as SystemUser to install this patch) based on 3.1.0.6: _______________________________________________ Glass mailing list [hidden email] http://lists.gemtalksystems.com/mailman/listinfo/glass |
On Mon, Aug 31, 2015 at 3:38 PM, Dale Henrichs <[hidden email]> wrote:
Thanks Dale and the rest of the engineers for the effort. I will test it soon (I must restore from backup first and do a few things in order to reproduce it again). In the meanwhile let me ask: 1) was this "temps and blocks" bugs solved in 3.2 or 3.3? 2) Is there anything I should take care of for when I am using temps and closures? I mean, were you able to fin the exact issue so that I can know and not have those problems in my own code? I do remember the known problem of seaside callbacks and temps and the usage of ValueHolder, but I don't know if this is related or not. Thanks in advance,
_______________________________________________ Glass mailing list [hidden email] http://lists.gemtalksystems.com/mailman/listinfo/glass |
On 8/31/15 11:49 AM, Mariano Martinez
Peck wrote:
I don't know the exact details ... the engineer I talked to was pretty confident that this bug wasn't present in 3.2.x or 3.3.x. Since you are using 3.1.0.6, your best course of action might be to read through the release notes for each of the 3.2.x releases[1] and see if block/temp bugs were fixed or mentioned and then cross reference with bugnotes[2] using the site-search[3]. IIRC the particular bug in this case had to do with assigning the temp that was only referenced in a block created before the temp assignment was made - moving the temp assignment before the block was created could be a workaround (I don't recall any additional details) ... Dale [1] https://gemtalksystems.com/products/gs64/versions32x/ [2] https://gemtalksystems.com/techsupport/bugnotes/ [3] https://cse.google.com/cse/home?cx=014580976650604809618:1cdoq5jo3te
_______________________________________________ Glass mailing list [hidden email] http://lists.gemtalksystems.com/mailman/listinfo/glass |
Hi Dale, I just tried but unfortunately I still got the same error.... I do not want to make you loose more time if this was fixed in newer gemstone versions. I just hope I do not get this error frequently in the future until I am able to migrate to newer versions. If you still want me to dig or try something else because of your convenience (like if you think there might still be a bug even for newer versions), then let me know. Thanks anyway for the help. Very much appreciated. On Mon, Aug 31, 2015 at 4:14 PM, Dale Henrichs <[hidden email]> wrote:
_______________________________________________ Glass mailing list [hidden email] http://lists.gemtalksystems.com/mailman/listinfo/glass |
Could you arrange to get a stack trace from your most recent error
and a listing of the method that you used ... I want to make sure
that we understand the failure mechanism ... if it is related to
block temps then it is fixed in 3.2.x, but if it is not related to
block temps then it could be present in later versions of GemStone
and we'll want to characterize the problem .... Obviously, this
particular call doesn't reproduce very frequently (I wasn't able to
make it break with trivial examples) so there is likely to be
something a little more complex going on ...
There's always time to track down bugs, even if I get cranky from time to time:) Dale On 9/1/15 9:44 AM, Mariano Martinez
Peck wrote:
_______________________________________________ Glass mailing list [hidden email] http://lists.gemtalksystems.com/mailman/listinfo/glass |
On Tue, Sep 1, 2015 at 2:14 PM, Dale Henrichs <[hidden email]> wrote:
Dale, the exception I get is the one I original shared with you and you got to the same conclusion as I did. What I can offer you is this that I log the error (continuation) in the object log and the provide you a user for the web user for our app and from there I can allow you open a kind of Seaside debugger/inspector which will be much richer than a plain string stack and at least you can also print/inspect from there. I cannot send you the extent because its quite big. If you think this is OK, then I please need you to ask you to only share the login info with GemTalks engineer. Since the site is a bit on use (but with a working extent) I must recover from backup and so the system will be running with a "broken" extent for a while. No problem with this but if this will be only a couple of hours or 1-2 day max. So if we will do this, I would appreciate that you let me know when (you or the engineer) would be available to take a look. Let me know if you want this.
_______________________________________________ Glass mailing list [hidden email] http://lists.gemtalksystems.com/mailman/listinfo/glass |
On 9/1/15 10:59 AM, Mariano Martinez
Peck wrote:
Thanks for the offer ... we might want to instrument up the method a bit more instead of looking at a continuation ... so I will get back to you ... I won't be in the office until Thursday, and that's when I will talk things over with the engineer ... Dale _______________________________________________ Glass mailing list [hidden email] http://lists.gemtalksystems.com/mailman/listinfo/glass |
OK then. Perfect. Let me know. Thanks! On Tue, Sep 1, 2015 at 3:28 PM, Dale Henrichs <[hidden email]> wrote:
_______________________________________________ Glass mailing list [hidden email] http://lists.gemtalksystems.com/mailman/listinfo/glass |
Mariano,
Sorry for the delay, but I'm back in the office today and what we would like to do is capture the args that are being used for the primitive so replaicing the `memOnlyBool` block logic in the listInstances:.... method with the following will help us get them: memOnlyBool ifFalse: [ scanBlk := [ :scanSetThisTime | | ret sKind | sKind := (directoryString ifNotNil:[ 2 ] ifNil:[ 0 ]). ret := self _scanPomWithMaxThreads: maxThreads waitForLock: 60 pageBufSize: 8 percentCpuActiveLimit: aPercentage identSet: scanSetThisTime limit: aSmallInt scanKind: sKind toDirectory: directoryString ]. ret ifNil: [ Transcript cr; show: '_scanPomWithMaxThreads failure: ', maxThreads printString, ' ', aPercentage printString, ' ', scanSetThisTime printString, ' ', aSmallInt printString, ' ', sKind printString, ' ', directoryString printString ]. ret ]. We thought the problem might have been related to the method temp reference for `(directoryString ifNotNil:[ 2 ] ifNil:[ 0 ])`, but since the prim is still failing with that expression inlined there must be a different (less obvious) failure mechanism. Dale On 09/01/2015 11:45 AM, Mariano
Martinez Peck wrote:
_______________________________________________ Glass mailing list [hidden email] http://lists.gemtalksystems.com/mailman/listinfo/glass |
Free forum by Nabble | Edit this page |