Hi all,
I just ran into an error and was wondering if anyone has ever seen this before. I have a sorted collection with a computationally intensive sorting block. It blew up on a nil. Low res jpg attached. It appears to have worked for the first 12 items: newCollection in SortedCollection>>collect: has 12 items in it. The 13th item returned a nil. This is because array on the sorted collection starts with 16 nils. firstIndex is set to 17. Notice on the jpg it says firstIndex = 17 but index is 13. So I can only conclude that since index is 13 and 12 other objects were processed successfully that the array and first index changed during processing. I am not sure that I can reproduce this error. I've been running the same code for quite a while without seeing it. If I can I will post it. Has anyone seen this before or does anyone know why this may have happened? Thanks for your help! Ron Teitelbaum [hidden email] http://bugs.impara.de/view.php?id=6030 SortedCollectionError.jpg (13K) Download Attachment |
I'm making a wild guess here, Ron: are you ABSOLUTELY sure that your
sortblock represents an ordering relation, that is, for EVERY pair of values a and b, ((sortBlock value: a value: b) AND (sortBlock value: b value: a)) implies a = b? If this is not the case, the sort will very likely run off the end of the legitimate values in the array, and into the nils. Andrew On 12 Feb 2007, at 08:09, Ron Teitelbaum wrote: > Hi all, > > I just ran into an error and was wondering if anyone has ever seen > this > before. I have a sorted collection with a computationally > intensive sorting > block. > > It blew up on a nil. Low res jpg attached. > > It appears to have worked for the first 12 items: > newCollection in SortedCollection>>collect: has 12 items in it. > > The 13th item returned a nil. This is because array on the sorted > collection starts with 16 nils. firstIndex is set to 17. > > Notice on the jpg it says firstIndex = 17 but index is 13. So I > can only > conclude that since index is 13 and 12 other objects were processed > successfully that the array and first index changed during processing. > > I am not sure that I can reproduce this error. I've been running > the same > code for quite a while without seeing it. If I can I will post it. > > Has anyone seen this before or does anyone know why this may have > happened? > > Thanks for your help! > > Ron Teitelbaum > [hidden email] > > http://bugs.impara.de/view.php?id=6030 > <SortedCollectionError.jpg> > |
Hi Andrew,
I'm not actually getting an error in the sorting, and if I take the exact same collection and reevaluate it in the debugger it comes up fine. Here is the sortBlock: ^[:a :b | (a nodeID bitXor: self networkID) < (b nodeID bitXor: self networkID)] The values are 256 bit integers. self networkID is an instance variable on the model that creates the sortBlock. I am doing some threading (the error always happens in the forked thread) but I'm pretty sure there is no way that I might be adding to the collection at the same time that I'm evaluating it. I thought maybe if the collection grew during the do: that would explain it, but I'm pretty sure that is not happening. The error was coming up more frequently with more activity. I was able to fix the problem as far as I can tell with aSortedCollection asOrderedCollection do:. I figured that would just push the problem to the orderedCollection conversion but it seems to work just fine (no errors, no nils in the collection and it appears that nothing is lost although the last part is hard to prove). I guessing that the extra message sends in asOrderedCollection allow the sorted collection to finish its sort. Also this appears to only happen when there is intensive activity. I'm running SSL connections, with seaside and multiple threads. It does worry me that I may loose values during the asOrderedCollection conversion. I may take it out and put a semaphore around it to make sure that it's not growing from an add during the read. I've tried to reproduce it with a simple case but so far no luck. Thanks for your suggestion and for looking at it. Ron Teitelbaum > -----Original Message----- > From: Andrew P. Black > Sent: Thursday, February 15, 2007 7:10 PM > > I'm making a wild guess here, Ron: are you ABSOLUTELY sure that your > sortblock represents an ordering relation, that is, for EVERY pair of > values a and b, ((sortBlock value: a value: b) AND (sortBlock > value: b value: a)) implies a = b? If this is not the case, the > sort will very likely run off the end of the legitimate values in the > array, and into the nils. > > Andrew > > > On 12 Feb 2007, at 08:09, Ron Teitelbaum wrote: > > > Hi all, > > > > I just ran into an error and was wondering if anyone has ever seen > > this > > before. I have a sorted collection with a computationally > > intensive sorting > > block. > > > > It blew up on a nil. Low res jpg attached. > > > > It appears to have worked for the first 12 items: > > newCollection in SortedCollection>>collect: has 12 items in it. > > > > The 13th item returned a nil. This is because array on the sorted > > collection starts with 16 nils. firstIndex is set to 17. > > > > Notice on the jpg it says firstIndex = 17 but index is 13. So I > > can only > > conclude that since index is 13 and 12 other objects were processed > > successfully that the array and first index changed during processing. > > > > I am not sure that I can reproduce this error. I've been running > > the same > > code for quite a while without seeing it. If I can I will post it. > > > > Has anyone seen this before or does anyone know why this may have > > happened? > > > > Thanks for your help! > > > > Ron Teitelbaum > > [hidden email] > > > > http://bugs.impara.de/view.php?id=6030 > > <SortedCollectionError.jpg> > > > > |
You are describing all the classic symptoms of a race condition. Try
using a semaphore/mutex to protect your collection (and the nodes inside it!). Cheers, - Andreas Ron Teitelbaum wrote: > Hi Andrew, > > I'm not actually getting an error in the sorting, and if I take the exact > same collection and reevaluate it in the debugger it comes up fine. > > Here is the sortBlock: > > ^[:a :b | (a nodeID bitXor: self networkID) < (b nodeID bitXor: self > networkID)] > > The values are 256 bit integers. self networkID is an instance variable on > the model that creates the sortBlock. > > I am doing some threading (the error always happens in the forked thread) > but I'm pretty sure there is no way that I might be adding to the collection > at the same time that I'm evaluating it. I thought maybe if the collection > grew during the do: that would explain it, but I'm pretty sure that is not > happening. > > The error was coming up more frequently with more activity. I was able to > fix the problem as far as I can tell with aSortedCollection > asOrderedCollection do:. I figured that would just push the problem to the > orderedCollection conversion but it seems to work just fine (no errors, no > nils in the collection and it appears that nothing is lost although the last > part is hard to prove). I guessing that the extra message sends in > asOrderedCollection allow the sorted collection to finish its sort. Also > this appears to only happen when there is intensive activity. I'm running > SSL connections, with seaside and multiple threads. > > It does worry me that I may loose values during the asOrderedCollection > conversion. I may take it out and put a semaphore around it to make sure > that it's not growing from an add during the read. I've tried to reproduce > it with a simple case but so far no luck. > > Thanks for your suggestion and for looking at it. > > Ron Teitelbaum > >> -----Original Message----- >> From: Andrew P. Black >> Sent: Thursday, February 15, 2007 7:10 PM >> >> I'm making a wild guess here, Ron: are you ABSOLUTELY sure that your >> sortblock represents an ordering relation, that is, for EVERY pair of >> values a and b, ((sortBlock value: a value: b) AND (sortBlock >> value: b value: a)) implies a = b? If this is not the case, the >> sort will very likely run off the end of the legitimate values in the >> array, and into the nils. >> >> Andrew >> >> >> On 12 Feb 2007, at 08:09, Ron Teitelbaum wrote: >> >>> Hi all, >>> >>> I just ran into an error and was wondering if anyone has ever seen >>> this >>> before. I have a sorted collection with a computationally >>> intensive sorting >>> block. >>> >>> It blew up on a nil. Low res jpg attached. >>> >>> It appears to have worked for the first 12 items: >>> newCollection in SortedCollection>>collect: has 12 items in it. >>> >>> The 13th item returned a nil. This is because array on the sorted >>> collection starts with 16 nils. firstIndex is set to 17. >>> >>> Notice on the jpg it says firstIndex = 17 but index is 13. So I >>> can only >>> conclude that since index is 13 and 12 other objects were processed >>> successfully that the array and first index changed during processing. >>> >>> I am not sure that I can reproduce this error. I've been running >>> the same >>> code for quite a while without seeing it. If I can I will post it. >>> >>> Has anyone seen this before or does anyone know why this may have >>> happened? >>> >>> Thanks for your help! >>> >>> Ron Teitelbaum >>> [hidden email] >>> >>> http://bugs.impara.de/view.php?id=6030 >>> <SortedCollectionError.jpg> >>> >> > > > > |
Andreas,
Thanks I will. I'd already created a SharedQueueSorted for something similar. I'll replace my sortedCollection with it and see if that fixes things. I really didn't want the extra overhead for this piece, oh well. Any news about QWAQ you can share yet without a NDA? Nice mention in that CIO article http://weeklysqueak.wordpress.com/2007/02/15/cio-insight-interview-with-alan -kay/ . Take care, Ron > -----Original Message----- > From: [hidden email] [mailto:squeak-dev- > [hidden email]] On Behalf Of Andreas Raab > Sent: Thursday, February 15, 2007 8:08 PM > To: [hidden email]; The general-purpose Squeak developers list > Subject: Re: Error in SortedCollection do: > > You are describing all the classic symptoms of a race condition. Try > using a semaphore/mutex to protect your collection (and the nodes inside > it!). > > Cheers, > - Andreas > > Ron Teitelbaum wrote: > > Hi Andrew, > > > > I'm not actually getting an error in the sorting, and if I take the > exact > > same collection and reevaluate it in the debugger it comes up fine. > > > > Here is the sortBlock: > > > > ^[:a :b | (a nodeID bitXor: self networkID) < (b nodeID bitXor: self > > networkID)] > > > > The values are 256 bit integers. self networkID is an instance variable > on > > the model that creates the sortBlock. > > > > I am doing some threading (the error always happens in the forked > thread) > > but I'm pretty sure there is no way that I might be adding to the > collection > > at the same time that I'm evaluating it. I thought maybe if the > collection > > grew during the do: that would explain it, but I'm pretty sure that is > not > > happening. > > > > The error was coming up more frequently with more activity. I was able > to > > fix the problem as far as I can tell with aSortedCollection > > asOrderedCollection do:. I figured that would just push the problem to > the > > orderedCollection conversion but it seems to work just fine (no errors, > no > > nils in the collection and it appears that nothing is lost although the > last > > part is hard to prove). I guessing that the extra message sends in > > asOrderedCollection allow the sorted collection to finish its sort. > Also > > this appears to only happen when there is intensive activity. I'm > running > > SSL connections, with seaside and multiple threads. > > > > It does worry me that I may loose values during the asOrderedCollection > > conversion. I may take it out and put a semaphore around it to make > sure > > that it's not growing from an add during the read. I've tried to > reproduce > > it with a simple case but so far no luck. > > > > Thanks for your suggestion and for looking at it. > > > > Ron Teitelbaum > > > >> -----Original Message----- > >> From: Andrew P. Black > >> Sent: Thursday, February 15, 2007 7:10 PM > >> > >> I'm making a wild guess here, Ron: are you ABSOLUTELY sure that your > >> sortblock represents an ordering relation, that is, for EVERY pair of > >> values a and b, ((sortBlock value: a value: b) AND (sortBlock > >> value: b value: a)) implies a = b? If this is not the case, the > >> sort will very likely run off the end of the legitimate values in the > >> array, and into the nils. > >> > >> Andrew > >> > >> > >> On 12 Feb 2007, at 08:09, Ron Teitelbaum wrote: > >> > >>> Hi all, > >>> > >>> I just ran into an error and was wondering if anyone has ever seen > >>> this > >>> before. I have a sorted collection with a computationally > >>> intensive sorting > >>> block. > >>> > >>> It blew up on a nil. Low res jpg attached. > >>> > >>> It appears to have worked for the first 12 items: > >>> newCollection in SortedCollection>>collect: has 12 items in it. > >>> > >>> The 13th item returned a nil. This is because array on the sorted > >>> collection starts with 16 nils. firstIndex is set to 17. > >>> > >>> Notice on the jpg it says firstIndex = 17 but index is 13. So I > >>> can only > >>> conclude that since index is 13 and 12 other objects were processed > >>> successfully that the array and first index changed during processing. > >>> > >>> I am not sure that I can reproduce this error. I've been running > >>> the same > >>> code for quite a while without seeing it. If I can I will post it. > >>> > >>> Has anyone seen this before or does anyone know why this may have > >>> happened? > >>> > >>> Thanks for your help! > >>> > >>> Ron Teitelbaum > >>> [hidden email] > >>> > >>> http://bugs.impara.de/view.php?id=6030 > >>> <SortedCollectionError.jpg> > >>> > >> > > > > > > > > > > |
In reply to this post by Ron Teitelbaum
Ron,
I bet Andreas is right. I didn't notice in your original post that this code was concurrent. Let us know! On 15 Feb 2007, at 16:59, Ron Teitelbaum wrote:
On a different note: I inferred that you were concerned with speed. If this is the case, wouldn't it make sense to do the self networkID once when the block is created, rather than 2 n log n times when you do the comparisons. Andrew Andrew P. Black Department of Computer Science Portland State University +1 503 725 2411 |
________________________________________
From: Andrew P. Black Sent: Friday, February 16, 2007 3:39 PM On a different note: I inferred that you were concerned with speed. If this is the case, wouldn't it make sense to do the self networkID once when the block is created, rather than 2 n log n times when you do the comparisons. Andrew That's interesting, I hadn't really considered it, since the block lives in a model that creates the block but it is used somewhere else I figured that the networkID would be a temp in the block. But now that you point it out, I can see that the call would be made to the model from the transferred context. That's a really silly mistake easily remedied by a temp variable. Thanks for the suggestion. Ron |
Free forum by Nabble | Edit this page |