Today, I had a ProcessMonitor that, when I tried to select a Process and
debug it, kept debugging a different Process. I finally noticed that the Process that was being erroneously debugged had the same identity hash as the one I was trying to debug. Some code reading shows that the only way a ProcessMonitorClient has to identify a Process to its ProcessMonitorServer is via its identityHash. And identityHashes are not unique, in fact in 32-bit VW there are only 16383 possible ones, with each Process getting a random hash within this range. If I have 100 Processes, my chances of having at least two with the same identityHash are better than 1 in 4. This is slightly alarming. Operations such as "terminate" also have the same problem, and could have much greater impact on a debugging session. With small numbers of Processes, the chances of duplication are small. However, there doesn't seem to be any really good reason to do it the way it's being done. The main motivation seems to be to only hold a weak reference to the Process itself. The ProcessMonitorService has a WeakArray that contains Processes, and all other reference is done by identity hash. The service could fairly easily use another truly unique identifier (index in the WeakArray, perhaps). Alternatively, the ProcessMonitorRecords, shared by the service and the client, could be ephemerons and hold onto the actual Process, eliminating the WeakArray altogether. Regards, -Martin _______________________________________________ vwnc mailing list [hidden email] http://lists.cs.uiuc.edu/mailman/listinfo/vwnc |
I only spent about two minutes looking at it, but here is another trick to help uniquely identify a Process without a strong reference to the process...
A way to indirectly reference the Process is through an identifier attribute; interruptProtect already looks suitable. Process.interruptProtect contains a Semaphore created unique to the process and that does not change. #processHash can return the bit shifted combination of both the process identityHash and the interruptProtect identityHash. Or, just gather and compare the objects from interruptProtect. WeakArrays are slow to cleanup. I'd avoid them for process level stuff. Paul Baumann -----Original Message----- From: [hidden email] [mailto:[hidden email]] On Behalf Of Martin McClure Sent: Monday, August 03, 2009 9:23 PM To: VWNC Subject: [vwnc] [bug][vw7.6]Process monitor tries to use identity hash to uniquely identify processes Today, I had a ProcessMonitor that, when I tried to select a Process and debug it, kept debugging a different Process. I finally noticed that the Process that was being erroneously debugged had the same identity hash as the one I was trying to debug. Some code reading shows that the only way a ProcessMonitorClient has to identify a Process to its ProcessMonitorServer is via its identityHash. And identityHashes are not unique, in fact in 32-bit VW there are only 16383 possible ones, with each Process getting a random hash within this range. If I have 100 Processes, my chances of having at least two with the same identityHash are better than 1 in 4. This is slightly alarming. Operations such as "terminate" also have the same problem, and could have much greater impact on a debugging session. With small numbers of Processes, the chances of duplication are small. However, there doesn't seem to be any really good reason to do it the way it's being done. The main motivation seems to be to only hold a weak reference to the Process itself. The ProcessMonitorService has a WeakArray that contains Processes, and all other reference is done by identity hash. The service could fairly easily use another truly unique identifier (index in the WeakArray, perhaps). Alternatively, the ProcessMonitorRecords, shared by the service and the client, could be ephemerons and hold onto the actual Process, eliminating the WeakArray altogether. Regards, -Martin _______________________________________________ vwnc mailing list [hidden email] http://lists.cs.uiuc.edu/mailman/listinfo/vwnc This message may contain confidential information and is intended for specific recipients unless explicitly noted otherwise. If you have reason to believe you are not an intended recipient of this message, please delete it and notify the sender. This message may not represent the opinion of IntercontinentalExchange, Inc. (ICE), its subsidiaries or affiliates, and does not constitute a contract or guarantee. Unencrypted electronic mail is not secure and the recipient of this message is expected to provide safeguards from viruses and pursue alternate means of communication where privacy or a binding message is desired. _______________________________________________ vwnc mailing list [hidden email] http://lists.cs.uiuc.edu/mailman/listinfo/vwnc |
It looks like it can be fixed with these changes...
Add: Process>>processHash ^(self identityHash * (ObjectMemory maximumIdentityHashValue + 1)) + interruptProtect identityHash or else simply: Process>>processHash ^interruptProtect then change these methods to send #processHash instead of #identityHash: ProcessMonitorService>>do: aBlock forProcess: anInteger ^aBlock value: (processes detect: [:p | p ~= 0 and: [p processHash = anInteger]] ifNone: [^nil]) ProcessMontiorRecord>>basicOn: aProcess ... hash := aProcess processHash. Paul Baumann -----Original Message----- From: Paul Baumann Sent: Tuesday, August 04, 2009 9:34 AM To: 'Martin McClure'; VWNC Subject: RE: [vwnc] [bug][vw7.6]Process monitor tries to use identity hash to uniquely identify processes I only spent about two minutes looking at it, but here is another trick to help uniquely identify a Process without a strong reference to the process... A way to indirectly reference the Process is through an identifier attribute; interruptProtect already looks suitable. Process.interruptProtect contains a Semaphore created unique to the process and that does not change. #processHash can return the bit shifted combination of both the process identityHash and the interruptProtect identityHash. Or, just gather and compare the objects from interruptProtect. WeakArrays are slow to cleanup. I'd avoid them for process level stuff. Paul Baumann -----Original Message----- From: [hidden email] [mailto:[hidden email]] On Behalf Of Martin McClure Sent: Monday, August 03, 2009 9:23 PM To: VWNC Subject: [vwnc] [bug][vw7.6]Process monitor tries to use identity hash to uniquely identify processes Today, I had a ProcessMonitor that, when I tried to select a Process and debug it, kept debugging a different Process. I finally noticed that the Process that was being erroneously debugged had the same identity hash as the one I was trying to debug. Some code reading shows that the only way a ProcessMonitorClient has to identify a Process to its ProcessMonitorServer is via its identityHash. And identityHashes are not unique, in fact in 32-bit VW there are only 16383 possible ones, with each Process getting a random hash within this range. If I have 100 Processes, my chances of having at least two with the same identityHash are better than 1 in 4. This is slightly alarming. Operations such as "terminate" also have the same problem, and could have much greater impact on a debugging session. With small numbers of Processes, the chances of duplication are small. However, there doesn't seem to be any really good reason to do it the way it's being done. The main motivation seems to be to only hold a weak reference to the Process itself. The ProcessMonitorService has a WeakArray that contains Processes, and all other reference is done by identity hash. The service could fairly easily use another truly unique identifier (index in the WeakArray, perhaps). Alternatively, the ProcessMonitorRecords, shared by the service and the client, could be ephemerons and hold onto the actual Process, eliminating the WeakArray altogether. Regards, -Martin _______________________________________________ vwnc mailing list [hidden email] http://lists.cs.uiuc.edu/mailman/listinfo/vwnc This message may contain confidential information and is intended for specific recipients unless explicitly noted otherwise. If you have reason to believe you are not an intended recipient of this message, please delete it and notify the sender. This message may not represent the opinion of IntercontinentalExchange, Inc. (ICE), its subsidiaries or affiliates, and does not constitute a contract or guarantee. Unencrypted electronic mail is not secure and the recipient of this message is expected to provide safeguards from viruses and pursue alternate means of communication where privacy or a binding message is desired. _______________________________________________ vwnc mailing list [hidden email] http://lists.cs.uiuc.edu/mailman/listinfo/vwnc |
In reply to this post by Paul Baumann
Paul Baumann wrote:
> I only spent about two minutes looking at it, but here is another trick to help uniquely identify a Process without a strong reference to the process... > > A way to indirectly reference the Process is through an identifier attribute; interruptProtect already looks suitable. Process.interruptProtect contains a Semaphore created unique to the process and that does not change. #processHash can return the bit shifted combination of both the process identityHash and the interruptProtect identityHash. Or, just gather and compare the objects from interruptProtect. > Hmm. Combining the identity hashes still doesn't create a unique identifier. It does reduce the probability of a collision. It's easy enough to be truly unique (see below) that one might as well be truly unique. Hanging on to the interruptProtect itself would be unique, but would keep the interruptProtect alive. That may not be a problem. > WeakArrays are slow to cleanup. I'd avoid them for process level stuff. Keeping a Process instance alive doesn't create any problems that I know of, AFAIK it's just another object unless the ProcessorScheduler is working with it. And you have to reference a Process somehow. The choices seem to be: * Do an #allInstances every time you need to access any Process, then search the results for the one you want. Finding the one you want is slightly tricky; comparing the interruptProtect would work. This would be fairly slow. * Keep a strong reference to each Process. This keeps the Process alive. The main disadvantage of this approach is that if you leave a ProcessMonitor open while many Processes are created and discarded, you'll accumulate dead Processes in your list and it will be harder to find the ones you're interested in. * Keep a weak reference to the Process. This is the current approach, which seems reasonable. * Keep an ephemeron reference to the Process. This has approximately the same advantages and disadvantages as the weak reference. With either weak or ephemeron, by using a weak-valued dictionary you can create whatever unique key you want. This could be as simple as an 'Object new' for each key, passed back and forth in the record. Regards, -Martin _______________________________________________ vwnc mailing list [hidden email] http://lists.cs.uiuc.edu/mailman/listinfo/vwnc |
Martin,
Understood, combining the hashes only cuts odds of collision in half. That is why I also showed the method that answers interruptProtect to entirely avoid hash collision issues. I showed an existing attribute so the general idea is clear and simple. Process can be given a durable identifier that is simply an assigned object that is accessed by way of a method called #processIdentifier. The slow WeakArray cleanup that I mentioned is a non-issue if the collection is rebuilt with each query instead of maintained. Weakarray finalization delays affect enduring and maintained weakarray-based collections--the kind that we are most familiar with. Sorry to have mentioned that for this context. Your initial suggestion to use an index into the weak array was a good one. It looks like ProcessMonitorRecord.index was a half step toward that kind of implementation. ProcessMonitorRecord should be created with an index (the weekarray of all processes) and a position in that index. I'm glad you found this problem. Paul Baumann -----Original Message----- From: Martin McClure [mailto:[hidden email]] Sent: Tuesday, August 04, 2009 1:42 PM To: Paul Baumann Cc: VWNC Subject: Re: [vwnc] [bug][vw7.6]Process monitor tries to use identity hash to uniquely identify processes Importance: High Paul Baumann wrote: > I only spent about two minutes looking at it, but here is another trick to help uniquely identify a Process without a strong reference to the process... > > A way to indirectly reference the Process is through an identifier attribute; interruptProtect already looks suitable. Process.interruptProtect contains a Semaphore created unique to the process and that does not change. #processHash can return the bit shifted combination of both the process identityHash and the interruptProtect identityHash. Or, just gather and compare the objects from interruptProtect. > Hmm. Combining the identity hashes still doesn't create a unique identifier. It does reduce the probability of a collision. It's easy enough to be truly unique (see below) that one might as well be truly unique. Hanging on to the interruptProtect itself would be unique, but would keep the interruptProtect alive. That may not be a problem. > WeakArrays are slow to cleanup. I'd avoid them for process level stuff. Keeping a Process instance alive doesn't create any problems that I know of, AFAIK it's just another object unless the ProcessorScheduler is working with it. And you have to reference a Process somehow. The choices seem to be: * Do an #allInstances every time you need to access any Process, then search the results for the one you want. Finding the one you want is slightly tricky; comparing the interruptProtect would work. This would be fairly slow. * Keep a strong reference to each Process. This keeps the Process alive. The main disadvantage of this approach is that if you leave a ProcessMonitor open while many Processes are created and discarded, you'll accumulate dead Processes in your list and it will be harder to find the ones you're interested in. * Keep a weak reference to the Process. This is the current approach, which seems reasonable. * Keep an ephemeron reference to the Process. This has approximately the same advantages and disadvantages as the weak reference. With either weak or ephemeron, by using a weak-valued dictionary you can create whatever unique key you want. This could be as simple as an 'Object new' for each key, passed back and forth in the record. Regards, -Martin This message may contain confidential information and is intended for specific recipients unless explicitly noted otherwise. If you have reason to believe you are not an intended recipient of this message, please delete it and notify the sender. This message may not represent the opinion of IntercontinentalExchange, Inc. (ICE), its subsidiaries or affiliates, and does not constitute a contract or guarantee. Unencrypted electronic mail is not secure and the recipient of this message is expected to provide safeguards from viruses and pursue alternate means of communication where privacy or a binding message is desired. _______________________________________________ vwnc mailing list [hidden email] http://lists.cs.uiuc.edu/mailman/listinfo/vwnc |
A simple solution would be to enable the commands only when
the sampler has stopped. This would prohibit a different process with the same identity hash from replacing a previous one and a command can be safely issued. Of course there could still be some confusion when the display is updating and what you think is the same process really is not. But if it is one of your processes, you have the option of assigning a name to it when you create the process. Terry =========================================================== Terry Raymond Crafted Smalltalk 80 Lazywood Ln. Tiverton, RI 02878 (401) 624-4517 [hidden email] <http://www.craftedsmalltalk.com> =========================================================== > -----Original Message----- > From: [hidden email] [mailto:[hidden email]] On Behalf Of Paul Baumann > Sent: Tuesday, August 04, 2009 5:38 PM > To: 'Martin McClure' > Cc: VWNC > Subject: Re: [vwnc] [bug][vw7.6]Process monitor tries to use identity hash to uniquely identify > processes > > Martin, > > Understood, combining the hashes only cuts odds of collision in half. That is why I also showed the > method that answers interruptProtect to entirely avoid hash collision issues. I showed an existing > attribute so the general idea is clear and simple. Process can be given a durable identifier that is > simply an assigned object that is accessed by way of a method called #processIdentifier. > > The slow WeakArray cleanup that I mentioned is a non-issue if the collection is rebuilt with each > query instead of maintained. Weakarray finalization delays affect enduring and maintained weakarray- > based collections--the kind that we are most familiar with. Sorry to have mentioned that for this > context. > > Your initial suggestion to use an index into the weak array was a good one. It looks like > ProcessMonitorRecord.index was a half step toward that kind of implementation. ProcessMonitorRecord > should be created with an index (the weekarray of all processes) and a position in that index. > > I'm glad you found this problem. > > Paul Baumann > > -----Original Message----- > From: Martin McClure [mailto:[hidden email]] > Sent: Tuesday, August 04, 2009 1:42 PM > To: Paul Baumann > Cc: VWNC > Subject: Re: [vwnc] [bug][vw7.6]Process monitor tries to use identity hash to uniquely identify > processes > Importance: High > > Paul Baumann wrote: > > I only spent about two minutes looking at it, but here is another trick to help uniquely identify a > Process without a strong reference to the process... > > > > A way to indirectly reference the Process is through an identifier attribute; interruptProtect > already looks suitable. Process.interruptProtect contains a Semaphore created unique to the process > and that does not change. #processHash can return the bit shifted combination of both the process > identityHash and the interruptProtect identityHash. Or, just gather and compare the objects from > interruptProtect. > > > > Hmm. Combining the identity hashes still doesn't create a unique identifier. It does reduce the > probability of a collision. It's easy enough to be truly unique (see below) that one might as well be > truly unique. > > Hanging on to the interruptProtect itself would be unique, but would keep the interruptProtect alive. > That may not be a problem. > > > WeakArrays are slow to cleanup. I'd avoid them for process level stuff. > > Keeping a Process instance alive doesn't create any problems that I know of, AFAIK it's just another > object unless the ProcessorScheduler is working with it. And you have to reference a Process somehow. > The choices seem to be: > > * Do an #allInstances every time you need to access any Process, then search the results for the one > you want. Finding the one you want is slightly tricky; comparing the interruptProtect would work. > This would be fairly slow. > > * Keep a strong reference to each Process. This keeps the Process alive. > The main disadvantage of this approach is that if you leave a ProcessMonitor open while many > Processes are created and discarded, you'll accumulate dead Processes in your list and it will be > harder to find the ones you're interested in. > > * Keep a weak reference to the Process. This is the current approach, which seems reasonable. > > * Keep an ephemeron reference to the Process. This has approximately the same advantages and > disadvantages as the weak reference. > > With either weak or ephemeron, by using a weak-valued dictionary you can create whatever unique key > you want. This could be as simple as an 'Object new' for each key, passed back and forth in the > record. > > Regards, > > -Martin > > > This message may contain confidential information and is intended for specific recipients unless > explicitly noted otherwise. If you have reason to believe you are not an intended recipient of this > message, please delete it and notify the sender. This message may not represent the opinion of > IntercontinentalExchange, Inc. (ICE), its subsidiaries or affiliates, and does not constitute a > contract or guarantee. Unencrypted electronic mail is not secure and the recipient of this message is > expected to provide safeguards from viruses and pursue alternate means of communication where privacy > or a binding message is desired. > > > _______________________________________________ > vwnc mailing list > [hidden email] > http://lists.cs.uiuc.edu/mailman/listinfo/vwnc _______________________________________________ vwnc mailing list [hidden email] http://lists.cs.uiuc.edu/mailman/listinfo/vwnc |
Terry Raymond wrote:
> A simple solution would be to enable the commands only when > the sampler has stopped. This would prohibit a different > process with the same identity hash from replacing > a previous one and a command can be safely issued. This is not a solution. In the real-world case I ran into, there were two concurrently live processes with the same identity hash. Regards, -Martin _______________________________________________ vwnc mailing list [hidden email] http://lists.cs.uiuc.edu/mailman/listinfo/vwnc |
AR 58888.
-----Original Message----- From: [hidden email] [mailto:[hidden email]] On Behalf Of Martin McClure Sent: Tuesday, August 04, 2009 4:04 PM To: Raymond, Terry Cc: 'VWNC' Subject: Re: [vwnc] [bug][vw7.6]Process monitor tries to use identity hash to uniquely identify processes Terry Raymond wrote: > A simple solution would be to enable the commands only when the > sampler has stopped. This would prohibit a different process with the > same identity hash from replacing a previous one and a command can be > safely issued. This is not a solution. In the real-world case I ran into, there were two concurrently live processes with the same identity hash. Regards, -Martin _______________________________________________ vwnc mailing list [hidden email] http://lists.cs.uiuc.edu/mailman/listinfo/vwnc _______________________________________________ vwnc mailing list [hidden email] http://lists.cs.uiuc.edu/mailman/listinfo/vwnc |
Free forum by Nabble | Edit this page |