[vwnc] [bug][vw7.6]Process monitor tries to use identity hash to uniquely identify processes

Previous Topic Next Topic
 
classic Classic list List threaded Threaded
8 messages Options
Reply | Threaded
Open this post in threaded view
|

[vwnc] [bug][vw7.6]Process monitor tries to use identity hash to uniquely identify processes

Martin McClure
Today, I had a ProcessMonitor that, when I tried to select a Process and
debug it, kept debugging a different Process. I finally noticed that the
Process that was being erroneously debugged had the same identity hash
as the one I was trying to debug.

Some code reading shows that the only way a ProcessMonitorClient has to
identify a Process to its ProcessMonitorServer is via its identityHash.
And identityHashes are not unique, in fact in 32-bit VW there are only
16383 possible ones, with each Process getting a random hash within this
range. If I have 100 Processes, my chances of having at least two with
the same identityHash are better than 1 in 4.

This is slightly alarming. Operations such as "terminate" also have the
same problem, and could have much greater impact on a debugging session.

With small numbers of Processes, the chances of duplication are small.
However, there doesn't seem to be any really good reason to do it the
way it's being done. The main motivation seems to be to only hold a weak
reference to the Process itself. The ProcessMonitorService has a
WeakArray that contains Processes, and all other reference is done by
identity hash. The service could fairly easily use another truly unique
identifier (index in the WeakArray, perhaps).

Alternatively, the ProcessMonitorRecords, shared by the service and the
client, could be ephemerons and hold onto the actual Process,
eliminating the WeakArray altogether.

Regards,

-Martin
_______________________________________________
vwnc mailing list
[hidden email]
http://lists.cs.uiuc.edu/mailman/listinfo/vwnc
Reply | Threaded
Open this post in threaded view
|

Re: [vwnc] [bug][vw7.6]Process monitor tries to use identity hash to uniquely identify processes

Paul Baumann
I only spent about two minutes looking at it, but here is another trick to help uniquely identify a Process without a strong reference to the process...

A way to indirectly reference the Process is through an identifier attribute; interruptProtect already looks suitable. Process.interruptProtect contains a Semaphore created unique to the process and that does not change. #processHash can return the bit shifted combination of both the process identityHash and the interruptProtect identityHash. Or, just gather and compare the objects from interruptProtect.

WeakArrays are slow to cleanup. I'd avoid them for process level stuff.

Paul Baumann


-----Original Message-----
From: [hidden email] [mailto:[hidden email]] On Behalf Of Martin McClure
Sent: Monday, August 03, 2009 9:23 PM
To: VWNC
Subject: [vwnc] [bug][vw7.6]Process monitor tries to use identity hash to uniquely identify processes

Today, I had a ProcessMonitor that, when I tried to select a Process and debug it, kept debugging a different Process. I finally noticed that the Process that was being erroneously debugged had the same identity hash as the one I was trying to debug.

Some code reading shows that the only way a ProcessMonitorClient has to identify a Process to its ProcessMonitorServer is via its identityHash.
And identityHashes are not unique, in fact in 32-bit VW there are only
16383 possible ones, with each Process getting a random hash within this range. If I have 100 Processes, my chances of having at least two with the same identityHash are better than 1 in 4.

This is slightly alarming. Operations such as "terminate" also have the same problem, and could have much greater impact on a debugging session.

With small numbers of Processes, the chances of duplication are small.
However, there doesn't seem to be any really good reason to do it the way it's being done. The main motivation seems to be to only hold a weak reference to the Process itself. The ProcessMonitorService has a WeakArray that contains Processes, and all other reference is done by identity hash. The service could fairly easily use another truly unique identifier (index in the WeakArray, perhaps).

Alternatively, the ProcessMonitorRecords, shared by the service and the client, could be ephemerons and hold onto the actual Process, eliminating the WeakArray altogether.

Regards,

-Martin
_______________________________________________
vwnc mailing list
[hidden email]
http://lists.cs.uiuc.edu/mailman/listinfo/vwnc


This message may contain confidential information and is intended for specific recipients unless explicitly noted otherwise. If you have reason to believe you are not an intended recipient of this message, please delete it and notify the sender. This message may not represent the opinion of IntercontinentalExchange, Inc. (ICE), its subsidiaries or affiliates, and does not constitute a contract or guarantee. Unencrypted electronic mail is not secure and the recipient of this message is expected to provide safeguards from viruses and pursue alternate means of communication where privacy or a binding message is desired.


_______________________________________________
vwnc mailing list
[hidden email]
http://lists.cs.uiuc.edu/mailman/listinfo/vwnc
Reply | Threaded
Open this post in threaded view
|

Re: [vwnc] [bug][vw7.6]Process monitor tries to use identity hash to uniquely identify processes

Paul Baumann
It looks like it can be fixed with these changes...

Add:

Process>>processHash
        ^(self identityHash * (ObjectMemory maximumIdentityHashValue + 1))
                + interruptProtect identityHash

or else simply:

Process>>processHash
        ^interruptProtect

then change these methods to send #processHash instead of #identityHash:

ProcessMonitorService>>do: aBlock forProcess: anInteger
        ^aBlock value: (processes detect: [:p | p ~= 0 and: [p processHash = anInteger]] ifNone: [^nil])

ProcessMontiorRecord>>basicOn: aProcess

...
        hash := aProcess processHash.


Paul Baumann


-----Original Message-----
From: Paul Baumann
Sent: Tuesday, August 04, 2009 9:34 AM
To: 'Martin McClure'; VWNC
Subject: RE: [vwnc] [bug][vw7.6]Process monitor tries to use identity hash to uniquely identify processes

I only spent about two minutes looking at it, but here is another trick to help uniquely identify a Process without a strong reference to the process...

A way to indirectly reference the Process is through an identifier attribute; interruptProtect already looks suitable. Process.interruptProtect contains a Semaphore created unique to the process and that does not change. #processHash can return the bit shifted combination of both the process identityHash and the interruptProtect identityHash. Or, just gather and compare the objects from interruptProtect.

WeakArrays are slow to cleanup. I'd avoid them for process level stuff.

Paul Baumann


-----Original Message-----
From: [hidden email] [mailto:[hidden email]] On Behalf Of Martin McClure
Sent: Monday, August 03, 2009 9:23 PM
To: VWNC
Subject: [vwnc] [bug][vw7.6]Process monitor tries to use identity hash to uniquely identify processes

Today, I had a ProcessMonitor that, when I tried to select a Process and debug it, kept debugging a different Process. I finally noticed that the Process that was being erroneously debugged had the same identity hash as the one I was trying to debug.

Some code reading shows that the only way a ProcessMonitorClient has to identify a Process to its ProcessMonitorServer is via its identityHash.
And identityHashes are not unique, in fact in 32-bit VW there are only
16383 possible ones, with each Process getting a random hash within this range. If I have 100 Processes, my chances of having at least two with the same identityHash are better than 1 in 4.

This is slightly alarming. Operations such as "terminate" also have the same problem, and could have much greater impact on a debugging session.

With small numbers of Processes, the chances of duplication are small.
However, there doesn't seem to be any really good reason to do it the way it's being done. The main motivation seems to be to only hold a weak reference to the Process itself. The ProcessMonitorService has a WeakArray that contains Processes, and all other reference is done by identity hash. The service could fairly easily use another truly unique identifier (index in the WeakArray, perhaps).

Alternatively, the ProcessMonitorRecords, shared by the service and the client, could be ephemerons and hold onto the actual Process, eliminating the WeakArray altogether.

Regards,

-Martin
_______________________________________________
vwnc mailing list
[hidden email]
http://lists.cs.uiuc.edu/mailman/listinfo/vwnc


This message may contain confidential information and is intended for specific recipients unless explicitly noted otherwise. If you have reason to believe you are not an intended recipient of this message, please delete it and notify the sender. This message may not represent the opinion of IntercontinentalExchange, Inc. (ICE), its subsidiaries or affiliates, and does not constitute a contract or guarantee. Unencrypted electronic mail is not secure and the recipient of this message is expected to provide safeguards from viruses and pursue alternate means of communication where privacy or a binding message is desired.


_______________________________________________
vwnc mailing list
[hidden email]
http://lists.cs.uiuc.edu/mailman/listinfo/vwnc
Reply | Threaded
Open this post in threaded view
|

Re: [vwnc] [bug][vw7.6]Process monitor tries to use identity hash to uniquely identify processes

Martin McClure
In reply to this post by Paul Baumann
Paul Baumann wrote:
> I only spent about two minutes looking at it, but here is another trick to help uniquely identify a Process without a strong reference to the process...
>
> A way to indirectly reference the Process is through an identifier attribute; interruptProtect already looks suitable. Process.interruptProtect contains a Semaphore created unique to the process and that does not change. #processHash can return the bit shifted combination of both the process identityHash and the interruptProtect identityHash. Or, just gather and compare the objects from interruptProtect.
>

Hmm. Combining the identity hashes still doesn't create a unique
identifier. It does reduce the probability of a collision. It's easy
enough to be truly unique (see below) that one might as well be truly
unique.

Hanging on to the interruptProtect itself would be unique, but would
keep the interruptProtect alive. That may not be a problem.

> WeakArrays are slow to cleanup. I'd avoid them for process level stuff.

Keeping a Process instance alive doesn't create any problems that I know
of, AFAIK it's just another object unless the ProcessorScheduler is
working with it. And you have to reference a Process somehow. The
choices seem to be:

* Do an #allInstances every time you need to access any Process, then
search the results for the one you want. Finding the one you want is
slightly tricky; comparing the interruptProtect would work. This would
be fairly slow.

* Keep a strong reference to each Process. This keeps the Process alive.
The main disadvantage of this approach is that if you leave a
ProcessMonitor open while many Processes are created and discarded,
you'll accumulate dead Processes in your list and it will be harder to
find the ones you're interested in.

* Keep a weak reference to the Process. This is the current approach,
which seems reasonable.

* Keep an ephemeron reference to the Process. This has approximately the
same advantages and disadvantages as the weak reference.

With either weak or ephemeron, by using a weak-valued dictionary you can
create whatever unique key you want. This could be as simple as an
'Object new' for each key, passed back and forth in the record.

Regards,

-Martin
_______________________________________________
vwnc mailing list
[hidden email]
http://lists.cs.uiuc.edu/mailman/listinfo/vwnc
Reply | Threaded
Open this post in threaded view
|

Re: [vwnc] [bug][vw7.6]Process monitor tries to use identity hash to uniquely identify processes

Paul Baumann
Martin,

Understood, combining the hashes only cuts odds of collision in half. That is why I also showed the method that answers interruptProtect to entirely avoid hash collision issues. I showed an existing attribute so the general idea is clear and simple. Process can be given a durable identifier that is simply an assigned object that is accessed by way of a method called #processIdentifier.

The slow WeakArray cleanup that I mentioned is a non-issue if the collection is rebuilt with each query instead of maintained. Weakarray finalization delays affect enduring and maintained weakarray-based collections--the kind that we are most familiar with. Sorry to have mentioned that for this context.

Your initial suggestion to use an index into the weak array was a good one. It looks like ProcessMonitorRecord.index was a half step toward that kind of implementation. ProcessMonitorRecord should be created with an index (the weekarray of all processes) and a position in that index.

I'm glad you found this problem.

Paul Baumann

-----Original Message-----
From: Martin McClure [mailto:[hidden email]]
Sent: Tuesday, August 04, 2009 1:42 PM
To: Paul Baumann
Cc: VWNC
Subject: Re: [vwnc] [bug][vw7.6]Process monitor tries to use identity hash to uniquely identify processes
Importance: High

Paul Baumann wrote:
> I only spent about two minutes looking at it, but here is another trick to help uniquely identify a Process without a strong reference to the process...
>
> A way to indirectly reference the Process is through an identifier attribute; interruptProtect already looks suitable. Process.interruptProtect contains a Semaphore created unique to the process and that does not change. #processHash can return the bit shifted combination of both the process identityHash and the interruptProtect identityHash. Or, just gather and compare the objects from interruptProtect.
>

Hmm. Combining the identity hashes still doesn't create a unique identifier. It does reduce the probability of a collision. It's easy enough to be truly unique (see below) that one might as well be truly unique.

Hanging on to the interruptProtect itself would be unique, but would keep the interruptProtect alive. That may not be a problem.

> WeakArrays are slow to cleanup. I'd avoid them for process level stuff.

Keeping a Process instance alive doesn't create any problems that I know of, AFAIK it's just another object unless the ProcessorScheduler is working with it. And you have to reference a Process somehow. The choices seem to be:

* Do an #allInstances every time you need to access any Process, then search the results for the one you want. Finding the one you want is slightly tricky; comparing the interruptProtect would work. This would be fairly slow.

* Keep a strong reference to each Process. This keeps the Process alive.
The main disadvantage of this approach is that if you leave a ProcessMonitor open while many Processes are created and discarded, you'll accumulate dead Processes in your list and it will be harder to find the ones you're interested in.

* Keep a weak reference to the Process. This is the current approach, which seems reasonable.

* Keep an ephemeron reference to the Process. This has approximately the same advantages and disadvantages as the weak reference.

With either weak or ephemeron, by using a weak-valued dictionary you can create whatever unique key you want. This could be as simple as an 'Object new' for each key, passed back and forth in the record.

Regards,

-Martin


This message may contain confidential information and is intended for specific recipients unless explicitly noted otherwise. If you have reason to believe you are not an intended recipient of this message, please delete it and notify the sender. This message may not represent the opinion of IntercontinentalExchange, Inc. (ICE), its subsidiaries or affiliates, and does not constitute a contract or guarantee. Unencrypted electronic mail is not secure and the recipient of this message is expected to provide safeguards from viruses and pursue alternate means of communication where privacy or a binding message is desired.


_______________________________________________
vwnc mailing list
[hidden email]
http://lists.cs.uiuc.edu/mailman/listinfo/vwnc
Reply | Threaded
Open this post in threaded view
|

Re: [vwnc] [bug][vw7.6]Process monitor tries to use identity hash to uniquely identify processes

Terry Raymond
A simple solution would be to enable the commands only when
the sampler has stopped. This would prohibit a different
process with the same identity hash from replacing
a previous one and a command can be safely issued.

Of course there could still be some confusion when the
display is updating and what you think is the same process
really is not. But if it is one of your processes, you have
the option of assigning a name to it when you create the
process.

Terry
 
===========================================================
Terry Raymond
Crafted Smalltalk
80 Lazywood Ln.
Tiverton, RI  02878
(401) 624-4517      [hidden email]
<http://www.craftedsmalltalk.com>
===========================================================

> -----Original Message-----
> From: [hidden email] [mailto:[hidden email]] On Behalf Of Paul Baumann
> Sent: Tuesday, August 04, 2009 5:38 PM
> To: 'Martin McClure'
> Cc: VWNC
> Subject: Re: [vwnc] [bug][vw7.6]Process monitor tries to use identity hash to uniquely identify
> processes
>
> Martin,
>
> Understood, combining the hashes only cuts odds of collision in half. That is why I also showed the
> method that answers interruptProtect to entirely avoid hash collision issues. I showed an existing
> attribute so the general idea is clear and simple. Process can be given a durable identifier that is
> simply an assigned object that is accessed by way of a method called #processIdentifier.
>
> The slow WeakArray cleanup that I mentioned is a non-issue if the collection is rebuilt with each
> query instead of maintained. Weakarray finalization delays affect enduring and maintained weakarray-
> based collections--the kind that we are most familiar with. Sorry to have mentioned that for this
> context.
>
> Your initial suggestion to use an index into the weak array was a good one. It looks like
> ProcessMonitorRecord.index was a half step toward that kind of implementation. ProcessMonitorRecord
> should be created with an index (the weekarray of all processes) and a position in that index.
>
> I'm glad you found this problem.
>
> Paul Baumann
>
> -----Original Message-----
> From: Martin McClure [mailto:[hidden email]]
> Sent: Tuesday, August 04, 2009 1:42 PM
> To: Paul Baumann
> Cc: VWNC
> Subject: Re: [vwnc] [bug][vw7.6]Process monitor tries to use identity hash to uniquely identify
> processes
> Importance: High
>
> Paul Baumann wrote:
> > I only spent about two minutes looking at it, but here is another trick to help uniquely identify a
> Process without a strong reference to the process...
> >
> > A way to indirectly reference the Process is through an identifier attribute; interruptProtect
> already looks suitable. Process.interruptProtect contains a Semaphore created unique to the process
> and that does not change. #processHash can return the bit shifted combination of both the process
> identityHash and the interruptProtect identityHash. Or, just gather and compare the objects from
> interruptProtect.
> >
>
> Hmm. Combining the identity hashes still doesn't create a unique identifier. It does reduce the
> probability of a collision. It's easy enough to be truly unique (see below) that one might as well be
> truly unique.
>
> Hanging on to the interruptProtect itself would be unique, but would keep the interruptProtect alive.
> That may not be a problem.
>
> > WeakArrays are slow to cleanup. I'd avoid them for process level stuff.
>
> Keeping a Process instance alive doesn't create any problems that I know of, AFAIK it's just another
> object unless the ProcessorScheduler is working with it. And you have to reference a Process somehow.
> The choices seem to be:
>
> * Do an #allInstances every time you need to access any Process, then search the results for the one
> you want. Finding the one you want is slightly tricky; comparing the interruptProtect would work.
> This would be fairly slow.
>
> * Keep a strong reference to each Process. This keeps the Process alive.
> The main disadvantage of this approach is that if you leave a ProcessMonitor open while many
> Processes are created and discarded, you'll accumulate dead Processes in your list and it will be
> harder to find the ones you're interested in.
>
> * Keep a weak reference to the Process. This is the current approach, which seems reasonable.
>
> * Keep an ephemeron reference to the Process. This has approximately the same advantages and
> disadvantages as the weak reference.
>
> With either weak or ephemeron, by using a weak-valued dictionary you can create whatever unique key
> you want. This could be as simple as an 'Object new' for each key, passed back and forth in the
> record.
>
> Regards,
>
> -Martin
>
>
> This message may contain confidential information and is intended for specific recipients unless
> explicitly noted otherwise. If you have reason to believe you are not an intended recipient of this
> message, please delete it and notify the sender. This message may not represent the opinion of
> IntercontinentalExchange, Inc. (ICE), its subsidiaries or affiliates, and does not constitute a
> contract or guarantee. Unencrypted electronic mail is not secure and the recipient of this message is
> expected to provide safeguards from viruses and pursue alternate means of communication where privacy
> or a binding message is desired.
>
>
> _______________________________________________
> vwnc mailing list
> [hidden email]
> http://lists.cs.uiuc.edu/mailman/listinfo/vwnc

_______________________________________________
vwnc mailing list
[hidden email]
http://lists.cs.uiuc.edu/mailman/listinfo/vwnc
Reply | Threaded
Open this post in threaded view
|

Re: [vwnc] [bug][vw7.6]Process monitor tries to use identity hash to uniquely identify processes

Martin McClure
Terry Raymond wrote:
> A simple solution would be to enable the commands only when
> the sampler has stopped. This would prohibit a different
> process with the same identity hash from replacing
> a previous one and a command can be safely issued.

This is not a solution. In the real-world case I ran into, there were
two concurrently live processes with the same identity hash.

Regards,

-Martin
_______________________________________________
vwnc mailing list
[hidden email]
http://lists.cs.uiuc.edu/mailman/listinfo/vwnc
Reply | Threaded
Open this post in threaded view
|

Re: [vwnc] [bug][vw7.6]Process monitor tries to use identity hash to uniquely identify processes

Andres Valloud-6
AR 58888.

-----Original Message-----
From: [hidden email] [mailto:[hidden email]] On
Behalf Of Martin McClure
Sent: Tuesday, August 04, 2009 4:04 PM
To: Raymond, Terry
Cc: 'VWNC'
Subject: Re: [vwnc] [bug][vw7.6]Process monitor tries to use identity
hash to uniquely identify processes

Terry Raymond wrote:
> A simple solution would be to enable the commands only when the
> sampler has stopped. This would prohibit a different process with the
> same identity hash from replacing a previous one and a command can be
> safely issued.

This is not a solution. In the real-world case I ran into, there were
two concurrently live processes with the same identity hash.

Regards,

-Martin
_______________________________________________
vwnc mailing list
[hidden email]
http://lists.cs.uiuc.edu/mailman/listinfo/vwnc

_______________________________________________
vwnc mailing list
[hidden email]
http://lists.cs.uiuc.edu/mailman/listinfo/vwnc