I'm using
"%VW75%\bin\win\debug\vwntconsole.exe -o10s <image>" to attempt to debug
what appears to be random VM freezes. I haven't caught a freeze yet, but so far
I've noticed that the VM seems to pause for a few
seconds after:
Prim fail in
..\..\..\src\mman\mmAllocate.c @ 247
Also, there are many
thousands of these types of failures:
Prim fail in
..\..\..\src\mman\mmSubscript.c @ 827
The mmAllocate
failures tend to follow mmSubscript failures, but not always. The position after
the @ symbol is sometimes different, but the values shown are the most common.
Anyone know what
those failures are? Are they real problems, or just hooks for things like
memory growth?
Paul
Baumann
This message may contain confidential information and is intended for specific recipients unless explicitly noted otherwise. If you have reason to believe you are not an intended recipient of this message, please delete it and notify the sender. This message may not represent the opinion of IntercontinentalExchange, Inc. (ICE), its subsidiaries or affiliates, and does not constitute a contract or guarantee. Unencrypted electronic mail is not secure and the recipient of this message is expected to provide safeguards from viruses and pursue alternate means of communication where privacy or a binding message is desired. |
Hi Paul,
these failures merely indicate that a primitive failed. They are no serious problem and are typically handled in the image Paul Baumann wrote: > I'm using "%VW75%\bin\win\debug\vwntconsole.exe -o10s <image>" to > attempt to debug what appears to be random VM freezes. I haven't caught > a freeze yet, but so far I've noticed that the VM seems to pause for a > few seconds after: -o10s does not list every method that is executed. It only shows when a method is translated or linked to,.... I.e. typically only the first time a method is called for an instance of a particular class shows on the log. > > Prim fail in ..\..\..\src\mman\mmAllocate.c @ 247 > new is called for an indexable class where new: would be apropriate. > Also, there are many thousands of these types of failures: > > Prim fail in ..\..\..\src\mman\mmSubscript.c @ 827 <primitive: 105> ByteEncodedString replaceFrom: start to: stop with: sourceString startingAt: sourceStart Often only the simple cases are handled in the primitive. More complicated cases (e.g. sourceString and the receiver differ in type) are done in smalltalk. > > The mmAllocate failures tend to follow mmSubscript failures, but not > always. The position after the @ symbol is sometimes different, but the > values shown are the most common. > > Anyone know what those failures are? Are they real problems, or just > hooks for things like memory growth? > > Paul Baumann > HTH, Ralf Propach -- Ralf Propach, [hidden email] Tel: +49 231 975 99 38 Fax: +49 231 975 99 20 Georg Heeg eK (Dortmund) Handelsregister: Amtsgericht Dortmund A 12812 |
Ok. After I posted I realized that I had the VM source code and could
track these down myself. I see that the mmAllocate is just a memory growth thing. The mmSubscript failures are so frequent that they aren't likely a real problem either. Thanks! Paul Baumann -----Original Message----- From: Ralf Propach [mailto:[hidden email]] Sent: Wednesday, August 01, 2007 11:16 AM To: Paul Baumann Cc: VW NC Subject: Re: VM debug prim failures Importance: High Hi Paul, these failures merely indicate that a primitive failed. They are no serious problem and are typically handled in the image Paul Baumann wrote: > I'm using "%VW75%\bin\win\debug\vwntconsole.exe -o10s <image>" to > attempt to debug what appears to be random VM freezes. I haven't > caught a freeze yet, but so far I've noticed that the VM seems to > pause for a few seconds after: -o10s does not list every method that is executed. It only shows when a method is translated or linked to,.... I.e. typically only the first time a method is called for an instance of a particular class shows on the log. > > Prim fail in ..\..\..\src\mman\mmAllocate.c @ 247 > new is called for an indexable class where new: would be apropriate. > Also, there are many thousands of these types of failures: > > Prim fail in ..\..\..\src\mman\mmSubscript.c @ 827 <primitive: 105> ByteEncodedString replaceFrom: start to: stop with: sourceString startingAt: sourceStart Often only the simple cases are handled in the primitive. More complicated cases (e.g. sourceString and the receiver differ in type) are done in smalltalk. > > The mmAllocate failures tend to follow mmSubscript failures, but not > always. The position after the @ symbol is sometimes different, but > the values shown are the most common. > > Anyone know what those failures are? Are they real problems, or just > hooks for things like memory growth? > > Paul Baumann > HTH, Ralf Propach -- Ralf Propach, [hidden email] Tel: +49 231 975 99 38 Fax: +49 231 975 99 20 Georg Heeg eK (Dortmund) Handelsregister: Amtsgericht Dortmund A 12812 -------------------------------------------------------- This message may contain confidential information and is intended for specific recipients unless explicitly noted otherwise. If you have reason to believe you are not an intended recipient of this message, please delete it and notify the sender. This message may not represent the opinion of IntercontinentalExchange, Inc. (ICE), its subsidiaries or affiliates, and does not constitute a contract or guarantee. Unencrypted electronic mail is not secure and the recipient of this message is expected to provide safeguards from viruses and pursue alternate means of communication where privacy or a binding message is desired. |
In reply to this post by Paul Baumann
Paul Baumann wrote: Something happened to me (I doubt its your problem, it would be too much of a coincidence :) ), but just for the record -- my image appears to freeze sometimes -- it usually unfreezes eventually -- can be anywhere from a couple of seconds to a minute or more. Turns out its my PC (as noted on this forum re something else a few days ago). What happens is that my CPU clock suddently either stops or goes backwards. The end result is that (Delay forMilliseconds: 100) wait can take from 100ms up to a minute or two -- and during that time the image sometimes appears totally frozen (or sometimes interruptable) depending on where the Delay is being invoked.
-- Dennis Smith +1 416.798.7948 Cherniak Software Development Corporation Fax: +1 416.798.0948 509-2001 Sheppard Avenue East [hidden email] Toronto, ON M2J 4Z8 <a class="moz-txt-link-freetext" href="sip:dennis@CherniakSoftware.com">sip:dennis@... Canada http://www.CherniakSoftware.com Entrance off Yorkland Blvd south of Sheppard Ave east of the DVP |
Ouch. The CPU clock goes backwards? I recall someone
saying that VW does something to ensure that chronometic values always increase.
Kind of hard to adjust values after they've been computed (for a future
chronometric value that isn't reached in the expected duration). Perhaps the
resumptionTime of Delay instances could be adjusted by the amount that
the CPU clock has stepped backwards. If you are using some program to
update your CPU time then see if you can find one that only makes gradual
positive adjustments to get back in sync. Sounds like a hardware problem that
the CPU would be adjusted in such large increments. Just throwing random
ideas out there...sorry if already discussed elsewhere.
I expect the freeze you're debugging could be
interrupted. The freeze I'm attempting to debug can't be broken into with ctrl-b
or ctrl-\. It seems unresponsive at the VM level. It is random, but often
involves StORE queries. It doesn't appear communication/socket related, but that
hasn't been completely ruled out. Right now I don't have a scenario to
reproduce. It happens 1-2 times a day.
Paul Baumann
From: Dennis Smith [mailto:[hidden email]] Sent: Wednesday, August 01, 2007 12:34 PM To: [hidden email] Cc: VW NC Subject: Re: VM debug prim failures Paul Baumann wrote: Something happened to me (I doubt its your problem, it would be too much of a coincidence :) ), but just for the record -- my image appears to freeze sometimes -- it usually unfreezes eventually -- can be anywhere from a couple of seconds to a minute or more. Turns out its my PC (as noted on this forum re something else a few days ago). What happens is that my CPU clock suddently either stops or goes backwards. The end result is that (Delay forMilliseconds: 100) wait can take from 100ms up to a minute or two -- and during that time the image sometimes appears totally frozen (or sometimes interruptable) depending on where the Delay is being invoked.
-- Dennis Smith +1 416.798.7948 Cherniak Software Development Corporation Fax: +1 416.798.0948 509-2001 Sheppard Avenue East [hidden email] Toronto, ON M2J 4Z8 <A class=moz-txt-link-freetext href="sip:dennis@CherniakSoftware.com">sip:dennis@... Canada http://www.CherniakSoftware.com Entrance off Yorkland Blvd south of Sheppard Ave east of the DVP This message may contain confidential information and is intended for specific recipients unless explicitly noted otherwise. If you have reason to believe you are not an intended recipient of this message, please delete it and notify the sender. This message may not represent the opinion of IntercontinentalExchange, Inc. (ICE), its subsidiaries or affiliates, and does not constitute a contract or guarantee. Unencrypted electronic mail is not secure and the recipient of this message is expected to provide safeguards from viruses and pursue alternate means of communication where privacy or a binding message is desired. |
Paul,
the freeze issue is not to do with maintaining a monotonic clock but to do with whether the OS delivers a timeout correctly. The VM can always hack the clock to make it appear the clokc only increases monotonically (it doesn't by the way) but that wont help if the VM asks the OS for an event at a specific time (so it can implement a delay) but the OS fails to deliver it ( e.g. because its clock is wonky). The VM implements delays on Windows by having a high-priority thread spin-loop waiting on an OS semaphore with a timeout. When the thread is quiescent (no delays are scheduled) the timeout is infinite. When the VM wants to schedule a delay it sets the desired timeout, and signals the semaphore, unlocking the delay thread. The delay thread then reenters the wait with the desired timeout. When the wait-with-timeout returns the thread checks whether the timeout expired or the semaphore was signalled, signalling the VM tha the delay has expired if the timeout was reached. i.e. the VM depends on OS services (on Windows and all other platforms) for its delay services. If the OS is broken VW will be broken. I can't see a low-overhead alternative. VW is dependent on the OS substrate for delays. On 8/1/07, Paul Baumann <[hidden email]> wrote:
|
Doesn't bother me that a broken OS (or H/W perhaps) can break VW -- I
would be surprised otherwise.
We are just saying forget it and getting a new PC -- its really neat to watch clock values on the thing :) Eliot Miranda wrote: Paul, -- Dennis Smith +1 416.798.7948 Cherniak Software Development Corporation Fax: +1 416.798.0948 509-2001 Sheppard Avenue East [hidden email] Toronto, ON M2J 4Z8 <a class="moz-txt-link-freetext" href="sip:dennis@CherniakSoftware.com">sip:dennis@... Canada http://www.CherniakSoftware.com Entrance off Yorkland Blvd south of Sheppard Ave east of the DVP |
I wonder how ntp systems adjust computer clock backwards... Do they
slowly trickle time back or make each second last a fraction longer? -Boris -- +1.604.689.0322 DeepCove Labs Ltd. 4th floor 595 Howe Street Vancouver, Canada V6C 2T5 http://tinyurl.com/r7uw4 [hidden email] CONFIDENTIALITY NOTICE This email is intended only for the persons named in the message header. Unless otherwise indicated, it contains information that is private and confidential. If you have received it in error, please notify the sender and delete the entire message including any attachments. Thank you. > -----Original Message----- > From: Dennis Smith [mailto:[hidden email]] > Sent: Wednesday, August 01, 2007 11:55 AM > To: vwnc >> "VWNC, " > Subject: Re: VM debug prim failures > > Doesn't bother me that a broken OS (or H/W perhaps) can break VW -- I > would be surprised otherwise. > We are just saying forget it and getting a new PC -- its really neat to > watch clock values on the thing :) > > > Eliot Miranda wrote: > > Paul, > > the freeze issue is not to do with maintaining a monotonic clock > but to do with whether the OS delivers a timeout correctly. The VM can > always hack the clock to make it appear the clokc only increases > monotonically (it doesn't by the way) but that wont help if the VM asks > the OS for an event at a specific time (so it can implement a delay) but > the OS fails to deliver it ( e.g. because its clock is wonky). > > The VM implements delays on Windows by having a high-priority thread > spin-loop waiting on an OS semaphore with a timeout. When the thread is > quiescent (no delays are scheduled) the timeout is infinite. When the VM > wants to schedule a delay it sets the desired timeout, and signals the > semaphore, unlocking the delay thread. The delay thread then reenters > the wait with the desired timeout. When the wait-with-timeout returns the > thread checks whether the timeout expired or the semaphore was signalled, > signalling the VM tha the delay has expired if the timeout was reached. > i.e. the VM depends on OS services (on Windows and all other platforms) > for its delay services. If the OS is broken VW will be broken. > > I can't see a low-overhead alternative. VW is dependent on the OS > substrate for delays. > > > > On 8/1/07, Paul Baumann <[hidden email]> wrote: > > > Ouch. The CPU clock goes backwards? I recall someone saying > that VW does something to ensure that chronometic values always increase. > Kind of hard to adjust values after they've been computed (for a future > chronometric value that isn't reached in the expected duration). Perhaps > the resumptionTime of Delay instances could be adjusted by the amount that > the CPU clock has stepped backwards. If you are using some program to > update your CPU time then see if you can find one that only makes gradual > positive adjustments to get back in sync. Sounds like a hardware problem > that the CPU would be adjusted in such large increments. Just throwing > random ideas out there...sorry if already discussed elsewhere. > > I expect the freeze you're debugging could be interrupted. The > freeze I'm attempting to debug can't be broken into with ctrl-b or ctrl-\. > It seems unresponsive at the VM level. It is random, but often involves > StORE queries. It doesn't appear communication/socket related, but that > hasn't been completely ruled out. Right now I don't have a scenario to > reproduce. It happens 1-2 times a day. > > Paul Baumann > > > > ________________________________ > > From: Dennis Smith [mailto:[hidden email]] > Sent: Wednesday, August 01, 2007 12:34 PM > To: [hidden email] > Cc: VW NC > Subject: Re: VM debug prim failures > > > > > Paul Baumann wrote: > > I'm using "%VW75%\bin\win\debug\vwntconsole.exe > <image>" to attempt to debug what appears to be random VM freezes. I > haven't caught a freeze yet, but so far I've noticed that the VM seems to > pause for a few seconds after: > > Something happened to me (I doubt its your problem, it would > be too much of a coincidence :) ), but just > for the record -- my image appears to freeze sometimes -- it > usually unfreezes eventually -- can be anywhere > from a couple of seconds to a minute or more. Turns out its > my PC (as noted on this forum re something > else a few days ago). What happens is that my CPU clock > suddently either stops or goes backwards. > The end result is that > (Delay forMilliseconds: 100) wait > can take from 100ms up to a minute or two -- and during that > time the image sometimes appears > totally frozen (or sometimes interruptable) depending on where > the Delay is being invoked. > > > > > > Prim fail in ..\..\..\src\mman\mmAllocate.c @ 247 > > Also, there are many thousands of these types of > failures: > > Prim fail in ..\..\..\src\mman\mmSubscript.c @ 827 > > The mmAllocate failures tend to follow mmSubscript > failures, but not always. The position after the @ symbol is sometimes > different, but the values shown are the most common. > > Anyone know what those failures are? Are they real > problems, or just hooks for things like memory growth? > > Paul Baumann > > > > ________________________________ > > > This message may contain confidential > intended for specific recipients unless explicitly noted otherwise. If you > have reason to believe you are not an intended recipient of this message, > please delete it and notify the sender. This message may not represent the > opinion of IntercontinentalExchange, Inc. (ICE), its subsidiaries or > affiliates, and does not constitute a contract or guarantee. Unencrypted > electronic mail is not secure and the recipient of this message is > expected to provide safeguards from viruses and pursue alternate means of > communication where privacy or a binding message is desired. > > > -- > Dennis Smith +1 416.798.7948 > Cherniak Software Development Corporation Fax: +1 > 416.798.0948 > 509-2001 Sheppard Avenue East > [hidden email] > Toronto, ON M2J 4Z8 > sip:[hidden email] > Canada > http://www.CherniakSoftware.com > Entrance off Yorkland Blvd south of Sheppard Ave east of the > DVP > > > ________________________________ > > > This message may contain confidential information and is > intended for specific recipients unless explicitly noted otherwise. If you > have reason to believe you are not an intended recipient of this message, > please delete it and notify the sender. This message may not represent the > opinion of IntercontinentalExchange, Inc. (ICE), its subsidiaries or > affiliates, and does not constitute a contract or guarantee. Unencrypted > electronic mail is not secure and the recipient of this message is > expected to provide safeguards from viruses and pursue alternate means of > communication where privacy or a binding message is desired. > > > > -- > Dennis Smith +1 416.798.7948 > Cherniak Software Development Corporation Fax: +1 416.798.0948 > 509-2001 Sheppard Avenue East [hidden email] > Toronto, ON M2J 4Z8 sip:[hidden email] > Canada http://www.CherniakSoftware.com > Entrance off Yorkland Blvd south of Sheppard Ave east of the DVP |
Boris Popov wrote:
> I wonder how ntp systems adjust computer clock backwards... Do they > slowly trickle time back or make each second last a fraction longer? It varies by configuration, but in some configurations NTP tries to keep the clock monotonic, and only varies the length of a second by a small fraction. -Martin |
In reply to this post by Boris Popov, DeepCove Labs (SNN)
Boris Popov wrote:
> I wonder how ntp systems adjust computer clock backwards... Do they > slowly trickle time back or make each second last a fraction longer? > > NTP4 has a 'panic threshold' of 1000 seconds. If the time discrepancy is larger than this threshold the daemon will abort with a warning (and the operator is expected to adjust the clock to within the panic threshold). Older daemons (NTP3 and before) will 'step' the clock once at startup if the deviation is outside the panic threshold which is much smaller on these versions, 16 seconds IIRC). Once running within the panic threshold NTP will adjust the clock rate such that OS time stays monotonic. So the clock is never run backwards, just a bit slower until the 'real world' time catches up with it. So essentially NTP4 never 'steps' the clock and NTP3 only once at startup of the daemon. Find tons of further info at http://www.ntp.org R - |
Free forum by Nabble | Edit this page |