> The welcome (sic) message on the SqueakSource home page is overly alarming,
> and IMHO should be changed to something that encourages new projects to > be created elsewhere, but that does not cause alarm for existing usurs. > But that is a policy decision, and I will defer to the Squeak board and > the Squeak community on this. +1 I think we should we should delete the "ATTENTION!" line but leave the note about creation of projects being disabled. The entire "Migration to SmalltalkHub" section should be deleted. What's more alarming than a self-repelling web-site? That information belongs on SmalltalkHub, not SqueakSource. |
On 14.11.2013, at 13:04, Chris Muller <[hidden email]> wrote: >> The welcome (sic) message on the SqueakSource home page is overly alarming, >> and IMHO should be changed to something that encourages new projects to >> be created elsewhere, but that does not cause alarm for existing usurs. >> But that is a policy decision, and I will defer to the Squeak board and >> the Squeak community on this. > > +1 > > I think we should we should delete the "ATTENTION!" line but leave the > note about creation of projects being disabled. Yep. - Bert - |
Yes, let's remove the alarms. But it has to be functional. Currently, I can connect on the web interface and I can download, but all my upload are failing with timeout... Any idea? 2013/11/14 Bert Freudenberg <[hidden email]>
|
On Thu, Nov 14, 2013 at 10:55:55PM +0100, Nicolas Cellier wrote:
> Yes, let's remove the alarms. > But it has to be functional. > Currently, I can connect on the web interface and I can download, but all > my upload are failing with timeout... Any idea? > I made some changes to the launch script for squeaksource.com earlier today: http://lists.squeakfoundation.org/pipermail/box-admins/2013-November/001598.html It is possible that this may be related to the problem you are seeing (I am not sure at this point). I tried loading some packages from squeaksource a few minutes ago, and it was slow but functional. However, checking the image I see 15 active SSession handlers in a ProcessBrowser. This is not right, and it appears to be a recurrence of a problem that we have seen previously on an intermittent basis, both on squeaksource.com and (probably) on source.squeak.org. I will terminate the runaway session handler processes, which I hope will clear up the immediate problem. More to follow I'm sure ... Dave > > 2013/11/14 Bert Freudenberg <[hidden email]> > > > > > On 14.11.2013, at 13:04, Chris Muller <[hidden email]> wrote: > > > > >> The welcome (sic) message on the SqueakSource home page is overly > > alarming, > > >> and IMHO should be changed to something that encourages new projects to > > >> be created elsewhere, but that does not cause alarm for existing usurs. > > >> But that is a policy decision, and I will defer to the Squeak board and > > >> the Squeak community on this. > > > > > > +1 > > > > > > I think we should we should delete the "ATTENTION!" line but leave the > > > note about creation of projects being disabled. > > > > Yep. > > > > - Bert - > > > > > > > > > |
Thanks David, it went back to normal speed for a moment, but is now rejecting my upload requests again (most will timeout, some do work intermittently)...
2013/11/14 David T. Lewis <[hidden email]>
|
The image is now showing 10 session handler processes in the process
browser. Presumably these are related to the failed upload requests. I do not understand the cause of this problem, and it may be that I should revert the changes that I did earlier today (in which I put squeaksource under the control of the supervise(8) for starting the image). But I suspect that the problem lies elsewhere, so for I will make a copy of the broken image for debugging, then terminate the excess processes. This should clear the problem temporarily. I will follow up with another email within about 30 minutes. Dave On Fri, Nov 15, 2013 at 01:28:18AM +0100, Nicolas Cellier wrote: > Thanks David, it went back to normal speed for a moment, but is now > rejecting my upload requests again (most will timeout, some do work > intermittently)... > > > 2013/11/14 David T. Lewis <[hidden email]> > > > On Thu, Nov 14, 2013 at 10:55:55PM +0100, Nicolas Cellier wrote: > > > Yes, let's remove the alarms. > > > But it has to be functional. > > > Currently, I can connect on the web interface and I can download, but all > > > my upload are failing with timeout... Any idea? > > > > > > > I made some changes to the launch script for squeaksource.com earlier > > today: > > > > > > http://lists.squeakfoundation.org/pipermail/box-admins/2013-November/001598.html > > > > It is possible that this may be related to the problem you are seeing (I am > > not sure at this point). > > > > I tried loading some packages from squeaksource a few minutes ago, and it > > was slow but functional. However, checking the image I see 15 active > > SSession handlers in a ProcessBrowser. This is not right, and it appears > > to be a recurrence of a problem that we have seen previously on an > > intermittent > > basis, both on squeaksource.com and (probably) on source.squeak.org. > > > > I will terminate the runaway session handler processes, which I hope will > > clear up the immediate problem. > > > > More to follow I'm sure ... > > > > Dave > > > > > > > > 2013/11/14 Bert Freudenberg <[hidden email]> > > > > > > > > > > > On 14.11.2013, at 13:04, Chris Muller <[hidden email]> wrote: > > > > > > > > >> The welcome (sic) message on the SqueakSource home page is overly > > > > alarming, > > > > >> and IMHO should be changed to something that encourages new > > projects to > > > > >> be created elsewhere, but that does not cause alarm for existing > > usurs. > > > > >> But that is a policy decision, and I will defer to the Squeak board > > and > > > > >> the Squeak community on this. > > > > > > > > > > +1 > > > > > > > > > > I think we should we should delete the "ATTENTION!" line but leave > > the > > > > > note about creation of projects being disabled. > > > > > > > > Yep. > > > > > > > > - Bert - > > > > > > > > > > > > > > > > > > > > > > > > > > > > |
Attached is a screen shot of the process browser in the squeaksource.com
image, showing the excess SSSession processes. They are deadlocked on accessing DateAndTime now, which contains a critical section using the LastTickSemaphore in class DateAndTime. In the squeaksource.com image, LastTickSemaphore has 0 excess signals, whereas in other images I look at, it has 1 excess signal. This looks to me like a mutex that has gotten confused. I sent a signal to LastTickSemaphore in class DateAndTime, and now it looks like a mutex again. Let's see if that clears the problem. This certainly has a bad smell about it :-( But I note also that we are running our SqueakSource services on older images, and a number of changes have been made to DateAndTime since then. Nicolas, I will send private email to give you the VNC password for access to the squeaksource.com image in case you need it (I am going to get some sleep soon). Dave On Thu, Nov 14, 2013 at 08:26:45PM -0500, David T. Lewis wrote: > The image is now showing 10 session handler processes in the process > browser. Presumably these are related to the failed upload requests. > > I do not understand the cause of this problem, and it may be that I > should revert the changes that I did earlier today (in which I put > squeaksource under the control of the supervise(8) for starting the > image). > > But I suspect that the problem lies elsewhere, so for I will make > a copy of the broken image for debugging, then terminate the excess > processes. This should clear the problem temporarily. I will follow > up with another email within about 30 minutes. > > Dave > > > > On Fri, Nov 15, 2013 at 01:28:18AM +0100, Nicolas Cellier wrote: > > Thanks David, it went back to normal speed for a moment, but is now > > rejecting my upload requests again (most will timeout, some do work > > intermittently)... > > > > > > 2013/11/14 David T. Lewis <[hidden email]> > > > > > On Thu, Nov 14, 2013 at 10:55:55PM +0100, Nicolas Cellier wrote: > > > > Yes, let's remove the alarms. > > > > But it has to be functional. > > > > Currently, I can connect on the web interface and I can download, but all > > > > my upload are failing with timeout... Any idea? > > > > > > > > > > I made some changes to the launch script for squeaksource.com earlier > > > today: > > > > > > > > > http://lists.squeakfoundation.org/pipermail/box-admins/2013-November/001598.html > > > > > > It is possible that this may be related to the problem you are seeing (I am > > > not sure at this point). > > > > > > I tried loading some packages from squeaksource a few minutes ago, and it > > > was slow but functional. However, checking the image I see 15 active > > > SSession handlers in a ProcessBrowser. This is not right, and it appears > > > to be a recurrence of a problem that we have seen previously on an > > > intermittent > > > basis, both on squeaksource.com and (probably) on source.squeak.org. > > > > > > I will terminate the runaway session handler processes, which I hope will > > > clear up the immediate problem. > > > > > > More to follow I'm sure ... > > > > > > Dave > > > > > > > > > > > 2013/11/14 Bert Freudenberg <[hidden email]> > > > > > > > > > > > > > > On 14.11.2013, at 13:04, Chris Muller <[hidden email]> wrote: > > > > > > > > > > >> The welcome (sic) message on the SqueakSource home page is overly > > > > > alarming, > > > > > >> and IMHO should be changed to something that encourages new > > > projects to > > > > > >> be created elsewhere, but that does not cause alarm for existing > > > usurs. > > > > > >> But that is a policy decision, and I will defer to the Squeak board > > > and > > > > > >> the Squeak community on this. > > > > > > > > > > > > +1 > > > > > > > > > > > > I think we should we should delete the "ATTENTION!" line but leave > > > the > > > > > > note about creation of projects being disabled. > > > > > > > > > > Yep. > > > > > > > > > > - Bert - > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > SqueakSourceProcesses.png (81K) Download Attachment |
On Thu, Nov 14, 2013 at 8:00 PM, David T. Lewis <[hidden email]> wrote:
> Attached is a screen shot of the process browser in the squeaksource.com > image, showing the excess SSSession processes. They are deadlocked on > accessing DateAndTime now, which contains a critical section using the > LastTickSemaphore in class DateAndTime. > > In the squeaksource.com image, LastTickSemaphore has 0 excess signals, > whereas in other images I look at, it has 1 excess signal. This looks > to me like a mutex that has gotten confused. This might be a problem I think I observed with using Seaside's #returnResponse: inside a Mutex's critical: block. The block is entered, the Sema waited but never resignaled, leaving all subsequent processes stuck waiting. > I sent a signal to LastTickSemaphore in class DateAndTime, and now it > looks like a mutex again. Let's see if that clears the problem. > > This certainly has a bad smell about it :-( But I note also that > we are running our SqueakSource services on older images, and a number > of changes have been made to DateAndTime since then. > > Nicolas, I will send private email to give you the VNC password for access > to the squeaksource.com image in case you need it (I am going to get some > sleep soon). > > Dave > > > On Thu, Nov 14, 2013 at 08:26:45PM -0500, David T. Lewis wrote: >> The image is now showing 10 session handler processes in the process >> browser. Presumably these are related to the failed upload requests. >> >> I do not understand the cause of this problem, and it may be that I >> should revert the changes that I did earlier today (in which I put >> squeaksource under the control of the supervise(8) for starting the >> image). >> >> But I suspect that the problem lies elsewhere, so for I will make >> a copy of the broken image for debugging, then terminate the excess >> processes. This should clear the problem temporarily. I will follow >> up with another email within about 30 minutes. >> >> Dave >> >> >> >> On Fri, Nov 15, 2013 at 01:28:18AM +0100, Nicolas Cellier wrote: >> > Thanks David, it went back to normal speed for a moment, but is now >> > rejecting my upload requests again (most will timeout, some do work >> > intermittently)... >> > >> > >> > 2013/11/14 David T. Lewis <[hidden email]> >> > >> > > On Thu, Nov 14, 2013 at 10:55:55PM +0100, Nicolas Cellier wrote: >> > > > Yes, let's remove the alarms. >> > > > But it has to be functional. >> > > > Currently, I can connect on the web interface and I can download, but all >> > > > my upload are failing with timeout... Any idea? >> > > > >> > > >> > > I made some changes to the launch script for squeaksource.com earlier >> > > today: >> > > >> > > >> > > http://lists.squeakfoundation.org/pipermail/box-admins/2013-November/001598.html >> > > >> > > It is possible that this may be related to the problem you are seeing (I am >> > > not sure at this point). >> > > >> > > I tried loading some packages from squeaksource a few minutes ago, and it >> > > was slow but functional. However, checking the image I see 15 active >> > > SSession handlers in a ProcessBrowser. This is not right, and it appears >> > > to be a recurrence of a problem that we have seen previously on an >> > > intermittent >> > > basis, both on squeaksource.com and (probably) on source.squeak.org. >> > > >> > > I will terminate the runaway session handler processes, which I hope will >> > > clear up the immediate problem. >> > > >> > > More to follow I'm sure ... >> > > >> > > Dave >> > > >> > > > >> > > > 2013/11/14 Bert Freudenberg <[hidden email]> >> > > > >> > > > > >> > > > > On 14.11.2013, at 13:04, Chris Muller <[hidden email]> wrote: >> > > > > >> > > > > >> The welcome (sic) message on the SqueakSource home page is overly >> > > > > alarming, >> > > > > >> and IMHO should be changed to something that encourages new >> > > projects to >> > > > > >> be created elsewhere, but that does not cause alarm for existing >> > > usurs. >> > > > > >> But that is a policy decision, and I will defer to the Squeak board >> > > and >> > > > > >> the Squeak community on this. >> > > > > > >> > > > > > +1 >> > > > > > >> > > > > > I think we should we should delete the "ATTENTION!" line but leave >> > > the >> > > > > > note about creation of projects being disabled. >> > > > > >> > > > > Yep. >> > > > > >> > > > > - Bert - >> > > > > >> > > > > >> > > > > >> > > > > >> > > >> > > > >> > > >> > > >> > > >> >> > > > > |
In reply to this post by David T. Lewis
Hi David,
On Thu, Nov 14, 2013 at 6:00 PM, David T. Lewis <[hidden email]> wrote: Attached is a screen shot of the process browser in the squeaksource.com All this LastTickSemaphore stuff is complete nonsense, wasting on average 1/2 a second on startup spinning until the clock rolls over. If we move to the 64-bit microsecond timebase which is provided by the Cog time primitives we don't need to sync the second and the millisecond clocks because they are replaced by a single microsecond clock. If the current Interpreter VMs do support the 64-bit microsecond primitives I suggest we move ASAP. QWe've already done this in our images at Cadence and been running happily with it for several months. Would this help?
best, Eliot
|
On Fri, Nov 15, 2013 at 09:31:18AM -0800, Eliot Miranda wrote:
> Hi David, > > On Thu, Nov 14, 2013 at 6:00 PM, David T. Lewis <[hidden email]> wrote: > > > Attached is a screen shot of the process browser in the squeaksource.com > > image, showing the excess SSSession processes. They are deadlocked on > > accessing DateAndTime now, which contains a critical section using the > > LastTickSemaphore in class DateAndTime. > > > > In the squeaksource.com image, LastTickSemaphore has 0 excess signals, > > whereas in other images I look at, it has 1 excess signal. This looks > > to me like a mutex that has gotten confused. > > > > I sent a signal to LastTickSemaphore in class DateAndTime, and now it > > looks like a mutex again. Let's see if that clears the problem. > > > > This certainly has a bad smell about it :-( But I note also that > > we are running our SqueakSource services on older images, and a number > > of changes have been made to DateAndTime since then. > > > > Nicolas, I will send private email to give you the VNC password for access > > to the squeaksource.com image in case you need it (I am going to get some > > sleep soon). > > > > All this LastTickSemaphore stuff is complete nonsense, wasting on average > 1/2 a second on startup spinning until the clock rolls over. If we move to > the 64-bit microsecond timebase which is provided by the Cog time > primitives we don't need to sync the second and the millisecond clocks > because they are replaced by a single microsecond clock. If the current > Interpreter VMs do support the 64-bit microsecond primitives I suggest we > move ASAP. QWe've already done this in our images at Cadence and been > running happily with it for several months. Would this help? > Yes, it probably would help, in the sense that it would make this particular failure scenario impossible. But I think that something else must be going on here, and it would be worth getting to the bottom of it. The particular deadlock we are seeing here should be impossible, regardless of how nonsensical the LastTickSemaphore stuff may be. We are looking at a small section of code evaluated within a critical section. If semaphores and process scheduling are working correctly, it should be impossible for two different processes to deadlock on that section. I recall that Andreas made an important fix to process scheduling perhaps a couple of years ago, but I can't remember the details. I wonder if our SqueakSource images may be lacking that fix? Also, Chris Muller pointed out that he has seen similar symptoms related to Seaside: This might be a problem I think I observed with using Seaside's #returnResponse: inside a Mutex's critical: block. The block is entered, the Sema waited but never resignaled, leaving all subsequent processes stuck waiting. So I am concerned about the following: How is it possible that a semaphore that is used privately by a small section of uncomplicated code ever get itself into a state where it has missed a signal and no longer functions as a mutex? In normal operation this never happens, but is there some scenario related to Seaside operations, socket timeouts, process scheduling, or image save and restart that might lead to this condition? BTW, squeaksource.com seems to be working nicely since I signalled that semaphore yesterday to break things loose. I uploaded a few packages today without problems. But the problem will be back, I am certain of that. Dave |
On 15 November 2013 18:11, David T. Lewis <[hidden email]> wrote:
> On Fri, Nov 15, 2013 at 09:31:18AM -0800, Eliot Miranda wrote: >> Hi David, >> >> On Thu, Nov 14, 2013 at 6:00 PM, David T. Lewis <[hidden email]> wrote: >> >> > Attached is a screen shot of the process browser in the squeaksource.com >> > image, showing the excess SSSession processes. They are deadlocked on >> > accessing DateAndTime now, which contains a critical section using the >> > LastTickSemaphore in class DateAndTime. >> > >> > In the squeaksource.com image, LastTickSemaphore has 0 excess signals, >> > whereas in other images I look at, it has 1 excess signal. This looks >> > to me like a mutex that has gotten confused. >> > >> > I sent a signal to LastTickSemaphore in class DateAndTime, and now it >> > looks like a mutex again. Let's see if that clears the problem. >> > >> > This certainly has a bad smell about it :-( But I note also that >> > we are running our SqueakSource services on older images, and a number >> > of changes have been made to DateAndTime since then. >> > >> > Nicolas, I will send private email to give you the VNC password for access >> > to the squeaksource.com image in case you need it (I am going to get some >> > sleep soon). >> > >> >> All this LastTickSemaphore stuff is complete nonsense, wasting on average >> 1/2 a second on startup spinning until the clock rolls over. If we move to >> the 64-bit microsecond timebase which is provided by the Cog time >> primitives we don't need to sync the second and the millisecond clocks >> because they are replaced by a single microsecond clock. If the current >> Interpreter VMs do support the 64-bit microsecond primitives I suggest we >> move ASAP. QWe've already done this in our images at Cadence and been >> running happily with it for several months. Would this help? >> > > Yes, it probably would help, in the sense that it would make this particular > failure scenario impossible. > > But I think that something else must be going on here, and it would be > worth getting to the bottom of it. The particular deadlock we are seeing > here should be impossible, regardless of how nonsensical the LastTickSemaphore > stuff may be. We are looking at a small section of code evaluated within > a critical section. If semaphores and process scheduling are working > correctly, it should be impossible for two different processes to deadlock > on that section. > > I recall that Andreas made an important fix to process scheduling perhaps > a couple of years ago, but I can't remember the details. I wonder if our > SqueakSource images may be lacking that fix? > > Also, Chris Muller pointed out that he has seen similar symptoms related > to Seaside: > > This might be a problem I think I observed with using Seaside's > #returnResponse: inside a Mutex's critical: block. The block is > entered, the Sema waited but never resignaled, leaving all subsequent > processes stuck waiting. > > So I am concerned about the following: How is it possible that a semaphore > that is used privately by a small section of uncomplicated code ever > get itself into a state where it has missed a signal and no longer > functions as a mutex? In normal operation this never happens, but is > there some scenario related to Seaside operations, socket timeouts, > process scheduling, or image save and restart that might lead to this > condition? I really don't know if this is anything more than a wild shot in the dark, but Seaside does (or used to) perform stack slicing. It was famed as being based on continuations. So anyway, is this Seaside installation using continuations? I'm pretty sure Seaside's continuations play correctly with #ensure: and stuff (being based on resumable exceptions), but you never know...? frank > BTW, squeaksource.com seems to be working nicely since I signalled > that semaphore yesterday to break things loose. I uploaded a few packages > today without problems. But the problem will be back, I am certain > of that. > > Dave > > |
Free forum by Nabble | Edit this page |