SqueakSource.com home page (was: Fix for OSProcess - Where to commit?)

Previous Topic Next Topic
 
classic Classic list List threaded Threaded
11 messages Options
Reply | Threaded
Open this post in threaded view
|

SqueakSource.com home page (was: Fix for OSProcess - Where to commit?)

Chris Muller-3
> The welcome (sic) message on the SqueakSource home page is overly alarming,
> and IMHO should be changed to something that encourages new projects to
> be created elsewhere, but that does not cause alarm for existing usurs.
> But that is a policy decision, and I will defer to the Squeak board and
> the Squeak community on this.

+1

I think we should we should delete the "ATTENTION!" line but leave the
note about creation of projects being disabled.

The entire "Migration to SmalltalkHub" section should be deleted.
What's more alarming than a self-repelling web-site?  That information
belongs on SmalltalkHub, not SqueakSource.

Reply | Threaded
Open this post in threaded view
|

Re: SqueakSource.com home page (was: Fix for OSProcess - Where to commit?)

Bert Freudenberg

On 14.11.2013, at 13:04, Chris Muller <[hidden email]> wrote:

>> The welcome (sic) message on the SqueakSource home page is overly alarming,
>> and IMHO should be changed to something that encourages new projects to
>> be created elsewhere, but that does not cause alarm for existing usurs.
>> But that is a policy decision, and I will defer to the Squeak board and
>> the Squeak community on this.
>
> +1
>
> I think we should we should delete the "ATTENTION!" line but leave the
> note about creation of projects being disabled.

Yep.

- Bert -



Reply | Threaded
Open this post in threaded view
|

Re: SqueakSource.com home page (was: Fix for OSProcess - Where to commit?)

Nicolas Cellier
Yes, let's remove the alarms.
But it has to be functional.
Currently, I can connect on the web interface and I can download, but all my upload are failing with timeout... Any idea?


2013/11/14 Bert Freudenberg <[hidden email]>

On 14.11.2013, at 13:04, Chris Muller <[hidden email]> wrote:

>> The welcome (sic) message on the SqueakSource home page is overly alarming,
>> and IMHO should be changed to something that encourages new projects to
>> be created elsewhere, but that does not cause alarm for existing usurs.
>> But that is a policy decision, and I will defer to the Squeak board and
>> the Squeak community on this.
>
> +1
>
> I think we should we should delete the "ATTENTION!" line but leave the
> note about creation of projects being disabled.

Yep.

- Bert -






Reply | Threaded
Open this post in threaded view
|

Re: SqueakSource.com home page (was: Fix for OSProcess - Where to commit?)

David T. Lewis
On Thu, Nov 14, 2013 at 10:55:55PM +0100, Nicolas Cellier wrote:
> Yes, let's remove the alarms.
> But it has to be functional.
> Currently, I can connect on the web interface and I can download, but all
> my upload are failing with timeout... Any idea?
>

I made some changes to the launch script for squeaksource.com earlier today:

http://lists.squeakfoundation.org/pipermail/box-admins/2013-November/001598.html

It is possible that this may be related to the problem you are seeing (I am
not sure at this point).

I tried loading some packages from squeaksource a few minutes ago, and it
was slow but functional. However, checking the image I see 15 active
SSession handlers in a ProcessBrowser. This is not right, and it appears
to be a recurrence of a problem that we have seen previously on an intermittent
basis, both on squeaksource.com and (probably) on source.squeak.org.

I will terminate the runaway session handler processes, which I hope will
clear up the immediate problem.

More to follow I'm sure ...

Dave

>
> 2013/11/14 Bert Freudenberg <[hidden email]>
>
> >
> > On 14.11.2013, at 13:04, Chris Muller <[hidden email]> wrote:
> >
> > >> The welcome (sic) message on the SqueakSource home page is overly
> > alarming,
> > >> and IMHO should be changed to something that encourages new projects to
> > >> be created elsewhere, but that does not cause alarm for existing usurs.
> > >> But that is a policy decision, and I will defer to the Squeak board and
> > >> the Squeak community on this.
> > >
> > > +1
> > >
> > > I think we should we should delete the "ATTENTION!" line but leave the
> > > note about creation of projects being disabled.
> >
> > Yep.
> >
> > - Bert -
> >
> >
> >
> >

>


Reply | Threaded
Open this post in threaded view
|

Re: SqueakSource.com home page (was: Fix for OSProcess - Where to commit?)

Nicolas Cellier
Thanks David, it went back to normal speed for a moment, but is now rejecting my upload requests again (most will timeout, some do work intermittently)...


2013/11/14 David T. Lewis <[hidden email]>
On Thu, Nov 14, 2013 at 10:55:55PM +0100, Nicolas Cellier wrote:
> Yes, let's remove the alarms.
> But it has to be functional.
> Currently, I can connect on the web interface and I can download, but all
> my upload are failing with timeout... Any idea?
>

I made some changes to the launch script for squeaksource.com earlier today:

http://lists.squeakfoundation.org/pipermail/box-admins/2013-November/001598.html

It is possible that this may be related to the problem you are seeing (I am
not sure at this point).

I tried loading some packages from squeaksource a few minutes ago, and it
was slow but functional. However, checking the image I see 15 active
SSession handlers in a ProcessBrowser. This is not right, and it appears
to be a recurrence of a problem that we have seen previously on an intermittent
basis, both on squeaksource.com and (probably) on source.squeak.org.

I will terminate the runaway session handler processes, which I hope will
clear up the immediate problem.

More to follow I'm sure ...

Dave

>
> 2013/11/14 Bert Freudenberg <[hidden email]>
>
> >
> > On 14.11.2013, at 13:04, Chris Muller <[hidden email]> wrote:
> >
> > >> The welcome (sic) message on the SqueakSource home page is overly
> > alarming,
> > >> and IMHO should be changed to something that encourages new projects to
> > >> be created elsewhere, but that does not cause alarm for existing usurs.
> > >> But that is a policy decision, and I will defer to the Squeak board and
> > >> the Squeak community on this.
> > >
> > > +1
> > >
> > > I think we should we should delete the "ATTENTION!" line but leave the
> > > note about creation of projects being disabled.
> >
> > Yep.
> >
> > - Bert -
> >
> >
> >
> >

>





Reply | Threaded
Open this post in threaded view
|

Re: SqueakSource.com home page (was: Fix for OSProcess - Where to commit?)

David T. Lewis
The image is now showing 10 session handler processes in the process
browser. Presumably these are related to the failed upload requests.

I do not understand the cause of this problem, and it may be that I
should revert the changes that I did earlier today (in which I put
squeaksource under the control of the supervise(8) for starting the
image).

But I suspect that the problem lies elsewhere, so for I will make
a copy of the broken image for debugging, then terminate the excess
processes. This should clear the problem temporarily. I will follow
up with another email within about 30 minutes.

Dave



On Fri, Nov 15, 2013 at 01:28:18AM +0100, Nicolas Cellier wrote:

> Thanks David, it went back to normal speed for a moment, but is now
> rejecting my upload requests again (most will timeout, some do work
> intermittently)...
>
>
> 2013/11/14 David T. Lewis <[hidden email]>
>
> > On Thu, Nov 14, 2013 at 10:55:55PM +0100, Nicolas Cellier wrote:
> > > Yes, let's remove the alarms.
> > > But it has to be functional.
> > > Currently, I can connect on the web interface and I can download, but all
> > > my upload are failing with timeout... Any idea?
> > >
> >
> > I made some changes to the launch script for squeaksource.com earlier
> > today:
> >
> >
> > http://lists.squeakfoundation.org/pipermail/box-admins/2013-November/001598.html
> >
> > It is possible that this may be related to the problem you are seeing (I am
> > not sure at this point).
> >
> > I tried loading some packages from squeaksource a few minutes ago, and it
> > was slow but functional. However, checking the image I see 15 active
> > SSession handlers in a ProcessBrowser. This is not right, and it appears
> > to be a recurrence of a problem that we have seen previously on an
> > intermittent
> > basis, both on squeaksource.com and (probably) on source.squeak.org.
> >
> > I will terminate the runaway session handler processes, which I hope will
> > clear up the immediate problem.
> >
> > More to follow I'm sure ...
> >
> > Dave
> >
> > >
> > > 2013/11/14 Bert Freudenberg <[hidden email]>
> > >
> > > >
> > > > On 14.11.2013, at 13:04, Chris Muller <[hidden email]> wrote:
> > > >
> > > > >> The welcome (sic) message on the SqueakSource home page is overly
> > > > alarming,
> > > > >> and IMHO should be changed to something that encourages new
> > projects to
> > > > >> be created elsewhere, but that does not cause alarm for existing
> > usurs.
> > > > >> But that is a policy decision, and I will defer to the Squeak board
> > and
> > > > >> the Squeak community on this.
> > > > >
> > > > > +1
> > > > >
> > > > > I think we should we should delete the "ATTENTION!" line but leave
> > the
> > > > > note about creation of projects being disabled.
> > > >
> > > > Yep.
> > > >
> > > > - Bert -
> > > >
> > > >
> > > >
> > > >
> >
> > >
> >
> >
> >

>


Reply | Threaded
Open this post in threaded view
|

Re: [Box-Admins] Re: [squeak-dev] SqueakSource.com home page (was: Fix for OSProcess - Where to commit?)

David T. Lewis
Attached is a screen shot of the process browser in the squeaksource.com
image, showing the excess SSSession processes. They are deadlocked on
accessing DateAndTime now, which contains a critical section using the
LastTickSemaphore in class DateAndTime.

In the squeaksource.com image, LastTickSemaphore has 0 excess signals,
whereas in other images I look at, it has 1 excess signal. This looks
to me like a mutex that has gotten confused.

I sent a signal to LastTickSemaphore in class DateAndTime, and now it
looks like a mutex again. Let's see if that clears the problem.

This certainly has a bad smell about it :-(  But I note also that
we are running our SqueakSource services on older images, and a number
of changes have been made to DateAndTime since then.

Nicolas, I will send private email to give you the VNC password for access
to the squeaksource.com image in case you need it (I am going to get some
sleep soon).

Dave


On Thu, Nov 14, 2013 at 08:26:45PM -0500, David T. Lewis wrote:

> The image is now showing 10 session handler processes in the process
> browser. Presumably these are related to the failed upload requests.
>
> I do not understand the cause of this problem, and it may be that I
> should revert the changes that I did earlier today (in which I put
> squeaksource under the control of the supervise(8) for starting the
> image).
>
> But I suspect that the problem lies elsewhere, so for I will make
> a copy of the broken image for debugging, then terminate the excess
> processes. This should clear the problem temporarily. I will follow
> up with another email within about 30 minutes.
>
> Dave
>
>
>
> On Fri, Nov 15, 2013 at 01:28:18AM +0100, Nicolas Cellier wrote:
> > Thanks David, it went back to normal speed for a moment, but is now
> > rejecting my upload requests again (most will timeout, some do work
> > intermittently)...
> >
> >
> > 2013/11/14 David T. Lewis <[hidden email]>
> >
> > > On Thu, Nov 14, 2013 at 10:55:55PM +0100, Nicolas Cellier wrote:
> > > > Yes, let's remove the alarms.
> > > > But it has to be functional.
> > > > Currently, I can connect on the web interface and I can download, but all
> > > > my upload are failing with timeout... Any idea?
> > > >
> > >
> > > I made some changes to the launch script for squeaksource.com earlier
> > > today:
> > >
> > >
> > > http://lists.squeakfoundation.org/pipermail/box-admins/2013-November/001598.html
> > >
> > > It is possible that this may be related to the problem you are seeing (I am
> > > not sure at this point).
> > >
> > > I tried loading some packages from squeaksource a few minutes ago, and it
> > > was slow but functional. However, checking the image I see 15 active
> > > SSession handlers in a ProcessBrowser. This is not right, and it appears
> > > to be a recurrence of a problem that we have seen previously on an
> > > intermittent
> > > basis, both on squeaksource.com and (probably) on source.squeak.org.
> > >
> > > I will terminate the runaway session handler processes, which I hope will
> > > clear up the immediate problem.
> > >
> > > More to follow I'm sure ...
> > >
> > > Dave
> > >
> > > >
> > > > 2013/11/14 Bert Freudenberg <[hidden email]>
> > > >
> > > > >
> > > > > On 14.11.2013, at 13:04, Chris Muller <[hidden email]> wrote:
> > > > >
> > > > > >> The welcome (sic) message on the SqueakSource home page is overly
> > > > > alarming,
> > > > > >> and IMHO should be changed to something that encourages new
> > > projects to
> > > > > >> be created elsewhere, but that does not cause alarm for existing
> > > usurs.
> > > > > >> But that is a policy decision, and I will defer to the Squeak board
> > > and
> > > > > >> the Squeak community on this.
> > > > > >
> > > > > > +1
> > > > > >
> > > > > > I think we should we should delete the "ATTENTION!" line but leave
> > > the
> > > > > > note about creation of projects being disabled.
> > > > >
> > > > > Yep.
> > > > >
> > > > > - Bert -
> > > > >
> > > > >
> > > > >
> > > > >
> > >
> > > >
> > >
> > >
> > >
>
> >



SqueakSourceProcesses.png (81K) Download Attachment
Reply | Threaded
Open this post in threaded view
|

Re: [Box-Admins] Re: [squeak-dev] SqueakSource.com home page (was: Fix for OSProcess - Where to commit?)

Chris Muller-3
On Thu, Nov 14, 2013 at 8:00 PM, David T. Lewis <[hidden email]> wrote:
> Attached is a screen shot of the process browser in the squeaksource.com
> image, showing the excess SSSession processes. They are deadlocked on
> accessing DateAndTime now, which contains a critical section using the
> LastTickSemaphore in class DateAndTime.
>
> In the squeaksource.com image, LastTickSemaphore has 0 excess signals,
> whereas in other images I look at, it has 1 excess signal. This looks
> to me like a mutex that has gotten confused.

This might be a problem I think I observed with using Seaside's
#returnResponse: inside a Mutex's critical: block.  The block is
entered, the Sema waited but never resignaled, leaving all subsequent
processes stuck waiting.

> I sent a signal to LastTickSemaphore in class DateAndTime, and now it
> looks like a mutex again. Let's see if that clears the problem.
>
> This certainly has a bad smell about it :-(  But I note also that
> we are running our SqueakSource services on older images, and a number
> of changes have been made to DateAndTime since then.
>
> Nicolas, I will send private email to give you the VNC password for access
> to the squeaksource.com image in case you need it (I am going to get some
> sleep soon).
>
> Dave
>
>
> On Thu, Nov 14, 2013 at 08:26:45PM -0500, David T. Lewis wrote:
>> The image is now showing 10 session handler processes in the process
>> browser. Presumably these are related to the failed upload requests.
>>
>> I do not understand the cause of this problem, and it may be that I
>> should revert the changes that I did earlier today (in which I put
>> squeaksource under the control of the supervise(8) for starting the
>> image).
>>
>> But I suspect that the problem lies elsewhere, so for I will make
>> a copy of the broken image for debugging, then terminate the excess
>> processes. This should clear the problem temporarily. I will follow
>> up with another email within about 30 minutes.
>>
>> Dave
>>
>>
>>
>> On Fri, Nov 15, 2013 at 01:28:18AM +0100, Nicolas Cellier wrote:
>> > Thanks David, it went back to normal speed for a moment, but is now
>> > rejecting my upload requests again (most will timeout, some do work
>> > intermittently)...
>> >
>> >
>> > 2013/11/14 David T. Lewis <[hidden email]>
>> >
>> > > On Thu, Nov 14, 2013 at 10:55:55PM +0100, Nicolas Cellier wrote:
>> > > > Yes, let's remove the alarms.
>> > > > But it has to be functional.
>> > > > Currently, I can connect on the web interface and I can download, but all
>> > > > my upload are failing with timeout... Any idea?
>> > > >
>> > >
>> > > I made some changes to the launch script for squeaksource.com earlier
>> > > today:
>> > >
>> > >
>> > > http://lists.squeakfoundation.org/pipermail/box-admins/2013-November/001598.html
>> > >
>> > > It is possible that this may be related to the problem you are seeing (I am
>> > > not sure at this point).
>> > >
>> > > I tried loading some packages from squeaksource a few minutes ago, and it
>> > > was slow but functional. However, checking the image I see 15 active
>> > > SSession handlers in a ProcessBrowser. This is not right, and it appears
>> > > to be a recurrence of a problem that we have seen previously on an
>> > > intermittent
>> > > basis, both on squeaksource.com and (probably) on source.squeak.org.
>> > >
>> > > I will terminate the runaway session handler processes, which I hope will
>> > > clear up the immediate problem.
>> > >
>> > > More to follow I'm sure ...
>> > >
>> > > Dave
>> > >
>> > > >
>> > > > 2013/11/14 Bert Freudenberg <[hidden email]>
>> > > >
>> > > > >
>> > > > > On 14.11.2013, at 13:04, Chris Muller <[hidden email]> wrote:
>> > > > >
>> > > > > >> The welcome (sic) message on the SqueakSource home page is overly
>> > > > > alarming,
>> > > > > >> and IMHO should be changed to something that encourages new
>> > > projects to
>> > > > > >> be created elsewhere, but that does not cause alarm for existing
>> > > usurs.
>> > > > > >> But that is a policy decision, and I will defer to the Squeak board
>> > > and
>> > > > > >> the Squeak community on this.
>> > > > > >
>> > > > > > +1
>> > > > > >
>> > > > > > I think we should we should delete the "ATTENTION!" line but leave
>> > > the
>> > > > > > note about creation of projects being disabled.
>> > > > >
>> > > > > Yep.
>> > > > >
>> > > > > - Bert -
>> > > > >
>> > > > >
>> > > > >
>> > > > >
>> > >
>> > > >
>> > >
>> > >
>> > >
>>
>> >
>
>
>

Reply | Threaded
Open this post in threaded view
|

Re: [Box-Admins] Re: [squeak-dev] SqueakSource.com home page (was: Fix for OSProcess - Where to commit?)

Eliot Miranda-2
In reply to this post by David T. Lewis
Hi David,

On Thu, Nov 14, 2013 at 6:00 PM, David T. Lewis <[hidden email]> wrote:
Attached is a screen shot of the process browser in the squeaksource.com
image, showing the excess SSSession processes. They are deadlocked on
accessing DateAndTime now, which contains a critical section using the
LastTickSemaphore in class DateAndTime.

In the squeaksource.com image, LastTickSemaphore has 0 excess signals,
whereas in other images I look at, it has 1 excess signal. This looks
to me like a mutex that has gotten confused.

I sent a signal to LastTickSemaphore in class DateAndTime, and now it
looks like a mutex again. Let's see if that clears the problem.

This certainly has a bad smell about it :-(  But I note also that
we are running our SqueakSource services on older images, and a number
of changes have been made to DateAndTime since then.

Nicolas, I will send private email to give you the VNC password for access
to the squeaksource.com image in case you need it (I am going to get some
sleep soon).

All this LastTickSemaphore stuff is complete nonsense, wasting on average 1/2 a second on startup spinning until the clock rolls over.  If we move to the 64-bit microsecond timebase which is provided by the Cog time primitives we don't need to sync the second and the millisecond clocks because they are replaced by a single microsecond clock.  If the current Interpreter VMs do support the 64-bit microsecond primitives I suggest we move ASAP.  QWe've already done this in our images at Cadence and been running happily with it for several months.  Would this help?
 

Dave


On Thu, Nov 14, 2013 at 08:26:45PM -0500, David T. Lewis wrote:
> The image is now showing 10 session handler processes in the process
> browser. Presumably these are related to the failed upload requests.
>
> I do not understand the cause of this problem, and it may be that I
> should revert the changes that I did earlier today (in which I put
> squeaksource under the control of the supervise(8) for starting the
> image).
>
> But I suspect that the problem lies elsewhere, so for I will make
> a copy of the broken image for debugging, then terminate the excess
> processes. This should clear the problem temporarily. I will follow
> up with another email within about 30 minutes.
>
> Dave
>
>
>
> On Fri, Nov 15, 2013 at 01:28:18AM +0100, Nicolas Cellier wrote:
> > Thanks David, it went back to normal speed for a moment, but is now
> > rejecting my upload requests again (most will timeout, some do work
> > intermittently)...
> >
> >
> > 2013/11/14 David T. Lewis <[hidden email]>
> >
> > > On Thu, Nov 14, 2013 at 10:55:55PM +0100, Nicolas Cellier wrote:
> > > > Yes, let's remove the alarms.
> > > > But it has to be functional.
> > > > Currently, I can connect on the web interface and I can download, but all
> > > > my upload are failing with timeout... Any idea?
> > > >
> > >
> > > I made some changes to the launch script for squeaksource.com earlier
> > > today:
> > >
> > >
> > > http://lists.squeakfoundation.org/pipermail/box-admins/2013-November/001598.html
> > >
> > > It is possible that this may be related to the problem you are seeing (I am
> > > not sure at this point).
> > >
> > > I tried loading some packages from squeaksource a few minutes ago, and it
> > > was slow but functional. However, checking the image I see 15 active
> > > SSession handlers in a ProcessBrowser. This is not right, and it appears
> > > to be a recurrence of a problem that we have seen previously on an
> > > intermittent
> > > basis, both on squeaksource.com and (probably) on source.squeak.org.
> > >
> > > I will terminate the runaway session handler processes, which I hope will
> > > clear up the immediate problem.
> > >
> > > More to follow I'm sure ...
> > >
> > > Dave
> > >
> > > >
> > > > 2013/11/14 Bert Freudenberg <[hidden email]>
> > > >
> > > > >
> > > > > On 14.11.2013, at 13:04, Chris Muller <[hidden email]> wrote:
> > > > >
> > > > > >> The welcome (sic) message on the SqueakSource home page is overly
> > > > > alarming,
> > > > > >> and IMHO should be changed to something that encourages new
> > > projects to
> > > > > >> be created elsewhere, but that does not cause alarm for existing
> > > usurs.
> > > > > >> But that is a policy decision, and I will defer to the Squeak board
> > > and
> > > > > >> the Squeak community on this.
> > > > > >
> > > > > > +1
> > > > > >
> > > > > > I think we should we should delete the "ATTENTION!" line but leave
> > > the
> > > > > > note about creation of projects being disabled.
> > > > >
> > > > > Yep.
> > > > >
> > > > > - Bert -
> > > > >
> > > > >
> > > > >
> > > > >
> > >
> > > >
> > >
> > >
> > >
>
> >






--
best,
Eliot


Reply | Threaded
Open this post in threaded view
|

Re: [Box-Admins] Re: [squeak-dev] SqueakSource.com home page (was: Fix for OSProcess - Where to commit?)

David T. Lewis
On Fri, Nov 15, 2013 at 09:31:18AM -0800, Eliot Miranda wrote:

> Hi David,
>
> On Thu, Nov 14, 2013 at 6:00 PM, David T. Lewis <[hidden email]> wrote:
>
> > Attached is a screen shot of the process browser in the squeaksource.com
> > image, showing the excess SSSession processes. They are deadlocked on
> > accessing DateAndTime now, which contains a critical section using the
> > LastTickSemaphore in class DateAndTime.
> >
> > In the squeaksource.com image, LastTickSemaphore has 0 excess signals,
> > whereas in other images I look at, it has 1 excess signal. This looks
> > to me like a mutex that has gotten confused.
> >
> > I sent a signal to LastTickSemaphore in class DateAndTime, and now it
> > looks like a mutex again. Let's see if that clears the problem.
> >
> > This certainly has a bad smell about it :-(  But I note also that
> > we are running our SqueakSource services on older images, and a number
> > of changes have been made to DateAndTime since then.
> >
> > Nicolas, I will send private email to give you the VNC password for access
> > to the squeaksource.com image in case you need it (I am going to get some
> > sleep soon).
> >
>
> All this LastTickSemaphore stuff is complete nonsense, wasting on average
> 1/2 a second on startup spinning until the clock rolls over.  If we move to
> the 64-bit microsecond timebase which is provided by the Cog time
> primitives we don't need to sync the second and the millisecond clocks
> because they are replaced by a single microsecond clock.  If the current
> Interpreter VMs do support the 64-bit microsecond primitives I suggest we
> move ASAP.  QWe've already done this in our images at Cadence and been
> running happily with it for several months.  Would this help?
>

Yes, it probably would help, in the sense that it would make this particular
failure scenario impossible.

But I think that something else must be going on here, and it would be
worth getting to the bottom of it. The particular deadlock we are seeing
here should be impossible, regardless of how nonsensical the LastTickSemaphore
stuff may be. We are looking at a small section of code evaluated within
a critical section. If semaphores and process scheduling are working
correctly, it should be impossible for two different processes to deadlock
on that section.

I recall that Andreas made an important fix to process scheduling perhaps
a couple of years ago, but I can't remember the details. I wonder if our
SqueakSource images may be lacking that fix?

Also, Chris Muller pointed out that he has seen similar symptoms related
to Seaside:

  This might be a problem I think I observed with using Seaside's
  #returnResponse: inside a Mutex's critical: block.  The block is
  entered, the Sema waited but never resignaled, leaving all subsequent
  processes stuck waiting.
       
So I am concerned about the following: How is it possible that a semaphore
that is used privately by a small section of uncomplicated code ever
get itself into a state where it has missed a signal and no longer
functions as a mutex? In normal operation this never happens, but is
there some scenario related to Seaside operations, socket timeouts,
process scheduling, or image save and restart that might lead to this
condition?

BTW, squeaksource.com seems to be working nicely since I signalled
that semaphore yesterday to break things loose. I uploaded a few packages
today without problems. But the problem will be back, I am certain
of that.

Dave


Reply | Threaded
Open this post in threaded view
|

Re: [Box-Admins] Re: [squeak-dev] SqueakSource.com home page (was: Fix for OSProcess - Where to commit?)

Frank Shearar-3
On 15 November 2013 18:11, David T. Lewis <[hidden email]> wrote:

> On Fri, Nov 15, 2013 at 09:31:18AM -0800, Eliot Miranda wrote:
>> Hi David,
>>
>> On Thu, Nov 14, 2013 at 6:00 PM, David T. Lewis <[hidden email]> wrote:
>>
>> > Attached is a screen shot of the process browser in the squeaksource.com
>> > image, showing the excess SSSession processes. They are deadlocked on
>> > accessing DateAndTime now, which contains a critical section using the
>> > LastTickSemaphore in class DateAndTime.
>> >
>> > In the squeaksource.com image, LastTickSemaphore has 0 excess signals,
>> > whereas in other images I look at, it has 1 excess signal. This looks
>> > to me like a mutex that has gotten confused.
>> >
>> > I sent a signal to LastTickSemaphore in class DateAndTime, and now it
>> > looks like a mutex again. Let's see if that clears the problem.
>> >
>> > This certainly has a bad smell about it :-(  But I note also that
>> > we are running our SqueakSource services on older images, and a number
>> > of changes have been made to DateAndTime since then.
>> >
>> > Nicolas, I will send private email to give you the VNC password for access
>> > to the squeaksource.com image in case you need it (I am going to get some
>> > sleep soon).
>> >
>>
>> All this LastTickSemaphore stuff is complete nonsense, wasting on average
>> 1/2 a second on startup spinning until the clock rolls over.  If we move to
>> the 64-bit microsecond timebase which is provided by the Cog time
>> primitives we don't need to sync the second and the millisecond clocks
>> because they are replaced by a single microsecond clock.  If the current
>> Interpreter VMs do support the 64-bit microsecond primitives I suggest we
>> move ASAP.  QWe've already done this in our images at Cadence and been
>> running happily with it for several months.  Would this help?
>>
>
> Yes, it probably would help, in the sense that it would make this particular
> failure scenario impossible.
>
> But I think that something else must be going on here, and it would be
> worth getting to the bottom of it. The particular deadlock we are seeing
> here should be impossible, regardless of how nonsensical the LastTickSemaphore
> stuff may be. We are looking at a small section of code evaluated within
> a critical section. If semaphores and process scheduling are working
> correctly, it should be impossible for two different processes to deadlock
> on that section.
>
> I recall that Andreas made an important fix to process scheduling perhaps
> a couple of years ago, but I can't remember the details. I wonder if our
> SqueakSource images may be lacking that fix?
>
> Also, Chris Muller pointed out that he has seen similar symptoms related
> to Seaside:
>
>   This might be a problem I think I observed with using Seaside's
>   #returnResponse: inside a Mutex's critical: block.  The block is
>   entered, the Sema waited but never resignaled, leaving all subsequent
>   processes stuck waiting.
>
> So I am concerned about the following: How is it possible that a semaphore
> that is used privately by a small section of uncomplicated code ever
> get itself into a state where it has missed a signal and no longer
> functions as a mutex? In normal operation this never happens, but is
> there some scenario related to Seaside operations, socket timeouts,
> process scheduling, or image save and restart that might lead to this
> condition?

I really don't know if this is anything more than a wild shot in the
dark, but Seaside does (or used to) perform stack slicing. It was
famed as being based on continuations. So anyway, is this Seaside
installation using continuations? I'm pretty sure Seaside's
continuations play correctly with #ensure: and stuff (being based on
resumable exceptions), but you never know...?

frank

> BTW, squeaksource.com seems to be working nicely since I signalled
> that semaphore yesterday to break things loose. I uploaded a few packages
> today without problems. But the problem will be back, I am certain
> of that.
>
> Dave
>
>