[squeak-dev] website down?

Previous Topic Next Topic
 
classic Classic list List threaded Threaded
7 messages Options
Reply | Threaded
Open this post in threaded view
|

[squeak-dev] website down?

Jason Rogers-4
squeak.org is down, and my message to the webteam was returned as well
with the following error:

This is an automatically generated Delivery Status Notification

Delivery to the following recipient failed permanently:

    [hidden email]

Technical details of permanent failure:
Google tried to deliver your message, but it was rejected by the
recipient domain. We recommend contacting the other email provider for
further information about the cause of this error. The error that the
other server returned was: 553 553 sorry, that domain isn't in my list
of allowed rcpthosts (#5.7.1) (state 14).

--
Jason Rogers

"I am crucified with Christ: nevertheless I live;
yet not I, but Christ liveth in me: and the life
which I now live in the flesh I live by the faith of
the Son of God, who loved me, and gave
himself for me."
    Galatians 2:20

Reply | Threaded
Open this post in threaded view
|

Re: [squeak-dev] website down?

Janko Mivšek
Hi Jason,

Jason Rogers pravi:
> squeak.org is down,

It is up again. We will restart the image later because there are some
strange Socket behavour on that image which cause socket to stop
receiving new web requests. Just restarting the http part of Swazoo
solves that problem. For a while...

This image is running now for months, that's why its restart could help
here. But finding the real cause of this trouble, well, this would be
just perfect! Anyone has an idea what can be  a problem?

> and my message to the webteam was returned as well
> with the following error:
>
> This is an automatically generated Delivery Status Notification
>
> Delivery to the following recipient failed permanently:
>
>     [hidden email]

The right address is [hidden email] !

Best regards
Janko

Reply | Threaded
Open this post in threaded view
|

Re: [squeak-dev] website down?

David T. Lewis
On Tue, May 26, 2009 at 05:24:09PM +0200, Janko Miv??ek wrote:
>
> It is up again. We will restart the image later because there are some
> strange Socket behavour on that image which cause socket to stop
> receiving new web requests. Just restarting the http part of Swazoo
> solves that problem. For a while...
>
> This image is running now for months, that's why its restart could help
> here. But finding the real cause of this trouble, well, this would be
> just perfect! Anyone has an idea what can be  a problem?

Socket leak? Keep an eye on the contents of /proc/<squeakpid>/fd/ and see
if open file descriptors are accumulating. If so, you will reach a limit
(usually 1024) after which no new connections will be possible.

Dave


Reply | Threaded
Open this post in threaded view
|

Re: [squeak-dev] website down?

Janko Mivšek
In reply to this post by Jason Rogers-4

Hi Jason,

Jason Rogers pravi:
> squeak.org is down,

It is up again. We will restart the image later because there are some
strange Socket behavour on that image which cause socket to stop
receiving new web requests. Just restarting the http part of Swazoo
solves that problem. For a while...

This image is running now for months, that's why its restart could help
here. But finding the real cause of this trouble, well, this would be
just perfect! Anyone has an idea what can be  a problem?

> and my message to the webteam was returned as well
> with the following error:
>
> This is an automatically generated Delivery Status Notification
>
> Delivery to the following recipient failed permanently:
>
>     [hidden email]

The right address is [hidden email] !

Best regards
Janko




Reply | Threaded
Open this post in threaded view
|

Re: [squeak-dev] website down?

Janko Mivšek
In reply to this post by David T. Lewis
Hi David,

David T. Lewis pravi:
> Janko Miv??ek wrote:

>> This image is running now for months, that's why its restart could help
>> here. But finding the real cause of this trouble, well, this would be
>> just perfect! Anyone has an idea what can be  a problem?

> Socket leak? Keep an eye on the contents of /proc/<squeakpid>/fd/ and see
> if open file descriptors are accumulating. If so, you will reach a limit
> (usually 1024) after which no new connections will be possible.

Yep, that's it, I just counted it: 1024 and dropped to 764 after Swazoo
restart.

So, first question: how to close those remaining 764 open sockets
without restarting the image,

second: how to avoid opening so many unclosed sockets (some kind of
attack?),

third: how to raise the 1024 limit?

Janko

--
Janko Mivšek
Svetovalec za informatiko
Eranova d.o.o.
Ljubljana, Slovenija
www.eranova.si
tel:  01 514 22 55
faks: 01 514 22 56
gsm: 031 674 565

Reply | Threaded
Open this post in threaded view
|

Re: [squeak-dev] website down?

David T. Lewis
On Tue, May 26, 2009 at 10:07:01PM +0200, Janko Miv??ek wrote:

> Hi David,
>
> David T. Lewis pravi:
> > Janko Miv??ek wrote:
>
> >> This image is running now for months, that's why its restart could help
> >> here. But finding the real cause of this trouble, well, this would be
> >> just perfect! Anyone has an idea what can be  a problem?
>
> > Socket leak? Keep an eye on the contents of /proc/<squeakpid>/fd/ and see
> > if open file descriptors are accumulating. If so, you will reach a limit
> > (usually 1024) after which no new connections will be possible.
>
> Yep, that's it, I just counted it: 1024 and dropped to 764 after Swazoo
> restart.
>
> So, first question: how to close those remaining 764 open sockets
> without restarting the image,

The socket file descriptors that you see listed in /proc/<squeakpid>/fd/ are
file descriptors that have been opened by the Squeak VM, and that have not been
properly closed when no longer in use. If you can find the objects in the
running image that were responsible for opening these descriptors (possibly
this is "Socket allSubInstances"), then you can probably close them. Otherwise,
if finalizers have been set up, a garbage collection may cause them to be
closed.

> second: how to avoid opening so many unclosed sockets (some kind of
> attack?),

It is not an attack. Most likely it is nothing more than some path in the
code that allows a socket to become unreferenced in the image without having
closed the socket first.

> third: how to raise the 1024 limit?

It is best not to do that. If you are "leaking" sockets, it will eventually
fail no matter what the limit is. But if you do want to control it, see the
man page for bash and look for the ulimit command.

Dave


Reply | Threaded
Open this post in threaded view
|

Re: [squeak-dev] website down?

Janko Mivšek
>> David T. Lewis pravi:
>>> Janko Mivšek wrote:

>>>> This image is running now for months, that's why its restart could help
>>>> here. But finding the real cause of this trouble, well, this would be
>>>> just perfect! Anyone has an idea what can be  a problem?

>>> Socket leak? Keep an eye on the contents of /proc/<squeakpid>/fd/ and see
>>> if open file descriptors are accumulating. If so, you will reach a limit
>>> (usually 1024) after which no new connections will be possible.

>> Yep, that's it, I just counted it: 1024 and dropped to 764 after Swazoo
>> restart.

This problem is now solved simply after restarting the image. Now the
open sockets vary from 20 to 300 and jump back to low every hour when
the image is snapshoted.

But the problem will repeat after few months again, because we somehow
leak sockets, as David said. I didn't manage to find the reason, not the
 offending sockets in image. But ok, restarting the image every few
months is good enough solution for now.

Thanks David for help!

Best regards
Janko



>>
>> So, first question: how to close those remaining 764 open sockets
>> without restarting the image,
>
> The socket file descriptors that you see listed in /proc/<squeakpid>/fd/ are
> file descriptors that have been opened by the Squeak VM, and that have not been
> properly closed when no longer in use. If you can find the objects in the
> running image that were responsible for opening these descriptors (possibly
> this is "Socket allSubInstances"), then you can probably close them. Otherwise,
> if finalizers have been set up, a garbage collection may cause them to be
> closed.
>
>> second: how to avoid opening so many unclosed sockets (some kind of
>> attack?),
>
> It is not an attack. Most likely it is nothing more than some path in the
> code that allows a socket to become unreferenced in the image without having
> closed the socket first.
>
>> third: how to raise the 1024 limit?
>
> It is best not to do that. If you are "leaking" sockets, it will eventually
> fail no matter what the limit is. But if you do want to control it, see the
> man page for bash and look for the ulimit command.
>
> Dave
>
>
>

--
Janko Mivšek
Svetovalec za informatiko
Eranova d.o.o.
Ljubljana, Slovenija
www.eranova.si
tel:  01 514 22 55
faks: 01 514 22 56
gsm: 031 674 565