unix SqueakSource socket 'too many open files' problem

Previous Topic Next Topic
 
classic Classic list List threaded Threaded
16 messages Options
Reply | Threaded
Open this post in threaded view
|

unix SqueakSource socket 'too many open files' problem

timrowledge
 
I've been running a squeaksource system on a 64bit linux server for a year or so and just started having odd problems.

Talking with Chris about it, and looking into the assorted logs, lead to the theory that sometihng (unknown and I can't see how possible) caused the image to save and quit at some point when the expected behaviour is to *not* ever save. That lead to some complaints about Magma related details that don't matter here.

I had much fun (the usual linux user names/permissions stuff) but eventually got a new copy of the squeaksource image running. Yay!

Except after an indeterminate time (several hours, not a whole day) the webpage part of the sytem became unaccessible even though it had not actually crashed out. The log contained a bunch of lines like this -
3729ab18e70dac 2021-02-24T23:37:53.41687-05:00 CurrentEnvironment:
@40000000603729ab18e71d4c 2acceptHandler: Too many open files
@4000000060379f1127814ddc acceptHandler: aborting server 12 pss=0x14df900
@4000000060379f112782dc4c socketStatus: freeing invalidated pss=0x14df900
@400000006037b1ef0d99c214 acceptHandler: Too many open files
@400000006037b1ef0d9a5e54 acceptHandler: aborting server 12 pss=0x7f2f8c8d4630
@400000006037b1ef0d9c7194 socketStatus: freeing invalidated pss=0x7f2f8c8d4630


... which are errors from the VM socket handlers.

I've not seen this happening before that I recall. Any ideas? Stuff to check? Solutions?

tim
--
tim Rowledge; [hidden email]; http://www.rowledge.org/tim
Strange OpCodes: TDB: Transfer and Drop Bits


Reply | Threaded
Open this post in threaded view
|

Re: unix SqueakSource socket 'too many open files' problem

David T. Lewis
 
Socket leaks were always a problem on the older squeaksource.com
images, although I don't think it has been happening since I updated
it to a Squeak 5.3 image, and I don't recall that it was ever a
problem on source.squeak.org.

I don't know what the root cause was, and it's certainly possible
that socket leak issues still exist but are not showing up with
the usage or load on our servers.

If you are using image based persistence (as opposed to Magma
back end), then the utility that I used to use for squeaksource.com
is a handy workaround (actually it is still running on our
squeaksource.com image, which is the reason I am not entirely
sure if the socket leak problem is actually gone).

Take a look in the http://www.squeaksource.com/SSImageInit repository
and have a look at the socket leak monitor. Again to be clear,
you don't want to run this with Magma, but in any case it may
give you and idea for how to handle it.

I do have to note that the error messages you are getting may be
a different issue entirely. The socket leaks I was dealing with
menifested as open file descriptors (visible in /proc/<pid>/fd/*).
The error messages you are getting may be another issue entirely.
If you run out of file descriptors the image will hang, and if
I recall correctly it may have crashed the VM. Possibly that
would lead to the later symptoms you are seeing now.

Dave


On Thu, Feb 25, 2021 at 11:42:23AM -0800, tim Rowledge wrote:

>  
> I've been running a squeaksource system on a 64bit linux server for a year or so and just started having odd problems.
>
> Talking with Chris about it, and looking into the assorted logs, lead to the theory that sometihng (unknown and I can't see how possible) caused the image to save and quit at some point when the expected behaviour is to *not* ever save. That lead to some complaints about Magma related details that don't matter here.
>
> I had much fun (the usual linux user names/permissions stuff) but eventually got a new copy of the squeaksource image running. Yay!
>
> Except after an indeterminate time (several hours, not a whole day) the webpage part of the sytem became unaccessible even though it had not actually crashed out. The log contained a bunch of lines like this -
> 3729ab18e70dac 2021-02-24T23:37:53.41687-05:00 CurrentEnvironment:
> @40000000603729ab18e71d4c 2acceptHandler: Too many open files
> @4000000060379f1127814ddc acceptHandler: aborting server 12 pss=0x14df900
> @4000000060379f112782dc4c socketStatus: freeing invalidated pss=0x14df900
> @400000006037b1ef0d99c214 acceptHandler: Too many open files
> @400000006037b1ef0d9a5e54 acceptHandler: aborting server 12 pss=0x7f2f8c8d4630
> @400000006037b1ef0d9c7194 socketStatus: freeing invalidated pss=0x7f2f8c8d4630
>
>
> ... which are errors from the VM socket handlers.
>
> I've not seen this happening before that I recall. Any ideas? Stuff to check? Solutions?
>
> tim
> --
> tim Rowledge; [hidden email]; http://www.rowledge.org/tim
> Strange OpCodes: TDB: Transfer and Drop Bits
>
>
Reply | Threaded
Open this post in threaded view
|

Re: unix SqueakSource socket 'too many open files' problem

timrowledge
 
Thanks Dave -
that's all rather scary sounding.

It seems really weird that this has only just started and yet the system has been running happily for over a year. It's a 'PersonalSqueakSource' image built on a very late 5.2 /5.3 alpha image.

Are there unix system settings for allowable numbers of sockets etc that might have been changed by a sysadmin? Anything like that?

tim
--
tim Rowledge; [hidden email]; http://www.rowledge.org/tim
Strange OpCodes: STOP: No Op


Reply | Threaded
Open this post in threaded view
|

Re: unix SqueakSource socket 'too many open files' problem

David T. Lewis
 
On Thu, Feb 25, 2021 at 01:47:16PM -0800, tim Rowledge wrote:
>  
> Thanks Dave -
> that's all rather scary sounding.
>

Scary, no. But hard to debug? Yes.  A file and/or socket handle leak
just means that the image is failing to close connections properly,
possibly because of some completely unrelated error that was not
handled by closing the socket.


> It seems really weird that this has only just started and yet the system has been running happily for over a year. It's a 'PersonalSqueakSource' image built on a very late 5.2 /5.3 alpha image.
>
> Are there unix system settings for allowable numbers of sockets etc that might have been changed by a sysadmin? Anything like that?
>

Yes I'm sure it's configurable, but it's highly unlikely that a
sysadmin would bother to fool around with something like that.
I think the default is 1000 open file descriptors (including
sockets, pipes, files) per process, and that is way more than
you need for a squeaksource server.

Assuming you have access to the server, the best place to start
is by looking in the directory /proc/<pid>/fd/* (where <pid> is
the process ID of the Squeak VM (you can find it with something
like "$ ps -aef | grep squeak"). If you look at that once a day
for a week, you will be able to see a growing number of file
descriptors hanging around. Eventually it gets to be too many,
and bad things happen.

If you do /not/ see these file descriptors accumulating over
time, then it would indicate some other kind of issue entirely.

Good luck and keep us posted.

Dave

Reply | Threaded
Open this post in threaded view
|

Re: unix SqueakSource socket 'too many open files' problem

timrowledge
 


> On 2021-02-25, at 3:17 PM, David T. Lewis <[hidden email]> wrote:
>
> Assuming you have access to the server, the best place to start
> is by looking in the directory /proc/<pid>/fd/* (where <pid> is
> the process ID of the Squeak VM (you can find it with something
> like "$ ps -aef | grep squeak"). If you look at that once a day
> for a week, you will be able to see a growing number of file
> descriptors hanging around. Eventually it gets to be too many,
> and bad things happen.

After an hour or so there are 350 entries in that directory, virtually all sockets. I imagine somehow something has caused an issue that ... well, no idea. Pretty sure it isn't going to last for a week at this rate!

I've found a backup copy of the image from before the problem so I think I'll probably install that and see if things improve. Maybe a new VM. Strange stuff.


tim
--
tim Rowledge; [hidden email]; http://www.rowledge.org/tim
Useful Latin Phrases:- Fac me cocleario vomere! = Gag me with a spoon!


Reply | Threaded
Open this post in threaded view
|

Re: unix SqueakSource socket 'too many open files' problem

Tobias Pape
 


> On 26. Feb 2021, at 05:04, tim Rowledge <[hidden email]> wrote:
>
>
>
>
>> On 2021-02-25, at 3:17 PM, David T. Lewis <[hidden email]> wrote:
>>
>> Assuming you have access to the server, the best place to start
>> is by looking in the directory /proc/<pid>/fd/* (where <pid> is
>> the process ID of the Squeak VM (you can find it with something
>> like "$ ps -aef | grep squeak"). If you look at that once a day
>> for a week, you will be able to see a growing number of file
>> descriptors hanging around. Eventually it gets to be too many,
>> and bad things happen.
>
> After an hour or so there are 350 entries in that directory, virtually all sockets. I imagine somehow something has caused an issue that ... well, no idea. Pretty sure it isn't going to last for a week at this rate!

Maybe sockets the other end closed but we did not notice, so hang on to them.
I'm quite sure I saw that not too few times when caring for Seaside on Squeak.
There was often something strange with the sockets going on when the talking
between browser and Seaside was already done but somehow the Squeak hang on to
the socket…

Best regards
        -Tobias

>
> I've found a backup copy of the image from before the problem so I think I'll probably install that and see if things improve. Maybe a new VM. Strange stuff.
>
>
> tim



Reply | Threaded
Open this post in threaded view
|

Re: unix SqueakSource socket 'too many open files' problem

David T. Lewis
 
On Fri, Feb 26, 2021 at 08:40:17AM +0100, Tobias Pape wrote:

>
> > On 26. Feb 2021, at 05:04, tim Rowledge <[hidden email]> wrote:
> >
> >> On 2021-02-25, at 3:17 PM, David T. Lewis <[hidden email]> wrote:
> >>
> >> Assuming you have access to the server, the best place to start
> >> is by looking in the directory /proc/<pid>/fd/* (where <pid> is
> >> the process ID of the Squeak VM (you can find it with something
> >> like "$ ps -aef | grep squeak"). If you look at that once a day
> >> for a week, you will be able to see a growing number of file
> >> descriptors hanging around. Eventually it gets to be too many,
> >> and bad things happen.
> >
> > After an hour or so there are 350 entries in that directory, virtually all sockets. I imagine somehow something has caused an issue that ... well, no idea. Pretty sure it isn't going to last for a week at this rate!
>
> Maybe sockets the other end closed but we did not notice, so hang on to them.
> I'm quite sure I saw that not too few times when caring for Seaside on Squeak.
> There was often something strange with the sockets going on when the talking
> between browser and Seaside was already done but somehow the Squeak hang on to
> the socket???
>

It may be that the socket leak problem on squeaksource.com is "fixed"
now simply because the it is so much faster now that it is running on
64-bit Spur, so timeouts and other errors simply do not happen any
more. In other words, an underlying problem may still be there, but
it rarely happens for squeaksource on the faster VM.

The responsibility for closing the sockets is entirely on the Squeak
side. Regardless of what the client does or does not do, the Squeak
image must always close its socket in order to free the file descriptor.

Tim, one other thing to look out for is the event handling in the VM.
Since last October it now uses the Linux-specific epoll mechanism.
If the problem you see now started happening after updating to a
newer VM, then consider this as a possible suspect.

Dave

Reply | Threaded
Open this post in threaded view
|

Re: unix SqueakSource socket 'too many open files' problem

timrowledge
In reply to this post by timrowledge
 


> On 2021-02-25, at 8:04 PM, tim Rowledge <[hidden email]> wrote:
>
> After an hour or so there are 350 entries in that directory, virtually all sockets. I imagine somehow something has caused an issue that ... well, no idea. Pretty sure it isn't going to last for a week at this rate!

This morning there were 976 fds. So I restarted and watched and .. 18. It's sitting there all happy and as if, as we say in the UK, "butter wouldn't melt in its mouth". What, me? Cause problems? How can you be so cruel to a young server?

Maybe a port scanning/attack of some sort flooded it?

tim
--
tim Rowledge; [hidden email]; http://www.rowledge.org/tim
Oxymorons: Living dead


Reply | Threaded
Open this post in threaded view
|

Re: unix SqueakSource socket 'too many open files' problem

Chris Muller-3
 
Still curious as to what VM you're using...

On Fri, Feb 26, 2021 at 2:04 PM tim Rowledge <[hidden email]> wrote:
 


> On 2021-02-25, at 8:04 PM, tim Rowledge <[hidden email]> wrote:
>
> After an hour or so there are 350 entries in that directory, virtually all sockets. I imagine somehow something has caused an issue that ... well, no idea. Pretty sure it isn't going to last for a week at this rate!

This morning there were 976 fds. So I restarted and watched and .. 18. It's sitting there all happy and as if, as we say in the UK, "butter wouldn't melt in its mouth". What, me? Cause problems? How can you be so cruel to a young server?

Maybe a port scanning/attack of some sort flooded it?

tim
--
tim Rowledge; [hidden email]; http://www.rowledge.org/tim
Oxymorons: Living dead


Reply | Threaded
Open this post in threaded view
|

Re: unix SqueakSource socket 'too many open files' problem

David T. Lewis
In reply to this post by timrowledge
 
On Fri, Feb 26, 2021 at 12:04:26PM -0800, tim Rowledge wrote:

>  
>
>
> > On 2021-02-25, at 8:04 PM, tim Rowledge <[hidden email]> wrote:
> >
> > After an hour or so there are 350 entries in that directory, virtually all sockets. I imagine somehow something has caused an issue that ... well, no idea. Pretty sure it isn't going to last for a week at this rate!
>
> This morning there were 976 fds. So I restarted and watched and .. 18. It's sitting there all happy and as if, as we say in the UK, "butter wouldn't melt in its mouth". What, me? Cause problems? How can you be so cruel to a young server?
>
> Maybe a port scanning/attack of some sort flooded it?
>

Keep an eye on it, you probably still have a socket leak.

Here is what I saw on the squeaksource.com server a few minutes ago:

  squeaksourcecom@dan:~$ ls -l /proc/16147/fd
  total 0
  lr-x------ 1 squeaksourcecom www-data 64 Feb 14 03:25 0 -> /dev/null
  l-wx------ 1 squeaksourcecom www-data 64 Feb 14 03:25 1 -> pipe:[11697]
  l-wx------ 1 squeaksourcecom www-data 64 Feb 14 03:25 2 -> pipe:[11697]
  lr-x------ 1 squeaksourcecom www-data 64 Feb 14 03:25 3 -> /srv/squeaksourcecom/SqueakSource/ss/Pharo
  lr-x------ 1 squeaksourcecom www-data 64 Feb 14 03:25 4 -> /srv/squeaksourcecom/SqueakSource/squeaksource.8.image
  lr-x------ 1 squeaksourcecom www-data 64 Feb 14 03:25 6 -> /srv/squeaksourcecom/SqueakSource/SqueakV50.sources
  lrwx------ 1 squeaksourcecom www-data 64 Feb 14 03:25 7 -> socket:[25825763]
  lrwx------ 1 squeaksourcecom www-data 64 Feb 14 03:25 8 -> socket:[25825764]
  lrwx------ 1 squeaksourcecom www-data 64 Feb 14 06:25 9 -> /srv/squeaksourcecom/SqueakSource/squeaksource.8.changes

And then a few minutes later:

  squeaksourcecom@dan:~$ ls -l /proc/16147/fd
  total 0
  lr-x------ 1 squeaksourcecom www-data 64 Feb 14 03:25 0 -> /dev/null
  l-wx------ 1 squeaksourcecom www-data 64 Feb 14 03:25 1 -> pipe:[11697]
  l-wx------ 1 squeaksourcecom www-data 64 Feb 14 03:25 2 -> pipe:[11697]
  lr-x------ 1 squeaksourcecom www-data 64 Feb 14 03:25 3 -> /srv/squeaksourcecom/SqueakSource/ss/metacello
  lr-x------ 1 squeaksourcecom www-data 64 Feb 14 03:25 4 -> /srv/squeaksourcecom/SqueakSource/squeaksource.8.image
  lr-x------ 1 squeaksourcecom www-data 64 Feb 14 03:25 6 -> /srv/squeaksourcecom/SqueakSource/SqueakV50.sources
  lrwx------ 1 squeaksourcecom www-data 64 Feb 14 03:25 7 -> socket:[25825763]
  lrwx------ 1 squeaksourcecom www-data 64 Feb 14 03:25 8 -> socket:[25825764]
  lrwx------ 1 squeaksourcecom www-data 64 Feb 14 06:25 9 -> /srv/squeaksourcecom/SqueakSource/squeaksource.8.changes

If your server is not actively handling requests, you should see similar
socket usage. You'll see more file descriptors when you are using Magma,
but keep an eye on the socket entries, and they should not accumulate
over time.

Dave
 
Reply | Threaded
Open this post in threaded view
|

Re: unix SqueakSource socket 'too many open files' problem

johnmci
 
Historically it's up to finalization to cleanup a socket. Anyone confirmed that code is there and runs with this VM?


....
John M. McIntosh. Corporate Smalltalk Consulting Ltd https://www.linkedin.com/in/smalltalk

‐‐‐‐‐‐‐ Original Message ‐‐‐‐‐‐‐
On Friday, February 26, 2021 1:12 PM, David T. Lewis <[hidden email]> wrote:

>
>
> On Fri, Feb 26, 2021 at 12:04:26PM -0800, tim Rowledge wrote:
>
> > > On 2021-02-25, at 8:04 PM, tim Rowledge [hidden email] wrote:
> > > After an hour or so there are 350 entries in that directory, virtually all sockets. I imagine somehow something has caused an issue that ... well, no idea. Pretty sure it isn't going to last for a week at this rate!
> >
> > This morning there were 976 fds. So I restarted and watched and .. 18. It's sitting there all happy and as if, as we say in the UK, "butter wouldn't melt in its mouth". What, me? Cause problems? How can you be so cruel to a young server?
> > Maybe a port scanning/attack of some sort flooded it?
>
> Keep an eye on it, you probably still have a socket leak.
>
> Here is what I saw on the squeaksource.com server a few minutes ago:
>
> squeaksourcecom@dan:~$ ls -l /proc/16147/fd
> total 0
> lr-x------ 1 squeaksourcecom www-data 64 Feb 14 03:25 0 -> /dev/null
> l-wx------ 1 squeaksourcecom www-data 64 Feb 14 03:25 1 -> pipe:[11697]
> l-wx------ 1 squeaksourcecom www-data 64 Feb 14 03:25 2 -> pipe:[11697]
> lr-x------ 1 squeaksourcecom www-data 64 Feb 14 03:25 3 -> /srv/squeaksourcecom/SqueakSource/ss/Pharo
> lr-x------ 1 squeaksourcecom www-data 64 Feb 14 03:25 4 -> /srv/squeaksourcecom/SqueakSource/squeaksource.8.image
> lr-x------ 1 squeaksourcecom www-data 64 Feb 14 03:25 6 -> /srv/squeaksourcecom/SqueakSource/SqueakV50.sources
> lrwx------ 1 squeaksourcecom www-data 64 Feb 14 03:25 7 -> socket:[25825763]
> lrwx------ 1 squeaksourcecom www-data 64 Feb 14 03:25 8 -> socket:[25825764]
> lrwx------ 1 squeaksourcecom www-data 64 Feb 14 06:25 9 -> /srv/squeaksourcecom/SqueakSource/squeaksource.8.changes
>
> And then a few minutes later:
>
> squeaksourcecom@dan:~$ ls -l /proc/16147/fd
> total 0
> lr-x------ 1 squeaksourcecom www-data 64 Feb 14 03:25 0 -> /dev/null
> l-wx------ 1 squeaksourcecom www-data 64 Feb 14 03:25 1 -> pipe:[11697]
> l-wx------ 1 squeaksourcecom www-data 64 Feb 14 03:25 2 -> pipe:[11697]
> lr-x------ 1 squeaksourcecom www-data 64 Feb 14 03:25 3 -> /srv/squeaksourcecom/SqueakSource/ss/metacello
> lr-x------ 1 squeaksourcecom www-data 64 Feb 14 03:25 4 -> /srv/squeaksourcecom/SqueakSource/squeaksource.8.image
> lr-x------ 1 squeaksourcecom www-data 64 Feb 14 03:25 6 -> /srv/squeaksourcecom/SqueakSource/SqueakV50.sources
> lrwx------ 1 squeaksourcecom www-data 64 Feb 14 03:25 7 -> socket:[25825763]
> lrwx------ 1 squeaksourcecom www-data 64 Feb 14 03:25 8 -> socket:[25825764]
> lrwx------ 1 squeaksourcecom www-data 64 Feb 14 06:25 9 -> /srv/squeaksourcecom/SqueakSource/squeaksource.8.changes
>
> If your server is not actively handling requests, you should see similar
> socket usage. You'll see more file descriptors when you are using Magma,
> but keep an eye on the socket entries, and they should not accumulate
> over time.
>
> Dave


Reply | Threaded
Open this post in threaded view
|

Re: unix SqueakSource socket 'too many open files' problem

timrowledge
In reply to this post by Chris Muller-3
 
Didn't I mention that before? - the vm I can only identify as 5.0-201807260206 from the lib/squeak directory name. Levente thinks that's the default one for the original 5.3 alpha package that is almost certainly what I downloaded to build this system 18 months ago. Sounds plausible to me.

If I have to restart anytime soon I'll try the latest VM and see if it makes any difference. So far today nothing is going wrong though; it has crept up to 20 fds open - 4 sockets right now.


> On 2021-02-26, at 12:20 PM, Chris Muller <[hidden email]> wrote:
>
> Still curious as to what VM you're using...
>
> On Fri, Feb 26, 2021 at 2:04 PM tim Rowledge <[hidden email]> wrote:
>  
>
>
> > On 2021-02-25, at 8:04 PM, tim Rowledge <[hidden email]> wrote:
> >
> > After an hour or so there are 350 entries in that directory, virtually all sockets. I imagine somehow something has caused an issue that ... well, no idea. Pretty sure it isn't going to last for a week at this rate!
>
> This morning there were 976 fds. So I restarted and watched and .. 18. It's sitting there all happy and as if, as we say in the UK, "butter wouldn't melt in its mouth". What, me? Cause problems? How can you be so cruel to a young server?
>
> Maybe a port scanning/attack of some sort flooded it?
>
> tim
> --
> tim Rowledge; [hidden email]; http://www.rowledge.org/tim
> Oxymorons: Living dead
>
>


tim
--
tim Rowledge; [hidden email]; http://www.rowledge.org/tim
Calm down -- it's only ones and zeros.


Reply | Threaded
Open this post in threaded view
|

Re: unix SqueakSource socket 'too many open files' problem

K K Subbu
In reply to this post by David T. Lewis
 
On 27/02/21 2:42 am, David T. Lewis wrote:
> Here is what I saw on the squeaksource.com server a few minutes ago:
>
>    squeaksourcecom@dan:~$ ls -l /proc/16147/fd

The socket count can be generated directly with:

$ ls -l /proc/$(pgrep squeak)/fd | grep -c socket:

This can then be visually graphed every, say, 3600s with:

bars='======================================================+';
while true; do
        echo $bars | cut -c 1-$(ls -l /proc/$(pgrep squeak)/fd | grep -c socket:)
        sleep 3600
done

If this shows a monotonic growth, then a leak can be suspected.

HTH .. Subbu
Reply | Threaded
Open this post in threaded view
|

Re: unix SqueakSource socket 'too many open files' problem

stes
In reply to this post by timrowledge
 
-----BEGIN PGP SIGNED MESSAGE-----
Hash: SHA256


One thing to check is also

        ulimit -n

the numbber of open files that the shell reports.

You could change it with ulimit -n -S <n> where <n> is smaller than the
current limit.

I'm not sure whether it matters for sockets.

But it may be the case that some software
was installed on your machine that perhaps changed the profile,
and that change the ulimit -n setting.

David Lewis wrote:

> Tim, one other thing to look out for is the event handling in the VM.
> Since last October it now uses the Linux-specific epoll mechanism.
> If the problem you see now started happening after updating to a
> newer VM, then consider this as a possible suspect.

Linux epoll originated on Linux but there exist implementations on other
Operating Systems it seems.

OpenIndiana has epoll and other facilities like Berkeley BSD kqueue.

https://illumos.org/man/5/epoll

When running the Squeak opensmalltalk configure script it says for me

$ grep HAVE_EPOLL config.h

#define HAVE_EPOLL 1
#define HAVE_EPOLL_PWAIT 1

So basically the configure script logic detects on OpenIndiana <sys/epoll.h>
and enabled it.

This is because the ax_have_epoll.m4 M4 file from autoconf was used:

https://www.gnu.org/software/autoconf-archive/ax_have_epoll.html

If I configure ax_cv_have_epoll=no ax_cv_have_epoll_pwait=no

then

$ grep EPOLL config.h
/* #undef HAVE_EPOLL */
/* #undef HAVE_EPOLL_PWAIT */

So basically when epoll support was introduced,
it automatically switched on OpenIndiana to epoll.

So you could try as an experiment build the VM with

   ax_cv_have_epoll=no ax_cv_have_epoll_pwait=no

The way to do that would be to edit the "mvm" script and add the above
to the configure command line so that git diff reports something like:

- -       CFLAGS="$OPT"
+       CFLAGS="$OPT"\
+       ax_cv_have_epoll=no ax_cv_have_epoll_pwait=no

I am not sure whether it is the right way to disable epoll.

Perhaps a configure --without-epoll exists but I don't see it.

Also an issue is , how to see how the VM was built ?

In my case, by default on OpenIndiana epoll seems to be enabled now.

$ strings squeak | grep aioEnable
aioEnable(%d, %p, %d): IGNORED
aioEnable(%d): fd too large
aioEnable: descriptor %d already enabled
aioEnable(%d, %p, %d): allocating new epoll_event
aioEnable: malloc event
aioEnable(%d, %p, %d): allocating new epollEventData
...
aioEnable(%d, %p, %d): enable non-blocking io and sigio

... but when I compile with ax_cv_have_epoll=no ax_cv_have_epoll_pwait=no

then

$ strings squeak  | grep aioEnable
aioEnable(%d, %p, %d): IGNORED
aioEnable(%d): fd too large
aioEnable: descriptor %d already enabled
aioEnable(%d): external

To return to the Linux case, it could be a good experiment to build the VM
without epoll if that is supported and see if it makes a difference,
either by using the method described or by the recommended method to do it
(as the method I describe may not be the correct one).

Finally as David T. Lewis also indicates the problem may have existed
in the past before epoll, so there is of course not really an indication
that epoll is involved here.

Basically as I read my interpretation of what David T. Lewis wrote is that
file descriptor socket leaks have always been a problem - with/without epoll.

Regards,
David Stes

-----BEGIN PGP SIGNATURE-----
Version: GnuPG v2

iQEcBAEBCAAGBQJgPO/GAAoJEAwpOKXMq1MaCUUIAK0d//me/vV4uuICELaOnxEs
zCqHtdb/8J8lI4GhvBW/oniSanyduFyzCxe7powS+5m4L/ons7U/2wmQXS2w3Vcy
gY9pbXm4m5v9xIBUC0oD/REqhTo0epeem8Q/2GN6bvHPBjTt+BWVh3Z42l6HN/TB
LXFfnXPck5buU2Lzo27brXE2XAweZx2JKdjhxB+YGOzsiwjWECqiy+zDXwXi2uN7
KG2ijFdQ6QTd5TaeQWgLhpFT/gnGX1ss8KPlkeKdcE7SH+EM0zPiie50bulUaJ7Q
viLM3pT7+K5U9NZ7FXbkhACg4MR8MPz0llHC4zUznqwiaNv2XEwf4Tk+Zy+rnhs=
=7R9P
-----END PGP SIGNATURE-----
Reply | Threaded
Open this post in threaded view
|

Re: unix SqueakSource socket 'too many open files' problem

timrowledge
In reply to this post by K K Subbu
 


> On 2021-02-26, at 10:32 PM, K K Subbu <[hidden email]> wrote:
>
> On 27/02/21 2:42 am, David T. Lewis wrote:
>> Here is what I saw on the squeaksource.com server a few minutes ago:
>>   squeaksourcecom@dan:~$ ls -l /proc/16147/fd
>
> The socket count can be generated directly with:
>
> $ ls -l /proc/$(pgrep squeak)/fd | grep -c socket:

That's pretty neat, thanks.

>
> This can then be visually graphed every, say, 3600s with:
>
> bars='======================================================+';
> while true; do
> echo $bars | cut -c 1-$(ls -l /proc/$(pgrep squeak)/fd | grep -c socket:)
> sleep 3600
> done
>

Goodness; shell script bargraphs. Whatever next?


> If this shows a monotonic growth, then a leak can be suspected.

I'm pretty sure there is a slow leak; but slow. It's climbed from 19 to 30 over three days


tim
--
tim Rowledge; [hidden email];
"You're a ghost driving a meat-coated skeleton made from stardust,
riding a rock, hurtling through space. Fear nothing"

Reply | Threaded
Open this post in threaded view
|

Re: unix SqueakSource socket 'too many open files' problem

K K Subbu
 
On 02/03/21 9:38 am, tim Rowledge wrote:
>> If this shows a monotonic growth, then a leak can be suspected.
> I'm pretty sure there is a slow leak; but slow. It's climbed from 19 to 30 over three days
Tim,

Ugh! slow leaks can be really annoying to diagnose. You may narrow down
the specific leak with ss (socket stats)

$ ss -s

to get a breakdown by tcp/udp ipv4/ipv6 etc. or

$ ss state close-wait

to catch all sockets in close-wait state. The arguments is a filter
condition and the man page gives examples of various filter expressions.

You may also use the watch command to sample it every day (-n 86400) and
highlight the difference (-d):

$ watch -n 86400 -d ss state close-wait

BTW, an early trick I learned is to keep a small terminal running such
commands in a corner (or second monitor these days). This lets me use
the center of the screen for all active work. Any gross changes in shape
(char graphs ;-)) or highlight will catch the corner of the eye.

HTH .. Subbu