Hi guys! Long email but... work for hire here! :) In short: Ron and I (3DICC) have a problem with the Unix VM networking and I am reaching out before burning too many hours on something one of you C-Unix/Socket/VM guys can fix in an afternoon - and earn a buck for your trouble. In looong: So... we are running a large deployment of Terf (yay!) that is "pushing" the server side VMs a bit. The large load has caused us to experience some issues. Our Possibly Faulted Analysis So Far ==================================== One of our server side VMs, the one that unfortunately is a crucial resource, locks up and doesn't respond on its most important listening Socket port. VM does not crash however. We reboot it, because its a stressful situation with LOTS of users being affected, so we haven't looked "inside". Unfortunately the shell script starting the VMs wasn't catching stderr to a log file (Argh! Now it does though so we will see if we get more info later) so we have missed some info here but Ron "got lucky" and saw this on his terminal (stderr of the VM going to his terminal instead of log file): "errno 9 select: Bad file descriptor" It took us quite some time before we realized this was indeed Squeak talking, and that it was from inside aio.c - a call from aioPoll(): perror("select") ...ok. Some important background info: Before this we hit the default ulimit of 1024 per user (duh!), causing "Too Many Files Open", so we raised them silly high. That did make the system handle itself - but in retrospect we think another issue (related to SAML auth) caused tons of client requests getting spawned from this VM and thus is what made us reach the limit in the first place. It may also have been the factor (=many many sockets) that in the end caused the errno 9 described above - see below for reasoning. After perusing the IMHO highly confusing :) Socket code (no offense of course, its probably trivial to a Unix C-guru) at least we understand that the code uses select() and not a more modern poll() or epoll(). In fact there is also a call to select() in sqUnixSocket.c, but... probably not relevant. Yeah, epoll() is not portable etc, we know, but frankly we only care for Linux here. Googling shows us further that select() has issues, I mean, yikes. And the thing I think we might be hitting here is the fact that select() doesn't handle more than 1024 file descriptors!!! (as far as I can understand the writing on the Internet) and to make it worse, it seems to be able to go bananas if you push it there... Now... again, Internet seems to imply that the "usual" cause of "errno 9" is doing a select on an fd that has already been closed. Typical bug causing this is accidentally closing an fd twice - and thus, if you are unlucky, accidentally the second time closing the fd when it actually has already managed to get reused and thus is open agian. Oops. But it seems unreasonable to think that *such* a bug exists in this code after all these years. And I am simply guessing its the >1024 fd problem biting us, but yeah, we don't know. And I also guess it was that SAML issue, in combination with raised ulimits, that made us even get over 1024. Things we are planning: - Come up with a test case showing it blowing up. Will try to do that next week. - Start looking at how to use poll() or epoll() instead, because we need to be SOLID here and we can't afford the 1024 limit. So... anyone interested? Any brilliant thoughts? AFAICT we can rest assured that there is a bug here somewhere, because otherwise we wouldn't be able to get "errno 9 Bad file descriptor", right? regards, Göran PS. Googling this in relation to the Squeak VM didn't show any recent hits, only some from Stephen Pair fixing a bug in X11 code etc. |
Hi Both,
On Thu, Oct 9, 2014 at 3:05 PM, Göran Krampe <[hidden email]> wrote:
Cool. This is likely easy to fix. Your image is running out of file descriptors. Track open and close calls, e.g. add logging around at least StandardFileStream>>#primOpen:writable: , AsyncFile>>#primOpen:forWrite:semaIndex:, Socket>>#primAcceptFrom:receiveBufferSize:sendBufSize:semaIndex:readSemaIndex:writeSemaIndex: and their associated close calls and see what's being opened without being closed. It shoudl be easy to track=down, but may be more difficult to fix. Good luck!
best, Eliot
|
On Thu, Oct 09, 2014 at 03:27:05PM -0700, Eliot Miranda wrote: > > Hi Both, > > On Thu, Oct 9, 2014 at 3:05 PM, G??ran Krampe <[hidden email]> wrote: > > > > > Hi guys! > > > > Long email but... work for hire here! :) > > > > In short: > > > > Ron and I (3DICC) have a problem with the Unix VM networking and I am > > reaching out before burning too many hours on something one of you > > C-Unix/Socket/VM guys can fix in an afternoon - and earn a buck for your > > trouble. > > > > Cool. This is likely easy to fix. Your image is running out of file > descriptors. Track open and close calls, e.g. add logging around at > least StandardFileStream>>#primOpen:writable: > , AsyncFile>>#primOpen:forWrite:semaIndex:, > Socket>>#primAcceptFrom:receiveBufferSize:sendBufSize:semaIndex:readSemaIndex:writeSemaIndex: > and their associated close calls and see what's being opened without being > closed. It shoudl be easy to track=down, but may be more difficult to fix. > > Good luck! I agree with what Eliot is saying and would add a few thoughts: - Don't fix the wrong problem (DFtWP). Unless you have some reason to believe that this server application would realistically have a need to handle anything close to a thousand concurrent TCP sessions, don't fix it by raising the per-process file handle limit, and don't fix it by reimplementing the socket listening code. - It is entirely possible that no one before you has ever tried to run a server application with the per-process file handle limit bumped up above the default 1024. So if that configuration does not play nicely with the select() mechanism, you may well be the first to have encountered this as an issue. But see above, don't fix it if it ain't broke. - Most "out of file descriptor" problems involve resource leaks (as Eliot is suggesting), and in those cases you will see a gradual increase in file descriptors in /proc/<vmpid>/fd/ over time. Eventually you run out of descriptors and something horrible happens. - Before doing anything else, you must confirm if this is a resource leak, with file descriptor use continuously increasing (which is what Eliot and I are both assuming to be the case here), or if it is a real resource issue in which your server has a legitimate need to maintain a very large number of TCP connections concurrently. Given that you have a running application with real users, you will probably want to do this with something like a shell script keeping track of the /proc/<pid>/fd/ directory for the running VM. (In squeaksource.com, there is an undiagnosed file handle leak similar to what I think you are experiencing. My kludgy workaround is a process in the image that uses OSProcess to count entries in /proc/<pid>/fd/ and restart the image when the file descriptor situation becomes dire). - Finding file (socket) handle leaks is trickly, and if you have customers depending on this, you probably do not have the luxury of fixing it right. Is there any way to periodically restart the server image without causing pain to the customer? If so, consider a kludgy workaround like I did for squeaksource. Monitor the VM process for file handle leaks and restart it proactively rather than waiting for a catastrophic failure. You can do this all from within the image, I will dig out my squeaksource hack if you think it may be of any help. - Sorry to repeat myself but this is by far the most important point: DFtWP. Dave > > > > In looong: > > > > So... we are running a large deployment of Terf (yay!) that is "pushing" > > the server side VMs a bit. The large load has caused us to experience some > > issues. > > > > > > Our Possibly Faulted Analysis So Far > > ==================================== > > > > One of our server side VMs, the one that unfortunately is a crucial > > resource, locks up and doesn't respond on its most important listening > > Socket port. VM does not crash however. We reboot it, because its a > > stressful situation with LOTS of users being affected, so we haven't looked > > "inside". > > > > Unfortunately the shell script starting the VMs wasn't catching stderr to > > a log file (Argh! Now it does though so we will see if we get more info > > later) so we have missed some info here but Ron "got lucky" and saw this on > > his terminal (stderr of the VM going to his terminal instead of log file): > > > > "errno 9 > > select: Bad file descriptor" > > > > It took us quite some time before we realized this was indeed Squeak > > talking, and that it was from inside aio.c - a call from aioPoll(): > > perror("select") > > > > ...ok. Some important background info: > > > > Before this we hit the default ulimit of 1024 per user (duh!), causing > > "Too Many Files Open", so we raised them silly high. That did make the > > system handle itself - but in retrospect we think another issue (related to > > SAML auth) caused tons of client requests getting spawned from this VM and > > thus is what made us reach the limit in the first place. It may also have > > been the factor (=many many sockets) that in the end caused the errno 9 > > described above - see below for reasoning. > > > > After perusing the IMHO highly confusing :) Socket code (no offense of > > course, its probably trivial to a Unix C-guru) at least we understand that > > the code uses select() and not a more modern poll() or epoll(). In fact > > there is also a call to select() in sqUnixSocket.c, but... probably not > > relevant. > > > > Yeah, epoll() is not portable etc, we know, but frankly we only care for > > Linux here. > > > > Googling shows us further that select() has issues, I mean, yikes. And the > > thing I think we might be hitting here is the fact that select() doesn't > > handle more than 1024 file descriptors!!! (as far as I can understand the > > writing on the Internet) and to make it worse, it seems to be able to go > > bananas if you push it there... > > > > Now... again, Internet seems to imply that the "usual" cause of "errno 9" > > is doing a select on an fd that has already been closed. Typical bug > > causing this is accidentally closing an fd twice - and thus, if you are > > unlucky, accidentally the second time closing the fd when it actually has > > already managed to get reused and thus is open agian. Oops. > > > > But it seems unreasonable to think that *such* a bug exists in this code > > after all these years. And I am simply guessing its the >1024 fd problem > > biting us, but yeah, we don't know. And I also guess it was that SAML > > issue, in combination with raised ulimits, that made us even get over 1024. > > > > Things we are planning: > > > > - Come up with a test case showing it blowing up. Will try to do that next > > week. > > - Start looking at how to use poll() or epoll() instead, because we need > > to be SOLID here and we can't afford the 1024 limit. > > > > So... anyone interested? Any brilliant thoughts? AFAICT we can rest > > assured that there is a bug here somewhere, because otherwise we wouldn't > > be able to get "errno 9 Bad file descriptor", right? > > > > regards, G??ran > > > > PS. Googling this in relation to the Squeak VM didn't show any recent > > hits, only some from Stephen Pair fixing a bug in X11 code etc. > > > > > > -- > best, > Eliot |
Hi guys! On 10/10/2014 01:28 AM, David T. Lewis wrote: >>> Ron and I (3DICC) have a problem with the Unix VM networking and I am >>> reaching out before burning too many hours on something one of you >>> C-Unix/Socket/VM guys can fix in an afternoon - and earn a buck for your >>> trouble. >> >> Cool. This is likely easy to fix. Your image is running out of file >> descriptors. Track open and close calls, e.g. add logging around at >> least StandardFileStream>>#primOpen:writable: >> , AsyncFile>>#primOpen:forWrite:semaIndex:, >> Socket>>#primAcceptFrom:receiveBufferSize:sendBufSize:semaIndex:readSemaIndex:writeSemaIndex: >> and their associated close calls and see what's being opened without being >> closed. It shoudl be easy to track=down, but may be more difficult to fix. >> >> Good luck! Aha. Soo... am I understanding this correctly - we are probably leaking fds and when we go above 1024 this makes select() go bonkers and eventually leads to the "Bad file descriptor" error? > I agree with what Eliot is saying and would add a few thoughts: > - Don't fix the wrong problem (DFtWP). Unless you have some reason to > believe that this server application would realistically have a need to > handle anything close to a thousand concurrent TCP sessions, don't fix > it by raising the per-process file handle limit, and don't fix it by > reimplementing the socket listening code. We haven't done the exact numbers, but we could probably hit several hundreds concurrent at least. 1024 seemed a bit "over the top" though :) The system in question is meant to serve more than 1000 concurrent users, so we are in fact moving into this territory. We have been up to around 600 so far. > - It is entirely possible that no one before you has ever tried to run > a server application with the per-process file handle limit bumped up > above the default 1024. So if that configuration does not play nicely > with the select() mechanism, you may well be the first to have encountered > this as an issue. But see above, don't fix it if it ain't broke. Well, it most probably *is* broke - I mean - I haven't read anywhere that our Socket code is limited to 1024 concurrent sockets and that going above that limit causes the Socket code to stop working? :) But I agree - I don't want to touch that code if we can simply avoid this bug by making sure we stay below 1024. But it sounds broke to me, nevertheless. ;) > - Most "out of file descriptor" problems involve resource leaks (as Eliot > is suggesting), and in those cases you will see a gradual increase in file > descriptors in /proc/<vmpid>/fd/ over time. Eventually you run out of > descriptors and something horrible happens. We will start looking at that and other tools too. > - Sorry to repeat myself but this is by far the most important point: DFtWP. Sure :). This is why I posted - to get your input. And I have a suspicion that the SAML issue I mentioned may be the code leaking, we will start looking. regards, Göran |
On Fri, Oct 10, 2014 at 02:09:08AM +0200, G?ran Krampe wrote: > > Hi guys! > > On 10/10/2014 01:28 AM, David T. Lewis wrote: > >>>Ron and I (3DICC) have a problem with the Unix VM networking and I am > >>>reaching out before burning too many hours on something one of you > >>>C-Unix/Socket/VM guys can fix in an afternoon - and earn a buck for your > >>>trouble. > >> > >>Cool. This is likely easy to fix. Your image is running out of file > >>descriptors. Track open and close calls, e.g. add logging around at > >>least StandardFileStream>>#primOpen:writable: > >>, AsyncFile>>#primOpen:forWrite:semaIndex:, > >>Socket>>#primAcceptFrom:receiveBufferSize:sendBufSize:semaIndex:readSemaIndex:writeSemaIndex: > >>and their associated close calls and see what's being opened without being > >>closed. It shoudl be easy to track=down, but may be more difficult to > >>fix. > >> > >>Good luck! > > Aha. Soo... am I understanding this correctly - we are probably leaking > fds and when we go above 1024 this makes select() go bonkers and > eventually leads to the "Bad file descriptor" error? > > >I agree with what Eliot is saying and would add a few thoughts: > > >- Don't fix the wrong problem (DFtWP). Unless you have some reason to > >believe that this server application would realistically have a need to > >handle anything close to a thousand concurrent TCP sessions, don't fix > >it by raising the per-process file handle limit, and don't fix it by > >reimplementing the socket listening code. > > We haven't done the exact numbers, but we could probably hit several > hundreds concurrent at least. 1024 seemed a bit "over the top" though :) > > The system in question is meant to serve more than 1000 concurrent > users, so we are in fact moving into this territory. We have been up to > around 600 so far. > > >- It is entirely possible that no one before you has ever tried to run > >a server application with the per-process file handle limit bumped up > >above the default 1024. So if that configuration does not play nicely > >with the select() mechanism, you may well be the first to have encountered > >this as an issue. But see above, don't fix it if it ain't broke. > > Well, it most probably *is* broke - I mean - I haven't read anywhere > that our Socket code is limited to 1024 concurrent sockets and that > going above that limit causes the Socket code to stop working? :) > > But I agree - I don't want to touch that code if we can simply avoid > this bug by making sure we stay below 1024. > > But it sounds broke to me, nevertheless. ;) Indeed it probably is. > > >- Most "out of file descriptor" problems involve resource leaks (as Eliot > >is suggesting), and in those cases you will see a gradual increase in file > >descriptors in /proc/<vmpid>/fd/ over time. Eventually you run out of > >descriptors and something horrible happens. > > We will start looking at that and other tools too. > > > >- Sorry to repeat myself but this is by far the most important point: > >DFtWP. > > Sure :). This is why I posted - to get your input. And I have a > suspicion that the SAML issue I mentioned may be the code leaking, we > will start looking. > > regards, G?ran Cool. Please keep us posted on progress. In the unlikely event that free advice turns out to be worth anything, there is plenty more where that came from ;-) Dave |
If I recall correctly socket file handles are managed by the weak object logic. In the past I've already said: "Finalization will run later and likely kill you in a production environment" Given a high attack rate for acquiring sockets do you know when the socket instance goes away, and when the file descriptor is destroyed via cleanup of weak object, or the forced CG cycle? On Thu, Oct 9, 2014 at 5:21 PM, David T. Lewis <[hidden email]> wrote:
===========================================================================
John M. McIntosh <[hidden email]> https://www.linkedin.com/in/smalltalk =========================================================================== |
In reply to this post by David T. Lewis
On Thu, Oct 9, 2014 at 4:28 PM, David T. Lewis <[hidden email]> wrote:
Great message David. You've nailed it.
best, Eliot
|
In reply to this post by Göran Krampe
On Thu, Oct 9, 2014 at 5:09 PM, Göran Krampe <[hidden email]> wrote:
I'm not sure, but that you're needing to up the per-process file handle limit is worrying. That you should diagnose first. If you solve it (and its likely to be easy; you can do things like maintain sets of open sockets, etc, and close the least recently used when reaching some high-tide mark, etc) then I suspect the select problems will go away too. I agree with what Eliot is saying and would add a few thoughts: But each connexion could have a few sockets, and then there may be file connexions etc. Best look see what you have there and sanity-check.
"Broke" is the wrong word. You're running into a soft resource limit that you can raise. But you should only raise it if you know the server code is correct, because raising the limit when incorrect code causes a gradual increase in open file descriptors will simply postpone the inevitable. And the longer the server appears to run healthily the more mysterious and annoying the crash may appear :-).
You may have to go above 1024 when you have >= 1024 users (or 512 or what ever). But you should understand the relationship between connexions and open file descriptors and know that there are no leaks before you up the limit. But it sounds broke to me, nevertheless. ;) If it is broke it is the OS that is broke. But not really. Having a soft limit is a great way to find problems (leaks) while providing flexibility. With no limit the system runs until catastrophe, probably bringing the OS down with it. This is a bit like Smalltalk catching infinite recursion through the low space condition; by the time the recursion is stopped there's precious little resource to do anything about it (not to mention that the stack will be huge). So only lift the limit when you know the code is correct and you know you need more resources to serve the anticipated number of clients. - Most "out of file descriptor" problems involve resource leaks (as Eliot best, Eliot
|
Hi! On 10/10/2014 02:50 AM, Eliot Miranda wrote: > On Thu, Oct 9, 2014 at 5:09 PM, Göran Krampe <[hidden email] > Well, it most probably *is* broke - I mean - I haven't read anywhere > that our Socket code is limited to 1024 concurrent sockets and that > going above that limit causes the Socket code to stop working? :) > > > "Broke" is the wrong word. You're running into a soft resource limit > that you can raise. But you should only raise it if you know the server No, see below. When I say 1024 I am not referring to the soft limit. I am referring to the fact that select() can not handle more than 1024, let me quote from the net: "The limitation comes from the way select() works, most concretely the data type used to represent the list of file descriptors, fd_set. Let’s take a look at fd_set in /usr/include/sys/select.h You will see a definition pretty much like: typedef struct { long int fds_bits[32]; } fd_set; I removed a bunch of macros to make the code more clear. As you see you have a static array of 32 long ints. This is just a bit map, where each bit represents a file descriptor number. 32 * sizeof(long int), for 32 bit platforms is 1024. So, if you do fd_set(&fd, 10), to add file descriptor 10 to an fd_set, it will just set to 1 the 10th bit in the bit map, what happens then if you do fd_set(&fd, 2000) ?, you guessed right, unpredictable. May be some sort of array overflow (fd_set is, at least on my system, implemented using the assembly instruction btsl, bit test and set)." > But it sounds broke to me, nevertheless. ;) > > If it is broke it is the OS that is broke. But not really. Having a > soft limit is a great way to find problems (leaks) while providing > flexibility. With no limit the system runs until catastrophe, probably But... I am not sure if you are misunderstanding or not. We did raise the limit but then we got the "Bad descriptor file" not the "Too many files open". What I mean with "broke" is the fact that select() can not handle more than 1024 *regardless* of limits. So I think we can not go above 1024 if we still use select(). This is the bug I am talking about. regards, Göran |
In reply to this post by David T. Lewis
Hi All! Thanks for your help!! Trying to reproduce the error I ran 1000 clients against our test cluster of 7 servers. The fd size hovers around 250 and never made it above 490 (there were 4 service providers so the main server should only handle around 250 people). David, Thanks for the pointers on how to find the fd size. Very cool. Since the fd size seemed to be well within proper limits and went both up and down I think you are correct we may be looking at the wrong thing. Could be that we are closing a socket twice? (see 0xffa5e0cc M Socket>destroy 0xd9ba02d4: a(n) Socket) Göran mentioned finding something that said on a very busy server you may try closing an already closed socket that had already been reused. On not so busy system you can get away with it so you may not see this error until you put a very large load on it. I captured the error: This repeated about 1000 times (no idea why it would repeat so many times): errno 9 select: Bad file descriptor Then: Segmentation fault Fri Oct 10 02:46:17 2014 Squeak VM version: 4.0-2798 #1 Wed Nov 13 12:31:45 UTC 2013 gcc 4.4.7 [Production ITHB VM] Built from: CoInterpreter VMMaker.oscog-eem.496 uuid: f0401045-f2f7-470e-9940-3535be9c0334 Nov 13 2013 With: StackToRegisterMappingCogit VMMaker.oscog-eem.496 uuid: f0401045-f2f7-470e-9940-3535be9c0334 Nov 13 2013 Revision: VM: r2798 http://www.squeakvm.org/svn/squeak/branches/Cog Plugins: r2545 http://squeakvm.org/svn/squeak/trunk/platforms/Cross/plugins Build host: Linux cogvm 2.6.32-358.el6.x86_64 #1 SMP Fri Feb 22 00:31:26 UTC 2013 x86_64 x86_64 x86_64 GNU/Linux plugin path: /home/openqwaq/server/bin/forums/../qwaqvm/plugins [default: /home/openqwaq/server/bin/cogvm/lib/squeak/4.0-2798/] C stack backtrace & registers: eax 0xffa191a4 ebx 0xffa190c0 ecx 0xffa19158 edx 0xffa1910c edi 0xffa18f90 esi 0xffa18f90 ebp 0xffa19028 esp 0xffa19074 eip 0xffa19288 *[0xffa19288] /home/openqwaq/server/bin/forums/../qwaqvm/lib/squeak/4.0-2798/squeak[0x805e 54e] /home/openqwaq/server/bin/forums/../qwaqvm/lib/squeak/4.0-2798/squeak[0x805e 84f] [0x5ad410] /home/openqwaq/server/bin/forums/../qwaqvm/lib/squeak/4.0-2798/squeak(sqSock etCloseConnection+0x26)[0x811f0a6] /home/openqwaq/server/bin/forums/../qwaqvm/lib/squeak/4.0-2798/squeak(sqSock etDestroy+0x40)[0x811fad0] /home/openqwaq/server/bin/forums/../qwaqvm/lib/squeak/4.0-2798/squeak[0x811c 9f8] [0xd7444848] /home/openqwaq/server/bin/forums/../qwaqvm/lib/squeak/4.0-2798/squeak(interp ret+0x76e)[0x8084eee] /home/openqwaq/server/bin/forums/../qwaqvm/lib/squeak/4.0-2798/squeak(main+0 x2b4)[0x805efa4] /lib/libc.so.6(__libc_start_main+0xe6)[0x2f4d36] /home/openqwaq/server/bin/forums/../qwaqvm/lib/squeak/4.0-2798/squeak[0x805b d41] [0xffa18c80] Smalltalk stack dump: 0xffa5e0cc M Socket>destroy 0xd9ba02d4: a(n) Socket 0xffa5e0e4 M QServiceProviderClient>destroy 0xd9c33dd4: a(n) QServiceProviderClient 0xffa5e0fc M QServiceProviderClient>connectionClosed 0xd9c33dd4: a(n) QServiceProviderClient 0xffa5e114 M [] in QServiceProviderClient>processRequest 0xd9c33dd4: a(n) QServiceProviderClient 0xffa5e138 M Socket>waitForDataFor:ifClosed:ifTimedOut: 0xd9ba02d4: a(n) Socket 0xffa5e164 M QServiceProviderClient>processRequest 0xd9c33dd4: a(n) QServiceProviderClient 0xffa5e17c M [] in QServiceProviderClient>on:in: 0xd9c33dd4: a(n) QServiceProviderClient 0xffa5e194 M [] in QSafeProcessLoop>run 0xd9c35efc: a(n) QSafeProcessLoop 0xffa5e1b0 M BlockClosure>on:do: 0xda118824: a(n) BlockClosure 0xffa5e1d0 M QSafeProcessLoop>run 0xd9c35efc: a(n) QSafeProcessLoop 0xd9c646dc s [] in QSafeProcessLoop(Object)>fork:at: 0xd9c36020 s [] in BlockClosure>newProcess Most recent primitives primSocketConnectionStatus: primSocketSendDone: primSocket:sendData:startIndex:count: millisecondClockValue primitiveWait millisecondClockValue millisecondClockValue primitiveEnterCriticalSection primitiveEnterCriticalSection at:put: findNextUnwindContextUpTo: tempAt: tempAt:put: tempAt: terminateTo: primitiveExitCriticalSection findNextUnwindContextUpTo: terminateTo: primitiveExitCriticalSection new: at:put: at:put: updateCrc:from:to:in: updateCrc:from:to:in: bitXor: at:put: at:put: at:put: at:put: at:put: at:put: at:put: at:put: replaceFrom:to:with:startingAt: primitiveEnterCriticalSection at:put: primitiveExitCriticalSection primitiveEnterCriticalSection = at:put: primitiveExitCriticalSection primitiveEnterCriticalSection at:put: primitiveExitCriticalSection replaceFrom:to:with:startingAt: primitiveEnterCriticalSection at:put: primitiveExitCriticalSection primitiveEnterCriticalSection = at:put: primitiveExitCriticalSection primitiveEnterCriticalSection at:put: primitiveExitCriticalSection signal primitiveEnterCriticalSection primitiveEnterCriticalSection findNextUnwindContextUpTo: tempAt: tempAt:put: tempAt: terminateTo: primitiveExitCriticalSection findNextUnwindContextUpTo: terminateTo: at:put: at:put: primitiveExitCriticalSection primSocketConnectionStatus: primSocketSendDone: primSocketConnectionStatus: primSocketSendDone: primSocket:sendData:startIndex:count: millisecondClockValue primitiveWait millisecondClockValue primitiveWait primitiveWait primitiveWait primitiveWait primitiveWait primitiveWait millisecondClockValue millisecondClockValue basicNew: at:put: basicNew: integerAt:put:size:signed: basicNew: primitiveARC4Process:from:to:m:x:y: findSubstring:in:startingAt:matchTable: basicNew stringHash:initialHash: hashBytes:startingWith: basicNew at:put: replaceFrom:to:with:startingAt: replaceFrom:to:with:startingAt: stringHash:initialHash: basicNew stringHash:initialHash: primitiveEnterCriticalSection primitiveEnterCriticalSection at:put: primitiveExitCriticalSection at:put: at:put: signal millisecondClockValue millisecondClockValue primitiveEnterCriticalSection primitiveExitCriticalSection primitiveEnterCriticalSection at:put: findNextUnwindContextUpTo: tempAt: tempAt:put: tempAt: terminateTo: primitiveExitCriticalSection findNextUnwindContextUpTo: terminateTo: primitiveExitCriticalSection new: at:put: at:put: updateCrc:from:to:in: updateCrc:from:to:in: bitXor: at:put: at:put: at:put: at:put: at:put: at:put: at:put: at:put: replaceFrom:to:with:startingAt: primitiveEnterCriticalSection at:put: primitiveExitCriticalSection primitiveEnterCriticalSection = at:put: primitiveExitCriticalSection primitiveEnterCriticalSection at:put: primitiveExitCriticalSection replaceFrom:to:with:startingAt: primitiveEnterCriticalSection at:put: primitiveExitCriticalSection primitiveEnterCriticalSection = at:put: primitiveExitCriticalSection primitiveEnterCriticalSection at:put: primitiveExitCriticalSection primitiveEnterCriticalSection primitiveEnterCriticalSection findNextUnwindContextUpTo: tempAt: tempAt:put: tempAt: terminateTo: primitiveExitCriticalSection findNextUnwindContextUpTo: terminateTo: at:put: at:put: primitiveExitCriticalSection primSocketConnectionStatus: primSocketSendDone: primSocketConnectionStatus: primSocketSendDone: primSocket:sendData:startIndex:count: millisecondClockValue primitiveWait basicNew basicNew primitiveEnterCriticalSection new: replaceFrom:to:with:startingAt: at:put: new: replaceFrom:to:with:startingAt: at:put: primitiveExitCriticalSection basicNew basicNew primitiveEnterCriticalSection new: replaceFrom:to:with:startingAt: at:put: new: replaceFrom:to:with:startingAt: at:put: primitiveExitCriticalSection basicNew millisecondClockValue primitiveWait primitiveWait primitiveWait signal primitiveWait signal primitiveWait signal primitiveWait signal primitiveWait signal primitiveWait signal signal primitiveWait primitiveEnterCriticalSection replaceFrom:to:with:startingAt: primSocketConnectionStatus: primitiveExitCriticalSection suspend primSocketReceiveDataAvailable: primSocketReceiveDataAvailable: primSocketConnectionStatus: basicNew: replaceFrom:to:with:startingAt: replaceFrom:to:with:startingAt: basicNew millisecondClockValue = digitMultiply:neg: < basicNew pointsTo: basicNew: replaceFrom:to:with:startingAt: replaceFrom:to:with:startingAt: basicNew: replaceFrom:to:with:startingAt: replaceFrom:to:with:startingAt: basicNew: replaceFrom:to:with:startingAt: replaceFrom:to:with:startingAt: primitiveWait at:put: signal signal basicNew primitiveWait at:put: signal signal primSocketConnectionStatus: primSocketDestroy: stack page bytes 4096 available headroom 3300 minimum unused headroom 2164 (Segmentation fault) > From: David T. Lewis > Sent: Thursday, October 09, 2014 8:22 PM > > On Fri, Oct 10, 2014 at 02:09:08AM +0200, G?ran Krampe wrote: > > > > Hi guys! Hi Eliot, hope you are happy and well! > > > > On 10/10/2014 01:28 AM, David T. Lewis wrote: > > >>>Ron and I (3DICC) have a problem with the Unix VM networking and I > > >>>am reaching out before burning too many hours on something one of > > >>>you C-Unix/Socket/VM guys can fix in an afternoon - and earn a buck > > >>>for your trouble. > > >> > > >>Cool. This is likely easy to fix. Your image is running out of > > >>file descriptors. Track open and close calls, e.g. add logging > > >>around at least StandardFileStream>>#primOpen:writable: > > >>, AsyncFile>>#primOpen:forWrite:semaIndex:, > > > >>Socket>>#primAcceptFrom:receiveBufferSize:sendBufSize:semaIndex:rea > dSemaIndex:writeSemaIndex: > > >>and their associated close calls and see what's being opened without > > >>being closed. It shoudl be easy to track=down, but may be more > > >>difficult to fix. > > >> > > >>Good luck! > > > > Aha. Soo... am I understanding this correctly - we are probably > > leaking fds and when we go above 1024 this makes select() go bonkers > > and eventually leads to the "Bad file descriptor" error? > > > > >I agree with what Eliot is saying and would add a few thoughts: > > > > >- Don't fix the wrong problem (DFtWP). Unless you have some reason to > > >believe that this server application would realistically have a need > > >to handle anything close to a thousand concurrent TCP sessions, don't > > >fix it by raising the per-process file handle limit, and don't fix it > > >by reimplementing the socket listening code. > > > > We haven't done the exact numbers, but we could probably hit several > > hundreds concurrent at least. 1024 seemed a bit "over the top" though > > :) > > > > The system in question is meant to serve more than 1000 concurrent > > users, so we are in fact moving into this territory. We have been up > > to around 600 so far. > > > > >- It is entirely possible that no one before you has ever tried to > > >run a server application with the per-process file handle limit > > >bumped up above the default 1024. So if that configuration does not > > >play nicely with the select() mechanism, you may well be the first to > > >have encountered this as an issue. But see above, don't fix it if it > broke. > > > > Well, it most probably *is* broke - I mean - I haven't read anywhere > > that our Socket code is limited to 1024 concurrent sockets and that > > going above that limit causes the Socket code to stop working? :) > > > > But I agree - I don't want to touch that code if we can simply avoid > > this bug by making sure we stay below 1024. > > > > But it sounds broke to me, nevertheless. ;) > > Indeed it probably is. > > > > > >- Most "out of file descriptor" problems involve resource leaks (as > > >Eliot is suggesting), and in those cases you will see a gradual > > >increase in file descriptors in /proc/<vmpid>/fd/ over time. Thank you for the pointer. That's a pretty cool directory to look at! > > >Eventually you run out of descriptors and something horrible happens. > > > > We will start looking at that and other tools too. > > > > > > >- Sorry to repeat myself but this is by far the most important point: > > >DFtWP. > > > > Sure :). This is why I posted - to get your input. And I have a > > suspicion that the SAML issue I mentioned may be the code leaking, we > > will start looking. > > > > regards, G?ran > > Cool. Please keep us posted on progress. In the unlikely event that free > advice turns out to be worth anything, there is plenty more where that > came from ;-) > > Dave > Ron Teitelbaum |
In reply to this post by Göran Krampe
Just my too cents: We ran into the “errno 9 select: Bad file descriptor” error two weaks back while playing around with systemd and socket activation. In our case was the following: 1. send request to socket -> image starts through socket activation 2. image initializes itself with the special systemd socket handle and keeps track of it (because we want to reuse it after restarting the image) 3. save the image 4. errno 9 What we found then was, that when the image was being saved, stderr would be closed and then reopened by the system, as did the systemd socket (which was a different problem). Unfortunately we always assumed a fixed number for the systemd socket, namely 3. Now, when the image reopened stderr it could happen that another socket had taken fd 2 and stderr would be on fd 3, thus we tried to open stderr as TCP socket. What I want to say is, just really make sure that you’re not doing something funny with sockets (e.g. fixed fd number) before you look at the VM. Good luck! Cheers, Max
|
In reply to this post by Göran Krampe
Disclaimer: I just realized I’m not 100% sure the scenario I described was the actual cause of errno 9. It might have been another thing we did, where we tried to reuse the memory handle for the socket (don’t judge… we duly facepalmed ourselves when we realized how silly that was :) ). Still, the basic idea remains the same: make sure you don’t mess with sockets. Just my too cents: |
In reply to this post by Ron Teitelbaum
Hi guys! Long post but an interesting one. Summary for the impatient: - Ron seems to prove that: - We never reach 1024 fds so no leak. - But select() still fails, so we probably have an obscure close() bug with fds. - And I prove that select() is indeed broken above 1024. - And some other interesting details like why "Socket new" creates *two* instances of Socket etc. Details follows: On 10/10/2014 05:56 AM, Ron Teitelbaum wrote: > Hi All! > > Thanks for your help!! > > Trying to reproduce the error I ran 1000 clients against our test cluster of > 7 servers. Hehe, this is fairly interesting in itself - Ron is firing up a whole slew of machines on EC2 and is in fact running 1000 Terf clients on ... 50 servers? I think, and they actually run under Wine in Ubuntu :) > The fd size hovers around 250 and never made it above 490 (there were 4 > service providers so the main server should only handle around 250 people). Very interesting. Thus, this is NOT the 1024 limitation of select() then, nor are we leaking. Unfortunately - since it implies a harder bug to find. > David, Thanks for the pointers on how to find the fd size. Very cool. > Since the fd size seemed to be well within proper limits and went both up > and down I think you are correct we may be looking at the wrong thing. Well, David seems to have been both right and wrong - I guessed the wrong thing, its probably not the 1024-limit of select() that is the *cause* of our problem - but its not a leak, more likely an obscure close bug. Right? :) NOTE: After writing this whole post I am not sure. Given that I can reproduce the errorno 9 spewage by simply crossing the 1024 limit (see below) it would still be nice to verify that is NOT what is happening when you throw 1000 users at it. You did check the fd count using David's instructions, can we be sure that we are way below? Select() *is* broken ==================== Further, just so we don't drop this ball, there *is* a problem with select() if we raise the ulimit above 1024. If I first run this trivial code (Apache answers on localhost:80) with the default limit of 1024: | local | local := NetNameResolver addressForName: 'localhost' timeout: 20. (1 to: 1040) collect: [:n | Socket new connectTo: local port: 80] And do an inspect on that - it will fail when it gets to 1017 (some descriptors already taken I presume) due to the default ulimit of 1024 - as expected. It fails a bit inconsistently with different errors, but anyway, nothing really surprising. So... then I raised my limits so that "ulimit -S" reports 2048 for my user. Then tried again, and behold!!! When I pass 1017 my stderr starts spewing out: errno 9 select: Bad file descriptor errno 9 select: Bad file descriptor errno 9 select: Bad file descriptor errno 9 select: Bad file descriptor errno 9 select: Bad file descriptor ...continuously, so yes, it fails on every aioPoll(). And in Squeak there is a debugger that pops up about 45 seconds later (timeout) with a connection timeout - also expected - since the next Socket to open doesn't connect since the select() isn't working anymore so Squeak never gets notified about the connection being made. Now that the errno 9 is spewing continuosly the VM is still running and Squeak "works". But if I try to connect somewhere - I just timeout. Testing in Pharo 2.0 with pharo VM ================================== So let's try in Pharo 2.0. Test code: | local | local := NetNameResolver addressForName: 'localhost' timeout: 20. (1 to: 1017) collect: [:n | Socket new connectTo: local port: 80] Note that 1016 works, over and over after a Smalltalk garbageCollect, no problem. But if we try it with 1017 we get a smack in the face: ./pharo shared/Pharo2.0.image *** buffer overflow detected ***: /home/gokr/pharo/pharo2.0/bin/pharo terminated ======= Backtrace: ========= /lib/i386-linux-gnu/libc.so.6(+0x696de)[0xf757f6de] /lib/i386-linux-gnu/libc.so.6(__fortify_fail+0x6b)[0xf761213b] /lib/i386-linux-gnu/libc.so.6(+0xfafca)[0xf7610fca] /lib/i386-linux-gnu/libc.so.6(+0xfc02a)[0xf761202a] /home/gokr/pharo/pharo2.0/bin/pharo(aioEnable+0x31)[0x80978d1] /home/gokr/pharo/pharo2.0/bin/pharo(sqSocketConnectToPort+0xd6)[0x80f38b6] /home/gokr/pharo/pharo2.0/bin/pharo[0x80f1518] [0xb70bd770] [0xb716fbd9] [0xb716fb16] [0xb716d53a] [0xb70bd700] [0xb70bd5c0] .... etc That's interesting? So trying again with the latest Cog from mirandabanda... Testing in Pharo 2.0 with Cog from Eliot ======================================== Aha, so here we first need to execute (too few semaphores, only 256 and I think we need about 3 per Socket?): VirtualMachine maxExternalSemaphoresSilently: 8192 ...and then we get the same behavior as using our oldish Cog, namely it starts spewing out the "errno 9 select: Bad file descriptor" over and over and the next Socket connection times out. So... back to you Eliot - am I still wrong? :) :) Extra Oddity ============ Socket allInstances size => 2038 (Socket allInstances select: [:e | e isConnected]) size => 2032 (Socket allInstances collect: [:e | e socketHandle]) asSet size => 1019 ...what? After some investigation it turns out that "Socket new" will create *two* instances of Socket. This has to do with WeakRegistry used in the Socket class :) executor "Return an object which can act as executor for finalization of the receiver" ^self shallowCopy actAsExecutor ...so each Socket gets a shallow copy of itself as the executor. Ok, that explains it - but IMHO it sure ain't pretty. > Could be that we are closing a socket twice? (see 0xffa5e0cc M > Socket>destroy 0xd9ba02d4: a(n) Socket) Göran mentioned finding something > that said on a very busy server you may try closing an already closed socket > that had already been reused. On not so busy system you can get away with > it so you may not see this error until you put a very large load on it. > > I captured the error: > > This repeated about 1000 times (no idea why it would repeat so many times): > > errno 9 > select: Bad file descriptor It repeats because we call select() many times per second and the bad (closed) fd is still in that set we send in, so it barfs every time and since it barfs it doesn't do its job anymore so this bad fd is blocking everything - and once this hits the VM goes unresponsive when it comes to file handles. I guess one could "clean out" the bad fd if this happens - but that is of course a hack and not sure how to do it. Then exactly why the segfault comes, who knows. :) And oh, the SAML auth that we thought was leaking - no. It just uses WebClient httpGet and doesn't hold onto anything. Some trivial tests shows that they should be GCed properly - although yes, John's remark about GC coming "later" is also interesting. regards, Göran |
On 10/10/2014 11:46 AM, Göran Krampe wrote: > Hi guys! > > Long post but an interesting one. Summary for the impatient: > > - Ron seems to prove that: > - We never reach 1024 fds so no leak. > - But select() still fails, so we probably have an obscure close() > bug with fds. > > - And I prove that select() is indeed broken above 1024. Aha! It seems Ragnar Hojland Espinosa has already fixed this 11 years ago: http://lists.squeakfoundation.org/pipermail/squeak-dev/2003-May/057750.html I also found code from David Shaffer that fixes the listening to not miss stuff, also interesting: http://lists.squeakfoundation.org/pipermail/squeak-dev/2005-August/093575.html ...and the patch he mentions: http://lists.squeakfoundation.org/pipermail/squeak-dev/2005-August/093638.html regards, Göran |
On Fri, Oct 10, 2014 at 01:15:21PM +0200, G?ran Krampe wrote: > > On 10/10/2014 11:46 AM, G?ran Krampe wrote: > >Hi guys! > > > >Long post but an interesting one. Summary for the impatient: > > > >- Ron seems to prove that: > > - We never reach 1024 fds so no leak. > > - But select() still fails, so we probably have an obscure close() > >bug with fds. > > > >- And I prove that select() is indeed broken above 1024. > > Aha! > > It seems Ragnar Hojland Espinosa has already fixed this 11 years ago: > > http://lists.squeakfoundation.org/pipermail/squeak-dev/2003-May/057750.html > > I also found code from David Shaffer that fixes the listening to not > miss stuff, also interesting: > > http://lists.squeakfoundation.org/pipermail/squeak-dev/2005-August/093575.html > > ...and the patch he mentions: > > http://lists.squeakfoundation.org/pipermail/squeak-dev/2005-August/093638.html > > > regards, G?ran > Nice detective work! I can't look right now but over the weekend I'll check to make sure we capture these in Mantis and get the fixes into the oscog/trunk repositories as needed. Dave |
In reply to this post by Max Leske
Thanks Max! Ron From: [hidden email] [mailto:[hidden email]] On Behalf Of Max Leske Sent: Friday, October 10, 2014 1:13 AM To: [hidden email] Subject: [Vm-dev] Re: Squeak socket problem ... help! |
In reply to this post by Göran Krampe
On Fri, Oct 10, 2014 at 01:15:21PM +0200, G?ran Krampe wrote: > > On 10/10/2014 11:46 AM, G?ran Krampe wrote: > >Hi guys! > > > >Long post but an interesting one. Summary for the impatient: > > > >- Ron seems to prove that: > > - We never reach 1024 fds so no leak. > > - But select() still fails, so we probably have an obscure close() > >bug with fds. > > > >- And I prove that select() is indeed broken above 1024. > > Aha! > > It seems Ragnar Hojland Espinosa has already fixed this 11 years ago: > > http://lists.squeakfoundation.org/pipermail/squeak-dev/2003-May/057750.html I entered Mantis 7822: "Unix aio VM support for 1024+ descriptors" to track this issue: http://bugs.squeak.org/view.php?id=7822 > > I also found code from David Shaffer that fixes the listening to not > miss stuff, also interesting: > > http://lists.squeakfoundation.org/pipermail/squeak-dev/2005-August/093575.html > > ...and the patch he mentions: > > http://lists.squeakfoundation.org/pipermail/squeak-dev/2005-August/093638.html > I entered Mantis 7823: "Patch for socket accept problem on unix VM" to track this issue: http://bugs.squeak.org/view.php?id=7823 Both of these issues will require some follow up work in order to get them included in the VMs. If further progress is made, please keep the Mantis issues updated and/or report progress to the vm-dev list. Dave |
On Sat, 11 Oct 2014, David T. Lewis wrote: > > On Fri, Oct 10, 2014 at 01:15:21PM +0200, G?ran Krampe wrote: >> >> On 10/10/2014 11:46 AM, G?ran Krampe wrote: >>> Hi guys! >>> >>> Long post but an interesting one. Summary for the impatient: >>> >>> - Ron seems to prove that: >>> - We never reach 1024 fds so no leak. >>> - But select() still fails, so we probably have an obscure close() >>> bug with fds. >>> >>> - And I prove that select() is indeed broken above 1024. >> >> Aha! >> >> It seems Ragnar Hojland Espinosa has already fixed this 11 years ago: >> >> http://lists.squeakfoundation.org/pipermail/squeak-dev/2003-May/057750.html > > I entered Mantis 7822: "Unix aio VM support for 1024+ descriptors" to track > this issue: > > http://bugs.squeak.org/view.php?id=7822 I think this is a great summary of the current status on unix systems: http://daniel.haxx.se/docs/poll-vs-select.html I think the best would be to use libevent or libev, but adding support for epoll (on linux only) is probably a lot easier thing to do. > >> >> I also found code from David Shaffer that fixes the listening to not >> miss stuff, also interesting: >> >> http://lists.squeakfoundation.org/pipermail/squeak-dev/2005-August/093575.html >> >> ...and the patch he mentions: >> >> http://lists.squeakfoundation.org/pipermail/squeak-dev/2005-August/093638.html >> > > I entered Mantis 7823: "Patch for socket accept problem on unix VM" to track > this issue: > > http://bugs.squeak.org/view.php?id=7823 This is pretty interesting. Now I know where those spurious 504 responses come from. Levente > > Both of these issues will require some follow up work in order to get them > included in the VMs. If further progress is made, please keep the Mantis issues > updated and/or report progress to the vm-dev list. > > Dave > > |
Hi Levente! On 10/12/2014 12:52 PM, Levente Uzonyi wrote: >>> It seems Ragnar Hojland Espinosa has already fixed this 11 years ago: >>> >>> http://lists.squeakfoundation.org/pipermail/squeak-dev/2003-May/057750.html >>> >> >> I entered Mantis 7822: "Unix aio VM support for 1024+ descriptors" to >> track >> this issue: >> >> http://bugs.squeak.org/view.php?id=7822 > > I think this is a great summary of the current status on unix systems: > http://daniel.haxx.se/docs/poll-vs-select.html > I think the best would be to use libevent or libev, but adding support > for epoll (on linux only) is probably a lot easier thing to do. And it seems that Ragnar has already done it for epoll (see post below) - I need to find his email and check with him. I agree on the libev/libevent thing, but as you say - that requires more changes presumably. I am not sure it will fix our issue which seems to be that a socket suddenly gets closed twice (or something) but who knows. regards, Göran |
Free forum by Nabble | Edit this page |