I started looking at the Socket failures that I can reproduce and I can't wrap my head around the SocketReadingWritingTest>>setUp. Based on what I know about TCP sockets it doesn't make sense to me. AFAIK to get both ends of a connected TCP connection (lacking a socketpair call) I need 3 sockets. A listening socket, a client socket (doing connect) and finally the server socket that comes out as a result of accept on the listener. So I'd expect to see something like this in the setUp:
| data input output listener socket1 socket2 process sync | Socket initializeNetwork. sync := Semaphore new. listener := Socket newTCP. listener listenOn: 9999. process := [ [ socket1 := listener accept ] ensure: [ listener close ]. sync signal ] fork. socket2 := Socket newTCP. socket2 connectTo: (NetNameResolver localHostAddress) port: 9999. sync wait. output := socket1 reading. input := socket2 writing. Surprisingly, the above doesn't work, while many of the Socket tests seem to pass for me despite my brain telling me that it can't possibly. Can anyone shed some light on this for me ? Thanks, Martin |
Attached is some code I wrote a while ago based on advice that ConnectionQueue and SocketStream are correct starting points. I am a big believer that network code should do what it is told until it told to stop; timeouts should be in the hands of the user, application programmer or server administrator, not decided by the socket layer.
________________________________________ From: [hidden email] [[hidden email]] On Behalf Of [hidden email] [[hidden email]] Sent: Wednesday, January 12, 2011 1:13 PM To: [hidden email] Subject: [Pharo-project] Socket question (Was Re: Xtreams up to date) I started looking at the Socket failures that I can reproduce and I can't wrap my head around the SocketReadingWritingTest>>setUp. Based on what I know about TCP sockets it doesn't make sense to me. AFAIK to get both ends of a connected TCP connection (lacking a socketpair call) I need 3 sockets. A listening socket, a client socket (doing connect) and finally the server socket that comes out as a result of accept on the listener. So I'd expect to see something like this in the setUp: | data input output listener socket1 socket2 process sync | Socket initializeNetwork. sync := Semaphore new. listener := Socket newTCP. listener listenOn: 9999. process := [ [ socket1 := listener accept ] ensure: [ listener close ]. sync signal ] fork. socket2 := Socket newTCP. socket2 connectTo: (NetNameResolver localHostAddress) port: 9999. sync wait. output := socket1 reading. input := socket2 writing. Surprisingly, the above doesn't work, while many of the Socket tests seem to pass for me despite my brain telling me that it can't possibly. Can anyone shed some light on this for me ? Thanks, Martin NetworkSmokeTestCase.st (3K) Download Attachment |
In reply to this post by mkobetic
Martin
did you check with squeak? because I have the impression that andreas fixed a bit some code. This is on our (igor) to do for 1.3 or 1.4 or 1.3 if somebody join and help. Noury and luc started to rewrite from scratch the sockets but I do not know the status. Stef On Jan 12, 2011, at 7:13 PM, [hidden email] wrote: > I started looking at the Socket failures that I can reproduce and I can't wrap my head around the SocketReadingWritingTest>>setUp. Based on what I know about TCP sockets it doesn't make sense to me. AFAIK to get both ends of a connected TCP connection (lacking a socketpair call) I need 3 sockets. A listening socket, a client socket (doing connect) and finally the server socket that comes out as a result of accept on the listener. So I'd expect to see something like this in the setUp: > > | data input output listener socket1 socket2 process sync | > Socket initializeNetwork. > sync := Semaphore new. > listener := Socket newTCP. > listener listenOn: 9999. > process := [ [ socket1 := listener accept ] ensure: [ listener close ]. sync signal ] fork. > socket2 := Socket newTCP. > socket2 connectTo: (NetNameResolver localHostAddress) port: 9999. > sync wait. > output := socket1 reading. > input := socket2 writing. > > Surprisingly, the above doesn't work, while many of the Socket tests seem to pass for me despite my brain telling me that it can't possibly. Can anyone shed some light on this for me ? > > Thanks, > > Martin > > |
In reply to this post by mkobetic
Martin,
On 12 Jan 2011, at 19:13, [hidden email] wrote: > I started looking at the Socket failures that I can reproduce and I can't wrap my head around the SocketReadingWritingTest>>setUp. Based on what I know about TCP sockets it doesn't make sense to me. AFAIK to get both ends of a connected TCP connection (lacking a socketpair call) I need 3 sockets. A listening socket, a client socket (doing connect) and finally the server socket that comes out as a result of accept on the listener. So I'd expect to see something like this in the setUp: > > | data input output listener socket1 socket2 process sync | > Socket initializeNetwork. > sync := Semaphore new. > listener := Socket newTCP. > listener listenOn: 9999. > process := [ [ socket1 := listener accept ] ensure: [ listener close ]. sync signal ] fork. > socket2 := Socket newTCP. > socket2 connectTo: (NetNameResolver localHostAddress) port: 9999. > sync wait. > output := socket1 reading. > input := socket2 writing. > > Surprisingly, the above doesn't work, while many of the Socket tests seem to pass for me despite my brain telling me that it can't possibly. Can anyone shed some light on this for me ? I have been playing around a little bit with your example and this works for me (Pharo 1.1.1 + Xtreams): | data input output listener socket1 socket2 process sync | data := #uninitialized. sync := Semaphore new. listener := Socket newTCP. listener listenOn: 9999 backlogSize: 10. process := [ [ socket1 := listener waitForAcceptFor: 10. socket1 waitForDataFor: 10. output := socket1 reading. data := output rest. sync signal ] ensure: [ listener close ] ] fork. socket2 := Socket newTCP. socket2 connectToHostNamed: 'localhost' port: 9999. input := socket2 writing. input write: 'Hello!'. sync wait. data asString Of course, you still have to close some sockets, but you get the idea. HTH, Sven PS: Yes, Socket (and friends) have some pretty confusing API, you have to stick to what you know that works ;-) |
In reply to this post by mkobetic
On Wed, 12 Jan 2011, [hidden email] wrote:
> I started looking at the Socket failures that I can reproduce and I can't wrap my head around the SocketReadingWritingTest>>setUp. Based on what I know about TCP sockets it doesn't make sense to me. AFAIK to get both ends of a connected TCP connection (lacking a socketpair call) I need 3 sockets. A listening socket, a client socket (doing connect) and finally the server socket that comes out as a result of accept on the listener. So I'd expect to see something like this in the setUp: There's a way to use only 2 sockets and this causes the problem. If you use #listenOn:, then there will be no backlog and the listening socket will accept the connection. If you use #listenOn:backlogSize: instead, then it should work. Levente > > | data input output listener socket1 socket2 process sync | > Socket initializeNetwork. > sync := Semaphore new. > listener := Socket newTCP. > listener listenOn: 9999. > process := [ [ socket1 := listener accept ] ensure: [ listener close ]. sync signal ] fork. > socket2 := Socket newTCP. > socket2 connectTo: (NetNameResolver localHostAddress) port: 9999. > sync wait. > output := socket1 reading. > input := socket2 writing. > > Surprisingly, the above doesn't work, while many of the Socket tests seem to pass for me despite my brain telling me that it can't possibly. Can anyone shed some light on this for me ? > > Thanks, > > Martin > > > |
In reply to this post by mkobetic
Thanks everyone for your responses. I studied the ConnectionQueue that Wilhelm pointed me to, that didn't clarify things for me too much and I wasn't able to get it working either. Finally Sven's version of the script got me going again. I can already see the first problem in the Xtreams implementation in that it uses non-blocking read in Socket>>readInto:, simply changing it to the blocking read doesn't seem to fix things though. Quite the opposite. So now I'm trying to zero in on the exact API at the Socket level to use. To expand on the test script a bit more, this is what I'd like to get going:
| input output listener process sync | sync := Semaphore new. listener := Socket newTCP. listener listenOn: 9999 backlogSize: 10. process := [ [ input := listener waitForAcceptFor: 10. sync signal ] ensure: [ listener close ] ] fork. output := Socket newTCP. output connectToHostNamed: 'localhost' port: 9999. sync wait. [ output sendData: 'Hello!'; close. input receiveData. ] ensure: [ output close. input close. process terminate ] Maybe I need some other calls in the last block, but I'm stuck even earlier again. When I interrupt the script above, the debugger highlights the whole last block (the ensure: receiver) and there's just the top level UndefinedObject>>DoIt context. Doesn't make much sense to me but debuggers have bugs as well. I'm guessing the process is stuck on the sync wait before that. But the sync semaphore has excess signals 1. And what is even stranger is that input is still nil. How is that possible ? Can someone get the above going ? I realize that I deviated from Sven's version by moving the read out of the accept process, but that shouldn't be a problem, or am I missing something ? I'm also wondering if there is a Socket test suite somewhere that already contains tests like the above, that I could mine for examples. I don't see any in the PharoCore1.2 image. Thanks, Martin "Sven Van Caekenberghe"<[hidden email]> wrote: > | data input output listener socket1 socket2 process sync | > data := #uninitialized. > sync := Semaphore new. > listener := Socket newTCP. > listener listenOn: 9999 backlogSize: 10. > process := [ [ > socket1 := listener waitForAcceptFor: 10. > socket1 waitForDataFor: 10. > output := socket1 reading. > data := output rest. > sync signal ] ensure: [ listener close ] ] fork. > socket2 := Socket newTCP. > socket2 connectToHostNamed: 'localhost' port: 9999. > input := socket2 writing. > input write: 'Hello!'. > sync wait. > data asString > > Sven > > PS: Yes, Socket (and friends) have some pretty confusing API, you have to stick to what you know that works ;-) > |
On Thu, 13 Jan 2011, [hidden email] wrote:
> Thanks everyone for your responses. I studied the ConnectionQueue that Wilhelm pointed me to, that didn't clarify things for me too much and I wasn't able to get it working either. Finally Sven's version of the script got me going again. I can already see the first problem in the Xtreams implementation in that it uses non-blocking read in Socket>>readInto:, simply changing it to the blocking read doesn't seem to fix things though. Quite the opposite. So now I'm trying to zero in on the exact API at the Socket level to use. To expand on the test script a bit more, this is what I'd like to get going: > > | input output listener process sync | > sync := Semaphore new. > listener := Socket newTCP. > listener listenOn: 9999 backlogSize: 10. > process := [ > [ input := listener waitForAcceptFor: 10. > sync signal > ] ensure: [ listener close ] > ] fork. > output := Socket newTCP. > output connectToHostNamed: 'localhost' port: 9999. > sync wait. > [ output sendData: 'Hello!'; close. > input receiveData. > ] ensure: [ output close. input close. process terminate ] It works fine in my Squeak image. Using localhost may be a problem if your local interface has an IPv6 address. Use 127.0.0.1 instead. Levente > > Maybe I need some other calls in the last block, but I'm stuck even earlier again. When I interrupt the script above, the debugger highlights the whole last block (the ensure: receiver) and there's just the top level UndefinedObject>>DoIt context. Doesn't make much sense to me but debuggers have bugs as well. I'm guessing the process is stuck on the sync wait before that. But the sync semaphore has excess signals 1. And what is even stranger is that input is still nil. How is that possible ? Can someone get the above going ? > > I realize that I deviated from Sven's version by moving the read out of the accept process, but that shouldn't be a problem, or am I missing something ? I'm also wondering if there is a Socket test suite somewhere that already contains tests like the above, that I could mine for examples. I don't see any in the PharoCore1.2 image. > > Thanks, > > Martin > > "Sven Van Caekenberghe"<[hidden email]> wrote: >> | data input output listener socket1 socket2 process sync | >> data := #uninitialized. >> sync := Semaphore new. >> listener := Socket newTCP. >> listener listenOn: 9999 backlogSize: 10. >> process := [ [ >> socket1 := listener waitForAcceptFor: 10. >> socket1 waitForDataFor: 10. >> output := socket1 reading. >> data := output rest. >> sync signal ] ensure: [ listener close ] ] fork. >> socket2 := Socket newTCP. >> socket2 connectToHostNamed: 'localhost' port: 9999. >> input := socket2 writing. >> input write: 'Hello!'. >> sync wait. >> data asString > >> >> Sven >> >> PS: Yes, Socket (and friends) have some pretty confusing API, you have to stick to what you know that works ;-) >> > > > |
In reply to this post by mkobetic
"Levente Uzonyi"<[hidden email]> wrote:
> It works fine in my Squeak image. Using localhost may be a problem if your > local interface has an IPv6 address. Use 127.0.0.1 instead. Odd. The image I was using must have been corrupt somehow. When I took fresh Pharo 1.2 image the script worked fine. Thanks for the confirmation! Martin |
In reply to this post by mkobetic
On 13 Jan 2011, at 22:56, [hidden email] wrote: > Can someone get the above going ? Yes, the following works as expected for me (Pharo 1.1.1 + latest Xtreams + Cog based VM): | input output listener process sync data | data := #uninitialized. sync := Semaphore new. listener := Socket newTCP. listener listenOn: 9999 backlogSize: 10. process := [ [ input := listener waitForAcceptFor: 10. sync signal ] ensure: [ listener close ] ] fork. output := Socket newTCP. output connectToHostNamed: 'localhost' port: 9999. sync wait. [ output sendData: 'Hello!'; close. data := input receiveData. ] ensure: [ output close. input close. process terminate ]. data. > I realize that I deviated from Sven's version by moving the read out of the accept process, but that shouldn't be a problem, or am I missing something ? I'm also wondering if there is a Socket test suite somewhere that already contains tests like the above, that I could mine for examples. I don't see any in the PharoCore1.2 image. I have encountered strange things when using sockets from different processes, that is why I moved the code in the block. But it should work, and now it does ;-) Sven |
In reply to this post by mkobetic
OK, with two minor tweaks the socket streams started to behave better for me. The whole suite passes sometimes, although the results get often spoiled by transient failures which suggest a race condition in transform write stream. I think it's the the same one I occasionally see on VW side although it's quite rare there. But the fact that it comes up more frequently on Squeak side should help figuring it out, since it makes it more reproduceable. We'll see.
|
In reply to this post by mkobetic
BTW, would this snippet be useful as a basic Socket smoke test ? Or is that already covered somewhere ?
| input output listener process sync | sync := Semaphore new. listener := Socket newTCP. listener listenOn: 9999 backlogSize: 10. process := [ [ input := listener waitForAcceptFor: 10. sync signal ] ensure: [ listener close ] ] fork. output := Socket newTCP. output connectTo: #[127 0 0 1] port: 9999. sync wait. [ #( 64 1024 2048 4096 8192 ) do: [:dataSize || data | data := ByteArray new: dataSize. 1 to: data size do: [:i | data at: i put: (i - 1) \\ 256]. 1 to: 10 do: [:each | output sendData: data]. 1 to: 10 do: [ :each || result index | index := 1. result := ByteArray new: dataSize. [ index := index + (input receiveDataInto: result startingAt: index). index > dataSize ] whileFalse. result = data ifFalse: [ result halt]]] ] ensure: [ output close. input close ] |
I am really bothered by the timeout on the accept, so much so that I think it has no place in the tests. Of course, one needs a way to clean up after a failure, but that should be done by another thread that waits for a "long" time and then cleans up anything that is left behind after the failure.
Servers should be started and stopped and in between, they should wait for and accept connections. Clients should try to connect until told otherwise; timeouts can be reasonable options there, but should not be forced by the sockets system. Only the calling threads should be blocked, of course. ________________________________________ From: [hidden email] [[hidden email]] On Behalf Of [hidden email] [[hidden email]] Sent: Friday, January 14, 2011 1:12 AM To: [hidden email] Subject: Re: [Pharo-project] Socket question (Was Re: Xtreams up to date) BTW, would this snippet be useful as a basic Socket smoke test ? Or is that already covered somewhere ? | input output listener process sync | sync := Semaphore new. listener := Socket newTCP. listener listenOn: 9999 backlogSize: 10. process := [ [ input := listener waitForAcceptFor: 10. sync signal ] ensure: [ listener close ] ] fork. output := Socket newTCP. output connectTo: #[127 0 0 1] port: 9999. sync wait. [ #( 64 1024 2048 4096 8192 ) do: [:dataSize || data | data := ByteArray new: dataSize. 1 to: data size do: [:i | data at: i put: (i - 1) \\ 256]. 1 to: 10 do: [:each | output sendData: data]. 1 to: 10 do: [ :each || result index | index := 1. result := ByteArray new: dataSize. [ index := index + (input receiveDataInto: result startingAt: index). index > dataSize ] whileFalse. result = data ifFalse: [ result halt]]] ] ensure: [ output close. input close ] |
On 14 Jan 2011, at 12:54, Schwab,Wilhelm K wrote: > I am really bothered by the timeout on the accept, so much so that I think it has no place in the tests. Of course, one needs a way to clean up after a failure, but that should be done by another thread that waits for a "long" time and then cleans up anything that is left behind after the failure. > > Servers should be started and stopped and in between, they should wait for and accept connections. Clients should try to connect until told otherwise; timeouts can be reasonable options there, but should not be forced by the sockets system. Only the calling threads should be blocked, of course. I don't understand why this would bother you: it is the 'client/user' that decides about the timeout (you could use some very large number for an 'infinite' timeout). You need of course a loop around it. Here is the code from ZnSingleThreadedServer (which started from the Blackfoot example): listenLoop "We create a listening Socket, then wait for a connection. After each connection we also check that the listening Socket is still valid - if not we just make a recursive call to this method to start over." self initializeServerSocket. [ [ serverSocket isValid ifFalse: [ "will trigger #ifCurtailed: block and destroy socket" ^ self listenLoop ]. self serveConnectionOn: serverSocket ] repeat ] ifCurtailed: [ self releaseServerSocket ] serveConnectionOn: listeningSocket "We wait up to acceptWaitTimeout seconds for an incoming connection. If we get one we wrap it in a SocketStream and #executeOneRequestResponseOn: on it" | stream socket | socket := (listeningSocket waitForAcceptFor: self acceptWaitTimeout) ifNil: [ ^ self log: 'Wait for accept timed out' ]. stream := ZnNetworkingUtils socketStreamOn: socket. [ [ [ self executeOneRequestResponseOn: stream ] ensure: [ self log: 'Closing stream'. stream close ] ] ifCurtailed: [ self log: 'Destroying socket'. socket destroy ] ] forkAt: Processor highIOPriority named: 'Zinc HTTP Server Worker' Each #acceptWaitTimeout seconds, 'Wait for accept timed out' is printed on the log and another accept/wait starts. This looping improves liveliness and responsiveness to other events. Sven |
In reply to this post by mkobetic
"Sven Van Caekenberghe"<[hidden email]> wrote:
> Each #acceptWaitTimeout seconds, 'Wait for accept timed out' is printed on the log and another accept/wait starts. This looping improves liveliness and responsiveness to other events. Given that accept is generally done on the server side, usually in a background process, I have to say I'd be inclined to side with Wilhelm on this one. I don't see much that the server can do than go back into another accept loop. You can't predict how long will it take for the next client to connect, so a timeout isn't a sign of an issue. So I don't see much use for timeout on accept. I'm not quite sure what you mean by "improving liveliness". If the implementation needs to get out of accept to let other processes respond to events, then I wouldn't call that a feature. But these are really more philosophical points. In practice, wrapping a loop around an accept timeout isn't a big deal either, as long as it works as needed. |
In reply to this post by Sven Van Caekenberghe
Sven,
Thanks for the code! The fact that I have to decide what constitutes an "infinite" delay seems wrong to me. It should just do what I tell it and either calmly wait for an event to let the code move forward (assuming that only the calling thread is blocked). One example: if I know I typed an incorrect address, then any time is wasted. If it's correct and the connection is slow (I downloaded a pdf of a paper earlier this week, and it was miserably slow but worked fine), I probably want the code to keep running without trying to think for itself. "This looping improves liveliness and responsiveness to other events." Why does looping on things that should be blocking until events arrive improve response to other events? This seems like a bailing wire solution to the problem. Bill ________________________________________ From: [hidden email] [[hidden email]] On Behalf Of Sven Van Caekenberghe [[hidden email]] Sent: Friday, January 14, 2011 9:36 AM To: [hidden email] Subject: Re: [Pharo-project] Socket question (Was Re: Xtreams up to date) On 14 Jan 2011, at 12:54, Schwab,Wilhelm K wrote: I don't understand why this would bother you: it is the 'client/user' that decides about the timeout (you could use some very large number for an 'infinite' timeout). You need of course a loop around it. Here is the code from ZnSingleThreadedServer (which started from the Blackfoot example): listenLoop "We create a listening Socket, then wait for a connection. After each connection we also check that the listening Socket is still valid - if not we just make a recursive call to this method to start over." self initializeServerSocket. [ [ serverSocket isValid ifFalse: [ "will trigger #ifCurtailed: block and destroy socket" ^ self listenLoop ]. self serveConnectionOn: serverSocket ] repeat ] ifCurtailed: [ self releaseServerSocket ] serveConnectionOn: listeningSocket "We wait up to acceptWaitTimeout seconds for an incoming connection. If we get one we wrap it in a SocketStream and #executeOneRequestResponseOn: on it" | stream socket | socket := (listeningSocket waitForAcceptFor: self acceptWaitTimeout) ifNil: [ ^ self log: 'Wait for accept timed out' ]. stream := ZnNetworkingUtils socketStreamOn: socket. [ [ [ self executeOneRequestResponseOn: stream ] ensure: [ self log: 'Closing stream'. stream close ] ] ifCurtailed: [ self log: 'Destroying socket'. socket destroy ] ] forkAt: Processor highIOPriority named: 'Zinc HTTP Server Worker' Each #acceptWaitTimeout seconds, 'Wait for accept timed out' is printed on the log and another accept/wait starts. This looping improves liveliness and responsiveness to other events. Sven |
In reply to this post by mkobetic
Martin, Wilhelm,
On 14 Jan 2011, at 20:31, [hidden email] wrote: > "Sven Van Caekenberghe"<[hidden email]> wrote: >> Each #acceptWaitTimeout seconds, 'Wait for accept timed out' is printed on the log and another accept/wait starts. This looping improves liveliness and responsiveness to other events. > > Given that accept is generally done on the server side, usually in a background process, I have to say I'd be inclined to side with Wilhelm on this one. I don't see much that the server can do than go back into another accept loop. You can't predict how long will it take for the next client to connect, so a timeout isn't a sign of an issue. So I don't see much use for timeout on accept. > > I'm not quite sure what you mean by "improving liveliness". If the implementation needs to get out of accept to let other processes respond to events, then I wouldn't call that a feature. But these are really more philosophical points. In practice, wrapping a loop around an accept timeout isn't a big deal either, as long as it works as needed. On 14 Jan 2011, at 20:45, Schwab,Wilhelm K wrote: > Sven, > > Thanks for the code! > > The fact that I have to decide what constitutes an "infinite" delay seems wrong to me. It should just do what I tell it and either calmly wait for an event to let the code move forward (assuming that only the calling thread is blocked). One example: if I know I typed an incorrect address, then any time is wasted. If it's correct and the connection is slow (I downloaded a pdf of a paper earlier this week, and it was miserably slow but worked fine), I probably want the code to keep running without trying to think for itself. > > > "This looping improves liveliness and responsiveness to other events." > > Why does looping on things that should be blocking until events arrive improve response to other events? This seems like a bailing wire solution to the problem. Of course it would be better if there was an #waitForAccept variant that waited forever. The fact that you keep looping can be handy to do periodic tasks like cleanups without starting a new thread. In Java you can set the timeout of a ServerSocket and the accept will fall through as well, it is just an option. Sven |
Free forum by Nabble | Edit this page |