Hi all,
I’m working on the Zinc server adaptor for Seaside 3.1 but I got stuck. Launching the following from Gemtools brings up the server and makes it respond to requests nicely. Trying to do the same from a topaz session though, immediately throws an out-of-mem exception. WAGsZincAdaptor startOn: 8383 If you break the blocking call in Gemtools and run the line above again, you also run out-of-mem after a couple of times (but only after _also_ connecting to the server from your web browser). First question about this is why that does not happen in gem tools? I tried turning the auto commit off but that did not change anything. Following advice in [1], I started investigating from gemtools and got this for the byte sizes in temp obj memory on 75% full: #'ByteArray'->22153440, #'GsMethodDictionary'->3073088, #'GsMethodLookupCache'->2224064, #'GsProcess'->562224, #'String'->415504, #'Array'->340168, #'SocketStream'->322680, #'ExecBlock'->201664, #'VariableContext'->175552, #'SocketStreamSocket'->150696, …0 The process browser shows over 100 threads displayed as follows: (priority=25) ready [oop= …. ] I guess this is where Dale started having nightmares. The stack traces show the out-of-mem always happens in the ZnNetworkingUtils>>setSocketStreamParameters: method. All of this is using the current Zinc version on github for gemstone 3.1 [2] and my version of the Seaside 3.1 port [3] To try, load Seaside from my repo, load the ‘Zinc-Seaside’ group from the baseline and hit the above. Any ideas are welcome Johan _______________________________________________ Glass mailing list [hidden email] http://lists.gemtalksystems.com/mailman/listinfo/glass |
Johan, Off hand, not sure why GemTools "works" - I have a feeling that we'll have to characterize this puppy before we know:) Do you have a stack when you sampled the 75% full? With all of the sockets and processes hanging around, someone is spinning out of control producing new processes so a stack might point to the culprit ... Dale From: "Johan Brichau" <[hidden email]> _______________________________________________ Glass mailing list [hidden email] http://lists.gemtalksystems.com/mailman/listinfo/glass |
In reply to this post by Johan Brichau-3
One possible difference between GemTools and Topaz is the configuration for GEM_TEMPOBJ_CACHE_SIZE. Can you find the log file for the Gem started by GemTools? Can you see the output from starting Topaz? What is the value used for GEM_TEMPOBJ_CACHE_SIZE in each? James
On Dec 15, 2013, at 1:20 PM, Johan Brichau <[hidden email]> wrote:
_______________________________________________ Glass mailing list [hidden email] http://lists.gemtalksystems.com/mailman/listinfo/glass |
In reply to this post by Dale Henrichs-3
ah...ZnNetworkingUtils>>setSocketStreamParameters: ... I'll try to poke around a bit today:) Dale From: "Dale K. Henrichs" <[hidden email]> _______________________________________________ Glass mailing list [hidden email] http://lists.gemtalksystems.com/mailman/listinfo/glass |
In reply to this post by Johan Brichau-3
Johan, I've finally started monkeying with this and the first thing I ran into (running this under tODE and trying to duplicate the GemTools error condition) was an error involving SocketStream and an apparent buffer overflow ... it looks to me that SocketStream>>next has no logic to refill the buffer once it is full ... at least I got a stack in which the ByteArray buffer was accessed at slot 4097 (1 larger than the buffer): There are three other processes floating around as well ... I'll keep digging ... Dale From: "Johan Brichau" <[hidden email]> _______________________________________________ Glass mailing list [hidden email] http://lists.gemtalksystems.com/mailman/listinfo/glass |
Here's the stack: 1. OffsetError(AbstractException)>>_signalWith: @5 line 25 2. OffsetError(AbstractException)>>signal @2 line 47 3. ByteArray(Object)>>_error:args: @15 line 11 4. ByteArray(Object)>>_errorIndexOutOfRange: @2 line 6 5. ByteArray>>at: @4 line 13 6. SocketStream>>next @13 line 10 7. ZnLineReader>>processNext @5 line 4 8. ZnLineReader>>nextLine @3 line 3 9. ZnRequestLine>>readFrom: @3 line 3 10. ZnRequestLine class>>readFrom: @3 line 3 11. ZnRequest>>readHeaderFrom: @2 line 2 12. ZnRequest(ZnMessage)>>readBinaryFrom: @2 line 3 13. ZnRequest class(ZnMessage class)>>readBinaryFrom: @3 line 3 14. [] in ExecBlock1(ZnZincServerAdaptor)>>configureServerForBinaryReading @2 line 4 15. [] in ZnManagingMultiThreadedServer(ZnSingleThreadedServer)>>readRequest: @3 line 6 16. [] in ExecBlock0(ZnSingleThreadedServer)>>withMaximumEntitySizeDo: @2 line 6 17. [] in ZnMaximumEntitySize class(DynamicVariable class)>>value:during: @3 line 9 18. ZnMaximumEntitySize class(ExecBlock)>>ensure: @2 line 12 19. ZnMaximumEntitySize class(DynamicVariable class)>>value:during: @6 line 10 20. ZnManagingMultiThreadedServer(ZnSingleThreadedServer)>>withMaximumEntitySizeDo: @5 line 5 21. ZnManagingMultiThreadedServer(ZnSingleThreadedServer)>>readRequest: @2 line 6 22. [] in ZnManagingMultiThreadedServer(ZnMultiThreadedServer)>>readRequestSafely: @2 line 4 23. ZnManagingMultiThreadedServer(ExecBlock)>>on:do: @3 line 42 24. ZnManagingMultiThreadedServer(ZnMultiThreadedServer)>>readRequestSafely: @3 line 5 25. ZnManagingMultiThreadedServer(ZnMultiThreadedServer)>>executeOneRequestResponseOn: @2 line 7 26. [] in ZnManagingMultiThreadedServer(ZnMultiThreadedServer)>>executeRequestResponseLoopOn: @2 line 10 27. [] in ZnCurrentServer class(DynamicVariable class)>>value:during: @3 line 9 28. ZnCurrentServer class(ExecBlock)>>ensure: @2 line 12 29. ZnCurrentServer class(DynamicVariable class)>>value:during: @6 line 10 30. ZnManagingMultiThreadedServer(ZnMultiThreadedServer)>>executeRequestResponseLoopOn: @4 line 8 31. [] in ZnManagingMultiThreadedServer(ZnMultiThreadedServer)>>serveConnectionsOn: @2 line 9 32. ZnManagingMultiThreadedServer(ExecBlock)>>ensure: @2 line 12 33. [] in ZnManagingMultiThreadedServer(ZnMultiThreadedServer)>>serveConnectionsOn: @2 line 10 34. [] in ExecBlock0(ExecBlock)>>ifCurtailed: @2 line 6 35. ExecBlock0(ExecBlock)>>ensure: @2 line 12 36. ZnManagingMultiThreadedServer(ExecBlock)>>ifCurtailed: @3 line 8 37. [] in ZnManagingMultiThreadedServer(ZnMultiThreadedServer)>>serveConnectionsOn: @2 line 11 38. GsProcess>>_start @7 line 16 39. UndefinedObject(GsNMethod class)>>_gsReturnToC @1 line 1 From: "Dale K. Henrichs" <[hidden email]> _______________________________________________ Glass mailing list [hidden email] http://lists.gemtalksystems.com/mailman/listinfo/glass |
Johan, I've collected more info on two processes[1] ... first the one that gets an "index out of range error" and the other process that was sitting in the process list waiting for more input from the socket, but the socket was closed (at leasted #isConnected is false). I hit the out of memory earlier but didn't collect any meaningful info ... I'm assuming that the multi-threaded boy is hitting an error somewhere and spinning away like crazy ... Need to catch the error in progress and perhaps find out from Sven if there is supposed to be a "deadman switch" on this guy to avoid infinite errors .. Dale [1] https://github.com/dalehenrich/todeHome/blob/master/home/seaside3.1.0/stacks From: "Dale K. Henrichs" <[hidden email]> _______________________________________________ Glass mailing list [hidden email] http://lists.gemtalksystems.com/mailman/listinfo/glass |
Just a touch more info: ZnMultiThreadedServer>>listenLoop sents #repeat to a block ... if `serverSocket isValid` returns false, the loop is exited but it launches another #listenLoop ... if we get consistent invalid serverSockets, this guy will spin out of control.. Inside the block, ZnMultiThreadedServer>>serveConnectionsOn: is called (right before the repeat). In ZnMultiThreadedServer>>serveConnectionsOn: if #waitForAcceptFor: returns a non-nil response, we are off to the races with a forked process and a return to ZnMultiThreadedServer>>listenLoop which itself spins up a new call to ZnMultiThreadedServer>>serveConnectionsOn: .... Soooo, it looks like SocketStreamSocket>>waitForAcceptFor: had better function correctly .... and serverSocket had better not go into terminal invalidity... Some halts or breakpoints in and around these particular methods should yield some good information ... or point us in a more productive direction. Dale From: "Dale K. Henrichs" <[hidden email]> _______________________________________________ Glass mailing list [hidden email] http://lists.gemtalksystems.com/mailman/listinfo/glass |
Hi Dale,
Thanks for these pointers. I will look into it asap. One thing I noticed is that after I merged the latest changes into Seaside3.1 [1], a lot less processes were spawning and it took more time before the system ran out of memory. But it might just have been a coincidence... Johan [1] https://github.com/glassdb/Seaside31/pull/5 On 19 Dec 2013, at 06:38, Dale K. Henrichs <[hidden email]> wrote: > Just a touch more info: > > ZnMultiThreadedServer>>listenLoop sents #repeat to a block ... > > if `serverSocket isValid` returns false, the loop is exited but it launches another > #listenLoop ... if we get consistent invalid serverSockets, this guy will spin out of control.. > > Inside the block, ZnMultiThreadedServer>>serveConnectionsOn: is called (right before the repeat). > > In ZnMultiThreadedServer>>serveConnectionsOn: if #waitForAcceptFor: returns a non-nil response, we are off to the races with a forked process and a return to ZnMultiThreadedServer>>listenLoop which itself spins up a new call to ZnMultiThreadedServer>>serveConnectionsOn: .... > > Soooo, it looks like SocketStreamSocket>>waitForAcceptFor: had better function correctly .... and serverSocket had better not go into terminal invalidity... > > Some halts or breakpoints in and around these particular methods should yield some good information ... or point us in a more productive direction. > > Dale > > From: "Dale K. Henrichs" <[hidden email]> > To: "Johan Brichau" <[hidden email]> > Cc: [hidden email] > Sent: Wednesday, December 18, 2013 8:59:47 PM > Subject: Re: [Glass] problem update: Zinc server with Seaside (3.1) > > Johan, > > I've collected more info on two processes[1] ... first the one that gets an "index out of range error" and the other process that was sitting in the process list waiting for more input from the socket, but the socket was closed (at leasted #isConnected is false). > > I hit the out of memory earlier but didn't collect any meaningful info ... I'm assuming that the multi-threaded boy is hitting an error somewhere and spinning away like crazy ... > > Need to catch the error in progress and perhaps find out from Sven if there is supposed to be a "deadman switch" on this guy to avoid infinite errors .. > > Dale > > [1] https://github.com/dalehenrich/todeHome/blob/master/home/seaside3.1.0/stacks > > From: "Dale K. Henrichs" <[hidden email]> > To: "Johan Brichau" <[hidden email]> > Cc: [hidden email] > Sent: Wednesday, December 18, 2013 6:39:13 PM > Subject: Re: [Glass] problem update: Zinc server with Seaside (3.1) > > Here's the stack: > > 1. OffsetError(AbstractException)>>_signalWith: @5 line 25 > 2. OffsetError(AbstractException)>>signal @2 line 47 > 3. ByteArray(Object)>>_error:args: @15 line 11 > 4. ByteArray(Object)>>_errorIndexOutOfRange: @2 line 6 > 5. ByteArray>>at: @4 line 13 > 6. SocketStream>>next @13 line 10 > 7. ZnLineReader>>processNext @5 line 4 > 8. ZnLineReader>>nextLine @3 line 3 > 9. ZnRequestLine>>readFrom: @3 line 3 > 10. ZnRequestLine class>>readFrom: @3 line 3 > 11. ZnRequest>>readHeaderFrom: @2 line 2 > 12. ZnRequest(ZnMessage)>>readBinaryFrom: @2 line 3 > 13. ZnRequest class(ZnMessage class)>>readBinaryFrom: @3 line 3 > 14. [] in ExecBlock1(ZnZincServerAdaptor)>>configureServerForBinaryReading @2 line 4 > 15. [] in ZnManagingMultiThreadedServer(ZnSingleThreadedServer)>>readRequest: @3 line 6 > 16. [] in ExecBlock0(ZnSingleThreadedServer)>>withMaximumEntitySizeDo: @2 line 6 > 17. [] in ZnMaximumEntitySize class(DynamicVariable class)>>value:during: @3 line 9 > 18. ZnMaximumEntitySize class(ExecBlock)>>ensure: @2 line 12 > 19. ZnMaximumEntitySize class(DynamicVariable class)>>value:during: @6 line 10 > 20. ZnManagingMultiThreadedServer(ZnSingleThreadedServer)>>withMaximumEntitySizeDo: @5 line 5 > 21. ZnManagingMultiThreadedServer(ZnSingleThreadedServer)>>readRequest: @2 line 6 > 22. [] in ZnManagingMultiThreadedServer(ZnMultiThreadedServer)>>readRequestSafely: @2 line 4 > 23. ZnManagingMultiThreadedServer(ExecBlock)>>on:do: @3 line 42 > 24. ZnManagingMultiThreadedServer(ZnMultiThreadedServer)>>readRequestSafely: @3 line 5 > 25. ZnManagingMultiThreadedServer(ZnMultiThreadedServer)>>executeOneRequestResponseOn: @2 line 7 > 26. [] in ZnManagingMultiThreadedServer(ZnMultiThreadedServer)>>executeRequestResponseLoopOn: @2 line 10 > 27. [] in ZnCurrentServer class(DynamicVariable class)>>value:during: @3 line 9 > 28. ZnCurrentServer class(ExecBlock)>>ensure: @2 line 12 > 29. ZnCurrentServer class(DynamicVariable class)>>value:during: @6 line 10 > 30. ZnManagingMultiThreadedServer(ZnMultiThreadedServer)>>executeRequestResponseLoopOn: @4 line 8 > 31. [] in ZnManagingMultiThreadedServer(ZnMultiThreadedServer)>>serveConnectionsOn: @2 line 9 > 32. ZnManagingMultiThreadedServer(ExecBlock)>>ensure: @2 line 12 > 33. [] in ZnManagingMultiThreadedServer(ZnMultiThreadedServer)>>serveConnectionsOn: @2 line 10 > 34. [] in ExecBlock0(ExecBlock)>>ifCurtailed: @2 line 6 > 35. ExecBlock0(ExecBlock)>>ensure: @2 line 12 > 36. ZnManagingMultiThreadedServer(ExecBlock)>>ifCurtailed: @3 line 8 > 37. [] in ZnManagingMultiThreadedServer(ZnMultiThreadedServer)>>serveConnectionsOn: @2 line 11 > 38. GsProcess>>_start @7 line 16 > 39. UndefinedObject(GsNMethod class)>>_gsReturnToC @1 line 1 > > > From: "Dale K. Henrichs" <[hidden email]> > To: "Johan Brichau" <[hidden email]> > Cc: [hidden email] > Sent: Wednesday, December 18, 2013 6:26:01 PM > Subject: Re: [Glass] problem update: Zinc server with Seaside (3.1) > > Johan, > > I've finally started monkeying with this and the first thing I ran into (running this under tODE and trying to duplicate the GemTools error condition) was an error involving SocketStream and an apparent buffer overflow ... it looks to me that SocketStream>>next has no logic to refill the buffer once it is full ... at least I got a stack in which the ByteArray buffer was accessed at slot 4097 (1 larger than the buffer): > > There are three other processes floating around as well ... I'll keep digging ... > > Dale > > > From: "Johan Brichau" <[hidden email]> > To: [hidden email] > Sent: Sunday, December 15, 2013 1:20:20 PM > Subject: [Glass] problem update: Zinc server with Seaside (3.1) > > Hi all, > > I’m working on the Zinc server adaptor for Seaside 3.1 but I got stuck. > > Launching the following from Gemtools brings up the server and makes it respond to requests nicely. Trying to do the same from a topaz session though, immediately throws an out-of-mem exception. > > WAGsZincAdaptor startOn: 8383 > > If you break the blocking call in Gemtools and run the line above again, you also run out-of-mem after a couple of times (but only after _also_ connecting to the server from your web browser). > First question about this is why that does not happen in gem tools? I tried turning the auto commit off but that did not change anything. > > Following advice in [1], I started investigating from gemtools and got this for the byte sizes in temp obj memory on 75% full: > > #'ByteArray'->22153440, #'GsMethodDictionary'->3073088, #'GsMethodLookupCache'->2224064, #'GsProcess'->562224, #'String'->415504, #'Array'->340168, #'SocketStream'->322680, #'ExecBlock'->201664, #'VariableContext'->175552, #'SocketStreamSocket'->150696, …0 > > The process browser shows over 100 threads displayed as follows: > > (priority=25) ready [oop= …. ] > > I guess this is where Dale started having nightmares. The stack traces show the out-of-mem always happens in the ZnNetworkingUtils>>setSocketStreamParameters: method. > > All of this is using the current Zinc version on github for gemstone 3.1 [2] and my version of the Seaside 3.1 port [3] > To try, load Seaside from my repo, load the ‘Zinc-Seaside’ group from the baseline and hit the above. > > Any ideas are welcome > > Johan > > [1] http://gemstonesoup.wordpress.com/2008/11/19/gemstone-101-managing-out-of-memory-situations/ > [2] https://github.com/glassdb/zinc/tree/gemstone3.1 > [3] https://github.com/jbrichau/Seaside31 > > _______________________________________________ > Glass mailing list > [hidden email] > http://lists.gemtalksystems.com/mailman/listinfo/glass _______________________________________________ Glass mailing list [hidden email] http://lists.gemtalksystems.com/mailman/listinfo/glass |
Johan ... I'm also digging more into this ... actually trying to get the logging api working to dump events to the object log ... I'm suspicious about a couple of things, but nothing showing up ...
Dale ----- Original Message ----- | From: "Johan Brichau" <[hidden email]> | To: "Dale K. Henrichs" <[hidden email]> | Cc: [hidden email] | Sent: Thursday, December 19, 2013 7:46:37 AM | Subject: Re: [Glass] problem update: Zinc server with Seaside (3.1) | | Hi Dale, | | Thanks for these pointers. | I will look into it asap. | | One thing I noticed is that after I merged the latest changes into | Seaside3.1 [1], a lot less processes were spawning and it took more | time before the system ran out of memory. | But it might just have been a coincidence... | | Johan | | [1] https://github.com/glassdb/Seaside31/pull/5 | | On 19 Dec 2013, at 06:38, Dale K. Henrichs | <[hidden email]> wrote: | | > Just a touch more info: | > | > ZnMultiThreadedServer>>listenLoop sents #repeat to a block ... | > | > if `serverSocket isValid` returns false, the loop is exited but it | > launches another | > #listenLoop ... if we get consistent invalid serverSockets, this | > guy will spin out of control.. | > | > Inside the block, ZnMultiThreadedServer>>serveConnectionsOn: is | > called (right before the repeat). | > | > In ZnMultiThreadedServer>>serveConnectionsOn: if #waitForAcceptFor: | > returns a non-nil response, we are off to the races with a forked | > process and a return to ZnMultiThreadedServer>>listenLoop which | > itself spins up a new call to | > ZnMultiThreadedServer>>serveConnectionsOn: .... | > | > Soooo, it looks like SocketStreamSocket>>waitForAcceptFor: had | > better function correctly .... and serverSocket had better not go | > into terminal invalidity... | > | > Some halts or breakpoints in and around these particular methods | > should yield some good information ... or point us in a more | > productive direction. | > | > Dale | > | > From: "Dale K. Henrichs" <[hidden email]> | > To: "Johan Brichau" <[hidden email]> | > Cc: [hidden email] | > Sent: Wednesday, December 18, 2013 8:59:47 PM | > Subject: Re: [Glass] problem update: Zinc server with Seaside (3.1) | > | > Johan, | > | > I've collected more info on two processes[1] ... first the one that | > gets an "index out of range error" and the other process that was | > sitting in the process list waiting for more input from the | > socket, but the socket was closed (at leasted #isConnected is | > false). | > | > I hit the out of memory earlier but didn't collect any meaningful | > info ... I'm assuming that the multi-threaded boy is hitting an | > error somewhere and spinning away like crazy ... | > | > Need to catch the error in progress and perhaps find out from Sven | > if there is supposed to be a "deadman switch" on this guy to avoid | > infinite errors .. | > | > Dale | > | > [1] | > https://github.com/dalehenrich/todeHome/blob/master/home/seaside3.1.0/stacks | > | > From: "Dale K. Henrichs" <[hidden email]> | > To: "Johan Brichau" <[hidden email]> | > Cc: [hidden email] | > Sent: Wednesday, December 18, 2013 6:39:13 PM | > Subject: Re: [Glass] problem update: Zinc server with Seaside (3.1) | > | > Here's the stack: | > | > 1. OffsetError(AbstractException)>>_signalWith: @5 line 25 | > 2. OffsetError(AbstractException)>>signal @2 line 47 | > 3. ByteArray(Object)>>_error:args: @15 line 11 | > 4. ByteArray(Object)>>_errorIndexOutOfRange: @2 line 6 | > 5. ByteArray>>at: @4 line 13 | > 6. SocketStream>>next @13 line 10 | > 7. ZnLineReader>>processNext @5 line 4 | > 8. ZnLineReader>>nextLine @3 line 3 | > 9. ZnRequestLine>>readFrom: @3 line 3 | > 10. ZnRequestLine class>>readFrom: @3 line 3 | > 11. ZnRequest>>readHeaderFrom: @2 line 2 | > 12. ZnRequest(ZnMessage)>>readBinaryFrom: @2 line 3 | > 13. ZnRequest class(ZnMessage class)>>readBinaryFrom: @3 line 3 | > 14. [] in | > ExecBlock1(ZnZincServerAdaptor)>>configureServerForBinaryReading | > @2 line 4 | > 15. [] in | > ZnManagingMultiThreadedServer(ZnSingleThreadedServer)>>readRequest: | > @3 line 6 | > 16. [] in | > ExecBlock0(ZnSingleThreadedServer)>>withMaximumEntitySizeDo: @2 | > line 6 | > 17. [] in ZnMaximumEntitySize class(DynamicVariable | > class)>>value:during: @3 line 9 | > 18. ZnMaximumEntitySize class(ExecBlock)>>ensure: @2 line 12 | > 19. ZnMaximumEntitySize class(DynamicVariable class)>>value:during: | > @6 line 10 | > 20. | > ZnManagingMultiThreadedServer(ZnSingleThreadedServer)>>withMaximumEntitySizeDo: | > @5 line 5 | > 21. | > ZnManagingMultiThreadedServer(ZnSingleThreadedServer)>>readRequest: | > @2 line 6 | > 22. [] in | > ZnManagingMultiThreadedServer(ZnMultiThreadedServer)>>readRequestSafely: | > @2 line 4 | > 23. ZnManagingMultiThreadedServer(ExecBlock)>>on:do: @3 line 42 | > 24. | > ZnManagingMultiThreadedServer(ZnMultiThreadedServer)>>readRequestSafely: | > @3 line 5 | > 25. | > ZnManagingMultiThreadedServer(ZnMultiThreadedServer)>>executeOneRequestResponseOn: | > @2 line 7 | > 26. [] in | > ZnManagingMultiThreadedServer(ZnMultiThreadedServer)>>executeRequestResponseLoopOn: | > @2 line 10 | > 27. [] in ZnCurrentServer class(DynamicVariable | > class)>>value:during: @3 line 9 | > 28. ZnCurrentServer class(ExecBlock)>>ensure: @2 line 12 | > 29. ZnCurrentServer class(DynamicVariable class)>>value:during: @6 | > line 10 | > 30. | > ZnManagingMultiThreadedServer(ZnMultiThreadedServer)>>executeRequestResponseLoopOn: | > @4 line 8 | > 31. [] in | > ZnManagingMultiThreadedServer(ZnMultiThreadedServer)>>serveConnectionsOn: | > @2 line 9 | > 32. ZnManagingMultiThreadedServer(ExecBlock)>>ensure: @2 line 12 | > 33. [] in | > ZnManagingMultiThreadedServer(ZnMultiThreadedServer)>>serveConnectionsOn: | > @2 line 10 | > 34. [] in ExecBlock0(ExecBlock)>>ifCurtailed: @2 line 6 | > 35. ExecBlock0(ExecBlock)>>ensure: @2 line 12 | > 36. ZnManagingMultiThreadedServer(ExecBlock)>>ifCurtailed: @3 line | > 8 | > 37. [] in | > ZnManagingMultiThreadedServer(ZnMultiThreadedServer)>>serveConnectionsOn: | > @2 line 11 | > 38. GsProcess>>_start @7 line 16 | > 39. UndefinedObject(GsNMethod class)>>_gsReturnToC @1 line 1 | > | > | > From: "Dale K. Henrichs" <[hidden email]> | > To: "Johan Brichau" <[hidden email]> | > Cc: [hidden email] | > Sent: Wednesday, December 18, 2013 6:26:01 PM | > Subject: Re: [Glass] problem update: Zinc server with Seaside (3.1) | > | > Johan, | > | > I've finally started monkeying with this and the first thing I ran | > into (running this under tODE and trying to duplicate the GemTools | > error condition) was an error involving SocketStream and an | > apparent buffer overflow ... it looks to me that | > SocketStream>>next has no logic to refill the buffer once it is | > full ... at least I got a stack in which the ByteArray buffer was | > accessed at slot 4097 (1 larger than the buffer): | > | > There are three other processes floating around as well ... I'll | > keep digging ... | > | > Dale | > | > | > From: "Johan Brichau" <[hidden email]> | > To: [hidden email] | > Sent: Sunday, December 15, 2013 1:20:20 PM | > Subject: [Glass] problem update: Zinc server with Seaside (3.1) | > | > Hi all, | > | > I’m working on the Zinc server adaptor for Seaside 3.1 but I got | > stuck. | > | > Launching the following from Gemtools brings up the server and | > makes it respond to requests nicely. Trying to do the same from a | > topaz session though, immediately throws an out-of-mem exception. | > | > WAGsZincAdaptor startOn: 8383 | > | > If you break the blocking call in Gemtools and run the line above | > again, you also run out-of-mem after a couple of times (but only | > after _also_ connecting to the server from your web browser). | > First question about this is why that does not happen in gem tools? | > I tried turning the auto commit off but that did not change | > anything. | > | > Following advice in [1], I started investigating from gemtools and | > got this for the byte sizes in temp obj memory on 75% full: | > | > #'ByteArray'->22153440, #'GsMethodDictionary'->3073088, | > #'GsMethodLookupCache'->2224064, #'GsProcess'->562224, | > #'String'->415504, #'Array'->340168, #'SocketStream'->322680, | > #'ExecBlock'->201664, #'VariableContext'->175552, | > #'SocketStreamSocket'->150696, …0 | > | > The process browser shows over 100 threads displayed as follows: | > | > (priority=25) ready [oop= …. ] | > | > I guess this is where Dale started having nightmares. The stack | > traces show the out-of-mem always happens in the | > ZnNetworkingUtils>>setSocketStreamParameters: method. | > | > All of this is using the current Zinc version on github for | > gemstone 3.1 [2] and my version of the Seaside 3.1 port [3] | > To try, load Seaside from my repo, load the ‘Zinc-Seaside’ group | > from the baseline and hit the above. | > | > Any ideas are welcome | > | > Johan | > | > [1] | > http://gemstonesoup.wordpress.com/2008/11/19/gemstone-101-managing-out-of-memory-situations/ | > [2] https://github.com/glassdb/zinc/tree/gemstone3.1 | > [3] https://github.com/jbrichau/Seaside31 | > | > _______________________________________________ | > Glass mailing list | > [hidden email] | > http://lists.gemtalksystems.com/mailman/listinfo/glass | | _______________________________________________ Glass mailing list [hidden email] http://lists.gemtalksystems.com/mailman/listinfo/glass |
In reply to this post by Johan Brichau-3
Johan,
I've narrowed it down to a problem in SocketStream>>atEnd. Basically SocketStream>>atEnd returns <true> if the socket isConnected (whether or not data has been received) which them causes SocketStream>>next to think that there is data available after the SocketStream>>receiveData call, so lastRead is incremented and we are off to the races. There is a bit more gorey detail here[1]. At a higher level (see logging in [1]) two socket connections are accepted when only one should be (haven't characterized that yet) and the second socket connection has no data and is left open which leads to the SocketStream>>atEnd problem... If you don't see this kind of behavior in Pharo, then perhaps there are differences in how SocketStream>>receiveData ... if SocketStream>>receiveData blocks in Pharo then that would explain why this doesn't happen in Pharo ... I may or may not get farther along this evening, but I think we're getting closer ... Dale [1] https://github.com/dalehenrich/todeHome/blob/master/home/seaside3.1.0/thoughts ----- Original Message ----- | From: "Johan Brichau" <[hidden email]> | To: "Dale K. Henrichs" <[hidden email]> | Cc: [hidden email] | Sent: Thursday, December 19, 2013 7:46:37 AM | Subject: Re: [Glass] problem update: Zinc server with Seaside (3.1) | | Hi Dale, | | Thanks for these pointers. | I will look into it asap. | | One thing I noticed is that after I merged the latest changes into | Seaside3.1 [1], a lot less processes were spawning and it took more | time before the system ran out of memory. | But it might just have been a coincidence... | | Johan | | [1] https://github.com/glassdb/Seaside31/pull/5 | | On 19 Dec 2013, at 06:38, Dale K. Henrichs | <[hidden email]> wrote: | | > Just a touch more info: | > | > ZnMultiThreadedServer>>listenLoop sents #repeat to a block ... | > | > if `serverSocket isValid` returns false, the loop is exited but it | > launches another | > #listenLoop ... if we get consistent invalid serverSockets, this | > guy will spin out of control.. | > | > Inside the block, ZnMultiThreadedServer>>serveConnectionsOn: is | > called (right before the repeat). | > | > In ZnMultiThreadedServer>>serveConnectionsOn: if #waitForAcceptFor: | > returns a non-nil response, we are off to the races with a forked | > process and a return to ZnMultiThreadedServer>>listenLoop which | > itself spins up a new call to | > ZnMultiThreadedServer>>serveConnectionsOn: .... | > | > Soooo, it looks like SocketStreamSocket>>waitForAcceptFor: had | > better function correctly .... and serverSocket had better not go | > into terminal invalidity... | > | > Some halts or breakpoints in and around these particular methods | > should yield some good information ... or point us in a more | > productive direction. | > | > Dale | > | > From: "Dale K. Henrichs" <[hidden email]> | > To: "Johan Brichau" <[hidden email]> | > Cc: [hidden email] | > Sent: Wednesday, December 18, 2013 8:59:47 PM | > Subject: Re: [Glass] problem update: Zinc server with Seaside (3.1) | > | > Johan, | > | > I've collected more info on two processes[1] ... first the one that | > gets an "index out of range error" and the other process that was | > sitting in the process list waiting for more input from the | > socket, but the socket was closed (at leasted #isConnected is | > false). | > | > I hit the out of memory earlier but didn't collect any meaningful | > info ... I'm assuming that the multi-threaded boy is hitting an | > error somewhere and spinning away like crazy ... | > | > Need to catch the error in progress and perhaps find out from Sven | > if there is supposed to be a "deadman switch" on this guy to avoid | > infinite errors .. | > | > Dale | > | > [1] | > https://github.com/dalehenrich/todeHome/blob/master/home/seaside3.1.0/stacks | > | > From: "Dale K. Henrichs" <[hidden email]> | > To: "Johan Brichau" <[hidden email]> | > Cc: [hidden email] | > Sent: Wednesday, December 18, 2013 6:39:13 PM | > Subject: Re: [Glass] problem update: Zinc server with Seaside (3.1) | > | > Here's the stack: | > | > 1. OffsetError(AbstractException)>>_signalWith: @5 line 25 | > 2. OffsetError(AbstractException)>>signal @2 line 47 | > 3. ByteArray(Object)>>_error:args: @15 line 11 | > 4. ByteArray(Object)>>_errorIndexOutOfRange: @2 line 6 | > 5. ByteArray>>at: @4 line 13 | > 6. SocketStream>>next @13 line 10 | > 7. ZnLineReader>>processNext @5 line 4 | > 8. ZnLineReader>>nextLine @3 line 3 | > 9. ZnRequestLine>>readFrom: @3 line 3 | > 10. ZnRequestLine class>>readFrom: @3 line 3 | > 11. ZnRequest>>readHeaderFrom: @2 line 2 | > 12. ZnRequest(ZnMessage)>>readBinaryFrom: @2 line 3 | > 13. ZnRequest class(ZnMessage class)>>readBinaryFrom: @3 line 3 | > 14. [] in | > ExecBlock1(ZnZincServerAdaptor)>>configureServerForBinaryReading | > @2 line 4 | > 15. [] in | > ZnManagingMultiThreadedServer(ZnSingleThreadedServer)>>readRequest: | > @3 line 6 | > 16. [] in | > ExecBlock0(ZnSingleThreadedServer)>>withMaximumEntitySizeDo: @2 | > line 6 | > 17. [] in ZnMaximumEntitySize class(DynamicVariable | > class)>>value:during: @3 line 9 | > 18. ZnMaximumEntitySize class(ExecBlock)>>ensure: @2 line 12 | > 19. ZnMaximumEntitySize class(DynamicVariable class)>>value:during: | > @6 line 10 | > 20. | > ZnManagingMultiThreadedServer(ZnSingleThreadedServer)>>withMaximumEntitySizeDo: | > @5 line 5 | > 21. | > ZnManagingMultiThreadedServer(ZnSingleThreadedServer)>>readRequest: | > @2 line 6 | > 22. [] in | > ZnManagingMultiThreadedServer(ZnMultiThreadedServer)>>readRequestSafely: | > @2 line 4 | > 23. ZnManagingMultiThreadedServer(ExecBlock)>>on:do: @3 line 42 | > 24. | > ZnManagingMultiThreadedServer(ZnMultiThreadedServer)>>readRequestSafely: | > @3 line 5 | > 25. | > ZnManagingMultiThreadedServer(ZnMultiThreadedServer)>>executeOneRequestResponseOn: | > @2 line 7 | > 26. [] in | > ZnManagingMultiThreadedServer(ZnMultiThreadedServer)>>executeRequestResponseLoopOn: | > @2 line 10 | > 27. [] in ZnCurrentServer class(DynamicVariable | > class)>>value:during: @3 line 9 | > 28. ZnCurrentServer class(ExecBlock)>>ensure: @2 line 12 | > 29. ZnCurrentServer class(DynamicVariable class)>>value:during: @6 | > line 10 | > 30. | > ZnManagingMultiThreadedServer(ZnMultiThreadedServer)>>executeRequestResponseLoopOn: | > @4 line 8 | > 31. [] in | > ZnManagingMultiThreadedServer(ZnMultiThreadedServer)>>serveConnectionsOn: | > @2 line 9 | > 32. ZnManagingMultiThreadedServer(ExecBlock)>>ensure: @2 line 12 | > 33. [] in | > ZnManagingMultiThreadedServer(ZnMultiThreadedServer)>>serveConnectionsOn: | > @2 line 10 | > 34. [] in ExecBlock0(ExecBlock)>>ifCurtailed: @2 line 6 | > 35. ExecBlock0(ExecBlock)>>ensure: @2 line 12 | > 36. ZnManagingMultiThreadedServer(ExecBlock)>>ifCurtailed: @3 line | > 8 | > 37. [] in | > ZnManagingMultiThreadedServer(ZnMultiThreadedServer)>>serveConnectionsOn: | > @2 line 11 | > 38. GsProcess>>_start @7 line 16 | > 39. UndefinedObject(GsNMethod class)>>_gsReturnToC @1 line 1 | > | > | > From: "Dale K. Henrichs" <[hidden email]> | > To: "Johan Brichau" <[hidden email]> | > Cc: [hidden email] | > Sent: Wednesday, December 18, 2013 6:26:01 PM | > Subject: Re: [Glass] problem update: Zinc server with Seaside (3.1) | > | > Johan, | > | > I've finally started monkeying with this and the first thing I ran | > into (running this under tODE and trying to duplicate the GemTools | > error condition) was an error involving SocketStream and an | > apparent buffer overflow ... it looks to me that | > SocketStream>>next has no logic to refill the buffer once it is | > full ... at least I got a stack in which the ByteArray buffer was | > accessed at slot 4097 (1 larger than the buffer): | > | > There are three other processes floating around as well ... I'll | > keep digging ... | > | > Dale | > | > | > From: "Johan Brichau" <[hidden email]> | > To: [hidden email] | > Sent: Sunday, December 15, 2013 1:20:20 PM | > Subject: [Glass] problem update: Zinc server with Seaside (3.1) | > | > Hi all, | > | > I’m working on the Zinc server adaptor for Seaside 3.1 but I got | > stuck. | > | > Launching the following from Gemtools brings up the server and | > makes it respond to requests nicely. Trying to do the same from a | > topaz session though, immediately throws an out-of-mem exception. | > | > WAGsZincAdaptor startOn: 8383 | > | > If you break the blocking call in Gemtools and run the line above | > again, you also run out-of-mem after a couple of times (but only | > after _also_ connecting to the server from your web browser). | > First question about this is why that does not happen in gem tools? | > I tried turning the auto commit off but that did not change | > anything. | > | > Following advice in [1], I started investigating from gemtools and | > got this for the byte sizes in temp obj memory on 75% full: | > | > #'ByteArray'->22153440, #'GsMethodDictionary'->3073088, | > #'GsMethodLookupCache'->2224064, #'GsProcess'->562224, | > #'String'->415504, #'Array'->340168, #'SocketStream'->322680, | > #'ExecBlock'->201664, #'VariableContext'->175552, | > #'SocketStreamSocket'->150696, …0 | > | > The process browser shows over 100 threads displayed as follows: | > | > (priority=25) ready [oop= …. ] | > | > I guess this is where Dale started having nightmares. The stack | > traces show the out-of-mem always happens in the | > ZnNetworkingUtils>>setSocketStreamParameters: method. | > | > All of this is using the current Zinc version on github for | > gemstone 3.1 [2] and my version of the Seaside 3.1 port [3] | > To try, load Seaside from my repo, load the ‘Zinc-Seaside’ group | > from the baseline and hit the above. | > | > Any ideas are welcome | > | > Johan | > | > [1] | > http://gemstonesoup.wordpress.com/2008/11/19/gemstone-101-managing-out-of-memory-situations/ | > [2] https://github.com/glassdb/zinc/tree/gemstone3.1 | > [3] https://github.com/jbrichau/Seaside31 | > | > _______________________________________________ | > Glass mailing list | > [hidden email] | > http://lists.gemtalksystems.com/mailman/listinfo/glass | | _______________________________________________ Glass mailing list [hidden email] http://lists.gemtalksystems.com/mailman/listinfo/glass |
In reply to this post by Johan Brichau-3
I think I've got things working now ... I will push a branch of glassdb/zinc with my changes ... I need to clean things up a bit and test with topaz ... right now its running in tODE ... I'm still getting two connections per request, but it's much more stable now ...
Dale ----- Original Message ----- | From: "Johan Brichau" <[hidden email]> | To: "Dale K. Henrichs" <[hidden email]> | Cc: [hidden email] | Sent: Thursday, December 19, 2013 7:46:37 AM | Subject: Re: [Glass] problem update: Zinc server with Seaside (3.1) | | Hi Dale, | | Thanks for these pointers. | I will look into it asap. | | One thing I noticed is that after I merged the latest changes into | Seaside3.1 [1], a lot less processes were spawning and it took more | time before the system ran out of memory. | But it might just have been a coincidence... | | Johan | | [1] https://github.com/glassdb/Seaside31/pull/5 | | On 19 Dec 2013, at 06:38, Dale K. Henrichs | <[hidden email]> wrote: | | > Just a touch more info: | > | > ZnMultiThreadedServer>>listenLoop sents #repeat to a block ... | > | > if `serverSocket isValid` returns false, the loop is exited but it | > launches another | > #listenLoop ... if we get consistent invalid serverSockets, this | > guy will spin out of control.. | > | > Inside the block, ZnMultiThreadedServer>>serveConnectionsOn: is | > called (right before the repeat). | > | > In ZnMultiThreadedServer>>serveConnectionsOn: if #waitForAcceptFor: | > returns a non-nil response, we are off to the races with a forked | > process and a return to ZnMultiThreadedServer>>listenLoop which | > itself spins up a new call to | > ZnMultiThreadedServer>>serveConnectionsOn: .... | > | > Soooo, it looks like SocketStreamSocket>>waitForAcceptFor: had | > better function correctly .... and serverSocket had better not go | > into terminal invalidity... | > | > Some halts or breakpoints in and around these particular methods | > should yield some good information ... or point us in a more | > productive direction. | > | > Dale | > | > From: "Dale K. Henrichs" <[hidden email]> | > To: "Johan Brichau" <[hidden email]> | > Cc: [hidden email] | > Sent: Wednesday, December 18, 2013 8:59:47 PM | > Subject: Re: [Glass] problem update: Zinc server with Seaside (3.1) | > | > Johan, | > | > I've collected more info on two processes[1] ... first the one that | > gets an "index out of range error" and the other process that was | > sitting in the process list waiting for more input from the | > socket, but the socket was closed (at leasted #isConnected is | > false). | > | > I hit the out of memory earlier but didn't collect any meaningful | > info ... I'm assuming that the multi-threaded boy is hitting an | > error somewhere and spinning away like crazy ... | > | > Need to catch the error in progress and perhaps find out from Sven | > if there is supposed to be a "deadman switch" on this guy to avoid | > infinite errors .. | > | > Dale | > | > [1] | > https://github.com/dalehenrich/todeHome/blob/master/home/seaside3.1.0/stacks | > | > From: "Dale K. Henrichs" <[hidden email]> | > To: "Johan Brichau" <[hidden email]> | > Cc: [hidden email] | > Sent: Wednesday, December 18, 2013 6:39:13 PM | > Subject: Re: [Glass] problem update: Zinc server with Seaside (3.1) | > | > Here's the stack: | > | > 1. OffsetError(AbstractException)>>_signalWith: @5 line 25 | > 2. OffsetError(AbstractException)>>signal @2 line 47 | > 3. ByteArray(Object)>>_error:args: @15 line 11 | > 4. ByteArray(Object)>>_errorIndexOutOfRange: @2 line 6 | > 5. ByteArray>>at: @4 line 13 | > 6. SocketStream>>next @13 line 10 | > 7. ZnLineReader>>processNext @5 line 4 | > 8. ZnLineReader>>nextLine @3 line 3 | > 9. ZnRequestLine>>readFrom: @3 line 3 | > 10. ZnRequestLine class>>readFrom: @3 line 3 | > 11. ZnRequest>>readHeaderFrom: @2 line 2 | > 12. ZnRequest(ZnMessage)>>readBinaryFrom: @2 line 3 | > 13. ZnRequest class(ZnMessage class)>>readBinaryFrom: @3 line 3 | > 14. [] in | > ExecBlock1(ZnZincServerAdaptor)>>configureServerForBinaryReading | > @2 line 4 | > 15. [] in | > ZnManagingMultiThreadedServer(ZnSingleThreadedServer)>>readRequest: | > @3 line 6 | > 16. [] in | > ExecBlock0(ZnSingleThreadedServer)>>withMaximumEntitySizeDo: @2 | > line 6 | > 17. [] in ZnMaximumEntitySize class(DynamicVariable | > class)>>value:during: @3 line 9 | > 18. ZnMaximumEntitySize class(ExecBlock)>>ensure: @2 line 12 | > 19. ZnMaximumEntitySize class(DynamicVariable class)>>value:during: | > @6 line 10 | > 20. | > ZnManagingMultiThreadedServer(ZnSingleThreadedServer)>>withMaximumEntitySizeDo: | > @5 line 5 | > 21. | > ZnManagingMultiThreadedServer(ZnSingleThreadedServer)>>readRequest: | > @2 line 6 | > 22. [] in | > ZnManagingMultiThreadedServer(ZnMultiThreadedServer)>>readRequestSafely: | > @2 line 4 | > 23. ZnManagingMultiThreadedServer(ExecBlock)>>on:do: @3 line 42 | > 24. | > ZnManagingMultiThreadedServer(ZnMultiThreadedServer)>>readRequestSafely: | > @3 line 5 | > 25. | > ZnManagingMultiThreadedServer(ZnMultiThreadedServer)>>executeOneRequestResponseOn: | > @2 line 7 | > 26. [] in | > ZnManagingMultiThreadedServer(ZnMultiThreadedServer)>>executeRequestResponseLoopOn: | > @2 line 10 | > 27. [] in ZnCurrentServer class(DynamicVariable | > class)>>value:during: @3 line 9 | > 28. ZnCurrentServer class(ExecBlock)>>ensure: @2 line 12 | > 29. ZnCurrentServer class(DynamicVariable class)>>value:during: @6 | > line 10 | > 30. | > ZnManagingMultiThreadedServer(ZnMultiThreadedServer)>>executeRequestResponseLoopOn: | > @4 line 8 | > 31. [] in | > ZnManagingMultiThreadedServer(ZnMultiThreadedServer)>>serveConnectionsOn: | > @2 line 9 | > 32. ZnManagingMultiThreadedServer(ExecBlock)>>ensure: @2 line 12 | > 33. [] in | > ZnManagingMultiThreadedServer(ZnMultiThreadedServer)>>serveConnectionsOn: | > @2 line 10 | > 34. [] in ExecBlock0(ExecBlock)>>ifCurtailed: @2 line 6 | > 35. ExecBlock0(ExecBlock)>>ensure: @2 line 12 | > 36. ZnManagingMultiThreadedServer(ExecBlock)>>ifCurtailed: @3 line | > 8 | > 37. [] in | > ZnManagingMultiThreadedServer(ZnMultiThreadedServer)>>serveConnectionsOn: | > @2 line 11 | > 38. GsProcess>>_start @7 line 16 | > 39. UndefinedObject(GsNMethod class)>>_gsReturnToC @1 line 1 | > | > | > From: "Dale K. Henrichs" <[hidden email]> | > To: "Johan Brichau" <[hidden email]> | > Cc: [hidden email] | > Sent: Wednesday, December 18, 2013 6:26:01 PM | > Subject: Re: [Glass] problem update: Zinc server with Seaside (3.1) | > | > Johan, | > | > I've finally started monkeying with this and the first thing I ran | > into (running this under tODE and trying to duplicate the GemTools | > error condition) was an error involving SocketStream and an | > apparent buffer overflow ... it looks to me that | > SocketStream>>next has no logic to refill the buffer once it is | > full ... at least I got a stack in which the ByteArray buffer was | > accessed at slot 4097 (1 larger than the buffer): | > | > There are three other processes floating around as well ... I'll | > keep digging ... | > | > Dale | > | > | > From: "Johan Brichau" <[hidden email]> | > To: [hidden email] | > Sent: Sunday, December 15, 2013 1:20:20 PM | > Subject: [Glass] problem update: Zinc server with Seaside (3.1) | > | > Hi all, | > | > I’m working on the Zinc server adaptor for Seaside 3.1 but I got | > stuck. | > | > Launching the following from Gemtools brings up the server and | > makes it respond to requests nicely. Trying to do the same from a | > topaz session though, immediately throws an out-of-mem exception. | > | > WAGsZincAdaptor startOn: 8383 | > | > If you break the blocking call in Gemtools and run the line above | > again, you also run out-of-mem after a couple of times (but only | > after _also_ connecting to the server from your web browser). | > First question about this is why that does not happen in gem tools? | > I tried turning the auto commit off but that did not change | > anything. | > | > Following advice in [1], I started investigating from gemtools and | > got this for the byte sizes in temp obj memory on 75% full: | > | > #'ByteArray'->22153440, #'GsMethodDictionary'->3073088, | > #'GsMethodLookupCache'->2224064, #'GsProcess'->562224, | > #'String'->415504, #'Array'->340168, #'SocketStream'->322680, | > #'ExecBlock'->201664, #'VariableContext'->175552, | > #'SocketStreamSocket'->150696, …0 | > | > The process browser shows over 100 threads displayed as follows: | > | > (priority=25) ready [oop= …. ] | > | > I guess this is where Dale started having nightmares. The stack | > traces show the out-of-mem always happens in the | > ZnNetworkingUtils>>setSocketStreamParameters: method. | > | > All of this is using the current Zinc version on github for | > gemstone 3.1 [2] and my version of the Seaside 3.1 port [3] | > To try, load Seaside from my repo, load the ‘Zinc-Seaside’ group | > from the baseline and hit the above. | > | > Any ideas are welcome | > | > Johan | > | > [1] | > http://gemstonesoup.wordpress.com/2008/11/19/gemstone-101-managing-out-of-memory-situations/ | > [2] https://github.com/glassdb/zinc/tree/gemstone3.1 | > [3] https://github.com/jbrichau/Seaside31 | > | > _______________________________________________ | > Glass mailing list | > [hidden email] | > http://lists.gemtalksystems.com/mailman/listinfo/glass | | _______________________________________________ Glass mailing list [hidden email] http://lists.gemtalksystems.com/mailman/listinfo/glass |
Johan,
I moved the package SocketStream to the repo (changed baseline) and commit on socketstream_fix ... let me know if this works for you.. Dale ----- Original Message ----- | From: "Dale K. Henrichs" <[hidden email]> | To: "Johan Brichau" <[hidden email]> | Cc: [hidden email] | Sent: Thursday, December 19, 2013 5:45:26 PM | Subject: Re: [Glass] problem update: Zinc server with Seaside (3.1) | | I think I've got things working now ... I will push a branch of | glassdb/zinc with my changes ... I need to clean things up a bit and | test with topaz ... right now its running in tODE ... I'm still | getting two connections per request, but it's much more stable now | ... | | Dale | | ----- Original Message ----- | | From: "Johan Brichau" <[hidden email]> | | To: "Dale K. Henrichs" <[hidden email]> | | Cc: [hidden email] | | Sent: Thursday, December 19, 2013 7:46:37 AM | | Subject: Re: [Glass] problem update: Zinc server with Seaside (3.1) | | | | Hi Dale, | | | | Thanks for these pointers. | | I will look into it asap. | | | | One thing I noticed is that after I merged the latest changes into | | Seaside3.1 [1], a lot less processes were spawning and it took more | | time before the system ran out of memory. | | But it might just have been a coincidence... | | | | Johan | | | | [1] https://github.com/glassdb/Seaside31/pull/5 | | | | On 19 Dec 2013, at 06:38, Dale K. Henrichs | | <[hidden email]> wrote: | | | | > Just a touch more info: | | > | | > ZnMultiThreadedServer>>listenLoop sents #repeat to a block ... | | > | | > if `serverSocket isValid` returns false, the loop is exited but | | > it | | > launches another | | > #listenLoop ... if we get consistent invalid serverSockets, this | | > guy will spin out of control.. | | > | | > Inside the block, ZnMultiThreadedServer>>serveConnectionsOn: is | | > called (right before the repeat). | | > | | > In ZnMultiThreadedServer>>serveConnectionsOn: if | | > #waitForAcceptFor: | | > returns a non-nil response, we are off to the races with a forked | | > process and a return to ZnMultiThreadedServer>>listenLoop which | | > itself spins up a new call to | | > ZnMultiThreadedServer>>serveConnectionsOn: .... | | > | | > Soooo, it looks like SocketStreamSocket>>waitForAcceptFor: had | | > better function correctly .... and serverSocket had better not go | | > into terminal invalidity... | | > | | > Some halts or breakpoints in and around these particular methods | | > should yield some good information ... or point us in a more | | > productive direction. | | > | | > Dale | | > | | > From: "Dale K. Henrichs" <[hidden email]> | | > To: "Johan Brichau" <[hidden email]> | | > Cc: [hidden email] | | > Sent: Wednesday, December 18, 2013 8:59:47 PM | | > Subject: Re: [Glass] problem update: Zinc server with Seaside | | > (3.1) | | > | | > Johan, | | > | | > I've collected more info on two processes[1] ... first the one | | > that | | > gets an "index out of range error" and the other process that was | | > sitting in the process list waiting for more input from the | | > socket, but the socket was closed (at leasted #isConnected is | | > false). | | > | | > I hit the out of memory earlier but didn't collect any meaningful | | > info ... I'm assuming that the multi-threaded boy is hitting an | | > error somewhere and spinning away like crazy ... | | > | | > Need to catch the error in progress and perhaps find out from | | > Sven | | > if there is supposed to be a "deadman switch" on this guy to | | > avoid | | > infinite errors .. | | > | | > Dale | | > | | > [1] | | > https://github.com/dalehenrich/todeHome/blob/master/home/seaside3.1.0/stacks | | > | | > From: "Dale K. Henrichs" <[hidden email]> | | > To: "Johan Brichau" <[hidden email]> | | > Cc: [hidden email] | | > Sent: Wednesday, December 18, 2013 6:39:13 PM | | > Subject: Re: [Glass] problem update: Zinc server with Seaside | | > (3.1) | | > | | > Here's the stack: | | > | | > 1. OffsetError(AbstractException)>>_signalWith: @5 line 25 | | > 2. OffsetError(AbstractException)>>signal @2 line 47 | | > 3. ByteArray(Object)>>_error:args: @15 line 11 | | > 4. ByteArray(Object)>>_errorIndexOutOfRange: @2 line 6 | | > 5. ByteArray>>at: @4 line 13 | | > 6. SocketStream>>next @13 line 10 | | > 7. ZnLineReader>>processNext @5 line 4 | | > 8. ZnLineReader>>nextLine @3 line 3 | | > 9. ZnRequestLine>>readFrom: @3 line 3 | | > 10. ZnRequestLine class>>readFrom: @3 line 3 | | > 11. ZnRequest>>readHeaderFrom: @2 line 2 | | > 12. ZnRequest(ZnMessage)>>readBinaryFrom: @2 line 3 | | > 13. ZnRequest class(ZnMessage class)>>readBinaryFrom: @3 line 3 | | > 14. [] in | | > ExecBlock1(ZnZincServerAdaptor)>>configureServerForBinaryReading | | > @2 line 4 | | > 15. [] in | | > ZnManagingMultiThreadedServer(ZnSingleThreadedServer)>>readRequest: | | > @3 line 6 | | > 16. [] in | | > ExecBlock0(ZnSingleThreadedServer)>>withMaximumEntitySizeDo: @2 | | > line 6 | | > 17. [] in ZnMaximumEntitySize class(DynamicVariable | | > class)>>value:during: @3 line 9 | | > 18. ZnMaximumEntitySize class(ExecBlock)>>ensure: @2 line 12 | | > 19. ZnMaximumEntitySize class(DynamicVariable | | > class)>>value:during: | | > @6 line 10 | | > 20. | | > ZnManagingMultiThreadedServer(ZnSingleThreadedServer)>>withMaximumEntitySizeDo: | | > @5 line 5 | | > 21. | | > ZnManagingMultiThreadedServer(ZnSingleThreadedServer)>>readRequest: | | > @2 line 6 | | > 22. [] in | | > ZnManagingMultiThreadedServer(ZnMultiThreadedServer)>>readRequestSafely: | | > @2 line 4 | | > 23. ZnManagingMultiThreadedServer(ExecBlock)>>on:do: @3 line 42 | | > 24. | | > ZnManagingMultiThreadedServer(ZnMultiThreadedServer)>>readRequestSafely: | | > @3 line 5 | | > 25. | | > ZnManagingMultiThreadedServer(ZnMultiThreadedServer)>>executeOneRequestResponseOn: | | > @2 line 7 | | > 26. [] in | | > ZnManagingMultiThreadedServer(ZnMultiThreadedServer)>>executeRequestResponseLoopOn: | | > @2 line 10 | | > 27. [] in ZnCurrentServer class(DynamicVariable | | > class)>>value:during: @3 line 9 | | > 28. ZnCurrentServer class(ExecBlock)>>ensure: @2 line 12 | | > 29. ZnCurrentServer class(DynamicVariable class)>>value:during: | | > @6 | | > line 10 | | > 30. | | > ZnManagingMultiThreadedServer(ZnMultiThreadedServer)>>executeRequestResponseLoopOn: | | > @4 line 8 | | > 31. [] in | | > ZnManagingMultiThreadedServer(ZnMultiThreadedServer)>>serveConnectionsOn: | | > @2 line 9 | | > 32. ZnManagingMultiThreadedServer(ExecBlock)>>ensure: @2 line 12 | | > 33. [] in | | > ZnManagingMultiThreadedServer(ZnMultiThreadedServer)>>serveConnectionsOn: | | > @2 line 10 | | > 34. [] in ExecBlock0(ExecBlock)>>ifCurtailed: @2 line 6 | | > 35. ExecBlock0(ExecBlock)>>ensure: @2 line 12 | | > 36. ZnManagingMultiThreadedServer(ExecBlock)>>ifCurtailed: @3 | | > line | | > 8 | | > 37. [] in | | > ZnManagingMultiThreadedServer(ZnMultiThreadedServer)>>serveConnectionsOn: | | > @2 line 11 | | > 38. GsProcess>>_start @7 line 16 | | > 39. UndefinedObject(GsNMethod class)>>_gsReturnToC @1 line 1 | | > | | > | | > From: "Dale K. Henrichs" <[hidden email]> | | > To: "Johan Brichau" <[hidden email]> | | > Cc: [hidden email] | | > Sent: Wednesday, December 18, 2013 6:26:01 PM | | > Subject: Re: [Glass] problem update: Zinc server with Seaside | | > (3.1) | | > | | > Johan, | | > | | > I've finally started monkeying with this and the first thing I | | > ran | | > into (running this under tODE and trying to duplicate the | | > GemTools | | > error condition) was an error involving SocketStream and an | | > apparent buffer overflow ... it looks to me that | | > SocketStream>>next has no logic to refill the buffer once it is | | > full ... at least I got a stack in which the ByteArray buffer was | | > accessed at slot 4097 (1 larger than the buffer): | | > | | > There are three other processes floating around as well ... I'll | | > keep digging ... | | > | | > Dale | | > | | > | | > From: "Johan Brichau" <[hidden email]> | | > To: [hidden email] | | > Sent: Sunday, December 15, 2013 1:20:20 PM | | > Subject: [Glass] problem update: Zinc server with Seaside (3.1) | | > | | > Hi all, | | > | | > I’m working on the Zinc server adaptor for Seaside 3.1 but I got | | > stuck. | | > | | > Launching the following from Gemtools brings up the server and | | > makes it respond to requests nicely. Trying to do the same from a | | > topaz session though, immediately throws an out-of-mem exception. | | > | | > WAGsZincAdaptor startOn: 8383 | | > | | > If you break the blocking call in Gemtools and run the line above | | > again, you also run out-of-mem after a couple of times (but only | | > after _also_ connecting to the server from your web browser). | | > First question about this is why that does not happen in gem | | > tools? | | > I tried turning the auto commit off but that did not change | | > anything. | | > | | > Following advice in [1], I started investigating from gemtools | | > and | | > got this for the byte sizes in temp obj memory on 75% full: | | > | | > #'ByteArray'->22153440, #'GsMethodDictionary'->3073088, | | > #'GsMethodLookupCache'->2224064, #'GsProcess'->562224, | | > #'String'->415504, #'Array'->340168, #'SocketStream'->322680, | | > #'ExecBlock'->201664, #'VariableContext'->175552, | | > #'SocketStreamSocket'->150696, …0 | | > | | > The process browser shows over 100 threads displayed as follows: | | > | | > (priority=25) ready [oop= …. ] | | > | | > I guess this is where Dale started having nightmares. The stack | | > traces show the out-of-mem always happens in the | | > ZnNetworkingUtils>>setSocketStreamParameters: method. | | > | | > All of this is using the current Zinc version on github for | | > gemstone 3.1 [2] and my version of the Seaside 3.1 port [3] | | > To try, load Seaside from my repo, load the ‘Zinc-Seaside’ group | | > from the baseline and hit the above. | | > | | > Any ideas are welcome | | > | | > Johan | | > | | > [1] | | > http://gemstonesoup.wordpress.com/2008/11/19/gemstone-101-managing-out-of-memory-situations/ | | > [2] https://github.com/glassdb/zinc/tree/gemstone3.1 | | > [3] https://github.com/jbrichau/Seaside31 | | > | | > _______________________________________________ | | > Glass mailing list | | > [hidden email] | | > http://lists.gemtalksystems.com/mailman/listinfo/glass | | | | | _______________________________________________ Glass mailing list [hidden email] http://lists.gemtalksystems.com/mailman/listinfo/glass |
Dale,
Unfortunately I’m getting the same result in topaz and less stable behaviour from gemtools (connections are dropped) Looking into it… Johan On 20 Dec 2013, at 04:31, Dale K. Henrichs <[hidden email]> wrote: > Johan, > > I moved the package SocketStream to the repo (changed baseline) and commit on socketstream_fix ... let me know if this works for you.. > > Dale > > ----- Original Message ----- > | From: "Dale K. Henrichs" <[hidden email]> > | To: "Johan Brichau" <[hidden email]> > | Cc: [hidden email] > | Sent: Thursday, December 19, 2013 5:45:26 PM > | Subject: Re: [Glass] problem update: Zinc server with Seaside (3.1) > | > | I think I've got things working now ... I will push a branch of > | glassdb/zinc with my changes ... I need to clean things up a bit and > | test with topaz ... right now its running in tODE ... I'm still > | getting two connections per request, but it's much more stable now > | ... > | > | Dale > | > | ----- Original Message ----- > | | From: "Johan Brichau" <[hidden email]> > | | To: "Dale K. Henrichs" <[hidden email]> > | | Cc: [hidden email] > | | Sent: Thursday, December 19, 2013 7:46:37 AM > | | Subject: Re: [Glass] problem update: Zinc server with Seaside (3.1) > | | > | | Hi Dale, > | | > | | Thanks for these pointers. > | | I will look into it asap. > | | > | | One thing I noticed is that after I merged the latest changes into > | | Seaside3.1 [1], a lot less processes were spawning and it took more > | | time before the system ran out of memory. > | | But it might just have been a coincidence... > | | > | | Johan > | | > | | [1] https://github.com/glassdb/Seaside31/pull/5 > | | > | | On 19 Dec 2013, at 06:38, Dale K. Henrichs > | | <[hidden email]> wrote: > | | > | | > Just a touch more info: > | | > > | | > ZnMultiThreadedServer>>listenLoop sents #repeat to a block ... > | | > > | | > if `serverSocket isValid` returns false, the loop is exited but > | | > it > | | > launches another > | | > #listenLoop ... if we get consistent invalid serverSockets, this > | | > guy will spin out of control.. > | | > > | | > Inside the block, ZnMultiThreadedServer>>serveConnectionsOn: is > | | > called (right before the repeat). > | | > > | | > In ZnMultiThreadedServer>>serveConnectionsOn: if > | | > #waitForAcceptFor: > | | > returns a non-nil response, we are off to the races with a forked > | | > process and a return to ZnMultiThreadedServer>>listenLoop which > | | > itself spins up a new call to > | | > ZnMultiThreadedServer>>serveConnectionsOn: .... > | | > > | | > Soooo, it looks like SocketStreamSocket>>waitForAcceptFor: had > | | > better function correctly .... and serverSocket had better not go > | | > into terminal invalidity... > | | > > | | > Some halts or breakpoints in and around these particular methods > | | > should yield some good information ... or point us in a more > | | > productive direction. > | | > > | | > Dale > | | > > | | > From: "Dale K. Henrichs" <[hidden email]> > | | > To: "Johan Brichau" <[hidden email]> > | | > Cc: [hidden email] > | | > Sent: Wednesday, December 18, 2013 8:59:47 PM > | | > Subject: Re: [Glass] problem update: Zinc server with Seaside > | | > (3.1) > | | > > | | > Johan, > | | > > | | > I've collected more info on two processes[1] ... first the one > | | > that > | | > gets an "index out of range error" and the other process that was > | | > sitting in the process list waiting for more input from the > | | > socket, but the socket was closed (at leasted #isConnected is > | | > false). > | | > > | | > I hit the out of memory earlier but didn't collect any meaningful > | | > info ... I'm assuming that the multi-threaded boy is hitting an > | | > error somewhere and spinning away like crazy ... > | | > > | | > Need to catch the error in progress and perhaps find out from > | | > Sven > | | > if there is supposed to be a "deadman switch" on this guy to > | | > avoid > | | > infinite errors .. > | | > > | | > Dale > | | > > | | > [1] > | | > https://github.com/dalehenrich/todeHome/blob/master/home/seaside3.1.0/stacks > | | > > | | > From: "Dale K. Henrichs" <[hidden email]> > | | > To: "Johan Brichau" <[hidden email]> > | | > Cc: [hidden email] > | | > Sent: Wednesday, December 18, 2013 6:39:13 PM > | | > Subject: Re: [Glass] problem update: Zinc server with Seaside > | | > (3.1) > | | > > | | > Here's the stack: > | | > > | | > 1. OffsetError(AbstractException)>>_signalWith: @5 line 25 > | | > 2. OffsetError(AbstractException)>>signal @2 line 47 > | | > 3. ByteArray(Object)>>_error:args: @15 line 11 > | | > 4. ByteArray(Object)>>_errorIndexOutOfRange: @2 line 6 > | | > 5. ByteArray>>at: @4 line 13 > | | > 6. SocketStream>>next @13 line 10 > | | > 7. ZnLineReader>>processNext @5 line 4 > | | > 8. ZnLineReader>>nextLine @3 line 3 > | | > 9. ZnRequestLine>>readFrom: @3 line 3 > | | > 10. ZnRequestLine class>>readFrom: @3 line 3 > | | > 11. ZnRequest>>readHeaderFrom: @2 line 2 > | | > 12. ZnRequest(ZnMessage)>>readBinaryFrom: @2 line 3 > | | > 13. ZnRequest class(ZnMessage class)>>readBinaryFrom: @3 line 3 > | | > 14. [] in > | | > ExecBlock1(ZnZincServerAdaptor)>>configureServerForBinaryReading > | | > @2 line 4 > | | > 15. [] in > | | > ZnManagingMultiThreadedServer(ZnSingleThreadedServer)>>readRequest: > | | > @3 line 6 > | | > 16. [] in > | | > ExecBlock0(ZnSingleThreadedServer)>>withMaximumEntitySizeDo: @2 > | | > line 6 > | | > 17. [] in ZnMaximumEntitySize class(DynamicVariable > | | > class)>>value:during: @3 line 9 > | | > 18. ZnMaximumEntitySize class(ExecBlock)>>ensure: @2 line 12 > | | > 19. ZnMaximumEntitySize class(DynamicVariable > | | > class)>>value:during: > | | > @6 line 10 > | | > 20. > | | > ZnManagingMultiThreadedServer(ZnSingleThreadedServer)>>withMaximumEntitySizeDo: > | | > @5 line 5 > | | > 21. > | | > ZnManagingMultiThreadedServer(ZnSingleThreadedServer)>>readRequest: > | | > @2 line 6 > | | > 22. [] in > | | > ZnManagingMultiThreadedServer(ZnMultiThreadedServer)>>readRequestSafely: > | | > @2 line 4 > | | > 23. ZnManagingMultiThreadedServer(ExecBlock)>>on:do: @3 line 42 > | | > 24. > | | > ZnManagingMultiThreadedServer(ZnMultiThreadedServer)>>readRequestSafely: > | | > @3 line 5 > | | > 25. > | | > ZnManagingMultiThreadedServer(ZnMultiThreadedServer)>>executeOneRequestResponseOn: > | | > @2 line 7 > | | > 26. [] in > | | > ZnManagingMultiThreadedServer(ZnMultiThreadedServer)>>executeRequestResponseLoopOn: > | | > @2 line 10 > | | > 27. [] in ZnCurrentServer class(DynamicVariable > | | > class)>>value:during: @3 line 9 > | | > 28. ZnCurrentServer class(ExecBlock)>>ensure: @2 line 12 > | | > 29. ZnCurrentServer class(DynamicVariable class)>>value:during: > | | > @6 > | | > line 10 > | | > 30. > | | > ZnManagingMultiThreadedServer(ZnMultiThreadedServer)>>executeRequestResponseLoopOn: > | | > @4 line 8 > | | > 31. [] in > | | > ZnManagingMultiThreadedServer(ZnMultiThreadedServer)>>serveConnectionsOn: > | | > @2 line 9 > | | > 32. ZnManagingMultiThreadedServer(ExecBlock)>>ensure: @2 line 12 > | | > 33. [] in > | | > ZnManagingMultiThreadedServer(ZnMultiThreadedServer)>>serveConnectionsOn: > | | > @2 line 10 > | | > 34. [] in ExecBlock0(ExecBlock)>>ifCurtailed: @2 line 6 > | | > 35. ExecBlock0(ExecBlock)>>ensure: @2 line 12 > | | > 36. ZnManagingMultiThreadedServer(ExecBlock)>>ifCurtailed: @3 > | | > line > | | > 8 > | | > 37. [] in > | | > ZnManagingMultiThreadedServer(ZnMultiThreadedServer)>>serveConnectionsOn: > | | > @2 line 11 > | | > 38. GsProcess>>_start @7 line 16 > | | > 39. UndefinedObject(GsNMethod class)>>_gsReturnToC @1 line 1 > | | > > | | > > | | > From: "Dale K. Henrichs" <[hidden email]> > | | > To: "Johan Brichau" <[hidden email]> > | | > Cc: [hidden email] > | | > Sent: Wednesday, December 18, 2013 6:26:01 PM > | | > Subject: Re: [Glass] problem update: Zinc server with Seaside > | | > (3.1) > | | > > | | > Johan, > | | > > | | > I've finally started monkeying with this and the first thing I > | | > ran > | | > into (running this under tODE and trying to duplicate the > | | > GemTools > | | > error condition) was an error involving SocketStream and an > | | > apparent buffer overflow ... it looks to me that > | | > SocketStream>>next has no logic to refill the buffer once it is > | | > full ... at least I got a stack in which the ByteArray buffer was > | | > accessed at slot 4097 (1 larger than the buffer): > | | > > | | > There are three other processes floating around as well ... I'll > | | > keep digging ... > | | > > | | > Dale > | | > > | | > > | | > From: "Johan Brichau" <[hidden email]> > | | > To: [hidden email] > | | > Sent: Sunday, December 15, 2013 1:20:20 PM > | | > Subject: [Glass] problem update: Zinc server with Seaside (3.1) > | | > > | | > Hi all, > | | > > | | > I’m working on the Zinc server adaptor for Seaside 3.1 but I got > | | > stuck. > | | > > | | > Launching the following from Gemtools brings up the server and > | | > makes it respond to requests nicely. Trying to do the same from a > | | > topaz session though, immediately throws an out-of-mem exception. > | | > > | | > WAGsZincAdaptor startOn: 8383 > | | > > | | > If you break the blocking call in Gemtools and run the line above > | | > again, you also run out-of-mem after a couple of times (but only > | | > after _also_ connecting to the server from your web browser). > | | > First question about this is why that does not happen in gem > | | > tools? > | | > I tried turning the auto commit off but that did not change > | | > anything. > | | > > | | > Following advice in [1], I started investigating from gemtools > | | > and > | | > got this for the byte sizes in temp obj memory on 75% full: > | | > > | | > #'ByteArray'->22153440, #'GsMethodDictionary'->3073088, > | | > #'GsMethodLookupCache'->2224064, #'GsProcess'->562224, > | | > #'String'->415504, #'Array'->340168, #'SocketStream'->322680, > | | > #'ExecBlock'->201664, #'VariableContext'->175552, > | | > #'SocketStreamSocket'->150696, …0 > | | > > | | > The process browser shows over 100 threads displayed as follows: > | | > > | | > (priority=25) ready [oop= …. ] > | | > > | | > I guess this is where Dale started having nightmares. The stack > | | > traces show the out-of-mem always happens in the > | | > ZnNetworkingUtils>>setSocketStreamParameters: method. > | | > > | | > All of this is using the current Zinc version on github for > | | > gemstone 3.1 [2] and my version of the Seaside 3.1 port [3] > | | > To try, load Seaside from my repo, load the ‘Zinc-Seaside’ group > | | > from the baseline and hit the above. > | | > > | | > Any ideas are welcome > | | > > | | > Johan > | | > > | | > [1] > | | > http://gemstonesoup.wordpress.com/2008/11/19/gemstone-101-managing-out-of-memory-situations/ > | | > [2] https://github.com/glassdb/zinc/tree/gemstone3.1 > | | > [3] https://github.com/jbrichau/Seaside31 > | | > > | | > _______________________________________________ > | | > Glass mailing list > | | > [hidden email] > | | > http://lists.gemtalksystems.com/mailman/listinfo/glass > | | > | | > | _______________________________________________ Glass mailing list [hidden email] http://lists.gemtalksystems.com/mailman/listinfo/glass |
Ha, a second try does yield better results. The start in topaz does not have the out-of-mem.
I do see a connection being dropped every other request…. but there is progress! Don’t know what made it fail similarly the first time. The only change I did between this email and the one below is a restart of the stone... Johan On 21 Dec 2013, at 10:42, Johan Brichau <[hidden email]> wrote: > Dale, > > Unfortunately I’m getting the same result in topaz and less stable behaviour from gemtools (connections are dropped) > Looking into it… > > Johan > > On 20 Dec 2013, at 04:31, Dale K. Henrichs <[hidden email]> wrote: > >> Johan, >> >> I moved the package SocketStream to the repo (changed baseline) and commit on socketstream_fix ... let me know if this works for you.. >> >> Dale >> >> ----- Original Message ----- >> | From: "Dale K. Henrichs" <[hidden email]> >> | To: "Johan Brichau" <[hidden email]> >> | Cc: [hidden email] >> | Sent: Thursday, December 19, 2013 5:45:26 PM >> | Subject: Re: [Glass] problem update: Zinc server with Seaside (3.1) >> | >> | I think I've got things working now ... I will push a branch of >> | glassdb/zinc with my changes ... I need to clean things up a bit and >> | test with topaz ... right now its running in tODE ... I'm still >> | getting two connections per request, but it's much more stable now >> | ... >> | >> | Dale >> | >> | ----- Original Message ----- >> | | From: "Johan Brichau" <[hidden email]> >> | | To: "Dale K. Henrichs" <[hidden email]> >> | | Cc: [hidden email] >> | | Sent: Thursday, December 19, 2013 7:46:37 AM >> | | Subject: Re: [Glass] problem update: Zinc server with Seaside (3.1) >> | | >> | | Hi Dale, >> | | >> | | Thanks for these pointers. >> | | I will look into it asap. >> | | >> | | One thing I noticed is that after I merged the latest changes into >> | | Seaside3.1 [1], a lot less processes were spawning and it took more >> | | time before the system ran out of memory. >> | | But it might just have been a coincidence... >> | | >> | | Johan >> | | >> | | [1] https://github.com/glassdb/Seaside31/pull/5 >> | | >> | | On 19 Dec 2013, at 06:38, Dale K. Henrichs >> | | <[hidden email]> wrote: >> | | >> | | > Just a touch more info: >> | | > >> | | > ZnMultiThreadedServer>>listenLoop sents #repeat to a block ... >> | | > >> | | > if `serverSocket isValid` returns false, the loop is exited but >> | | > it >> | | > launches another >> | | > #listenLoop ... if we get consistent invalid serverSockets, this >> | | > guy will spin out of control.. >> | | > >> | | > Inside the block, ZnMultiThreadedServer>>serveConnectionsOn: is >> | | > called (right before the repeat). >> | | > >> | | > In ZnMultiThreadedServer>>serveConnectionsOn: if >> | | > #waitForAcceptFor: >> | | > returns a non-nil response, we are off to the races with a forked >> | | > process and a return to ZnMultiThreadedServer>>listenLoop which >> | | > itself spins up a new call to >> | | > ZnMultiThreadedServer>>serveConnectionsOn: .... >> | | > >> | | > Soooo, it looks like SocketStreamSocket>>waitForAcceptFor: had >> | | > better function correctly .... and serverSocket had better not go >> | | > into terminal invalidity... >> | | > >> | | > Some halts or breakpoints in and around these particular methods >> | | > should yield some good information ... or point us in a more >> | | > productive direction. >> | | > >> | | > Dale >> | | > >> | | > From: "Dale K. Henrichs" <[hidden email]> >> | | > To: "Johan Brichau" <[hidden email]> >> | | > Cc: [hidden email] >> | | > Sent: Wednesday, December 18, 2013 8:59:47 PM >> | | > Subject: Re: [Glass] problem update: Zinc server with Seaside >> | | > (3.1) >> | | > >> | | > Johan, >> | | > >> | | > I've collected more info on two processes[1] ... first the one >> | | > that >> | | > gets an "index out of range error" and the other process that was >> | | > sitting in the process list waiting for more input from the >> | | > socket, but the socket was closed (at leasted #isConnected is >> | | > false). >> | | > >> | | > I hit the out of memory earlier but didn't collect any meaningful >> | | > info ... I'm assuming that the multi-threaded boy is hitting an >> | | > error somewhere and spinning away like crazy ... >> | | > >> | | > Need to catch the error in progress and perhaps find out from >> | | > Sven >> | | > if there is supposed to be a "deadman switch" on this guy to >> | | > avoid >> | | > infinite errors .. >> | | > >> | | > Dale >> | | > >> | | > [1] >> | | > https://github.com/dalehenrich/todeHome/blob/master/home/seaside3.1.0/stacks >> | | > >> | | > From: "Dale K. Henrichs" <[hidden email]> >> | | > To: "Johan Brichau" <[hidden email]> >> | | > Cc: [hidden email] >> | | > Sent: Wednesday, December 18, 2013 6:39:13 PM >> | | > Subject: Re: [Glass] problem update: Zinc server with Seaside >> | | > (3.1) >> | | > >> | | > Here's the stack: >> | | > >> | | > 1. OffsetError(AbstractException)>>_signalWith: @5 line 25 >> | | > 2. OffsetError(AbstractException)>>signal @2 line 47 >> | | > 3. ByteArray(Object)>>_error:args: @15 line 11 >> | | > 4. ByteArray(Object)>>_errorIndexOutOfRange: @2 line 6 >> | | > 5. ByteArray>>at: @4 line 13 >> | | > 6. SocketStream>>next @13 line 10 >> | | > 7. ZnLineReader>>processNext @5 line 4 >> | | > 8. ZnLineReader>>nextLine @3 line 3 >> | | > 9. ZnRequestLine>>readFrom: @3 line 3 >> | | > 10. ZnRequestLine class>>readFrom: @3 line 3 >> | | > 11. ZnRequest>>readHeaderFrom: @2 line 2 >> | | > 12. ZnRequest(ZnMessage)>>readBinaryFrom: @2 line 3 >> | | > 13. ZnRequest class(ZnMessage class)>>readBinaryFrom: @3 line 3 >> | | > 14. [] in >> | | > ExecBlock1(ZnZincServerAdaptor)>>configureServerForBinaryReading >> | | > @2 line 4 >> | | > 15. [] in >> | | > ZnManagingMultiThreadedServer(ZnSingleThreadedServer)>>readRequest: >> | | > @3 line 6 >> | | > 16. [] in >> | | > ExecBlock0(ZnSingleThreadedServer)>>withMaximumEntitySizeDo: @2 >> | | > line 6 >> | | > 17. [] in ZnMaximumEntitySize class(DynamicVariable >> | | > class)>>value:during: @3 line 9 >> | | > 18. ZnMaximumEntitySize class(ExecBlock)>>ensure: @2 line 12 >> | | > 19. ZnMaximumEntitySize class(DynamicVariable >> | | > class)>>value:during: >> | | > @6 line 10 >> | | > 20. >> | | > ZnManagingMultiThreadedServer(ZnSingleThreadedServer)>>withMaximumEntitySizeDo: >> | | > @5 line 5 >> | | > 21. >> | | > ZnManagingMultiThreadedServer(ZnSingleThreadedServer)>>readRequest: >> | | > @2 line 6 >> | | > 22. [] in >> | | > ZnManagingMultiThreadedServer(ZnMultiThreadedServer)>>readRequestSafely: >> | | > @2 line 4 >> | | > 23. ZnManagingMultiThreadedServer(ExecBlock)>>on:do: @3 line 42 >> | | > 24. >> | | > ZnManagingMultiThreadedServer(ZnMultiThreadedServer)>>readRequestSafely: >> | | > @3 line 5 >> | | > 25. >> | | > ZnManagingMultiThreadedServer(ZnMultiThreadedServer)>>executeOneRequestResponseOn: >> | | > @2 line 7 >> | | > 26. [] in >> | | > ZnManagingMultiThreadedServer(ZnMultiThreadedServer)>>executeRequestResponseLoopOn: >> | | > @2 line 10 >> | | > 27. [] in ZnCurrentServer class(DynamicVariable >> | | > class)>>value:during: @3 line 9 >> | | > 28. ZnCurrentServer class(ExecBlock)>>ensure: @2 line 12 >> | | > 29. ZnCurrentServer class(DynamicVariable class)>>value:during: >> | | > @6 >> | | > line 10 >> | | > 30. >> | | > ZnManagingMultiThreadedServer(ZnMultiThreadedServer)>>executeRequestResponseLoopOn: >> | | > @4 line 8 >> | | > 31. [] in >> | | > ZnManagingMultiThreadedServer(ZnMultiThreadedServer)>>serveConnectionsOn: >> | | > @2 line 9 >> | | > 32. ZnManagingMultiThreadedServer(ExecBlock)>>ensure: @2 line 12 >> | | > 33. [] in >> | | > ZnManagingMultiThreadedServer(ZnMultiThreadedServer)>>serveConnectionsOn: >> | | > @2 line 10 >> | | > 34. [] in ExecBlock0(ExecBlock)>>ifCurtailed: @2 line 6 >> | | > 35. ExecBlock0(ExecBlock)>>ensure: @2 line 12 >> | | > 36. ZnManagingMultiThreadedServer(ExecBlock)>>ifCurtailed: @3 >> | | > line >> | | > 8 >> | | > 37. [] in >> | | > ZnManagingMultiThreadedServer(ZnMultiThreadedServer)>>serveConnectionsOn: >> | | > @2 line 11 >> | | > 38. GsProcess>>_start @7 line 16 >> | | > 39. UndefinedObject(GsNMethod class)>>_gsReturnToC @1 line 1 >> | | > >> | | > >> | | > From: "Dale K. Henrichs" <[hidden email]> >> | | > To: "Johan Brichau" <[hidden email]> >> | | > Cc: [hidden email] >> | | > Sent: Wednesday, December 18, 2013 6:26:01 PM >> | | > Subject: Re: [Glass] problem update: Zinc server with Seaside >> | | > (3.1) >> | | > >> | | > Johan, >> | | > >> | | > I've finally started monkeying with this and the first thing I >> | | > ran >> | | > into (running this under tODE and trying to duplicate the >> | | > GemTools >> | | > error condition) was an error involving SocketStream and an >> | | > apparent buffer overflow ... it looks to me that >> | | > SocketStream>>next has no logic to refill the buffer once it is >> | | > full ... at least I got a stack in which the ByteArray buffer was >> | | > accessed at slot 4097 (1 larger than the buffer): >> | | > >> | | > There are three other processes floating around as well ... I'll >> | | > keep digging ... >> | | > >> | | > Dale >> | | > >> | | > >> | | > From: "Johan Brichau" <[hidden email]> >> | | > To: [hidden email] >> | | > Sent: Sunday, December 15, 2013 1:20:20 PM >> | | > Subject: [Glass] problem update: Zinc server with Seaside (3.1) >> | | > >> | | > Hi all, >> | | > >> | | > I’m working on the Zinc server adaptor for Seaside 3.1 but I got >> | | > stuck. >> | | > >> | | > Launching the following from Gemtools brings up the server and >> | | > makes it respond to requests nicely. Trying to do the same from a >> | | > topaz session though, immediately throws an out-of-mem exception. >> | | > >> | | > WAGsZincAdaptor startOn: 8383 >> | | > >> | | > If you break the blocking call in Gemtools and run the line above >> | | > again, you also run out-of-mem after a couple of times (but only >> | | > after _also_ connecting to the server from your web browser). >> | | > First question about this is why that does not happen in gem >> | | > tools? >> | | > I tried turning the auto commit off but that did not change >> | | > anything. >> | | > >> | | > Following advice in [1], I started investigating from gemtools >> | | > and >> | | > got this for the byte sizes in temp obj memory on 75% full: >> | | > >> | | > #'ByteArray'->22153440, #'GsMethodDictionary'->3073088, >> | | > #'GsMethodLookupCache'->2224064, #'GsProcess'->562224, >> | | > #'String'->415504, #'Array'->340168, #'SocketStream'->322680, >> | | > #'ExecBlock'->201664, #'VariableContext'->175552, >> | | > #'SocketStreamSocket'->150696, …0 >> | | > >> | | > The process browser shows over 100 threads displayed as follows: >> | | > >> | | > (priority=25) ready [oop= …. ] >> | | > >> | | > I guess this is where Dale started having nightmares. The stack >> | | > traces show the out-of-mem always happens in the >> | | > ZnNetworkingUtils>>setSocketStreamParameters: method. >> | | > >> | | > All of this is using the current Zinc version on github for >> | | > gemstone 3.1 [2] and my version of the Seaside 3.1 port [3] >> | | > To try, load Seaside from my repo, load the ‘Zinc-Seaside’ group >> | | > from the baseline and hit the above. >> | | > >> | | > Any ideas are welcome >> | | > >> | | > Johan >> | | > >> | | > [1] >> | | > http://gemstonesoup.wordpress.com/2008/11/19/gemstone-101-managing-out-of-memory-situations/ >> | | > [2] https://github.com/glassdb/zinc/tree/gemstone3.1 >> | | > [3] https://github.com/jbrichau/Seaside31 >> | | > >> | | > _______________________________________________ >> | | > Glass mailing list >> | | > [hidden email] >> | | > http://lists.gemtalksystems.com/mailman/listinfo/glass >> | | >> | | >> | > _______________________________________________ Glass mailing list [hidden email] http://lists.gemtalksystems.com/mailman/listinfo/glass |
Obviously, once I turn on the logging, the dropped connections disappear too…. (see log below) I’m wondering about the ZnUnknownHttpMethod before closing the stream. It tries to read from an empty stream actually… But we are definitely making progress… I’m off Xmas shopping today but will look at this in the evening.. Johan topaz 1> topaz 1> topaz 1> topaz 1> Zinc Server started on port 8383 --transcript--2013-12-21 03:03:1.1953476905822754E+01 455751 I Starting ZnManagingMultiThreadedServer HTTP port 8383 --transcript--2013-12-21 03:03:1.1953932046890259E+01 455683 D Initializing server socket --transcript--'new stream 116656385' --transcript--2013-12-21 03:03:1.8871202945709229E+01 455692 D Executing request/response loop --transcript--2013-12-21 03:03:1.8872347116470337E+01 455692 I Read aZnRequest(GET /welcome) --transcript--2013-12-21 03:03:1.8899705886840820E+01 455692 T GET /welcome 200 4733B 27ms --transcript--2013-12-21 03:03:1.8900135993957520E+01 455692 I Wrote aZnResponse(200 OK text/html;charset=utf-8 4733B) --transcript--2013-12-21 03:03:1.8900440931320190E+01 455692 D a ZnUnknownHttpMethod occurred (error 2710) while reading request --transcript--2013-12-21 03:03:1.8900584936141968E+01 455692 D Closing stream 116656385 On 21 Dec 2013, at 10:51, Johan Brichau <[hidden email]> wrote: Ha, a second try does yield better results. The start in topaz does not have the out-of-mem. _______________________________________________ Glass mailing list [hidden email] http://lists.gemtalksystems.com/mailman/listinfo/glass |
Yes I was seeing the second odd (null) connection, too and didn't figure out where that was coming from ... Didn't try things in topaz ... I never did characterize the out-of-memory cause either ... I just focused on solving the spinning in empty buffer problem ... there are probably some remaining gotchas hanging around ... Dale From: "Johan Brichau" <[hidden email]> _______________________________________________ Glass mailing list [hidden email] http://lists.gemtalksystems.com/mailman/listinfo/glass |
Dale,
It’s becoming quite a crazy issue… I’m now executing the Zinc adaptor perfectly _without_ any of the changes to SocketStream. I even did a complete load of Seaside31 and Zinc in a virgin image (without the fixes for SocketStream) and it works. I got there because I was fiddling around to get to the origin of the second connection. To me, the original implementation of #atEnd is correct because #receiveData throws a timeout exception if it does not receive any data from the socket. However, I noticed that this does not happen in the Gs implementation of SocketStream because the timeout is set to 0. But now I’m unable to test my theory because the original implementation works… So, there probably is still something fishy going on because we both had different kinds of errors before with the same codebase. Johan On 22 Dec 2013, at 05:10, Dale K. Henrichs <[hidden email]> wrote:
_______________________________________________ Glass mailing list [hidden email] http://lists.gemtalksystems.com/mailman/listinfo/glass |
Johan, The implementation of atEnd only becomes important when you hit one of "null connections" if you get one of those, I think that SocketStream misbehaves .... the socket is alive, says that it will not block on read and then proceeds to return no data ... this could very easily be "an illegal combination", but if it occurs things go crazy ... If you are able to avoid the "null connection" (my guess) then SocketStream works fine ... So I think we need to understand where these "null connections" are coming from and make sure that they don't reappear. Can you confirm that you are not seeing the "null connections" as well ... or are you surviving through them? Dale From: "Johan Brichau" <[hidden email]> _______________________________________________ Glass mailing list [hidden email] http://lists.gemtalksystems.com/mailman/listinfo/glass |
Dale,
With the new implementation of #atEnd, I see an error: --transcript--2013-12-22 10:39:3.5306130886077881E+01 302143 D Executing request/response loop --transcript--2013-12-22 10:39:3.5307526111602783E+01 302143 I Read aZnRequest(GET /welcome) --transcript--2013-12-22 10:39:3.5339780092239380E+01 302143 T GET /welcome 200 4733B 32ms --transcript--2013-12-22 10:39:3.5340270996093750E+01 302143 I Wrote aZnResponse(200 OK text/html;charset=utf-8 4733B) --transcript--2013-12-22 10:39:3.5341017007827759E+01 302143 D a ZnUnknownHttpMethod occurred (error 2710) while reading request --transcript--2013-12-22 10:39:3.5341190099716187E+01 302143 D Closing stream With the old implementation of #atEnd, everything works (again) and I also do not have the error: --transcript--2013-12-22 10:46:2.9496814012527466E+01 694893 D Executing request/response loop --transcript--2013-12-22 10:46:2.9498058080673218E+01 694893 I Read aZnRequest(GET /welcome) --transcript--2013-12-22 10:46:2.9528563976287842E+01 694893 T GET /welcome 200 4733B 30ms --transcript--2013-12-22 10:46:2.9529000997543335E+01 694893 I Wrote aZnResponse(200 OK text/html;charset=utf-8 4733B) --transcript--2013-12-22 10:46:5.9529083013534546E+01 694893 D a ConnectionTimedOut occurred (error 2710), Data receive timed out. while reading request --transcript--2013-12-22 10:46:5.9529268026351929E+01 694893 D Closing stream The only problem is… this codebase used to misbehave instantly. So… I agree that there is a condition that does not occur anymore but I cannot characterize it. In any case, I think the failsafe code in #next to check if there is anything in the buffer after the receiveData should be sufficient because the change to #atEnd also changes its semantics and this appears to yield different results (see above). I compared with the behaviour in Pharo and there I’m seeing the same log as the second one above. Johan On 22 Dec 2013, at 18:30, Dale K. Henrichs <[hidden email]> wrote:
_______________________________________________ Glass mailing list [hidden email] http://lists.gemtalksystems.com/mailman/listinfo/glass |
Free forum by Nabble | Edit this page |