This seems to become a different topic - I tried to change this and I notice that I was not able to make it work. Some of my topaz/ZnServer tasks are handling only http-calls with special URLs. It can happen, that these calls will be called over a period if time (days or so) - in my client simulation tool these calls were not called at all. As I understand the ZnServer, GsSocketStream and GsSocket code: the ZnServer tasks are then simply blocked low level in the virtual machine until a request comes in ... I would be surprised to see if they really wake up if they get a SigAbort ... The whole logic in ZnServer>>serveConnectionOn: listeningSocket: ... socket := listeningSocket waitForAcceptFor: self acceptWaitTimeout. ... does not work: timeout is not considered and at the lowest level, that API calls says: I'm blocking. Therefore the periodicTasks are not so executed periodically - ONLY when a request is incoming from that listing socket. Seems to be strange ... Marten Feldtmann via Glass <[hidden email]> hat am 24. Oktober 2016 um 16:43 geschrieben:
_______________________________________________ Glass mailing list [hidden email] http://lists.gemtalksystems.com/mailman/listinfo/glass _______________________________________________ Glass mailing list [hidden email] http://lists.gemtalksystems.com/mailman/listinfo/glass |
And I think, that this can be fixed by: SocketStreamSocket>>waitForConnectionFor: timeout ifTimedOut: timeoutBlock
Marten Feldtmann via Glass <[hidden email]> hat am 24. Oktober 2016 um 22:23 geschrieben:
_______________________________________________ Glass mailing list [hidden email] http://lists.gemtalksystems.com/mailman/listinfo/glass _______________________________________________ Glass mailing list [hidden email] http://lists.gemtalksystems.com/mailman/listinfo/glass |
In reply to this post by GLASS mailing list
Marten, Are you reading code and drawing these conclusions or are you in a debugger or ??? Unless I'm mistaken the method that calls ZnServer>>serveConnectionOn: is in an infinite repeat loop and it looks like by default the accept will block for 5 minutes, then the timeout should fire and a new accept request will be launched ... which will give the SigAbort code a chance to run ... So if you are not seeing the accept call kick out after 5 minutes
(300 seconds = 5 minutes right:) then perhaps there is a bug ...
is that what is happening for you? And it looks like GsSocket>>waitForAcceptFor: calls GsSocket>>waitForConnectionFor:ifTimedOut:.. Digging a bit more I see that
SocketStreamSocket>>waitForAcceptFor: calls
SocketStreamSocket>>waitForConnectionFor:ifTimedOut: is
wired to calling `self accept` which does ignore the timeout ....
Since GsSocket implements waitForConnectionFor:ifTimedOut:, I
think that
SocketStreamSocket>>waitForConnectionFor:ifTimedOut: should
do: ^self self underlyingSocket waitForConnectionFor: timeout ifTimedOut: timeoutBlock Could you try that out and confirm ... presumably the SocketStreamSocket code dates back to GemStone 2.4.x when perhaps GsSocket>>waitForConnectionFor:ifTimedOut: wasn't implemented? Good catch ... let me know it it works ... Dale On 10/24/2016 01:23 PM, Marten Feldtmann via Glass wrote:
_______________________________________________ Glass mailing list [hidden email] http://lists.gemtalksystems.com/mailman/listinfo/glass |
Hmm, a) GsServer>>waitForConnectionFor:ifTimeout: does not consider timeoutBlock at all and has a different logic. b) the call SocketStreamSocket>>waitForConnectionFor:ifTimedOut: is expected to return an instance of SocketStreamSocket and not an instance of GsSocket ... and ZnServer works on instances of SocketStreamSocket Marten Dale Henrichs via Glass <[hidden email]> hat am 25. Oktober 2016 um 00:04 geschrieben:
_______________________________________________ _______________________________________________ Glass mailing list [hidden email] http://lists.gemtalksystems.com/mailman/listinfo/glass |
In reply to this post by GLASS mailing list
instance for GsSocket--*zodiac-gemstone-core and for Zodiac, GsSocket>>waitForConnectionFor:ifTimedOut: is implemented ... so the real fix is to move these methods into SocketStream and/or make SocketStream obsolete, ... since I know I didn't create an SecureSocketStream class the need for SocketStream is probably fading away ... Created an issue[1] for moving all of these methods into zinc and
this issue[2] for your original problem [1] https://github.com/GsDevKit/zinc/issues/81 [2] https://github.com/GsDevKit/zinc/issues/82 On 10/24/2016 02:22 PM, Marten
Feldtmann via Glass wrote:
_______________________________________________ Glass mailing list [hidden email] http://lists.gemtalksystems.com/mailman/listinfo/glass |
In reply to this post by GLASS mailing list
Thanks, Dale
> I think that if you make continuations non-persistent, then you will need to > make sure that you have session affinity[1] ... this could cost in needing > to have enough gems to handle the number of concurrent "long term" sessions > you have ... I did experiments many years ago and I think that if you have a > known number of users/browser sessions, then this approach could work ... > let me know if there is interest here and I can blow the dust off of that > work and give you some pointers/help. We prefer to load balance over sessions because we have some requests that take a few seconds to complete. We do not want one session to hold up another when there are gems idle. OTOH, we could reconsider and perhaps start a new gem for each session. Our gems eat a lot of RAM though, which we need to sort out. Yes, we can run some of these requests in the background. But they creep up on us and we don't realise there's a problem until users complain that their requests time out. > I've also done some work with notTranlogged session data ... back when I was > doing scaling trying to get to 10,000 requests second (hit 7000) I noticed > the growth in tranlogs and a notTranlogged feature was subsequently added to > GemStone ... until now I haven't heard about users really needing this > feature -- even now it is in a stress testing context -- but a year ago I > took some time to some to do some experiments with notTranlogged session > state for Seaside and the work is on gitub[2]. I was able to my > NotTranlogged experiment to function, but I never did any comparison tests > to find out how much would actually be saved ... I'm not there yet. I think that it is manageable at this stage. 100GB disk space is not overly expensive. Other things to do now. Just a thought: I wonder if the not tranlogged will work with a hot standby. Should be ok? Have to be sure that one can recover when you lose the extent. That would be the major test. > [2] https://github.com/dalehenrich/Seaside/tree/notTranlogged > >> >> We load balance across gemstone sessions, which may influence what you do. >> >> On Mon, Oct 24, 2016 at 10:48 AM, Marten Feldtmann via Glass >> <[hidden email]> wrote: >>> >>> Ok, I have to make this perhaps a little bit clearer: >>> >>> In my test cases I have 200 to 400 clients (written in Python) playing >>> their >>> role as client simulators. They do their requests via http/rest against >>> Gemstone/s against 8 responding gem/zinc tasks (running with different >>> configurations). They produce in total (10-20) events per seconds against >>> the database. These events are changing or adding data - no deletion at >>> this >>> point. >>> >>> In terms of speed Gemstone has no problem answering the requests (even >>> with >>> only one answering tasks it would work). >>> >>> The current database file is around 64 GB of size. >>> >>> The size of the uncompressed backup file is around 4 GB. >>> >>> As I mentioned in my older postings: this system produces 1GB transaction >>> log files within 4-5 minutes.Therefore we have around 100GB transaction >>> data >>> each day ... wow. >>> >>> This seems to indicate, that not the extents files will be a problem, but >>> the management of the transaction files is far more difficult to handle >>> (and >>> I was not aware of this :-))). >>> >>> Question to the more experienced Gemstone/S developers: is this ok (a >>> general pattern) ? >>> >>> >>> Marten Feldtmann via Glass <[hidden email]> hat am 22. >>> Oktober 2016 um 22:08 geschrieben: >>> >>> >>> One of the most surprising things I noticed with my application is the >>> tremendous storage need of my transaction logs. >>> >>> Whenever I test my application (with around 200 clients) I get one >>> transaction log file within 4 minutes - each with a size of 1 GB. That >>> would >>> lead to 120GB transaction logs within one days with 8 hours working time. >>> >>> >>> And of course I do full transaction log .... the database size after >>> doing a >>> backup is 3.4 GB. Normally the clients do not do so much work - but they >>> send lots of events ... >>> >>> I looked at the inventory and I did not find any strange things ... any >>> ideas how to get an idea why such an amount of space is needed for the >>> transaction logs .... >>> >>> How do classes like Set, Arrays and changes to them results in >>> transaction >>> logs changes ??? I prefer to use Arrays at this moment ... >>> >>> >>> >>> >>> _______________________________________________ Glass mailing list >>> [hidden email] >>> http://lists.gemtalksystems.com/mailman/listinfo/glass >>> >>> >>> _______________________________________________ >>> Glass mailing list >>> [hidden email] >>> http://lists.gemtalksystems.com/mailman/listinfo/glass >>> >> _______________________________________________ >> Glass mailing list >> [hidden email] >> http://lists.gemtalksystems.com/mailman/listinfo/glass > > > _______________________________________________ > Glass mailing list > [hidden email] > http://lists.gemtalksystems.com/mailman/listinfo/glass Glass mailing list [hidden email] http://lists.gemtalksystems.com/mailman/listinfo/glass |
On 10/24/16 9:20 PM, Otto Behrens wrote: > Thanks, Dale > >> I think that if you make continuations non-persistent, then you will need to >> make sure that you have session affinity[1] ... this could cost in needing >> to have enough gems to handle the number of concurrent "long term" sessions >> you have ... I did experiments many years ago and I think that if you have a >> known number of users/browser sessions, then this approach could work ... >> let me know if there is interest here and I can blow the dust off of that >> work and give you some pointers/help. > We prefer to load balance over sessions because we have some requests > that take a few seconds to complete. We do not want one session to > hold up another when there are gems idle. the request if it was "running a long operation" ... things get interesting when all of the gems are running long transactions though:) if you are happy with what you are doing there's no need to change .. > > OTOH, we could reconsider and perhaps start a new gem for each > session. Our gems eat a lot of RAM though, which we need to sort out. Yeah that's the big one ... there's a hard cap on number of users if you go this route ... > > Yes, we can run some of these requests in the background. But they > creep up on us and we don't realise there's a problem until users > complain that their requests time out. > >> I've also done some work with notTranlogged session data ... back when I was >> doing scaling trying to get to 10,000 requests second (hit 7000) I noticed >> the growth in tranlogs and a notTranlogged feature was subsequently added to >> GemStone ... until now I haven't heard about users really needing this >> feature -- even now it is in a stress testing context -- but a year ago I >> took some time to some to do some experiments with notTranlogged session >> state for Seaside and the work is on gitub[2]. I was able to my >> NotTranlogged experiment to function, but I never did any comparison tests >> to find out how much would actually be saved ... > I'm not there yet. I think that it is manageable at this stage. 100GB > disk space is not overly expensive. Other things to do now. starting to look into notTranlogged so that Seaside is ready when you need it .. Like I said I didn't check in to the tranlog yield in my experiments, but I had gone as far as I could without starting to making structural changes to Seaside ... which might turn out to be necessary ... > > Just a thought: I wonder if the not tranlogged will work with a hot > standby. Should be ok? Have to be sure that one can recover when you > lose the extent. That would be the major test. Nottranlogged objects are persisted in the main disk bug not written to tranlogs .. so the implication is that notTranlogged objects are not present upon recovery --- but then Seaside session state is not needed if the stone crashes ... so if it's not written to the tranlog it will not be sent to the hot standby which should be fine ... Between disk and hot standby, the notTranlogged might be worth having in our pockets . > >> [2] https://github.com/dalehenrich/Seaside/tree/notTranlogged >> >>> We load balance across gemstone sessions, which may influence what you do. >>> >>> On Mon, Oct 24, 2016 at 10:48 AM, Marten Feldtmann via Glass >>> <[hidden email]> wrote: >>>> Ok, I have to make this perhaps a little bit clearer: >>>> >>>> In my test cases I have 200 to 400 clients (written in Python) playing >>>> their >>>> role as client simulators. They do their requests via http/rest against >>>> Gemstone/s against 8 responding gem/zinc tasks (running with different >>>> configurations). They produce in total (10-20) events per seconds against >>>> the database. These events are changing or adding data - no deletion at >>>> this >>>> point. >>>> >>>> In terms of speed Gemstone has no problem answering the requests (even >>>> with >>>> only one answering tasks it would work). >>>> >>>> The current database file is around 64 GB of size. >>>> >>>> The size of the uncompressed backup file is around 4 GB. >>>> >>>> As I mentioned in my older postings: this system produces 1GB transaction >>>> log files within 4-5 minutes.Therefore we have around 100GB transaction >>>> data >>>> each day ... wow. >>>> >>>> This seems to indicate, that not the extents files will be a problem, but >>>> the management of the transaction files is far more difficult to handle >>>> (and >>>> I was not aware of this :-))). >>>> >>>> Question to the more experienced Gemstone/S developers: is this ok (a >>>> general pattern) ? >>>> >>>> >>>> Marten Feldtmann via Glass <[hidden email]> hat am 22. >>>> Oktober 2016 um 22:08 geschrieben: >>>> >>>> >>>> One of the most surprising things I noticed with my application is the >>>> tremendous storage need of my transaction logs. >>>> >>>> Whenever I test my application (with around 200 clients) I get one >>>> transaction log file within 4 minutes - each with a size of 1 GB. That >>>> would >>>> lead to 120GB transaction logs within one days with 8 hours working time. >>>> >>>> >>>> And of course I do full transaction log .... the database size after >>>> doing a >>>> backup is 3.4 GB. Normally the clients do not do so much work - but they >>>> send lots of events ... >>>> >>>> I looked at the inventory and I did not find any strange things ... any >>>> ideas how to get an idea why such an amount of space is needed for the >>>> transaction logs .... >>>> >>>> How do classes like Set, Arrays and changes to them results in >>>> transaction >>>> logs changes ??? I prefer to use Arrays at this moment ... >>>> >>>> >>>> >>>> >>>> _______________________________________________ Glass mailing list >>>> [hidden email] >>>> http://lists.gemtalksystems.com/mailman/listinfo/glass >>>> >>>> >>>> _______________________________________________ >>>> Glass mailing list >>>> [hidden email] >>>> http://lists.gemtalksystems.com/mailman/listinfo/glass >>>> >>> _______________________________________________ >>> Glass mailing list >>> [hidden email] >>> http://lists.gemtalksystems.com/mailman/listinfo/glass >> >> _______________________________________________ >> Glass mailing list >> [hidden email] >> http://lists.gemtalksystems.com/mailman/listinfo/glass _______________________________________________ Glass mailing list [hidden email] http://lists.gemtalksystems.com/mailman/listinfo/glass |
In reply to this post by GLASS mailing list
Wow, after making this change I was - for the first time - able to make a full backup (nothing blocked) while the crash tests were running - even though I must to more tests, this is a wonderful success. Marten
_______________________________________________ Glass mailing list [hidden email] http://lists.gemtalksystems.com/mailman/listinfo/glass _______________________________________________ Glass mailing list [hidden email] http://lists.gemtalksystems.com/mailman/listinfo/glass |
In reply to this post by GLASS mailing list
Hey Dale, any first hints how to check the content of the translog files ? Marten _______________________________________________ Glass mailing list [hidden email] http://lists.gemtalksystems.com/mailman/listinfo/glass |
In reply to this post by GLASS mailing list
The fix you made would reduce the accumulation of transactions that gs needs to resolve views and potential conflicts through. The accumulation would have been high when some gems are very active with commits while others are mostly waiting (and holding an old view). It was the mostly-idle gems causing accumulation. You may notice: I'm big on performance tuning, here is another hint. Gs is also aided by having many gems starting with the same view point rather than each having separate views. Change your timeout time so that it expires at a time interval that the others would also target. Rather than a fixed 15 second timeout, compute a timeout from current system time that other gems might also compute to timeout on. A 15 second target interval might be a bit aggressive depending on your commit rates. Perhaps target expiry at a 30 second interval to further improve odds that gems will timeout-abort at the same moment. Paul Baumann On Oct 26, 2016 7:36 AM, "Marten Feldtmann via Glass" <[hidden email]> wrote:
_______________________________________________ Glass mailing list [hidden email] http://lists.gemtalksystems.com/mailman/listinfo/glass |
That means something like: ZnServer>>acceptWaitTimeout
Paul Baumann <[hidden email]> hat am 26. Oktober 2016 um 17:42 geschrieben: _______________________________________________ Glass mailing list [hidden email] http://lists.gemtalksystems.com/mailman/listinfo/glass |
Ok, I notice, that seconds are not enough ... timeout has to be calculated on ms time - but ok not that difficult. Thanks, I'll implement this .. Marten Marten Feldtmann via Glass <[hidden email]> hat am 26. Oktober 2016 um 19:39 geschrieben:
_______________________________________________ Glass mailing list [hidden email] http://lists.gemtalksystems.com/mailman/listinfo/glass _______________________________________________ Glass mailing list [hidden email] http://lists.gemtalksystems.com/mailman/listinfo/glass |
In reply to this post by GLASS mailing list
Personally I have not run the scripts so I don't have any hints without research ... on the other hand an HR will connect you directly with the person who does have experience:) I'm busy today so if you are willing to wait, I will try to find
the time to figure out the formula... On 10/26/16 4:46 AM, Marten Feldtmann
via Glass wrote:
_______________________________________________ Glass mailing list [hidden email] http://lists.gemtalksystems.com/mailman/listinfo/glass |
In reply to this post by GLASS mailing list
If these are changes that deserve to be shared (including the bugfix) a pull request with the relevant fixes would be much appreciated ... there are only so many hours in the day for me to work and the whole point of using GitHub is to make it much easier for users to contribute ... Personally I would rather spend the time teaching you guys how to submit pull requests than have to fix all of the outsanding bugs all by myself:) Dale On 10/26/16 10:50 AM, Marten Feldtmann
via Glass wrote:
_______________________________________________ Glass mailing list [hidden email] http://lists.gemtalksystems.com/mailman/listinfo/glass |
Administrator
|
In reply to this post by GLASS mailing list
Look in the product tree: ./bin/printlogs ./doc/man1/printlogs.1 Like Dale, I have not used these. But between the help printed by running the script and the man page, you should be able to get started.
|
Free forum by Nabble | Edit this page |