Transaction Logs so large ....

Previous Topic Next Topic
 
classic Classic list List threaded Threaded
35 messages Options
12
Reply | Threaded
Open this post in threaded view
|

[3.3.1]: Waiting for a connection is blocked ... no timeout at all is considered

GLASS mailing list

This seems to become a different topic - I tried to change this and I notice that I was not able to make it work.


Some of my topaz/ZnServer tasks are handling only http-calls with special URLs. It can happen, that these calls will be called over a period if time (days or so) - in my client simulation tool these calls were not called at all.

As I understand the ZnServer, GsSocketStream and GsSocket code: the ZnServer tasks are then simply blocked low level in the virtual machine until a request comes in ... I would be surprised to see if they really wake up if they get a SigAbort ...

The whole logic in ZnServer>>serveConnectionOn: listeningSocket:

...

socket := listeningSocket waitForAcceptFor: self acceptWaitTimeout.

...

does not work: timeout is not considered and at the lowest level, that API calls says: I'm blocking.

Therefore the periodicTasks are not so executed periodically - ONLY when a request is incoming from that listing socket.

Seems to be strange ...



Marten Feldtmann via Glass <[hidden email]> hat am 24. Oktober 2016 um 16:43 geschrieben:

That's very interesting.

This means, that I should introduce my own subclass of ZnServer and its #serveConnectionOn: (or the #noteAcceptWaitTimedOut) should be rewritten to make an empty transaction (with #beginTransaction and #abortTransaction) on a periodical base (15 seconds ?).



 

_______________________________________________ Glass mailing list [hidden email] http://lists.gemtalksystems.com/mailman/listinfo/glass

_______________________________________________
Glass mailing list
[hidden email]
http://lists.gemtalksystems.com/mailman/listinfo/glass
Reply | Threaded
Open this post in threaded view
|

Re: [3.3.1]: Waiting for a connection is blocked ... no timeout at all is considered

GLASS mailing list

And I think, that this can be fixed by:

SocketStreamSocket>>waitForConnectionFor: timeout ifTimedOut: timeoutBlock

waitForConnectionFor: timeout ifTimedOut: timeoutBlock
    "Wait up until the given deadline for a connection to be established. Return true if it is established by the deadline, false if not."

    | aGsSocketOrNil |

    aGsSocketOrNil := self underlyingSocket acceptTimeoutMs: timeout * 1000.
    aGsSocketOrNil isNil ifTrue:[^timeoutBlock value].
    ^ SocketStreamSocket onNativeclientSocket: aGsSocketOrNil for: self

Marten Feldtmann via Glass <[hidden email]> hat am 24. Oktober 2016 um 22:23 geschrieben:

This seems to become a different topic - I tried to change this and I notice that I was not able to make it work.


Some of my topaz/ZnServer tasks are handling only http-calls with special URLs. It can happen, that these calls will be called over a period if time (days or so) - in my client simulation tool these calls were not called at all.

As I understand the ZnServer, GsSocketStream and GsSocket code: the ZnServer tasks are then simply blocked low level in the virtual machine until a request comes in ... I would be surprised to see if they really wake up if they get a SigAbort ...

The whole logic in ZnServer>>serveConnectionOn: listeningSocket:

...

socket := listeningSocket waitForAcceptFor: self acceptWaitTimeout.

...

does not work: timeout is not considered and at the lowest level, that API calls says: I'm blocking.

Therefore the periodicTasks are not so executed periodically - ONLY when a request is incoming from that listing socket.

Seems to be strange ...



Marten Feldtmann via Glass <[hidden email]> hat am 24. Oktober 2016 um 16:43 geschrieben:

That's very interesting.

This means, that I should introduce my own subclass of ZnServer and its #serveConnectionOn: (or the #noteAcceptWaitTimedOut) should be rewritten to make an empty transaction (with #beginTransaction and #abortTransaction) on a periodical base (15 seconds ?).



 

_______________________________________________ Glass mailing list [hidden email] http://lists.gemtalksystems.com/mailman/listinfo/glass


 

_______________________________________________ Glass mailing list [hidden email] http://lists.gemtalksystems.com/mailman/listinfo/glass

_______________________________________________
Glass mailing list
[hidden email]
http://lists.gemtalksystems.com/mailman/listinfo/glass
Reply | Threaded
Open this post in threaded view
|

Re: [3.3.1]: Waiting for a connection is blocked ... no timeout at all is considered

GLASS mailing list
In reply to this post by GLASS mailing list

Marten,

Are you reading code and drawing these conclusions or are you in a debugger or ???

Unless I'm mistaken the method that calls ZnServer>>serveConnectionOn: is in an infinite repeat loop and it looks like by default the accept will block for 5 minutes, then the timeout should fire and a new accept request will be launched ... which will give the SigAbort code a chance to run ...

So if you are not seeing the accept call kick out after 5 minutes (300 seconds = 5 minutes right:) then perhaps there is a bug ... is that what is happening for you?

And it looks like GsSocket>>waitForAcceptFor: calls GsSocket>>waitForConnectionFor:ifTimedOut:..

Digging a bit more I see that SocketStreamSocket>>waitForAcceptFor: calls SocketStreamSocket>>waitForConnectionFor:ifTimedOut: is wired to calling `self accept` which does ignore the timeout ....  Since GsSocket implements waitForConnectionFor:ifTimedOut:, I think that SocketStreamSocket>>waitForConnectionFor:ifTimedOut: should do:

  ^self self underlyingSocket waitForConnectionFor: timeout ifTimedOut: timeoutBlock

Could you try that out and confirm ... presumably the SocketStreamSocket code dates back to GemStone 2.4.x when perhaps GsSocket>>waitForConnectionFor:ifTimedOut: wasn't implemented?

Good catch ... let me know it it works ...

Dale

On 10/24/2016 01:23 PM, Marten Feldtmann via Glass wrote:

This seems to become a different topic - I tried to change this and I notice that I was not able to make it work.


Some of my topaz/ZnServer tasks are handling only http-calls with special URLs. It can happen, that these calls will be called over a period if time (days or so) - in my client simulation tool these calls were not called at all.

As I understand the ZnServer, GsSocketStream and GsSocket code: the ZnServer tasks are then simply blocked low level in the virtual machine until a request comes in ... I would be surprised to see if they really wake up if they get a SigAbort ...

The whole logic in ZnServer>>serveConnectionOn: listeningSocket:

...

socket := listeningSocket waitForAcceptFor: self acceptWaitTimeout.

...

does not work: timeout is not considered and at the lowest level, that API calls says: I'm blocking.

Therefore the periodicTasks are not so executed periodically - ONLY when a request is incoming from that listing socket.

Seems to be strange ...



Marten Feldtmann via Glass [hidden email] hat am 24. Oktober 2016 um 16:43 geschrieben:

That's very interesting.

This means, that I should introduce my own subclass of ZnServer and its #serveConnectionOn: (or the #noteAcceptWaitTimedOut) should be rewritten to make an empty transaction (with #beginTransaction and #abortTransaction) on a periodical base (15 seconds ?).



 

_______________________________________________ Glass mailing list [hidden email] http://lists.gemtalksystems.com/mailman/listinfo/glass


_______________________________________________
Glass mailing list
[hidden email]
http://lists.gemtalksystems.com/mailman/listinfo/glass


_______________________________________________
Glass mailing list
[hidden email]
http://lists.gemtalksystems.com/mailman/listinfo/glass
Reply | Threaded
Open this post in threaded view
|

Re: [3.3.1]: Waiting for a connection is blocked ... no timeout at all is considered

GLASS mailing list

Hmm,

a) GsServer>>waitForConnectionFor:ifTimeout:

does not consider timeoutBlock at all and has a different logic.

b) the call SocketStreamSocket>>waitForConnectionFor:ifTimedOut: is expected to return an instance of SocketStreamSocket and not an instance of GsSocket ... and ZnServer works on instances of SocketStreamSocket


Marten


Dale Henrichs via Glass <[hidden email]> hat am 25. Oktober 2016 um 00:04 geschrieben:

Marten,

Are you reading code and drawing these conclusions or are you in a debugger or ???

Unless I'm mistaken the method that calls ZnServer>>serveConnectionOn: is in an infinite repeat loop and it looks like by default the accept will block for 5 minutes, then the timeout should fire and a new accept request will be launched ... which will give the SigAbort code a chance to run ...

So if you are not seeing the accept call kick out after 5 minutes (300 seconds = 5 minutes right:) then perhaps there is a bug ... is that what is happening for you?

And it looks like GsSocket>>waitForAcceptFor: calls GsSocket>>waitForConnectionFor:ifTimedOut:..

Digging a bit more I see that SocketStreamSocket>>waitForAcceptFor: calls SocketStreamSocket>>waitForConnectionFor:ifTimedOut: is wired to calling `self accept` which does ignore the timeout ....  Since GsSocket implements waitForConnectionFor:ifTimedOut:, I think that SocketStreamSocket>>waitForConnectionFor:ifTimedOut: should do:

  ^self self underlyingSocket waitForConnectionFor: timeout ifTimedOut: timeoutBlock

Could you try that out and confirm ... presumably the SocketStreamSocket code dates back to GemStone 2.4.x when perhaps GsSocket>>waitForConnectionFor:ifTimedOut: wasn't implemented?

Good catch ... let me know it it works ...

Dale



 

_______________________________________________
Glass mailing list
[hidden email]
http://lists.gemtalksystems.com/mailman/listinfo/glass

_______________________________________________
Glass mailing list
[hidden email]
http://lists.gemtalksystems.com/mailman/listinfo/glass
Reply | Threaded
Open this post in threaded view
|

Re: [3.3.1]: Waiting for a connection is blocked ... no timeout at all is considered

GLASS mailing list
In reply to this post by GLASS mailing list


Whoops my previous guess was a bit wrong ... and it looks like I extended the GsSocket protocol when I ported Zodiac to GemStone and I'm sitting in an image with Zodiac installed ... sorry about that:

instance for GsSocket--*zodiac-gemstone-core
--------------------
closeAndDestroy
connectTo:port:waitForConnectionFor:
dataAvailable
isValid
listenOn:backlogSize:
localPort
readInto:startingAt:for:
readyForRead
receiveDataSignallingTimeout:into:startingAt:
sendData:count:
sendSomeData:startIndex:count:
waitForAcceptFor:
waitForConnectionFor:ifTimedOut:
waitForDataFor:
waitForDataFor:ifClosed:ifTimedOut:
waitForNonBlockingReadActivityUpToMs:
writeFrom:startingAt:for:

and for Zodiac, GsSocket>>waitForConnectionFor:ifTimedOut: is implemented ... so the real fix is to move these methods into SocketStream and/or make SocketStream obsolete, ... since I know I didn't create an SecureSocketStream class the need for SocketStream is probably fading away ...

Created an issue[1] for moving all of these methods into zinc and this issue[2] for your original problem

[1] https://github.com/GsDevKit/zinc/issues/81

[2] https://github.com/GsDevKit/zinc/issues/82

On 10/24/2016 02:22 PM, Marten Feldtmann via Glass wrote:

And I think, that this can be fixed by:

SocketStreamSocket>>waitForConnectionFor: timeout ifTimedOut: timeoutBlock

waitForConnectionFor: timeout ifTimedOut: timeoutBlock
    "Wait up until the given deadline for a connection to be established. Return true if it is established by the deadline, false if not."

    | aGsSocketOrNil |

    aGsSocketOrNil := self underlyingSocket acceptTimeoutMs: timeout * 1000.
    aGsSocketOrNil isNil ifTrue:[^timeoutBlock value].
    ^ SocketStreamSocket onNativeclientSocket: aGsSocketOrNil for: self

Marten Feldtmann via Glass [hidden email] hat am 24. Oktober 2016 um 22:23 geschrieben:

This seems to become a different topic - I tried to change this and I notice that I was not able to make it work.


Some of my topaz/ZnServer tasks are handling only http-calls with special URLs. It can happen, that these calls will be called over a period if time (days or so) - in my client simulation tool these calls were not called at all.

As I understand the ZnServer, GsSocketStream and GsSocket code: the ZnServer tasks are then simply blocked low level in the virtual machine until a request comes in ... I would be surprised to see if they really wake up if they get a SigAbort ...

The whole logic in ZnServer>>serveConnectionOn: listeningSocket:

...

socket := listeningSocket waitForAcceptFor: self acceptWaitTimeout.

...

does not work: timeout is not considered and at the lowest level, that API calls says: I'm blocking.

Therefore the periodicTasks are not so executed periodically - ONLY when a request is incoming from that listing socket.

Seems to be strange ...



Marten Feldtmann via Glass [hidden email] hat am 24. Oktober 2016 um 16:43 geschrieben:

That's very interesting.

This means, that I should introduce my own subclass of ZnServer and its #serveConnectionOn: (or the #noteAcceptWaitTimedOut) should be rewritten to make an empty transaction (with #beginTransaction and #abortTransaction) on a periodical base (15 seconds ?).



 

_______________________________________________ Glass mailing list [hidden email] http://lists.gemtalksystems.com/mailman/listinfo/glass


 

_______________________________________________ Glass mailing list [hidden email] http://lists.gemtalksystems.com/mailman/listinfo/glass


_______________________________________________
Glass mailing list
[hidden email]
http://lists.gemtalksystems.com/mailman/listinfo/glass


_______________________________________________
Glass mailing list
[hidden email]
http://lists.gemtalksystems.com/mailman/listinfo/glass
Reply | Threaded
Open this post in threaded view
|

Re: Transaction Logs so large ....

GLASS mailing list
In reply to this post by GLASS mailing list
Thanks, Dale

> I think that if you make continuations non-persistent, then you will need to
> make sure that you have session affinity[1] ... this could cost in needing
> to have enough gems to handle the number of concurrent "long term" sessions
> you have ... I did experiments many years ago and I think that if you have a
> known number of users/browser sessions, then this approach could work ...
> let me know if there is interest here and I can blow the dust off of that
> work and give you some pointers/help.

We prefer to load balance over sessions because we have some requests
that take a few seconds to complete. We do not want one session to
hold up another when there are gems idle.

OTOH, we could reconsider and perhaps start a new gem for each
session. Our gems eat a lot of RAM though, which we need to sort out.

Yes, we can run some of these requests in the background. But they
creep up on us and we don't realise there's a problem until users
complain that their requests time out.

> I've also done some work with notTranlogged session data ... back when I was
> doing scaling trying to get to 10,000 requests second (hit 7000) I noticed
> the growth in tranlogs and a notTranlogged feature was subsequently added to
> GemStone ... until now I haven't heard about users really needing this
> feature -- even now it is in a stress testing context -- but a year ago I
> took some time to some to do some experiments with notTranlogged session
> state for Seaside and the work is on gitub[2]. I was able to my
> NotTranlogged experiment to function, but I never did any comparison tests
> to find out how much would actually be saved ...

I'm not there yet. I think that it is manageable at this stage. 100GB
disk space is not overly expensive. Other things to do now.

Just a thought: I wonder if the not tranlogged will work with a hot
standby. Should be ok? Have to be sure that one can recover when you
lose the extent. That would be the major test.

> [2] https://github.com/dalehenrich/Seaside/tree/notTranlogged
>
>>
>> We load balance across gemstone sessions, which may influence what you do.
>>
>> On Mon, Oct 24, 2016 at 10:48 AM, Marten Feldtmann via Glass
>> <[hidden email]> wrote:
>>>
>>> Ok, I have to make this perhaps a little bit clearer:
>>>
>>> In my test cases I have 200 to 400 clients (written in Python) playing
>>> their
>>> role as client simulators. They do their requests via http/rest against
>>> Gemstone/s against 8 responding gem/zinc tasks (running with different
>>> configurations). They produce in total (10-20) events per seconds against
>>> the database. These events are changing or adding data - no deletion at
>>> this
>>> point.
>>>
>>> In terms of speed Gemstone has no problem answering the requests (even
>>> with
>>> only one answering tasks it would work).
>>>
>>> The current database file is around 64 GB of size.
>>>
>>> The size of the uncompressed backup file is around 4 GB.
>>>
>>> As I mentioned in my older postings: this system produces 1GB transaction
>>> log files within 4-5 minutes.Therefore we have around 100GB transaction
>>> data
>>> each day ... wow.
>>>
>>> This seems to indicate, that not the extents files will be a problem, but
>>> the management of the transaction files is far more difficult to handle
>>> (and
>>> I was not aware of this :-))).
>>>
>>> Question to the more experienced Gemstone/S developers: is this ok (a
>>> general pattern) ?
>>>
>>>
>>> Marten Feldtmann via Glass <[hidden email]> hat am 22.
>>> Oktober 2016 um 22:08 geschrieben:
>>>
>>>
>>> One of the most surprising things I noticed with my application is the
>>> tremendous storage need of my transaction logs.
>>>
>>> Whenever I test my application (with around 200 clients) I get one
>>> transaction log file within 4 minutes - each with a size of 1 GB. That
>>> would
>>> lead to 120GB transaction logs within one days with 8 hours working time.
>>>
>>>
>>> And of course I do full transaction log .... the database size after
>>> doing a
>>> backup is 3.4 GB. Normally the clients do not do so much work - but they
>>> send lots of events ...
>>>
>>> I looked at the inventory and I did not find any strange things ... any
>>> ideas how to get an idea why such an amount of space is needed for the
>>> transaction logs ....
>>>
>>> How do classes like Set, Arrays and changes to them results in
>>> transaction
>>> logs changes ??? I prefer to use Arrays at this moment ...
>>>
>>>
>>>
>>>
>>> _______________________________________________ Glass mailing list
>>> [hidden email]
>>> http://lists.gemtalksystems.com/mailman/listinfo/glass
>>>
>>>
>>> _______________________________________________
>>> Glass mailing list
>>> [hidden email]
>>> http://lists.gemtalksystems.com/mailman/listinfo/glass
>>>
>> _______________________________________________
>> Glass mailing list
>> [hidden email]
>> http://lists.gemtalksystems.com/mailman/listinfo/glass
>
>
> _______________________________________________
> Glass mailing list
> [hidden email]
> http://lists.gemtalksystems.com/mailman/listinfo/glass
_______________________________________________
Glass mailing list
[hidden email]
http://lists.gemtalksystems.com/mailman/listinfo/glass
Reply | Threaded
Open this post in threaded view
|

Re: Transaction Logs so large ....

GLASS mailing list


On 10/24/16 9:20 PM, Otto Behrens wrote:

> Thanks, Dale
>
>> I think that if you make continuations non-persistent, then you will need to
>> make sure that you have session affinity[1] ... this could cost in needing
>> to have enough gems to handle the number of concurrent "long term" sessions
>> you have ... I did experiments many years ago and I think that if you have a
>> known number of users/browser sessions, then this approach could work ...
>> let me know if there is interest here and I can blow the dust off of that
>> work and give you some pointers/help.
> We prefer to load balance over sessions because we have some requests
> that take a few seconds to complete. We do not want one session to
> hold up another when there are gems idle.
I had done some work on "self load balancing" where a gem would bounce
the request if it was "running a long operation" ... things get
interesting when all of the gems are running long transactions though:)

if you are happy with what you are doing there's no need to change ..
>
> OTOH, we could reconsider and perhaps start a new gem for each
> session. Our gems eat a lot of RAM though, which we need to sort out.
Yeah that's the big one ... there's a hard cap on number of users if you
go this route ...

>
> Yes, we can run some of these requests in the background. But they
> creep up on us and we don't realise there's a problem until users
> complain that their requests time out.
>
>> I've also done some work with notTranlogged session data ... back when I was
>> doing scaling trying to get to 10,000 requests second (hit 7000) I noticed
>> the growth in tranlogs and a notTranlogged feature was subsequently added to
>> GemStone ... until now I haven't heard about users really needing this
>> feature -- even now it is in a stress testing context -- but a year ago I
>> took some time to some to do some experiments with notTranlogged session
>> state for Seaside and the work is on gitub[2]. I was able to my
>> NotTranlogged experiment to function, but I never did any comparison tests
>> to find out how much would actually be saved ...
> I'm not there yet. I think that it is manageable at this stage. 100GB
> disk space is not overly expensive. Other things to do now.
Yeah, it didn't seem to be a big problem yet ... but it might be worth
starting to look into notTranlogged so that Seaside is ready when you
need it .. Like I said I didn't check in to the tranlog yield in my
experiments, but I had gone as far as I could without starting to making
structural changes to Seaside ... which might turn out to be necessary ...
>
> Just a thought: I wonder if the not tranlogged will work with a hot
> standby. Should be ok? Have to be sure that one can recover when you
> lose the extent. That would be the major test.
Nottranlogged objects are persisted in the main disk bug not written to
tranlogs .. so the implication is that notTranlogged objects are not
present upon recovery --- but then Seaside session state is not needed
if the stone crashes ... so if it's not written to the tranlog it will
not be sent to the hot standby which should be fine ...

Between disk and hot standby, the notTranlogged might be worth having in
our pockets .

>
>> [2] https://github.com/dalehenrich/Seaside/tree/notTranlogged
>>
>>> We load balance across gemstone sessions, which may influence what you do.
>>>
>>> On Mon, Oct 24, 2016 at 10:48 AM, Marten Feldtmann via Glass
>>> <[hidden email]> wrote:
>>>> Ok, I have to make this perhaps a little bit clearer:
>>>>
>>>> In my test cases I have 200 to 400 clients (written in Python) playing
>>>> their
>>>> role as client simulators. They do their requests via http/rest against
>>>> Gemstone/s against 8 responding gem/zinc tasks (running with different
>>>> configurations). They produce in total (10-20) events per seconds against
>>>> the database. These events are changing or adding data - no deletion at
>>>> this
>>>> point.
>>>>
>>>> In terms of speed Gemstone has no problem answering the requests (even
>>>> with
>>>> only one answering tasks it would work).
>>>>
>>>> The current database file is around 64 GB of size.
>>>>
>>>> The size of the uncompressed backup file is around 4 GB.
>>>>
>>>> As I mentioned in my older postings: this system produces 1GB transaction
>>>> log files within 4-5 minutes.Therefore we have around 100GB transaction
>>>> data
>>>> each day ... wow.
>>>>
>>>> This seems to indicate, that not the extents files will be a problem, but
>>>> the management of the transaction files is far more difficult to handle
>>>> (and
>>>> I was not aware of this :-))).
>>>>
>>>> Question to the more experienced Gemstone/S developers: is this ok (a
>>>> general pattern) ?
>>>>
>>>>
>>>> Marten Feldtmann via Glass <[hidden email]> hat am 22.
>>>> Oktober 2016 um 22:08 geschrieben:
>>>>
>>>>
>>>> One of the most surprising things I noticed with my application is the
>>>> tremendous storage need of my transaction logs.
>>>>
>>>> Whenever I test my application (with around 200 clients) I get one
>>>> transaction log file within 4 minutes - each with a size of 1 GB. That
>>>> would
>>>> lead to 120GB transaction logs within one days with 8 hours working time.
>>>>
>>>>
>>>> And of course I do full transaction log .... the database size after
>>>> doing a
>>>> backup is 3.4 GB. Normally the clients do not do so much work - but they
>>>> send lots of events ...
>>>>
>>>> I looked at the inventory and I did not find any strange things ... any
>>>> ideas how to get an idea why such an amount of space is needed for the
>>>> transaction logs ....
>>>>
>>>> How do classes like Set, Arrays and changes to them results in
>>>> transaction
>>>> logs changes ??? I prefer to use Arrays at this moment ...
>>>>
>>>>
>>>>
>>>>
>>>> _______________________________________________ Glass mailing list
>>>> [hidden email]
>>>> http://lists.gemtalksystems.com/mailman/listinfo/glass
>>>>
>>>>
>>>> _______________________________________________
>>>> Glass mailing list
>>>> [hidden email]
>>>> http://lists.gemtalksystems.com/mailman/listinfo/glass
>>>>
>>> _______________________________________________
>>> Glass mailing list
>>> [hidden email]
>>> http://lists.gemtalksystems.com/mailman/listinfo/glass
>>
>> _______________________________________________
>> Glass mailing list
>> [hidden email]
>> http://lists.gemtalksystems.com/mailman/listinfo/glass

_______________________________________________
Glass mailing list
[hidden email]
http://lists.gemtalksystems.com/mailman/listinfo/glass
Reply | Threaded
Open this post in threaded view
|

Re: [3.3.1]: Waiting for a connection is blocked ... no timeout at all is considered

GLASS mailing list
In reply to this post by GLASS mailing list

Wow,

after making this change I was - for the first time - able to make a full backup (nothing blocked) while the crash tests were running - even though I must to more tests, this is a wonderful success.



Marten



 

_______________________________________________ Glass mailing list [hidden email] http://lists.gemtalksystems.com/mailman/listinfo/glass

_______________________________________________
Glass mailing list
[hidden email]
http://lists.gemtalksystems.com/mailman/listinfo/glass
Reply | Threaded
Open this post in threaded view
|

Re: Transaction Logs so large ....

GLASS mailing list
In reply to this post by GLASS mailing list

Hey Dale,

any first hints how to check the content of the translog files ?


Marten


_______________________________________________
Glass mailing list
[hidden email]
http://lists.gemtalksystems.com/mailman/listinfo/glass
Reply | Threaded
Open this post in threaded view
|

Re: [3.3.1]: Waiting for a connection is blocked ... no timeout at all is considered

GLASS mailing list
In reply to this post by GLASS mailing list

The fix you made would reduce the accumulation of transactions that gs needs to resolve views and potential conflicts through. The accumulation would have been high when some gems are very active with commits while others are mostly waiting (and holding an old view). It was the mostly-idle gems causing accumulation.

You may notice:
- Quicker continueTransaction
- shorter login time
- far fewer gems receive SigAbort
- cr backlogs resolve quicker
- fewer transactions since last checkpoint
- extents updated more frequently
- backups are more reliable
- restores require fewer tranlog files

I'm big on performance tuning, here is another hint. Gs is also aided by having many gems starting with the same view point rather than each having separate views. Change your timeout time so that it expires at a time interval that the others would also target. Rather than a fixed 15 second timeout, compute a timeout from current system time that other gems might also compute to timeout on. A 15 second target interval might be a bit aggressive depending on your commit rates. Perhaps target expiry at a 30 second interval to further improve odds that gems will timeout-abort at the same moment.

Paul Baumann


On Oct 26, 2016 7:36 AM, "Marten Feldtmann via Glass" <[hidden email]> wrote:

Wow,

after making this change I was - for the first time - able to make a full backup (nothing blocked) while the crash tests were running - even though I must to more tests, this is a wonderful success.



Marten



 

_______________________________________________ Glass mailing list [hidden email] http://lists.gemtalksystems.com/mailman/listinfo/glass

_______________________________________________
Glass mailing list
[hidden email]
http://lists.gemtalksystems.com/mailman/listinfo/glass


_______________________________________________
Glass mailing list
[hidden email]
http://lists.gemtalksystems.com/mailman/listinfo/glass
Reply | Threaded
Open this post in threaded view
|

Re: [3.3.1]: Waiting for a connection is blocked ... no timeout at all is considered

GLASS mailing list

That means something like:


ZnServer>>acceptWaitTimeout

acceptWaitTimeout
    "How many seconds to wait for a server socket listening for an accept ?"

        | utcSeconds2001 wishedIntervallInSeconds |

        wishedIntervallInSeconds := 30.
        utcSeconds2001 := DateAndTime now asSeconds asInteger.
        ^(utcSeconds2001 / wishedIntervallInSeconds) truncated * wishedIntervallInSeconds + wishedIntervallInSeconds - utcSeconds2001


Paul Baumann <[hidden email]> hat am 26. Oktober 2016 um 17:42 geschrieben:

The fix you made would reduce the accumulation of transactions that gs needs to resolve views and potential conflicts through. The accumulation would have been high when some gems are very active with commits while others are mostly waiting (and holding an old view). It was the mostly-idle gems causing accumulation.

You may notice:
- Quicker continueTransaction
- shorter login time
- far fewer gems receive SigAbort
- cr backlogs resolve quicker
- fewer transactions since last checkpoint
- extents updated more frequently
- backups are more reliable
- restores require fewer tranlog files

I'm big on performance tuning, here is another hint. Gs is also aided by having many gems starting with the same view point rather than each having separate views. Change your timeout time so that it expires at a time interval that the others would also target. Rather than a fixed 15 second timeout, compute a timeout from current system time that other gems might also compute to timeout on. A 15 second target interval might be a bit aggressive depending on your commit rates. Perhaps target expiry at a 30 second interval to further improve odds that gems will timeout-abort at the same moment.

Paul Baumann


_______________________________________________
Glass mailing list
[hidden email]
http://lists.gemtalksystems.com/mailman/listinfo/glass
Reply | Threaded
Open this post in threaded view
|

Re: [3.3.1]: Waiting for a connection is blocked ... no timeout at all is considered

GLASS mailing list

Ok, I notice, that seconds are not enough ... timeout has to be calculated on ms time - but ok not that difficult.

Thanks, I'll implement this ..

Marten

Marten Feldtmann via Glass <[hidden email]> hat am 26. Oktober 2016 um 19:39 geschrieben:

That means something like:


ZnServer>>acceptWaitTimeout

acceptWaitTimeout
    "How many seconds to wait for a server socket listening for an accept ?"

        | utcSeconds2001 wishedIntervallInSeconds |

        wishedIntervallInSeconds := 30.
        utcSeconds2001 := DateAndTime now asSeconds asInteger.
        ^(utcSeconds2001 / wishedIntervallInSeconds) truncated * wishedIntervallInSeconds + wishedIntervallInSeconds - utcSeconds2001


Paul Baumann <[hidden email]> hat am 26. Oktober 2016 um 17:42 geschrieben:

The fix you made would reduce the accumulation of transactions that gs needs to resolve views and potential conflicts through. The accumulation would have been high when some gems are very active with commits while others are mostly waiting (and holding an old view). It was the mostly-idle gems causing accumulation.

You may notice:
- Quicker continueTransaction
- shorter login time
- far fewer gems receive SigAbort
- cr backlogs resolve quicker
- fewer transactions since last checkpoint
- extents updated more frequently
- backups are more reliable
- restores require fewer tranlog files

I'm big on performance tuning, here is another hint. Gs is also aided by having many gems starting with the same view point rather than each having separate views. Change your timeout time so that it expires at a time interval that the others would also target. Rather than a fixed 15 second timeout, compute a timeout from current system time that other gems might also compute to timeout on. A 15 second target interval might be a bit aggressive depending on your commit rates. Perhaps target expiry at a 30 second interval to further improve odds that gems will timeout-abort at the same moment.

Paul Baumann


 

_______________________________________________ Glass mailing list [hidden email] http://lists.gemtalksystems.com/mailman/listinfo/glass

_______________________________________________
Glass mailing list
[hidden email]
http://lists.gemtalksystems.com/mailman/listinfo/glass
Reply | Threaded
Open this post in threaded view
|

Re: Transaction Logs so large ....

GLASS mailing list
In reply to this post by GLASS mailing list

Personally I have not run the scripts so I don't have any hints without research ... on the other hand an HR will connect you directly with the person who does have experience:)

I'm busy today so if you are willing to wait, I will try to find the time to figure out the formula...

Dale

On 10/26/16 4:46 AM, Marten Feldtmann via Glass wrote:

Hey Dale,

any first hints how to check the content of the translog files ?


Marten



_______________________________________________
Glass mailing list
[hidden email]
http://lists.gemtalksystems.com/mailman/listinfo/glass


_______________________________________________
Glass mailing list
[hidden email]
http://lists.gemtalksystems.com/mailman/listinfo/glass
Reply | Threaded
Open this post in threaded view
|

Re: [3.3.1]: Waiting for a connection is blocked ... no timeout at all is considered

GLASS mailing list
In reply to this post by GLASS mailing list

If these are changes that deserve to be shared (including the bugfix) a pull request with the relevant fixes would be much appreciated ... there are only so many hours in the day for me to work and the whole point of using GitHub is to make it much easier for users to contribute ...

Personally I would rather spend the time teaching you guys how to submit pull requests than have to fix all of the outsanding bugs all by myself:)

Dale


On 10/26/16 10:50 AM, Marten Feldtmann via Glass wrote:

Ok, I notice, that seconds are not enough ... timeout has to be calculated on ms time - but ok not that difficult.

Thanks, I'll implement this ..

Marten

Marten Feldtmann via Glass [hidden email] hat am 26. Oktober 2016 um 19:39 geschrieben:

That means something like:


ZnServer>>acceptWaitTimeout

acceptWaitTimeout
    "How many seconds to wait for a server socket listening for an accept ?"

        | utcSeconds2001 wishedIntervallInSeconds |

        wishedIntervallInSeconds := 30.
        utcSeconds2001 := DateAndTime now asSeconds asInteger.
        ^(utcSeconds2001 / wishedIntervallInSeconds) truncated * wishedIntervallInSeconds + wishedIntervallInSeconds - utcSeconds2001


Paul Baumann [hidden email] hat am 26. Oktober 2016 um 17:42 geschrieben:

The fix you made would reduce the accumulation of transactions that gs needs to resolve views and potential conflicts through. The accumulation would have been high when some gems are very active with commits while others are mostly waiting (and holding an old view). It was the mostly-idle gems causing accumulation.

You may notice:
- Quicker continueTransaction
- shorter login time
- far fewer gems receive SigAbort
- cr backlogs resolve quicker
- fewer transactions since last checkpoint
- extents updated more frequently
- backups are more reliable
- restores require fewer tranlog files

I'm big on performance tuning, here is another hint. Gs is also aided by having many gems starting with the same view point rather than each having separate views. Change your timeout time so that it expires at a time interval that the others would also target. Rather than a fixed 15 second timeout, compute a timeout from current system time that other gems might also compute to timeout on. A 15 second target interval might be a bit aggressive depending on your commit rates. Perhaps target expiry at a 30 second interval to further improve odds that gems will timeout-abort at the same moment.

Paul Baumann


 

_______________________________________________ Glass mailing list [hidden email] http://lists.gemtalksystems.com/mailman/listinfo/glass


_______________________________________________
Glass mailing list
[hidden email]
http://lists.gemtalksystems.com/mailman/listinfo/glass


_______________________________________________
Glass mailing list
[hidden email]
http://lists.gemtalksystems.com/mailman/listinfo/glass
Reply | Threaded
Open this post in threaded view
|

Re: Transaction Logs so large ....

Richard Sargent
Administrator
In reply to this post by GLASS mailing list
GLASS mailing list wrote
<!DOCTYPE html>

   
Hey Dale, any first hints how to check the content of the translog files ? Marten
Look in the product tree:
./bin/printlogs
./doc/man1/printlogs.1


Like Dale, I have not used these. But between the help printed by running the script and the man page, you should be able to get started.



 

_______________________________________________
Glass mailing list
[hidden email]
http://lists.gemtalksystems.com/mailman/listinfo/glass
12