CannotWriteData errors in P3 and Seaside

Previous Topic Next Topic
 
classic Classic list List threaded Threaded
14 messages Options
Reply | Threaded
Open this post in threaded view
|

CannotWriteData errors in P3 and Seaside

Esteban A. Maringolo
Hi all, Sven ;-)

I'm having erratic P3 errors in a recent application I wrote using
Pharo, Seaside and Glorp with P3 as driver.

Each Seaside session has a GlorpSession, which in turn has a
P3Connection in its accessor. I don't know why, but sometimes the
P3Connection socket is closed, and then when trying to read from the
database, it cannot write the query to the P3 socket and exception is
raised, and it isn't handled by the P3DatabaseDriver (automatically
trying to reconnect?).

I don't know if I'm doing something wrong, I plan to migrate the
GlorpPooledDatabaseAccessor and also use the P3ConnectionPool, but I
want to be sure that the current setup works of if maybe I'm exceeding
some limit or timeout that causes the connection to be closed.

Regards!


Esteban A. Maringolo

Reply | Threaded
Open this post in threaded view
|

Re: CannotWriteData errors in P3 and Seaside

Esteban A. Maringolo
To add to this hard to reproduce issue, it only happens in production
(where if not?) when running within a Docker Swarm where PostgreSQL is
inside the Swarm, so there might be something there fiddling with the
network, or I don't know really :-/

Regards.

Esteban A. Maringolo

On Mon, Aug 10, 2020 at 4:15 PM Esteban Maringolo <[hidden email]> wrote:

>
> Hi all, Sven ;-)
>
> I'm having erratic P3 errors in a recent application I wrote using
> Pharo, Seaside and Glorp with P3 as driver.
>
> Each Seaside session has a GlorpSession, which in turn has a
> P3Connection in its accessor. I don't know why, but sometimes the
> P3Connection socket is closed, and then when trying to read from the
> database, it cannot write the query to the P3 socket and exception is
> raised, and it isn't handled by the P3DatabaseDriver (automatically
> trying to reconnect?).
>
> I don't know if I'm doing something wrong, I plan to migrate the
> GlorpPooledDatabaseAccessor and also use the P3ConnectionPool, but I
> want to be sure that the current setup works of if maybe I'm exceeding
> some limit or timeout that causes the connection to be closed.
>
> Regards!
>
>
> Esteban A. Maringolo

Reply | Threaded
Open this post in threaded view
|

Re: CannotWriteData errors in P3 and Seaside

Sven Van Caekenberghe-2
In reply to this post by Esteban A. Maringolo
Hi Esteban,

I have a web app with P3 under Seaside in production and it works fine. But that is without Glorp, nor any connection pooling.

You say the connection seems closed, maybe the closing got triggered by your app somehow ? How do you clean up expired sessions ? How do you handle logouts ?

P3 does normally reconnect automatically, IIRC.

You could try to enable logging in P3Client, that is a recent addition. It should show you what happens to your connections.

Sven

> On 10 Aug 2020, at 21:15, Esteban Maringolo <[hidden email]> wrote:
>
> Hi all, Sven ;-)
>
> I'm having erratic P3 errors in a recent application I wrote using
> Pharo, Seaside and Glorp with P3 as driver.
>
> Each Seaside session has a GlorpSession, which in turn has a
> P3Connection in its accessor. I don't know why, but sometimes the
> P3Connection socket is closed, and then when trying to read from the
> database, it cannot write the query to the P3 socket and exception is
> raised, and it isn't handled by the P3DatabaseDriver (automatically
> trying to reconnect?).
>
> I don't know if I'm doing something wrong, I plan to migrate the
> GlorpPooledDatabaseAccessor and also use the P3ConnectionPool, but I
> want to be sure that the current setup works of if maybe I'm exceeding
> some limit or timeout that causes the connection to be closed.
>
> Regards!
>
>
> Esteban A. Maringolo
>


Reply | Threaded
Open this post in threaded view
|

Re: CannotWriteData errors in P3 and Seaside

Esteban A. Maringolo
My Seaside session isn't closing the connection, only when
unregistered, but this seems to be something else I don't know.

I saw there is logging, and I need to set it up in general (including
Fuel serialized stacks).
Looking in the web apparently there is the need for a keepalive that
is not in place.
What disturbs me is that it doesn't happen in development. Making things harder.

Regards!

Esteban A. Maringolo

On Mon, Aug 10, 2020 at 5:02 PM Sven Van Caekenberghe <[hidden email]> wrote:

>
> Hi Esteban,
>
> I have a web app with P3 under Seaside in production and it works fine. But that is without Glorp, nor any connection pooling.
>
> You say the connection seems closed, maybe the closing got triggered by your app somehow ? How do you clean up expired sessions ? How do you handle logouts ?
>
> P3 does normally reconnect automatically, IIRC.
>
> You could try to enable logging in P3Client, that is a recent addition. It should show you what happens to your connections.
>
> Sven
>
> > On 10 Aug 2020, at 21:15, Esteban Maringolo <[hidden email]> wrote:
> >
> > Hi all, Sven ;-)
> >
> > I'm having erratic P3 errors in a recent application I wrote using
> > Pharo, Seaside and Glorp with P3 as driver.
> >
> > Each Seaside session has a GlorpSession, which in turn has a
> > P3Connection in its accessor. I don't know why, but sometimes the
> > P3Connection socket is closed, and then when trying to read from the
> > database, it cannot write the query to the P3 socket and exception is
> > raised, and it isn't handled by the P3DatabaseDriver (automatically
> > trying to reconnect?).
> >
> > I don't know if I'm doing something wrong, I plan to migrate the
> > GlorpPooledDatabaseAccessor and also use the P3ConnectionPool, but I
> > want to be sure that the current setup works of if maybe I'm exceeding
> > some limit or timeout that causes the connection to be closed.
> >
> > Regards!
> >
> >
> > Esteban A. Maringolo
> >
>
>

Reply | Threaded
Open this post in threaded view
|

Re: CannotWriteData errors in P3 and Seaside

Esteban A. Maringolo
Looking at the Postgres side of the log I find that the connection was
reset from the other side (it is Pharo).

The reason for that is yet unknown to me. Since I don't do anything
(that I'm aware of).

golfware_database.1.9sl4bt9j6cv5@gw    | 2020-08-10 20:02:35.939 UTC
[132] LOG:  could not receive data from client: Connection reset by
peer
golfware_database.1.9sl4bt9j6cv5@gw    | 2020-08-10 20:06:58.083 UTC
[139] LOG:  could not receive data from client: Connection reset by
peer
golfware_database.1.9sl4bt9j6cv5@gw    | 2020-08-10 20:06:58.083 UTC
[137] LOG:  could not receive data from client: Connection reset by
peer
golfware_database.1.9sl4bt9j6cv5@gw    | 2020-08-10 20:33:20.163 UTC
[166] LOG:  could not receive data from client: Connection reset by
peer
golfware_database.1.9sl4bt9j6cv5@gw    | 2020-08-10 20:35:22.019 UTC
[168] LOG:  could not receive data from client: Connection reset by
peer
golfware_database.1.9sl4bt9j6cv5@gw    | 2020-08-10 20:37:33.091 UTC
[177] LOG:  could not receive data from client: Connection reset by
peer

I'll have to keep looking.

Best regards!

Esteban A. Maringolo


On Mon, Aug 10, 2020 at 5:35 PM Esteban Maringolo <[hidden email]> wrote:

>
> My Seaside session isn't closing the connection, only when
> unregistered, but this seems to be something else I don't know.
>
> I saw there is logging, and I need to set it up in general (including
> Fuel serialized stacks).
> Looking in the web apparently there is the need for a keepalive that
> is not in place.
> What disturbs me is that it doesn't happen in development. Making things harder.
>
> Regards!
>
> Esteban A. Maringolo
>
> On Mon, Aug 10, 2020 at 5:02 PM Sven Van Caekenberghe <[hidden email]> wrote:
> >
> > Hi Esteban,
> >
> > I have a web app with P3 under Seaside in production and it works fine. But that is without Glorp, nor any connection pooling.
> >
> > You say the connection seems closed, maybe the closing got triggered by your app somehow ? How do you clean up expired sessions ? How do you handle logouts ?
> >
> > P3 does normally reconnect automatically, IIRC.
> >
> > You could try to enable logging in P3Client, that is a recent addition. It should show you what happens to your connections.
> >
> > Sven
> >
> > > On 10 Aug 2020, at 21:15, Esteban Maringolo <[hidden email]> wrote:
> > >
> > > Hi all, Sven ;-)
> > >
> > > I'm having erratic P3 errors in a recent application I wrote using
> > > Pharo, Seaside and Glorp with P3 as driver.
> > >
> > > Each Seaside session has a GlorpSession, which in turn has a
> > > P3Connection in its accessor. I don't know why, but sometimes the
> > > P3Connection socket is closed, and then when trying to read from the
> > > database, it cannot write the query to the P3 socket and exception is
> > > raised, and it isn't handled by the P3DatabaseDriver (automatically
> > > trying to reconnect?).
> > >
> > > I don't know if I'm doing something wrong, I plan to migrate the
> > > GlorpPooledDatabaseAccessor and also use the P3ConnectionPool, but I
> > > want to be sure that the current setup works of if maybe I'm exceeding
> > > some limit or timeout that causes the connection to be closed.
> > >
> > > Regards!
> > >
> > >
> > > Esteban A. Maringolo
> > >
> >
> >

Reply | Threaded
Open this post in threaded view
|

Re: CannotWriteData errors in P3 and Seaside

Esteban A. Maringolo
I kept looking into this, and still haven't found what might be causing it.

However, I was trying to "salvage" that until I find a solution, and
run a "healthcheck" to be sure that the GlorpSession has an active
connection, and then I found that a P3Client reports as connected even
when it's not.

I connected to a PostgreSQL database running in a Docker container,
stopped the container, and the driver continues to report as
connected, even way after the server was stopped (the timeout is using
the default 10 seconds).

The ZdcSocketStream reports both ends of the socket as connected, when
the server side certainly isn't.

I noticed also that in P3Client>>connect it calls #ensureOpen, and it
takes the socket as open when it is not, so as soon as it tries to
flush the data written to it, a `CannotWriteData` exception is
signaled.

In my case I'm developing on Pharo 8 on Windows with PostgreSQL
running in Docker on WSL2 (ubuntu), but on the server it is a 100%
Linux deployment.

Any ideas?

Esteban A. Maringolo

On Mon, Aug 10, 2020 at 5:44 PM Esteban Maringolo <[hidden email]> wrote:

>
> Looking at the Postgres side of the log I find that the connection was
> reset from the other side (it is Pharo).
>
> The reason for that is yet unknown to me. Since I don't do anything
> (that I'm aware of).
>
> golfware_database.1.9sl4bt9j6cv5@gw    | 2020-08-10 20:02:35.939 UTC
> [132] LOG:  could not receive data from client: Connection reset by
> peer
> golfware_database.1.9sl4bt9j6cv5@gw    | 2020-08-10 20:06:58.083 UTC
> [139] LOG:  could not receive data from client: Connection reset by
> peer
> golfware_database.1.9sl4bt9j6cv5@gw    | 2020-08-10 20:06:58.083 UTC
> [137] LOG:  could not receive data from client: Connection reset by
> peer
> golfware_database.1.9sl4bt9j6cv5@gw    | 2020-08-10 20:33:20.163 UTC
> [166] LOG:  could not receive data from client: Connection reset by
> peer
> golfware_database.1.9sl4bt9j6cv5@gw    | 2020-08-10 20:35:22.019 UTC
> [168] LOG:  could not receive data from client: Connection reset by
> peer
> golfware_database.1.9sl4bt9j6cv5@gw    | 2020-08-10 20:37:33.091 UTC
> [177] LOG:  could not receive data from client: Connection reset by
> peer
>
> I'll have to keep looking.
>
> Best regards!
>
> Esteban A. Maringolo
>
>
> On Mon, Aug 10, 2020 at 5:35 PM Esteban Maringolo <[hidden email]> wrote:
> >
> > My Seaside session isn't closing the connection, only when
> > unregistered, but this seems to be something else I don't know.
> >
> > I saw there is logging, and I need to set it up in general (including
> > Fuel serialized stacks).
> > Looking in the web apparently there is the need for a keepalive that
> > is not in place.
> > What disturbs me is that it doesn't happen in development. Making things harder.
> >
> > Regards!
> >
> > Esteban A. Maringolo
> >
> > On Mon, Aug 10, 2020 at 5:02 PM Sven Van Caekenberghe <[hidden email]> wrote:
> > >
> > > Hi Esteban,
> > >
> > > I have a web app with P3 under Seaside in production and it works fine. But that is without Glorp, nor any connection pooling.
> > >
> > > You say the connection seems closed, maybe the closing got triggered by your app somehow ? How do you clean up expired sessions ? How do you handle logouts ?
> > >
> > > P3 does normally reconnect automatically, IIRC.
> > >
> > > You could try to enable logging in P3Client, that is a recent addition. It should show you what happens to your connections.
> > >
> > > Sven
> > >
> > > > On 10 Aug 2020, at 21:15, Esteban Maringolo <[hidden email]> wrote:
> > > >
> > > > Hi all, Sven ;-)
> > > >
> > > > I'm having erratic P3 errors in a recent application I wrote using
> > > > Pharo, Seaside and Glorp with P3 as driver.
> > > >
> > > > Each Seaside session has a GlorpSession, which in turn has a
> > > > P3Connection in its accessor. I don't know why, but sometimes the
> > > > P3Connection socket is closed, and then when trying to read from the
> > > > database, it cannot write the query to the P3 socket and exception is
> > > > raised, and it isn't handled by the P3DatabaseDriver (automatically
> > > > trying to reconnect?).
> > > >
> > > > I don't know if I'm doing something wrong, I plan to migrate the
> > > > GlorpPooledDatabaseAccessor and also use the P3ConnectionPool, but I
> > > > want to be sure that the current setup works of if maybe I'm exceeding
> > > > some limit or timeout that causes the connection to be closed.
> > > >
> > > > Regards!
> > > >
> > > >
> > > > Esteban A. Maringolo
> > > >
> > >
> > >

Reply | Threaded
Open this post in threaded view
|

Re: CannotWriteData errors in P3 and Seaside

Sven Van Caekenberghe-2
IIUC a socket stream does not automatically/automagically know that the state of the connection changed, unless/until it tries to use it (read or write to it, wait for data, ...).

I would recommend using #isWorking to actually test if a connection is good, if you would need to do that. That does an actual query.

Did you enable P3 logging as I suggested ? What did you learn ?

It feels as if in your particular setup the server eagerly closes connections that are open for too long. P3 was written under the assumption that that does not happen (it does not for me).

> On 19 Aug 2020, at 19:48, Esteban Maringolo <[hidden email]> wrote:
>
> I kept looking into this, and still haven't found what might be causing it.
>
> However, I was trying to "salvage" that until I find a solution, and
> run a "healthcheck" to be sure that the GlorpSession has an active
> connection, and then I found that a P3Client reports as connected even
> when it's not.
>
> I connected to a PostgreSQL database running in a Docker container,
> stopped the container, and the driver continues to report as
> connected, even way after the server was stopped (the timeout is using
> the default 10 seconds).
>
> The ZdcSocketStream reports both ends of the socket as connected, when
> the server side certainly isn't.
>
> I noticed also that in P3Client>>connect it calls #ensureOpen, and it
> takes the socket as open when it is not, so as soon as it tries to
> flush the data written to it, a `CannotWriteData` exception is
> signaled.
>
> In my case I'm developing on Pharo 8 on Windows with PostgreSQL
> running in Docker on WSL2 (ubuntu), but on the server it is a 100%
> Linux deployment.
>
> Any ideas?
>
> Esteban A. Maringolo
>
> On Mon, Aug 10, 2020 at 5:44 PM Esteban Maringolo <[hidden email]> wrote:
>>
>> Looking at the Postgres side of the log I find that the connection was
>> reset from the other side (it is Pharo).
>>
>> The reason for that is yet unknown to me. Since I don't do anything
>> (that I'm aware of).
>>
>> golfware_database.1.9sl4bt9j6cv5@gw    | 2020-08-10 20:02:35.939 UTC
>> [132] LOG:  could not receive data from client: Connection reset by
>> peer
>> golfware_database.1.9sl4bt9j6cv5@gw    | 2020-08-10 20:06:58.083 UTC
>> [139] LOG:  could not receive data from client: Connection reset by
>> peer
>> golfware_database.1.9sl4bt9j6cv5@gw    | 2020-08-10 20:06:58.083 UTC
>> [137] LOG:  could not receive data from client: Connection reset by
>> peer
>> golfware_database.1.9sl4bt9j6cv5@gw    | 2020-08-10 20:33:20.163 UTC
>> [166] LOG:  could not receive data from client: Connection reset by
>> peer
>> golfware_database.1.9sl4bt9j6cv5@gw    | 2020-08-10 20:35:22.019 UTC
>> [168] LOG:  could not receive data from client: Connection reset by
>> peer
>> golfware_database.1.9sl4bt9j6cv5@gw    | 2020-08-10 20:37:33.091 UTC
>> [177] LOG:  could not receive data from client: Connection reset by
>> peer
>>
>> I'll have to keep looking.
>>
>> Best regards!
>>
>> Esteban A. Maringolo
>>
>>
>> On Mon, Aug 10, 2020 at 5:35 PM Esteban Maringolo <[hidden email]> wrote:
>>>
>>> My Seaside session isn't closing the connection, only when
>>> unregistered, but this seems to be something else I don't know.
>>>
>>> I saw there is logging, and I need to set it up in general (including
>>> Fuel serialized stacks).
>>> Looking in the web apparently there is the need for a keepalive that
>>> is not in place.
>>> What disturbs me is that it doesn't happen in development. Making things harder.
>>>
>>> Regards!
>>>
>>> Esteban A. Maringolo
>>>
>>> On Mon, Aug 10, 2020 at 5:02 PM Sven Van Caekenberghe <[hidden email]> wrote:
>>>>
>>>> Hi Esteban,
>>>>
>>>> I have a web app with P3 under Seaside in production and it works fine. But that is without Glorp, nor any connection pooling.
>>>>
>>>> You say the connection seems closed, maybe the closing got triggered by your app somehow ? How do you clean up expired sessions ? How do you handle logouts ?
>>>>
>>>> P3 does normally reconnect automatically, IIRC.
>>>>
>>>> You could try to enable logging in P3Client, that is a recent addition. It should show you what happens to your connections.
>>>>
>>>> Sven
>>>>
>>>>> On 10 Aug 2020, at 21:15, Esteban Maringolo <[hidden email]> wrote:
>>>>>
>>>>> Hi all, Sven ;-)
>>>>>
>>>>> I'm having erratic P3 errors in a recent application I wrote using
>>>>> Pharo, Seaside and Glorp with P3 as driver.
>>>>>
>>>>> Each Seaside session has a GlorpSession, which in turn has a
>>>>> P3Connection in its accessor. I don't know why, but sometimes the
>>>>> P3Connection socket is closed, and then when trying to read from the
>>>>> database, it cannot write the query to the P3 socket and exception is
>>>>> raised, and it isn't handled by the P3DatabaseDriver (automatically
>>>>> trying to reconnect?).
>>>>>
>>>>> I don't know if I'm doing something wrong, I plan to migrate the
>>>>> GlorpPooledDatabaseAccessor and also use the P3ConnectionPool, but I
>>>>> want to be sure that the current setup works of if maybe I'm exceeding
>>>>> some limit or timeout that causes the connection to be closed.
>>>>>
>>>>> Regards!
>>>>>
>>>>>
>>>>> Esteban A. Maringolo
>>>>>
>>>>
>>>>


Reply | Threaded
Open this post in threaded view
|

Re: CannotWriteData errors in P3 and Seaside

Esteban A. Maringolo
Hi,

I'm replying to the list as well... because the last two mails got
replied to our personal addresses.

On Thu, Aug 20, 2020 at 11:55 AM Sven Van Caekenberghe <[hidden email]> wrote:

> > On 20 Aug 2020, at 15:31, Esteban Maringolo <[hidden email]> wrote:
> >
> > Hi Sven,
> >
> > If a socketstream doesn't know the state of the connection, then what
> > is the #socketIsConnected method for? In particular the
> > #isOtherEndClosed test.
> >
> > ZdcAbstractSocketStream>>#socketIsConnected
> >  ^socket isConnected and: [ socket isOtherEndClosed not ]
>
> I don't know what is going on inside Socket, I just stated my opinion.

Maybe there is something to investigate here?

> With logging enabled, I can do the following:
>
> $ grep P3 server-2020-08-20.log | grep CONNECT | tail -n 20
> <snip>
> 2020-08-20 14:43:06 [P3] 30513 DISCONNECTING psql://client-xyz:hiddenpassword@client-xyz-db:5432/client-xyz
> 2020-08-20 14:44:06 [P3] 30516 CONNECTED psql://client-xyz:hiddenpassword@client-xyz-db:5432/client-xyz
> 2020-08-20 14:44:06 [P3] 30516 DISCONNECTING psql://client-xyz:hiddenpassword@client-xyz-db:5432/client-xyz
> 2020-08-20 14:44:06 [P3] 30517 CONNECTED psql://client-xyz:hiddenpassword@client-xyz-db:5432/client-xyz
> 2020-08-20 14:44:06 [P3] 30517 DISCONNECTING psql://client-xyz:hiddenpassword@client-xyz-db:5432/client-xyz
>
> The number after [P3] is the session identifier (backend process id) of that connection. You should see each one being opened and closed in pairs.

Yes, I noticed the pid, and compared it with what I had on the
pg_stat_activity table.

I don't get the CONNECTED log because there is no way to set the
logging in the P3DatabaseDriver before it creates (and connects) the
P3Client.
Maybe there could be a setting on P3Client class to set verbosity
globally? Or at the P3DatabaseDriver instead.

Summarizing... I'm pretty confident that P3 works correctly and also
the PG server.
At this point I'm factoring out what might be causing this. It's an
issue that only happens to me in production, and I don't have a better
instrumentation in place to debug it.

Again, thanks for the support.

Regards.

Reply | Threaded
Open this post in threaded view
|

Re: CannotWriteData errors in P3 and Seaside

Sven Van Caekenberghe-2


> On 20 Aug 2020, at 19:51, Esteban Maringolo <[hidden email]> wrote:
>
> Hi,
>
> I'm replying to the list as well... because the last two mails got
> replied to our personal addresses.

Oh, I did not notice that, that was not my intention.

> On Thu, Aug 20, 2020 at 11:55 AM Sven Van Caekenberghe <[hidden email]> wrote:
>>> On 20 Aug 2020, at 15:31, Esteban Maringolo <[hidden email]> wrote:
>>>
>>> Hi Sven,
>>>
>>> If a socketstream doesn't know the state of the connection, then what
>>> is the #socketIsConnected method for? In particular the
>>> #isOtherEndClosed test.
>>>
>>> ZdcAbstractSocketStream>>#socketIsConnected
>>> ^socket isConnected and: [ socket isOtherEndClosed not ]
>>
>> I don't know what is going on inside Socket, I just stated my opinion.
>
> Maybe there is something to investigate here?

Is it not so that in Docker all network connections in/out/between instances are mediated by some management software ? I even thought it was nginx. Maybe I am totally wrong here.

>> With logging enabled, I can do the following:
>>
>> $ grep P3 server-2020-08-20.log | grep CONNECT | tail -n 20
>> <snip>
>> 2020-08-20 14:43:06 [P3] 30513 DISCONNECTING psql://client-xyz:hiddenpassword@client-xyz-db:5432/client-xyz
>> 2020-08-20 14:44:06 [P3] 30516 CONNECTED psql://client-xyz:hiddenpassword@client-xyz-db:5432/client-xyz
>> 2020-08-20 14:44:06 [P3] 30516 DISCONNECTING psql://client-xyz:hiddenpassword@client-xyz-db:5432/client-xyz
>> 2020-08-20 14:44:06 [P3] 30517 CONNECTED psql://client-xyz:hiddenpassword@client-xyz-db:5432/client-xyz
>> 2020-08-20 14:44:06 [P3] 30517 DISCONNECTING psql://client-xyz:hiddenpassword@client-xyz-db:5432/client-xyz
>>
>> The number after [P3] is the session identifier (backend process id) of that connection. You should see each one being opened and closed in pairs.
>
> Yes, I noticed the pid, and compared it with what I had on the
> pg_stat_activity table.

Right.

> I don't get the CONNECTED log because there is no way to set the
> logging in the P3DatabaseDriver before it creates (and connects) the
> P3Client.
> Maybe there could be a setting on P3Client class to set verbosity
> globally? Or at the P3DatabaseDriver instead.

What if you change

P3DatabaseDriver >> connect: aLogin
        connection := self connectionClass new.
        connection
                host: aLogin host;
                port: aLogin port asInteger;
                database: aLogin databaseName;
                user: aLogin username;
                password: aLogin password.
        connection connect

by inserting

         verbose: true

before the last statement.

You could also make a subclass of P3DatabaseDriver.

> Summarizing... I'm pretty confident that P3 works correctly and also
> the PG server.
> At this point I'm factoring out what might be causing this. It's an
> issue that only happens to me in production, and I don't have a better
> instrumentation in place to debug it.
>
> Again, thanks for the support.
>
> Regards.


Reply | Threaded
Open this post in threaded view
|

Re: CannotWriteData errors in P3 and Seaside

Stéphane Ducasse
HI guys

let us know if we have to do something from our side because we want to support business. 

S. 

On 20 Aug 2020, at 20:01, Sven Van Caekenberghe <[hidden email]> wrote:



On 20 Aug 2020, at 19:51, Esteban Maringolo <[hidden email]> wrote:

Hi,

I'm replying to the list as well... because the last two mails got
replied to our personal addresses.

Oh, I did not notice that, that was not my intention.

On Thu, Aug 20, 2020 at 11:55 AM Sven Van Caekenberghe <[hidden email]> wrote:
On 20 Aug 2020, at 15:31, Esteban Maringolo <[hidden email]> wrote:

Hi Sven,

If a socketstream doesn't know the state of the connection, then what
is the #socketIsConnected method for? In particular the
#isOtherEndClosed test.

ZdcAbstractSocketStream>>#socketIsConnected
^socket isConnected and: [ socket isOtherEndClosed not ]

I don't know what is going on inside Socket, I just stated my opinion.

Maybe there is something to investigate here?

Is it not so that in Docker all network connections in/out/between instances are mediated by some management software ? I even thought it was nginx. Maybe I am totally wrong here.

With logging enabled, I can do the following:

$ grep P3 server-2020-08-20.log | grep CONNECT | tail -n 20
<snip>
2020-08-20 14:43:06 [P3] 30513 DISCONNECTING <a href="psql://client-xyz:hiddenpassword@client-xyz-db:5432/client-xyz" class="">psql://client-xyz:hiddenpassword@client-xyz-db:5432/client-xyz
2020-08-20 14:44:06 [P3] 30516 CONNECTED <a href="psql://client-xyz:hiddenpassword@client-xyz-db:5432/client-xyz" class="">psql://client-xyz:hiddenpassword@client-xyz-db:5432/client-xyz
2020-08-20 14:44:06 [P3] 30516 DISCONNECTING <a href="psql://client-xyz:hiddenpassword@client-xyz-db:5432/client-xyz" class="">psql://client-xyz:hiddenpassword@client-xyz-db:5432/client-xyz
2020-08-20 14:44:06 [P3] 30517 CONNECTED <a href="psql://client-xyz:hiddenpassword@client-xyz-db:5432/client-xyz" class="">psql://client-xyz:hiddenpassword@client-xyz-db:5432/client-xyz
2020-08-20 14:44:06 [P3] 30517 DISCONNECTING <a href="psql://client-xyz:hiddenpassword@client-xyz-db:5432/client-xyz" class="">psql://client-xyz:hiddenpassword@client-xyz-db:5432/client-xyz

The number after [P3] is the session identifier (backend process id) of that connection. You should see each one being opened and closed in pairs.

Yes, I noticed the pid, and compared it with what I had on the
pg_stat_activity table.

Right.

I don't get the CONNECTED log because there is no way to set the
logging in the P3DatabaseDriver before it creates (and connects) the
P3Client.
Maybe there could be a setting on P3Client class to set verbosity
globally? Or at the P3DatabaseDriver instead.

What if you change 

P3DatabaseDriver >> connect: aLogin
connection := self connectionClass new.
connection 
host: aLogin host;
port: aLogin port asInteger;
database: aLogin databaseName;
user: aLogin username;
password: aLogin password.
connection connect

by inserting

 verbose: true

before the last statement.

You could also make a subclass of P3DatabaseDriver.

Summarizing... I'm pretty confident that P3 works correctly and also
the PG server.
At this point I'm factoring out what might be causing this. It's an
issue that only happens to me in production, and I don't have a better
instrumentation in place to debug it.

Again, thanks for the support.

Regards.

--------------------------------------------
Stéphane Ducasse
03 59 35 87 52
Assistant: Aurore Dalle 
FAX 03 59 57 78 50
TEL 03 59 35 86 16
S. Ducasse - Inria
40, avenue Halley, 
Parc Scientifique de la Haute Borne, Bât.A, Park Plaza
Villeneuve d'Ascq 59650
France

Reply | Threaded
Open this post in threaded view
|

Re: CannotWriteData errors in P3 and Seaside

Esteban A. Maringolo
In reply to this post by Sven Van Caekenberghe-2
Hi all,

On Thu, Aug 20, 2020 at 3:02 PM Sven Van Caekenberghe <[hidden email]> wrote:

> >> I don't know what is going on inside Socket, I just stated my opinion.
> > Maybe there is something to investigate here?

> Is it not so that in Docker all network connections in/out/between instances are mediated by some management software ?
> I even thought it was nginx. Maybe I am totally wrong here.

No, it is not that, but inside a Swarm all containers run in a
"overlay network" (basically a VPN) that is independent of the host
network, this way you can distribute containers among different hosts.

All the packages are routed by Docker itself, and apparently there is
an issue there, that if a connection is idle for a certain time, it
silently stops routing packages, leaving both sides of the connection
unaware of it.
https://github.com/moby/moby/issues/31208

> What if you change
>
> P3DatabaseDriver >> connect: aLogin

> by inserting
>
>          verbose: true
>
> before the last statement.
>
> You could also make a subclass of P3DatabaseDriver.

I could, but that would only log one extra CONNECT entry, not of much use.


> > At this point I'm factoring out what might be causing this. It's an
> > issue that only happens to me in production, and I don't have a better
> > instrumentation in place to debug it.

I think I found the culprit, now I need to know how to bypass or setup
the network to avoid these situations.

Luckily this will be solved soon. :-)

Regards.

Reply | Threaded
Open this post in threaded view
|

Re: CannotWriteData errors in P3 and Seaside

Esteban A. Maringolo
Hi all,

Good news. :-)

On Thu, Aug 20, 2020 at 6:36 PM Esteban Maringolo <[hidden email]> wrote:

> > Is it not so that in Docker all network connections in/out/between instances are mediated by some management software ?
> > I even thought it was nginx. Maybe I am totally wrong here.
>
> No, it is not that, but inside a Swarm all containers run in a
> "overlay network" (basically a VPN) that is independent of the host
> network, this way you can distribute containers among different hosts.

> All the packages are routed by Docker itself, and apparently there is
> an issue there, that if a connection is idle for a certain time, it
> silently stops routing packages, leaving both sides of the connection
> unaware of it.

I finally was able to pinpoint what was causing this issue that gave
me a lot of headaches.

It effectively was the "overlay network" [1] of my Docker Swarm [2]
deployment. The "mesh" router was terminating idle connections after
10 minutes (I noticed this exact timing some days ago).

So configuring the PostgreSQL server container to use the host network
instead of the overlay made all my connections stable again. No more
silent drops. No more GlorpDatabaseReadError. And now with some
additional health checks in place that will help me when there is an
actual issue with the connections.

Maybe there is another way to make it work within the overlay, but
that's not something I'm interested in doing right now.

So P3 behavior was okay (again), PostgreSQL was not doing anything odd
either, it was what was sitting in between.

Best regards,

[1] https://docs.docker.com/network/overlay/
[2] I have a Swarm with 1 nginx for static content, 1 PostgreSQL and 1
Traefik as an HTTP gateway for several replicas of a Seaside
application.

Reply | Threaded
Open this post in threaded view
|

Re: CannotWriteData errors in P3 and Seaside

Sven Van Caekenberghe-2
In reply to this post by Esteban A. Maringolo
Hi Esteban,

That is good to hear !

Sven

> On 20 Aug 2020, at 23:36, Esteban Maringolo <[hidden email]> wrote:
>
> Hi all,
>
> On Thu, Aug 20, 2020 at 3:02 PM Sven Van Caekenberghe <[hidden email]> wrote:
>
>>>> I don't know what is going on inside Socket, I just stated my opinion.
>>> Maybe there is something to investigate here?
>
>> Is it not so that in Docker all network connections in/out/between instances are mediated by some management software ?
>> I even thought it was nginx. Maybe I am totally wrong here.
>
> No, it is not that, but inside a Swarm all containers run in a
> "overlay network" (basically a VPN) that is independent of the host
> network, this way you can distribute containers among different hosts.
>
> All the packages are routed by Docker itself, and apparently there is
> an issue there, that if a connection is idle for a certain time, it
> silently stops routing packages, leaving both sides of the connection
> unaware of it.
> https://github.com/moby/moby/issues/31208
>
>> What if you change
>>
>> P3DatabaseDriver >> connect: aLogin
>
>> by inserting
>>
>>         verbose: true
>>
>> before the last statement.
>>
>> You could also make a subclass of P3DatabaseDriver.
>
> I could, but that would only log one extra CONNECT entry, not of much use.
>
>
>>> At this point I'm factoring out what might be causing this. It's an
>>> issue that only happens to me in production, and I don't have a better
>>> instrumentation in place to debug it.
>
> I think I found the culprit, now I need to know how to bypass or setup
> the network to avoid these situations.
>
> Luckily this will be solved soon. :-)
>
> Regards.
>


Reply | Threaded
Open this post in threaded view
|

Re: CannotWriteData errors in P3 and Seaside

Stéphane Ducasse
+1 and I like when this is not Pharo the problem :)

S.

On 21 Aug 2020, at 08:24, Sven Van Caekenberghe <[hidden email]> wrote:

Hi Esteban,

That is good to hear !

Sven

On 20 Aug 2020, at 23:36, Esteban Maringolo <[hidden email]> wrote:

Hi all,

On Thu, Aug 20, 2020 at 3:02 PM Sven Van Caekenberghe <[hidden email]> wrote:

I don't know what is going on inside Socket, I just stated my opinion.
Maybe there is something to investigate here?

Is it not so that in Docker all network connections in/out/between instances are mediated by some management software ?
I even thought it was nginx. Maybe I am totally wrong here.

No, it is not that, but inside a Swarm all containers run in a
"overlay network" (basically a VPN) that is independent of the host
network, this way you can distribute containers among different hosts.

All the packages are routed by Docker itself, and apparently there is
an issue there, that if a connection is idle for a certain time, it
silently stops routing packages, leaving both sides of the connection
unaware of it.
https://github.com/moby/moby/issues/31208

What if you change

P3DatabaseDriver >> connect: aLogin

by inserting

       verbose: true

before the last statement.

You could also make a subclass of P3DatabaseDriver.

I could, but that would only log one extra CONNECT entry, not of much use.


At this point I'm factoring out what might be causing this. It's an
issue that only happens to me in production, and I don't have a better
instrumentation in place to debug it.

I think I found the culprit, now I need to know how to bypass or setup
the network to avoid these situations.

Luckily this will be solved soon. :-)

Regards.




--------------------------------------------
Stéphane Ducasse
03 59 35 87 52
Assistant: Aurore Dalle 
FAX 03 59 57 78 50
TEL 03 59 35 86 16
S. Ducasse - Inria
40, avenue Halley, 
Parc Scientifique de la Haute Borne, Bât.A, Park Plaza
Villeneuve d'Ascq 59650
France