Hi all, Sven ;-)
I'm having erratic P3 errors in a recent application I wrote using Pharo, Seaside and Glorp with P3 as driver. Each Seaside session has a GlorpSession, which in turn has a P3Connection in its accessor. I don't know why, but sometimes the P3Connection socket is closed, and then when trying to read from the database, it cannot write the query to the P3 socket and exception is raised, and it isn't handled by the P3DatabaseDriver (automatically trying to reconnect?). I don't know if I'm doing something wrong, I plan to migrate the GlorpPooledDatabaseAccessor and also use the P3ConnectionPool, but I want to be sure that the current setup works of if maybe I'm exceeding some limit or timeout that causes the connection to be closed. Regards! Esteban A. Maringolo |
To add to this hard to reproduce issue, it only happens in production
(where if not?) when running within a Docker Swarm where PostgreSQL is inside the Swarm, so there might be something there fiddling with the network, or I don't know really :-/ Regards. Esteban A. Maringolo On Mon, Aug 10, 2020 at 4:15 PM Esteban Maringolo <[hidden email]> wrote: > > Hi all, Sven ;-) > > I'm having erratic P3 errors in a recent application I wrote using > Pharo, Seaside and Glorp with P3 as driver. > > Each Seaside session has a GlorpSession, which in turn has a > P3Connection in its accessor. I don't know why, but sometimes the > P3Connection socket is closed, and then when trying to read from the > database, it cannot write the query to the P3 socket and exception is > raised, and it isn't handled by the P3DatabaseDriver (automatically > trying to reconnect?). > > I don't know if I'm doing something wrong, I plan to migrate the > GlorpPooledDatabaseAccessor and also use the P3ConnectionPool, but I > want to be sure that the current setup works of if maybe I'm exceeding > some limit or timeout that causes the connection to be closed. > > Regards! > > > Esteban A. Maringolo |
In reply to this post by Esteban A. Maringolo
Hi Esteban,
I have a web app with P3 under Seaside in production and it works fine. But that is without Glorp, nor any connection pooling. You say the connection seems closed, maybe the closing got triggered by your app somehow ? How do you clean up expired sessions ? How do you handle logouts ? P3 does normally reconnect automatically, IIRC. You could try to enable logging in P3Client, that is a recent addition. It should show you what happens to your connections. Sven > On 10 Aug 2020, at 21:15, Esteban Maringolo <[hidden email]> wrote: > > Hi all, Sven ;-) > > I'm having erratic P3 errors in a recent application I wrote using > Pharo, Seaside and Glorp with P3 as driver. > > Each Seaside session has a GlorpSession, which in turn has a > P3Connection in its accessor. I don't know why, but sometimes the > P3Connection socket is closed, and then when trying to read from the > database, it cannot write the query to the P3 socket and exception is > raised, and it isn't handled by the P3DatabaseDriver (automatically > trying to reconnect?). > > I don't know if I'm doing something wrong, I plan to migrate the > GlorpPooledDatabaseAccessor and also use the P3ConnectionPool, but I > want to be sure that the current setup works of if maybe I'm exceeding > some limit or timeout that causes the connection to be closed. > > Regards! > > > Esteban A. Maringolo > |
My Seaside session isn't closing the connection, only when
unregistered, but this seems to be something else I don't know. I saw there is logging, and I need to set it up in general (including Fuel serialized stacks). Looking in the web apparently there is the need for a keepalive that is not in place. What disturbs me is that it doesn't happen in development. Making things harder. Regards! Esteban A. Maringolo On Mon, Aug 10, 2020 at 5:02 PM Sven Van Caekenberghe <[hidden email]> wrote: > > Hi Esteban, > > I have a web app with P3 under Seaside in production and it works fine. But that is without Glorp, nor any connection pooling. > > You say the connection seems closed, maybe the closing got triggered by your app somehow ? How do you clean up expired sessions ? How do you handle logouts ? > > P3 does normally reconnect automatically, IIRC. > > You could try to enable logging in P3Client, that is a recent addition. It should show you what happens to your connections. > > Sven > > > On 10 Aug 2020, at 21:15, Esteban Maringolo <[hidden email]> wrote: > > > > Hi all, Sven ;-) > > > > I'm having erratic P3 errors in a recent application I wrote using > > Pharo, Seaside and Glorp with P3 as driver. > > > > Each Seaside session has a GlorpSession, which in turn has a > > P3Connection in its accessor. I don't know why, but sometimes the > > P3Connection socket is closed, and then when trying to read from the > > database, it cannot write the query to the P3 socket and exception is > > raised, and it isn't handled by the P3DatabaseDriver (automatically > > trying to reconnect?). > > > > I don't know if I'm doing something wrong, I plan to migrate the > > GlorpPooledDatabaseAccessor and also use the P3ConnectionPool, but I > > want to be sure that the current setup works of if maybe I'm exceeding > > some limit or timeout that causes the connection to be closed. > > > > Regards! > > > > > > Esteban A. Maringolo > > > > |
Looking at the Postgres side of the log I find that the connection was
reset from the other side (it is Pharo). The reason for that is yet unknown to me. Since I don't do anything (that I'm aware of). golfware_database.1.9sl4bt9j6cv5@gw | 2020-08-10 20:02:35.939 UTC [132] LOG: could not receive data from client: Connection reset by peer golfware_database.1.9sl4bt9j6cv5@gw | 2020-08-10 20:06:58.083 UTC [139] LOG: could not receive data from client: Connection reset by peer golfware_database.1.9sl4bt9j6cv5@gw | 2020-08-10 20:06:58.083 UTC [137] LOG: could not receive data from client: Connection reset by peer golfware_database.1.9sl4bt9j6cv5@gw | 2020-08-10 20:33:20.163 UTC [166] LOG: could not receive data from client: Connection reset by peer golfware_database.1.9sl4bt9j6cv5@gw | 2020-08-10 20:35:22.019 UTC [168] LOG: could not receive data from client: Connection reset by peer golfware_database.1.9sl4bt9j6cv5@gw | 2020-08-10 20:37:33.091 UTC [177] LOG: could not receive data from client: Connection reset by peer I'll have to keep looking. Best regards! Esteban A. Maringolo On Mon, Aug 10, 2020 at 5:35 PM Esteban Maringolo <[hidden email]> wrote: > > My Seaside session isn't closing the connection, only when > unregistered, but this seems to be something else I don't know. > > I saw there is logging, and I need to set it up in general (including > Fuel serialized stacks). > Looking in the web apparently there is the need for a keepalive that > is not in place. > What disturbs me is that it doesn't happen in development. Making things harder. > > Regards! > > Esteban A. Maringolo > > On Mon, Aug 10, 2020 at 5:02 PM Sven Van Caekenberghe <[hidden email]> wrote: > > > > Hi Esteban, > > > > I have a web app with P3 under Seaside in production and it works fine. But that is without Glorp, nor any connection pooling. > > > > You say the connection seems closed, maybe the closing got triggered by your app somehow ? How do you clean up expired sessions ? How do you handle logouts ? > > > > P3 does normally reconnect automatically, IIRC. > > > > You could try to enable logging in P3Client, that is a recent addition. It should show you what happens to your connections. > > > > Sven > > > > > On 10 Aug 2020, at 21:15, Esteban Maringolo <[hidden email]> wrote: > > > > > > Hi all, Sven ;-) > > > > > > I'm having erratic P3 errors in a recent application I wrote using > > > Pharo, Seaside and Glorp with P3 as driver. > > > > > > Each Seaside session has a GlorpSession, which in turn has a > > > P3Connection in its accessor. I don't know why, but sometimes the > > > P3Connection socket is closed, and then when trying to read from the > > > database, it cannot write the query to the P3 socket and exception is > > > raised, and it isn't handled by the P3DatabaseDriver (automatically > > > trying to reconnect?). > > > > > > I don't know if I'm doing something wrong, I plan to migrate the > > > GlorpPooledDatabaseAccessor and also use the P3ConnectionPool, but I > > > want to be sure that the current setup works of if maybe I'm exceeding > > > some limit or timeout that causes the connection to be closed. > > > > > > Regards! > > > > > > > > > Esteban A. Maringolo > > > > > > > |
I kept looking into this, and still haven't found what might be causing it.
However, I was trying to "salvage" that until I find a solution, and run a "healthcheck" to be sure that the GlorpSession has an active connection, and then I found that a P3Client reports as connected even when it's not. I connected to a PostgreSQL database running in a Docker container, stopped the container, and the driver continues to report as connected, even way after the server was stopped (the timeout is using the default 10 seconds). The ZdcSocketStream reports both ends of the socket as connected, when the server side certainly isn't. I noticed also that in P3Client>>connect it calls #ensureOpen, and it takes the socket as open when it is not, so as soon as it tries to flush the data written to it, a `CannotWriteData` exception is signaled. In my case I'm developing on Pharo 8 on Windows with PostgreSQL running in Docker on WSL2 (ubuntu), but on the server it is a 100% Linux deployment. Any ideas? Esteban A. Maringolo On Mon, Aug 10, 2020 at 5:44 PM Esteban Maringolo <[hidden email]> wrote: > > Looking at the Postgres side of the log I find that the connection was > reset from the other side (it is Pharo). > > The reason for that is yet unknown to me. Since I don't do anything > (that I'm aware of). > > golfware_database.1.9sl4bt9j6cv5@gw | 2020-08-10 20:02:35.939 UTC > [132] LOG: could not receive data from client: Connection reset by > peer > golfware_database.1.9sl4bt9j6cv5@gw | 2020-08-10 20:06:58.083 UTC > [139] LOG: could not receive data from client: Connection reset by > peer > golfware_database.1.9sl4bt9j6cv5@gw | 2020-08-10 20:06:58.083 UTC > [137] LOG: could not receive data from client: Connection reset by > peer > golfware_database.1.9sl4bt9j6cv5@gw | 2020-08-10 20:33:20.163 UTC > [166] LOG: could not receive data from client: Connection reset by > peer > golfware_database.1.9sl4bt9j6cv5@gw | 2020-08-10 20:35:22.019 UTC > [168] LOG: could not receive data from client: Connection reset by > peer > golfware_database.1.9sl4bt9j6cv5@gw | 2020-08-10 20:37:33.091 UTC > [177] LOG: could not receive data from client: Connection reset by > peer > > I'll have to keep looking. > > Best regards! > > Esteban A. Maringolo > > > On Mon, Aug 10, 2020 at 5:35 PM Esteban Maringolo <[hidden email]> wrote: > > > > My Seaside session isn't closing the connection, only when > > unregistered, but this seems to be something else I don't know. > > > > I saw there is logging, and I need to set it up in general (including > > Fuel serialized stacks). > > Looking in the web apparently there is the need for a keepalive that > > is not in place. > > What disturbs me is that it doesn't happen in development. Making things harder. > > > > Regards! > > > > Esteban A. Maringolo > > > > On Mon, Aug 10, 2020 at 5:02 PM Sven Van Caekenberghe <[hidden email]> wrote: > > > > > > Hi Esteban, > > > > > > I have a web app with P3 under Seaside in production and it works fine. But that is without Glorp, nor any connection pooling. > > > > > > You say the connection seems closed, maybe the closing got triggered by your app somehow ? How do you clean up expired sessions ? How do you handle logouts ? > > > > > > P3 does normally reconnect automatically, IIRC. > > > > > > You could try to enable logging in P3Client, that is a recent addition. It should show you what happens to your connections. > > > > > > Sven > > > > > > > On 10 Aug 2020, at 21:15, Esteban Maringolo <[hidden email]> wrote: > > > > > > > > Hi all, Sven ;-) > > > > > > > > I'm having erratic P3 errors in a recent application I wrote using > > > > Pharo, Seaside and Glorp with P3 as driver. > > > > > > > > Each Seaside session has a GlorpSession, which in turn has a > > > > P3Connection in its accessor. I don't know why, but sometimes the > > > > P3Connection socket is closed, and then when trying to read from the > > > > database, it cannot write the query to the P3 socket and exception is > > > > raised, and it isn't handled by the P3DatabaseDriver (automatically > > > > trying to reconnect?). > > > > > > > > I don't know if I'm doing something wrong, I plan to migrate the > > > > GlorpPooledDatabaseAccessor and also use the P3ConnectionPool, but I > > > > want to be sure that the current setup works of if maybe I'm exceeding > > > > some limit or timeout that causes the connection to be closed. > > > > > > > > Regards! > > > > > > > > > > > > Esteban A. Maringolo > > > > > > > > > > |
IIUC a socket stream does not automatically/automagically know that the state of the connection changed, unless/until it tries to use it (read or write to it, wait for data, ...).
I would recommend using #isWorking to actually test if a connection is good, if you would need to do that. That does an actual query. Did you enable P3 logging as I suggested ? What did you learn ? It feels as if in your particular setup the server eagerly closes connections that are open for too long. P3 was written under the assumption that that does not happen (it does not for me). > On 19 Aug 2020, at 19:48, Esteban Maringolo <[hidden email]> wrote: > > I kept looking into this, and still haven't found what might be causing it. > > However, I was trying to "salvage" that until I find a solution, and > run a "healthcheck" to be sure that the GlorpSession has an active > connection, and then I found that a P3Client reports as connected even > when it's not. > > I connected to a PostgreSQL database running in a Docker container, > stopped the container, and the driver continues to report as > connected, even way after the server was stopped (the timeout is using > the default 10 seconds). > > The ZdcSocketStream reports both ends of the socket as connected, when > the server side certainly isn't. > > I noticed also that in P3Client>>connect it calls #ensureOpen, and it > takes the socket as open when it is not, so as soon as it tries to > flush the data written to it, a `CannotWriteData` exception is > signaled. > > In my case I'm developing on Pharo 8 on Windows with PostgreSQL > running in Docker on WSL2 (ubuntu), but on the server it is a 100% > Linux deployment. > > Any ideas? > > Esteban A. Maringolo > > On Mon, Aug 10, 2020 at 5:44 PM Esteban Maringolo <[hidden email]> wrote: >> >> Looking at the Postgres side of the log I find that the connection was >> reset from the other side (it is Pharo). >> >> The reason for that is yet unknown to me. Since I don't do anything >> (that I'm aware of). >> >> golfware_database.1.9sl4bt9j6cv5@gw | 2020-08-10 20:02:35.939 UTC >> [132] LOG: could not receive data from client: Connection reset by >> peer >> golfware_database.1.9sl4bt9j6cv5@gw | 2020-08-10 20:06:58.083 UTC >> [139] LOG: could not receive data from client: Connection reset by >> peer >> golfware_database.1.9sl4bt9j6cv5@gw | 2020-08-10 20:06:58.083 UTC >> [137] LOG: could not receive data from client: Connection reset by >> peer >> golfware_database.1.9sl4bt9j6cv5@gw | 2020-08-10 20:33:20.163 UTC >> [166] LOG: could not receive data from client: Connection reset by >> peer >> golfware_database.1.9sl4bt9j6cv5@gw | 2020-08-10 20:35:22.019 UTC >> [168] LOG: could not receive data from client: Connection reset by >> peer >> golfware_database.1.9sl4bt9j6cv5@gw | 2020-08-10 20:37:33.091 UTC >> [177] LOG: could not receive data from client: Connection reset by >> peer >> >> I'll have to keep looking. >> >> Best regards! >> >> Esteban A. Maringolo >> >> >> On Mon, Aug 10, 2020 at 5:35 PM Esteban Maringolo <[hidden email]> wrote: >>> >>> My Seaside session isn't closing the connection, only when >>> unregistered, but this seems to be something else I don't know. >>> >>> I saw there is logging, and I need to set it up in general (including >>> Fuel serialized stacks). >>> Looking in the web apparently there is the need for a keepalive that >>> is not in place. >>> What disturbs me is that it doesn't happen in development. Making things harder. >>> >>> Regards! >>> >>> Esteban A. Maringolo >>> >>> On Mon, Aug 10, 2020 at 5:02 PM Sven Van Caekenberghe <[hidden email]> wrote: >>>> >>>> Hi Esteban, >>>> >>>> I have a web app with P3 under Seaside in production and it works fine. But that is without Glorp, nor any connection pooling. >>>> >>>> You say the connection seems closed, maybe the closing got triggered by your app somehow ? How do you clean up expired sessions ? How do you handle logouts ? >>>> >>>> P3 does normally reconnect automatically, IIRC. >>>> >>>> You could try to enable logging in P3Client, that is a recent addition. It should show you what happens to your connections. >>>> >>>> Sven >>>> >>>>> On 10 Aug 2020, at 21:15, Esteban Maringolo <[hidden email]> wrote: >>>>> >>>>> Hi all, Sven ;-) >>>>> >>>>> I'm having erratic P3 errors in a recent application I wrote using >>>>> Pharo, Seaside and Glorp with P3 as driver. >>>>> >>>>> Each Seaside session has a GlorpSession, which in turn has a >>>>> P3Connection in its accessor. I don't know why, but sometimes the >>>>> P3Connection socket is closed, and then when trying to read from the >>>>> database, it cannot write the query to the P3 socket and exception is >>>>> raised, and it isn't handled by the P3DatabaseDriver (automatically >>>>> trying to reconnect?). >>>>> >>>>> I don't know if I'm doing something wrong, I plan to migrate the >>>>> GlorpPooledDatabaseAccessor and also use the P3ConnectionPool, but I >>>>> want to be sure that the current setup works of if maybe I'm exceeding >>>>> some limit or timeout that causes the connection to be closed. >>>>> >>>>> Regards! >>>>> >>>>> >>>>> Esteban A. Maringolo >>>>> >>>> >>>> |
Hi,
I'm replying to the list as well... because the last two mails got replied to our personal addresses. On Thu, Aug 20, 2020 at 11:55 AM Sven Van Caekenberghe <[hidden email]> wrote: > > On 20 Aug 2020, at 15:31, Esteban Maringolo <[hidden email]> wrote: > > > > Hi Sven, > > > > If a socketstream doesn't know the state of the connection, then what > > is the #socketIsConnected method for? In particular the > > #isOtherEndClosed test. > > > > ZdcAbstractSocketStream>>#socketIsConnected > > ^socket isConnected and: [ socket isOtherEndClosed not ] > > I don't know what is going on inside Socket, I just stated my opinion. Maybe there is something to investigate here? > With logging enabled, I can do the following: > > $ grep P3 server-2020-08-20.log | grep CONNECT | tail -n 20 > <snip> > 2020-08-20 14:43:06 [P3] 30513 DISCONNECTING psql://client-xyz:hiddenpassword@client-xyz-db:5432/client-xyz > 2020-08-20 14:44:06 [P3] 30516 CONNECTED psql://client-xyz:hiddenpassword@client-xyz-db:5432/client-xyz > 2020-08-20 14:44:06 [P3] 30516 DISCONNECTING psql://client-xyz:hiddenpassword@client-xyz-db:5432/client-xyz > 2020-08-20 14:44:06 [P3] 30517 CONNECTED psql://client-xyz:hiddenpassword@client-xyz-db:5432/client-xyz > 2020-08-20 14:44:06 [P3] 30517 DISCONNECTING psql://client-xyz:hiddenpassword@client-xyz-db:5432/client-xyz > > The number after [P3] is the session identifier (backend process id) of that connection. You should see each one being opened and closed in pairs. Yes, I noticed the pid, and compared it with what I had on the pg_stat_activity table. I don't get the CONNECTED log because there is no way to set the logging in the P3DatabaseDriver before it creates (and connects) the P3Client. Maybe there could be a setting on P3Client class to set verbosity globally? Or at the P3DatabaseDriver instead. Summarizing... I'm pretty confident that P3 works correctly and also the PG server. At this point I'm factoring out what might be causing this. It's an issue that only happens to me in production, and I don't have a better instrumentation in place to debug it. Again, thanks for the support. Regards. |
> On 20 Aug 2020, at 19:51, Esteban Maringolo <[hidden email]> wrote: > > Hi, > > I'm replying to the list as well... because the last two mails got > replied to our personal addresses. Oh, I did not notice that, that was not my intention. > On Thu, Aug 20, 2020 at 11:55 AM Sven Van Caekenberghe <[hidden email]> wrote: >>> On 20 Aug 2020, at 15:31, Esteban Maringolo <[hidden email]> wrote: >>> >>> Hi Sven, >>> >>> If a socketstream doesn't know the state of the connection, then what >>> is the #socketIsConnected method for? In particular the >>> #isOtherEndClosed test. >>> >>> ZdcAbstractSocketStream>>#socketIsConnected >>> ^socket isConnected and: [ socket isOtherEndClosed not ] >> >> I don't know what is going on inside Socket, I just stated my opinion. > > Maybe there is something to investigate here? Is it not so that in Docker all network connections in/out/between instances are mediated by some management software ? I even thought it was nginx. Maybe I am totally wrong here. >> With logging enabled, I can do the following: >> >> $ grep P3 server-2020-08-20.log | grep CONNECT | tail -n 20 >> <snip> >> 2020-08-20 14:43:06 [P3] 30513 DISCONNECTING psql://client-xyz:hiddenpassword@client-xyz-db:5432/client-xyz >> 2020-08-20 14:44:06 [P3] 30516 CONNECTED psql://client-xyz:hiddenpassword@client-xyz-db:5432/client-xyz >> 2020-08-20 14:44:06 [P3] 30516 DISCONNECTING psql://client-xyz:hiddenpassword@client-xyz-db:5432/client-xyz >> 2020-08-20 14:44:06 [P3] 30517 CONNECTED psql://client-xyz:hiddenpassword@client-xyz-db:5432/client-xyz >> 2020-08-20 14:44:06 [P3] 30517 DISCONNECTING psql://client-xyz:hiddenpassword@client-xyz-db:5432/client-xyz >> >> The number after [P3] is the session identifier (backend process id) of that connection. You should see each one being opened and closed in pairs. > > Yes, I noticed the pid, and compared it with what I had on the > pg_stat_activity table. Right. > I don't get the CONNECTED log because there is no way to set the > logging in the P3DatabaseDriver before it creates (and connects) the > P3Client. > Maybe there could be a setting on P3Client class to set verbosity > globally? Or at the P3DatabaseDriver instead. What if you change P3DatabaseDriver >> connect: aLogin connection := self connectionClass new. connection host: aLogin host; port: aLogin port asInteger; database: aLogin databaseName; user: aLogin username; password: aLogin password. connection connect by inserting verbose: true before the last statement. You could also make a subclass of P3DatabaseDriver. > Summarizing... I'm pretty confident that P3 works correctly and also > the PG server. > At this point I'm factoring out what might be causing this. It's an > issue that only happens to me in production, and I don't have a better > instrumentation in place to debug it. > > Again, thanks for the support. > > Regards. |
HI guys
let us know if we have to do something from our side because we want to support business. S.
-------------------------------------------- Stéphane Ducasse 03 59 35 87 52 Assistant: Aurore Dalle FAX 03 59 57 78 50 TEL 03 59 35 86 16 S. Ducasse - Inria 40, avenue Halley, Parc Scientifique de la Haute Borne, Bât.A, Park Plaza Villeneuve d'Ascq 59650 France |
In reply to this post by Sven Van Caekenberghe-2
Hi all,
On Thu, Aug 20, 2020 at 3:02 PM Sven Van Caekenberghe <[hidden email]> wrote: > >> I don't know what is going on inside Socket, I just stated my opinion. > > Maybe there is something to investigate here? > Is it not so that in Docker all network connections in/out/between instances are mediated by some management software ? > I even thought it was nginx. Maybe I am totally wrong here. No, it is not that, but inside a Swarm all containers run in a "overlay network" (basically a VPN) that is independent of the host network, this way you can distribute containers among different hosts. All the packages are routed by Docker itself, and apparently there is an issue there, that if a connection is idle for a certain time, it silently stops routing packages, leaving both sides of the connection unaware of it. https://github.com/moby/moby/issues/31208 > What if you change > > P3DatabaseDriver >> connect: aLogin > by inserting > > verbose: true > > before the last statement. > > You could also make a subclass of P3DatabaseDriver. I could, but that would only log one extra CONNECT entry, not of much use. > > At this point I'm factoring out what might be causing this. It's an > > issue that only happens to me in production, and I don't have a better > > instrumentation in place to debug it. I think I found the culprit, now I need to know how to bypass or setup the network to avoid these situations. Luckily this will be solved soon. :-) Regards. |
Hi all,
Good news. :-) On Thu, Aug 20, 2020 at 6:36 PM Esteban Maringolo <[hidden email]> wrote: > > Is it not so that in Docker all network connections in/out/between instances are mediated by some management software ? > > I even thought it was nginx. Maybe I am totally wrong here. > > No, it is not that, but inside a Swarm all containers run in a > "overlay network" (basically a VPN) that is independent of the host > network, this way you can distribute containers among different hosts. > All the packages are routed by Docker itself, and apparently there is > an issue there, that if a connection is idle for a certain time, it > silently stops routing packages, leaving both sides of the connection > unaware of it. I finally was able to pinpoint what was causing this issue that gave me a lot of headaches. It effectively was the "overlay network" [1] of my Docker Swarm [2] deployment. The "mesh" router was terminating idle connections after 10 minutes (I noticed this exact timing some days ago). So configuring the PostgreSQL server container to use the host network instead of the overlay made all my connections stable again. No more silent drops. No more GlorpDatabaseReadError. And now with some additional health checks in place that will help me when there is an actual issue with the connections. Maybe there is another way to make it work within the overlay, but that's not something I'm interested in doing right now. So P3 behavior was okay (again), PostgreSQL was not doing anything odd either, it was what was sitting in between. Best regards, [1] https://docs.docker.com/network/overlay/ [2] I have a Swarm with 1 nginx for static content, 1 PostgreSQL and 1 Traefik as an HTTP gateway for several replicas of a Seaside application. |
In reply to this post by Esteban A. Maringolo
Hi Esteban,
That is good to hear ! Sven > On 20 Aug 2020, at 23:36, Esteban Maringolo <[hidden email]> wrote: > > Hi all, > > On Thu, Aug 20, 2020 at 3:02 PM Sven Van Caekenberghe <[hidden email]> wrote: > >>>> I don't know what is going on inside Socket, I just stated my opinion. >>> Maybe there is something to investigate here? > >> Is it not so that in Docker all network connections in/out/between instances are mediated by some management software ? >> I even thought it was nginx. Maybe I am totally wrong here. > > No, it is not that, but inside a Swarm all containers run in a > "overlay network" (basically a VPN) that is independent of the host > network, this way you can distribute containers among different hosts. > > All the packages are routed by Docker itself, and apparently there is > an issue there, that if a connection is idle for a certain time, it > silently stops routing packages, leaving both sides of the connection > unaware of it. > https://github.com/moby/moby/issues/31208 > >> What if you change >> >> P3DatabaseDriver >> connect: aLogin > >> by inserting >> >> verbose: true >> >> before the last statement. >> >> You could also make a subclass of P3DatabaseDriver. > > I could, but that would only log one extra CONNECT entry, not of much use. > > >>> At this point I'm factoring out what might be causing this. It's an >>> issue that only happens to me in production, and I don't have a better >>> instrumentation in place to debug it. > > I think I found the culprit, now I need to know how to bypass or setup > the network to avoid these situations. > > Luckily this will be solved soon. :-) > > Regards. > |
+1 and I like when this is not Pharo the problem :)
S.
-------------------------------------------- Stéphane Ducasse 03 59 35 87 52 Assistant: Aurore Dalle FAX 03 59 57 78 50 TEL 03 59 35 86 16 S. Ducasse - Inria 40, avenue Halley, Parc Scientifique de la Haute Borne, Bât.A, Park Plaza Villeneuve d'Ascq 59650 France |
Free forum by Nabble | Edit this page |