From: | Ashutosh Bapat <ashutosh(dot)bapat(dot)oss(at)gmail(dot)com> |
---|---|
To: | Bharath Rupireddy <bharath(dot)rupireddyforpostgres(at)gmail(dot)com> |
Cc: | PostgreSQL Hackers <pgsql-hackers(at)lists(dot)postgresql(dot)org>, PostgreSQL-development <pgsql-hackers(at)postgresql(dot)org>, Ashutosh Bapat <ashutosh(dot)bapat(at)2ndquadrant(dot)com> |
Subject: | Re: Retry Cached Remote Connections for postgres_fdw in case remote backend gets killed/goes away |
Date: | 2020-07-13 04:43:19 |
Message-ID: | CAExHW5vn_JpBOB8Xr1y=w=W1U9Um6UX-X_GPS2+Vd15vBwNrfA@mail.gmail.com |
Views: | Raw Message | Whole Thread | Download mbox | Resend email |
Thread: | |
Lists: | pgsql-hackers |
On Wed, Jul 8, 2020 at 6:10 PM Bharath Rupireddy
<bharath(dot)rupireddyforpostgres(at)gmail(dot)com> wrote:
>
> I couldn't think of adding a test case to the existing postgres_fdw
> regression test suite with an automated scenario of the remote backend
> getting killed.
You could get a backend's PID using PQbackendPID and then kill it by
calling pg_terminate_backend() to kill the remote backend to automate
scenario of remote backend being killed.
>
> I would like to thank Ashutosh Bapat (ashutosh(dot)bapat(dot)oss(at)gmail(dot)com)
> for the suggestion to fix this and the review of my initial patch
> attached in [2]. I tried to address the review comments provided on my
> initial patch [3].
>
> For, one of the Ashutosh's review comments from [3] "the fact that the
> same connection may be used by multiple plan nodes", I tried to have
> few use cases where there exist joins on two foreign tables on the
> same remote server, in a single query, so essentially, the same
> connection was used for multiple plan nodes. In this case we avoid
> retrying for the second GetConnection() request for the second foreign
> table, with the check entry->xact_depth <= 0 , xact_depth after the
> first GetConnection() and the first remote query will become 1 and we
> don't hit the retry logic and seems like we are safe here. Please add
> If I'm missing something here.
>
> Request the community to consider the patch for further review if the
> overall idea seems beneficial.
I think this idea will be generally useful if your work on dropping
stale connection uses idle_connection_timeout or something like that
on the remote server.
About the patch. It seems we could just catch the error from
begin_remote_xact() in GetConnection() and retry connection if the
error is "bad connection". Retrying using PQreset() might be better
than calling PQConnect* always.
>
> [1] https://www.postgresql.org/message-id/CAExHW5t21B_XPQy_hownm1Qq%3DhMrgOhX%2B8gDj3YEKFvpk%3DVRgw%40mail.gmail.com
> [2] https://www.postgresql.org/message-id/CALj2ACXp6DQ3iLGx5g%2BLgVtGwC4F6K9WzKQJpyR4FfdydQzC_g%40mail.gmail.com
> [3] https://www.postgresql.org/message-id/CAExHW5u3Gyv6Q1BEr6zMg0t%2B59e8c4KMfKVrV3Z%3D4UKKjJ19nQ%40mail.gmail.com
>
> With Regards,
> Bharath Rupireddy.
> EnterpriseDB: http://www.enterprisedb.com
--
Best Wishes,
Ashutosh Bapat
From | Date | Subject | |
---|---|---|---|
Next Message | Amit Kapila | 2020-07-13 04:43:51 | Re: PATCH: logical_work_mem and logical streaming of large in-progress transactions |
Previous Message | Ashutosh Bapat | 2020-07-13 04:42:22 | Re: Bug with indexes on whole-row expressions |