Re: pgsql: postgres_fdw: reestablish new connection if cached one is detect

From: Fujii Masao <masao(dot)fujii(at)oss(dot)nttdata(dot)com>
To: Michael Paquier <michael(at)paquier(dot)xyz>, Fujii Masao <fujii(at)postgresql(dot)org>
Cc: pgsql-committers(at)lists(dot)postgresql(dot)org
Subject: Re: pgsql: postgres_fdw: reestablish new connection if cached one is detect
Date: 2020-10-07 03:54:13
Message-ID: 14cd32c8-9532-47b8-77d4-b81eb3b192ea@oss.nttdata.com
Views: Raw Message | Whole Thread | Download mbox | Resend email
Thread:
Lists: pgsql-committers

On 2020/10/07 11:13, Michael Paquier wrote:
> Hi Fujii-san,
>
> On Tue, Oct 06, 2020 at 01:52:55AM +0000, Fujii Masao wrote:
>> postgres_fdw: reestablish new connection if cached one is detected as broken.
>>
>> In postgres_fdw, once remote connections are established, they are cached
>> and re-used for subsequent queries and transactions. There can be some
>> cases where those cached connections are unavaiable, for example,
>> by the restart of remote server. In these cases, previously an error was
>> reported and the query accessing to remote server failed if new remote
>> transaction failed to start because the cached connection was broken.
>>
>> This commit improves postgres_fdw so that new connection is remade
>> if broken connection is detected when starting new remote transaction.
>> This is useful to avoid unnecessary failure of queries when connection is
>> broken but can be reestablished.
>
> lorikeet is telling that the test introduced by this commit is
> unstable:
> https://buildfarm.postgresql.org/cgi-bin/show_log.pl?nm=lorikeet&dt=2020-10-06%2008%3A28%3A36

Thanks for letting me know this!

>
> Some details:
> BEGIN;
> SELECT 1 FROM ft1 LIMIT 1;
> - ?column?
> -----------
> - 1
> -(1 row)
> -
> +ERROR: could not receive data from server: Software caused connection abort
> +CONTEXT: remote SQL command: START TRANSACTION ISOLATION LEVEL REPEATABLE READ

This error means that new connection was successfully reestablished
after the cached connection was terminated, and then the above connection
error occurred when issuing "START TRANSACTION" command on that
new connection. There seems no suspicious relevant log messages in the
logfile. So I'm not sure why this error happened, yet.

Per the previous discusson at [1], lorikeet sometimes seems to cause
connection-relation failure in the regression test. So the cause of error
that we faced today also may be lorikeet itself.

[1]
https://www.postgresql.org/message-id/CA+hUKGL3Son9iAeqgjPbXCpU_6hhZhw9X24uNO14mOC4bG0cCA@mail.gmail.com

Regards,

--
Fujii Masao
Advanced Computing Technology Center
Research and Development Headquarters
NTT DATA CORPORATION

In response to

Responses

Browse pgsql-committers by date

  From Date Subject
Next Message Fujii Masao 2020-10-07 13:25:14 Re: pgsql: postgres_fdw: reestablish new connection if cached one is detect
Previous Message Amit Kapila 2020-10-07 02:50:12 pgsql: Display the names of missing columns in error during logical rep