RE: [Proposal] Add foreign-server health checks infrastructure

From: "kuroda(dot)hayato(at)fujitsu(dot)com" <kuroda(dot)hayato(at)fujitsu(dot)com>
To: "osumi(dot)takamichi(at)fujitsu(dot)com" <osumi(dot)takamichi(at)fujitsu(dot)com>, 'Önder Kalacı' <onderkalaci(at)gmail(dot)com>, Fujii Masao <masao(dot)fujii(at)oss(dot)nttdata(dot)com>
Cc: Andres Freund <andres(at)anarazel(dot)de>, "pgsql-hackers(at)lists(dot)postgresql(dot)org" <pgsql-hackers(at)lists(dot)postgresql(dot)org>, "Shinya11(dot)Kato(at)oss(dot)nttdata(dot)com" <Shinya11(dot)Kato(at)oss(dot)nttdata(dot)com>, "zyu(at)yugabyte(dot)com" <zyu(at)yugabyte(dot)com>, Kyotaro Horiguchi <horikyota(dot)ntt(at)gmail(dot)com>
Subject: RE: [Proposal] Add foreign-server health checks infrastructure
Date: 2022-10-17 12:25:21
Message-ID: TYAPR01MB5866E2788F327896D2AF4E42F5299@TYAPR01MB5866.jpnprd01.prod.outlook.com
Views: Raw Message | Whole Thread | Download mbox | Resend email
Thread:
Lists: pgsql-hackers

Dear Osumi-san,

> I mainly followed the steps there and
> replaced the command "SELECT" for the remote table at 6-9 with "INSERT"
> command.
> Then, after waiting for few seconds, the "COMMIT" succeeded like below output,
> even after the server stop of the worker side.

> Additionally, the last reference "SELECT" which failed above can succeed,
> if I restart the worker server before the "SELECT" command to the remote table.
> This means the transaction looks successful but the data isn't there ?
> Could you please have a look at this issue ?

Good catch. This was occurred because we disconnected the crashed server.

Previously the failed server had been disconnected in the pgfdw_connection_check().
It was OK if the transaction(or statement) was surely cacneled.
But currently the statement might be not canceled because QueryCancelPending might be cleaned up[1].

If we failed to cancel statements and reached pgfdw_xact_callback(),
the connection would not be used because entry->conn is NULL.
That's why you sucseeded to do "COMMIT".
However, the backend did not send "COMMIT" command to the srever,
so inserted data was vanished on remote server.

I understood that we should not call disconnect_pg_server(entry->conn) even if we detect the disconnection.
If should be controlled in Xact callback. This will be modified in next version.

Moreover, I noticed that we should enable timer even if the QueryCancelMessage was not NULL.
If we do not turn on here, the checking will be never enabled again.

[1]: https://www.postgresql.org/message-id/TYAPR01MB5866CE34430424A588F60FC2F55D9%40TYAPR01MB5866.jpnprd01.prod.outlook.com

Best Regards,
Hayato Kuroda
FUJITSU LIMITED

In response to

Responses

Browse pgsql-hackers by date

  From Date Subject
Next Message kuroda.hayato@fujitsu.com 2022-10-17 13:01:37 RE: [Proposal] Add foreign-server health checks infrastructure
Previous Message kuroda.hayato@fujitsu.com 2022-10-17 12:04:07 RE: [Proposal] Add foreign-server health checks infrastructure