From: | "kuroda(dot)hayato(at)fujitsu(dot)com" <kuroda(dot)hayato(at)fujitsu(dot)com> |
---|---|
To: | 'Shinya Kato' <Shinya11(dot)Kato(at)oss(dot)nttdata(dot)com> |
Cc: | "pgsql-hackers(at)lists(dot)postgresql(dot)org" <pgsql-hackers(at)lists(dot)postgresql(dot)org> |
Subject: | RE: [Proposal] Add foreign-server health checks infrastructure |
Date: | 2021-11-18 12:43:01 |
Message-ID: | TYAPR01MB58662CD4FD98AA475B3D10F9F59B9@TYAPR01MB5866.jpnprd01.prod.outlook.com |
Views: | Raw Message | Whole Thread | Download mbox | Resend email |
Thread: | |
Lists: | pgsql-hackers |
Dear Kato-san,
Thank you for your interest!
> > I also want you to review the postgres_fdw part,
> > but I think it should not be attached because cfbot cannot understand
> > such a dependency
> > and will throw build error. Do you know how to deal with them in this
> > case?
>
> I don't know how to deal with them, but I hope you will attach the PoC,
> as it may be easier to review.
OK, I attached the PoC along with the dependent patches. Please see the zip file.
add_helth_check_... patch is written by me, and other two patches are
just copied from [1].
In the new callback function ConnectionHash is searched sequentially and
WaitEventSetWait() is performed for WL_SOCKET_CLOSED socket event.
This event is added by the dependent ones.
===
How to use
===
I'll explain how to use it briefly.
1. boot two postmaster processes. One is coordinator, and another is worker
2. set remote_servers_connection_check_interval to non-zero value at the coordinator
3. create tables to worker DB-cluster.
4. create foreign server, user mapping, and foreign table to coordinator.
5. connect to coordinator via psql.
6. open a transaction and access to foreing tables.
7. do "pg_ctl stop" command to woker DB-cluser.
8. execute some commands that does not access an foreign table.
9. Finally the following output will be get:
ERROR: Postgres foreign server XXX might be down.
===
Example in some steps
===
3. at worker
```
postgres=# \d
List of relations
Schema | Name | Type | Owner
--------+--------+-------+--------
public | remote | table | hayato
(1 row)
```
4. at coordinator
```
postgres=# select * from pg_foreign_server ;
oid | srvname | srvowner | srvfdw | srvtype | srvversion | srvacl | srvoptions
-------+---------+----------+--------+---------+------------+--------+-----------------------------
16406 | remote | 10 | 16402 | | | | {port=5433,dbname=postgres}
(1 row)
postgres=# select * from pg_user_mapping ;
oid | umuser | umserver | umoptions
-------+--------+----------+---------------
16407 | 10 | 16406 | {user=hayato}
(1 row)
postgres=# \d
List of relations
Schema | Name | Type | Owner
--------+--------+---------------+--------
public | local | table | hayato
public | remote | foreign table | hayato
(2 rows)
```
6-9. at coordinator
```
postgres=# begin;
BEGIN
postgres=*# select * from remote ;
id
----
1
(1 row)
postgres=*# select * from local ;
ERROR: Postgres foreign server remote might be down.
postgres=!#
```
Note that some keepalive settings are needed
if you want to detect cable breakdown events.
In my understanding following parameters are needed as server options:
* keepalives_idle
* keepalives_count
* keepalives_interval
[1]: https://commitfest.postgresql.org/35/3098/
Best Regards,
Hayato Kuroda
FUJITSU LIMITED
Attachment | Content-Type | Size |
---|---|---|
patches.zip | application/x-zip-compressed | 6.1 KB |
v01_add_checking_infrastracture.patch | application/octet-stream | 11.9 KB |
From | Date | Subject | |
---|---|---|---|
Next Message | Etsuro Fujita | 2021-11-18 12:45:29 | Re: postgres_fdw: commit remote (sub)transactions in parallel during pre-commit |
Previous Message | Amul Sul | 2021-11-18 12:39:07 | Re: Should rename "startup process" to something else? |