RE: speed up a logical replica setup

From: "Hayato Kuroda (Fujitsu)" <kuroda(dot)hayato(at)fujitsu(dot)com>
To: "Hayato Kuroda (Fujitsu)" <kuroda(dot)hayato(at)fujitsu(dot)com>, 'Euler Taveira' <euler(at)eulerto(dot)com>
Cc: "pgsql-hackers(at)lists(dot)postgresql(dot)org" <pgsql-hackers(at)lists(dot)postgresql(dot)org>, vignesh C <vignesh21(at)gmail(dot)com>, Michael Paquier <michael(at)paquier(dot)xyz>, Peter Eisentraut <peter(at)eisentraut(dot)org>, Andres Freund <andres(at)anarazel(dot)de>, Ashutosh Bapat <ashutosh(dot)bapat(dot)oss(at)gmail(dot)com>, Amit Kapila <amit(dot)kapila16(at)gmail(dot)com>, Shlok Kyal <shlok(dot)kyal(dot)oss(at)gmail(dot)com>, Fabrízio de Royes Mello <fabriziomello(at)gmail(dot)com>
Subject: RE: speed up a logical replica setup
Date: 2024-02-13 12:58:23
Message-ID: TYCPR01MB120776F5C9824042DD8BB1307F54F2@TYCPR01MB12077.jpnprd01.prod.outlook.com
Views: Raw Message | Whole Thread | Download mbox | Resend email
Thread:
Lists: pgsql-hackers

Dear hackers,

I've replied for trackability.

> Further comments for v17.
>
> 01.
> This program assumes that the target server has same major version with this.
> Because the target server would be restarted by same version's pg_ctl command.
> I felt it should be ensured by reading the PG_VERSION.

Still investigating.

> 02.
> pg_upgrade checked the version of using executables, like pg_ctl, postgres, and
> pg_resetwal. I felt it should be as well.

Still investigating.

> 03. get_bin_directory
> ```
> if (find_my_exec(path, full_path) < 0)
> {
> pg_log_error("The program \"%s\" is needed by %s but was not
> found in the\n"
> "same directory as \"%s\".\n",
> "pg_ctl", progname, full_path);
> ```
>
> s/"pg_ctl"/progname

The message was updated.

> 04.
> Missing canonicalize_path()?

I found find_my_exec() calls canonicalize_path(). No need to do.

> 05.
> Assuming that the target server is a cascade standby, i.e., it has a role as
> another primary. In this case, I thought the child node would not work. Because
> pg_createsubcriber runs pg_resetwal and all WAL files would be discarded at that
> time. I have not tested, but should the program detect it and exit earlier?

Still investigating.

> 06.
> wait_for_end_recovery() waits forever even if the standby has been disconnected
> from the primary, right? should we check the status of the replication via
> pg_stat_wal_receiver?

Still investigating.

> 07.
> The cleanup function has couple of bugs.
>
> * If subscriptions have been created on the database, the function also tries to
> drop a publication. But it leads an ERROR because it has been already dropped.
> See setup_subscriber().
> * If the subscription has been created, drop_replication_slot() leads an ERROR.
> Because the subscriber tried to drop the subscription while executing DROP
> SUBSCRIPTION.

Only drop_publication() was removed.

> 08.
> I found that all messages (ERROR, WARNING, INFO, etc...) would output to stderr,
> but I felt it should be on stdout. Is there a reason? pg_dump outputs messages to
> stderr, but the motivation might be to avoid confusion with dumps.

Still investigating.

> 09.
> I'm not sure the cleanup for subscriber is really needed. Assuming that there
> are two databases, e.g., pg1 pg2 , and we fail to create a subscription on pg2.
> This can happen when the subscription which has the same name has been
> already
> created on the primary server.
> In this case a subscirption pn pg1 would be removed. But what is a next step?
> Since a timelineID on the standby server is larger than the primary (note that
> the standby has been promoted once), we cannot resume the physical replication
> as-is. IIUC the easiest method to retry is removing a cluster once and restarting
> from pg_basebackup. If so, no need to cleanup the standby because it is
> corrupted.
> We just say "Please remove the cluster and recreate again".

I still think it should be, but not done yet.

New patch can be available in [1].

[1]: https://www.postgresql.org/message-id/TYCPR01MB12077A6BB424A025F04A8243DF54F2%40TYCPR01MB12077.jpnprd01.prod.outlook.com

Best Regards,
Hayato Kuroda
FUJITSU LIMITED
https://www.fujitsu.com/

In response to

Browse pgsql-hackers by date

  From Date Subject
Next Message Hayato Kuroda (Fujitsu) 2024-02-13 13:03:13 RE: speed up a logical replica setup
Previous Message Hayato Kuroda (Fujitsu) 2024-02-13 12:57:25 RE: speed up a logical replica setup