Re: [PoC] pg_upgrade: allow to upgrade publisher node

From: Dilip Kumar <dilipbalaut(at)gmail(dot)com>
To: "Hayato Kuroda (Fujitsu)" <kuroda(dot)hayato(at)fujitsu(dot)com>
Cc: Amit Kapila <amit(dot)kapila16(at)gmail(dot)com>, "Zhijie Hou (Fujitsu)" <houzj(dot)fnst(at)fujitsu(dot)com>, Peter Smith <smithpb2250(at)gmail(dot)com>, PostgreSQL Hackers <pgsql-hackers(at)lists(dot)postgresql(dot)org>, Bruce Momjian <bruce(at)momjian(dot)us>, Julien Rouhaud <rjuju123(at)gmail(dot)com>, vignesh C <vignesh21(at)gmail(dot)com>, Masahiko Sawada <sawada(dot)mshk(at)gmail(dot)com>
Subject: Re: [PoC] pg_upgrade: allow to upgrade publisher node
Date: 2023-09-14 03:44:40
Message-ID: CAFiTN-uMgwknh2yz_+QA+3ode+w9-hr2ZDj1u1KddS+x56HiWA@mail.gmail.com
Views: Raw Message | Whole Thread | Download mbox | Resend email
Thread:
Lists: pgsql-hackers

On Wed, Sep 13, 2023 at 7:22 PM Hayato Kuroda (Fujitsu)
<kuroda(dot)hayato(at)fujitsu(dot)com> wrote:
>
> Dear Amit,
>
> Thank you for reviewing! Before making a patch I can reply the important point.
>
> > 1. One thing to note is that if user checks whether the old cluster is
> > upgradable with --check option and then try to upgrade, that will also
> > fail. Because during the --check run there would at least one
> > additional shutdown checkpoint WAL and then in the next run the slots
> > position won't match. Note, I am saying this in context of using
> > --check option with not-running old cluster. Won't that be surprising
> > to users? One possibility is that we document such a behaviour and
> > other is that we go back to WAL reading design where we can ignore
> > known WAL records like shutdown checkpoint, XLOG_RUNNING_XACTS, etc.
>
> Good catch, we have never considered the case that --check is executed for
> stopped cluster. You are right, the old cluster is turned on/off during the
> check and it generates SHUTDOWN_CHECKPOINT. This leads that confirmed_flush is
> behind the latest checkpoint lsn.
>
> Good catch, we have never considered the case that --check is executed for
> stopped cluster. You are right, the old cluster is turned on/off during the
> check and it generates SHUTDOWN_CHECKPOINT. This leads that confirmed_flush is
> behind the latest checkpoint lsn.

Good catch.

> Here are other approaches we came up with:
>
> 1. adds WARNING message when the --check is executed and slots are checked.
> We can say like:
>
> ```
> ...
> Checking for valid logical replication slots
> WARNING: this check generated WALs
> Next pg_uprade would fail.
> Please ensure again that all WALs are replicated.
> ...

IMHO the --check is a very common command users execute before the
actual upgrade. So issuing such a WARNING might not be good because
then what option user have? Do they need to again restart the cluster
in order to stream the new WAL and again shut it down? I don't think
that is really an acceptable idea. Maybe as discussed in the past we
can provide an option to skip the slot checking and during the --check
command we can give a WARNING and suggest that better to use
--skip-slot-checking for the main upgrade as we have already checked.
This could still be okay for the user.

--
Regards,
Dilip Kumar
EnterpriseDB: http://www.enterprisedb.com

In response to

Browse pgsql-hackers by date

  From Date Subject
Next Message Nathan Bossart 2023-09-14 03:45:39 Re: Inefficiency in parallel pg_restore with many tables
Previous Message Zhijie Hou (Fujitsu) 2023-09-14 03:10:38 RE: [PoC] pg_upgrade: allow to upgrade publisher node