Re: [PoC] pg_upgrade: allow to upgrade publisher node

From: Amit Kapila <amit(dot)kapila16(at)gmail(dot)com>
To: "Hayato Kuroda (Fujitsu)" <kuroda(dot)hayato(at)fujitsu(dot)com>
Cc: PostgreSQL Hackers <pgsql-hackers(at)lists(dot)postgresql(dot)org>, Julien Rouhaud <rjuju123(at)gmail(dot)com>, vignesh C <vignesh21(at)gmail(dot)com>
Subject: Re: [PoC] pg_upgrade: allow to upgrade publisher node
Date: 2023-08-03 10:26:58
Message-ID: CAA4eK1+CD82Kssy+iqpETPKYUh9AmNORF+3iGfNXgxKxqL3T6g@mail.gmail.com
Views: Raw Message | Whole Thread | Download mbox | Resend email
Thread:
Lists: pgsql-hackers

On Wed, Aug 2, 2023 at 1:43 PM Hayato Kuroda (Fujitsu)
<kuroda(dot)hayato(at)fujitsu(dot)com> wrote:
>
> Thank you for giving comments! PSA new version patchset.
>
> > 1. Do we really need 0001 patch after the latest change proposed by
> > Vignesh in the 0004 patch?
>
> I removed 0001 patch and revived old patch which serializes slots at shutdown.
> This is because the problem which slots are not serialized to disk still remain [1]
> and then confirmed_flush becomes behind, even if we implement the approach.
>

So, IIUC, you are talking about a patch with the below commit message.
[PATCH v18 2/4] Always persist to disk logical slots during a
shutdown checkpoint.

It's entirely possible for a logical slot to have a confirmed_flush_lsn higher
than the last value saved on disk while not being marked as dirty. It's
currently not a problem to lose that value during a clean shutdown / restart
cycle, but a later patch adding support for pg_upgrade of publications and
logical slots will rely on that value being properly persisted to disk.

As per this commit message, this patch should be numbered as 1 but you
have placed it as 2 after the main upgrade patch?

> > 2.
> > + if (dopt.logical_slots_only)
> > + {
> > + if (!dopt.binary_upgrade)
> > + pg_fatal("options --logical-replication-slots-only requires option
> > --binary-upgrade");
> > +
> > + if (dopt.dataOnly)
> > + pg_fatal("options --logical-replication-slots-only and
> > -a/--data-only cannot be used together");
> > +
> > + if (dopt.schemaOnly)
> > + pg_fatal("options --logical-replication-slots-only and
> > -s/--schema-only cannot be used together");
> >
> > Can you please explain why the patch imposes these restrictions? I
> > guess the binary_upgrade is because you want this option to be used
> > for the upgrade. Do we want to avoid giving any other option with
> > logical_slots, if so, are the above checks sufficient and why?
>
> Regarding the --binary-upgrade, the motivation is same as you expected. I covered
> up the --logical-replication-slots-only option from users, so it should not be
> used not for upgrade. Additionaly, this option is not shown in help and document.
>
> As for -{data|schema}-only options, I removed restrictions.
> Firstly I set as excluded because it may be confused - as discussed at [2], slots
> must be dumped after all the pg_resetwal is done and at that time all the definitions
> are already dumped. to avoid duplicated definitions, we must ensure only slots are
> written in the output file. I thought this requirement contradict descirptions of
> these options (Dump only the A, not B).
> But after considering more, I thought this might not be needed because it was not
> opened to users - no one would be confused by using both them.
> (Restriction for -c is also removed for the same motivation)
>

I see inconsistent behavior here with the patch. If I use "pg_dump.exe
--schema-only --logical-replication-slots-only --binary-upgrade
postgres" then I get only a dump of slots without any schema. When I
use "pg_dump.exe --data-only --logical-replication-slots-only
--binary-upgrade postgres" then neither table data nor slots. When I
use "pg_dump.exe --create --logical-replication-slots-only
--binary-upgrade postgres" then it returns the error "pg_dump: error:
role with OID 10 does not exist".

Now, I tried using --binary-upgrade with some other option like
"pg_dump.exe --create --binary-upgrade postgres" and then I got a dump
with all required objects with support for binary-upgrade.

I think your thought here is that this new option won't be usable
directly with pg_dump but we should study whether we allow to support
other options with --binary-upgrade for in-place upgrade utilities
other than pg_upgrade.

>
> > 4.
> > + /*
> > + * Check that all logical replication slots have reached the current WAL
> > + * position.
> > + */
> > + res = executeQueryOrDie(conn,
> > + "SELECT slot_name FROM pg_catalog.pg_replication_slots "
> > + "WHERE (SELECT count(record_type) "
> > + " FROM pg_catalog.pg_get_wal_records_content(confirmed_flush_lsn,
> > pg_catalog.pg_current_wal_insert_lsn()) "
> > + " WHERE record_type != 'CHECKPOINT_SHUTDOWN') <> 0 "
> > + "AND temporary = false AND wal_status IN ('reserved', 'extended');");
> >
> > I think this can unnecessarily lead to reading a lot of WAL data if
> > the confirmed_flush_lsn for a slot is too much behind. Can we think of
> > improving this by passing the number of records to read which in this
> > case should be 1?
>
> I checked and pg_wal_record_info() seemed to be used for the purpose. I tried to
> move the functionality to core.
>

But I don't see how it addresses my concern about reading too many
records. If the confirmed_flush_lsn is too much behind, it will also
try to read all the remaining WAL for such slots.

> But this function raise an ERROR when there is no valid record after the specified
> lsn. This means that the pg_upgrade fails if logical slots has caught up the current
> WAL location. IIUC DBA must do following steps:
>
> 1. shutdown old publisher
> 2. disable the subscription once <- this is mandatory, otherwise the walsender may
> send the record during the upgrade and confirmed_lsn may point the SHUTDOWN_CHECKPOINT
> 3. do pg_upgrade <- pg_get_wal_record_content() may raise an ERROR if 2. was skipped
>

But we have already seen that we write shutdown_checkpoint record only
after logical walsender is shut down. So, how above is possible?

--
With Regards,
Amit Kapila.

In response to

Browse pgsql-hackers by date

  From Date Subject
Next Message Richard Guo 2023-08-03 10:43:52 Re: Oversight in reparameterize_path_by_child leading to executor crash
Previous Message Laetitia Avrot 2023-08-03 10:06:11 Re: Adding a pg_servername() function