Re: [PoC] pg_upgrade: allow to upgrade publisher node

From: Masahiko Sawada <sawada(dot)mshk(at)gmail(dot)com>
To: Amit Kapila <amit(dot)kapila16(at)gmail(dot)com>
Cc: "Hayato Kuroda (Fujitsu)" <kuroda(dot)hayato(at)fujitsu(dot)com>, PostgreSQL Hackers <pgsql-hackers(at)lists(dot)postgresql(dot)org>, Julien Rouhaud <rjuju123(at)gmail(dot)com>, vignesh C <vignesh21(at)gmail(dot)com>
Subject: Re: [PoC] pg_upgrade: allow to upgrade publisher node
Date: 2023-08-07 08:31:50
Message-ID: CAD21AoBvEchXqeUnYFN78RYd4pa+=sR-hcKs2bFRLqqPgYvBaQ@mail.gmail.com
Views: Raw Message | Whole Thread | Download mbox | Resend email
Thread:
Lists: pgsql-hackers

On Mon, Aug 7, 2023 at 12:54 PM Amit Kapila <amit(dot)kapila16(at)gmail(dot)com> wrote:
>
> On Sun, Aug 6, 2023 at 6:02 PM Masahiko Sawada <sawada(dot)mshk(at)gmail(dot)com> wrote:
> >
> > On Wed, Aug 2, 2023 at 5:13 PM Hayato Kuroda (Fujitsu)
> > <kuroda(dot)hayato(at)fujitsu(dot)com> wrote:
> > >
> > > > 4.
> > > > + /*
> > > > + * Check that all logical replication slots have reached the current WAL
> > > > + * position.
> > > > + */
> > > > + res = executeQueryOrDie(conn,
> > > > + "SELECT slot_name FROM pg_catalog.pg_replication_slots "
> > > > + "WHERE (SELECT count(record_type) "
> > > > + " FROM pg_catalog.pg_get_wal_records_content(confirmed_flush_lsn,
> > > > pg_catalog.pg_current_wal_insert_lsn()) "
> > > > + " WHERE record_type != 'CHECKPOINT_SHUTDOWN') <> 0 "
> > > > + "AND temporary = false AND wal_status IN ('reserved', 'extended');");
> > > >
> > > > I think this can unnecessarily lead to reading a lot of WAL data if
> > > > the confirmed_flush_lsn for a slot is too much behind. Can we think of
> > > > improving this by passing the number of records to read which in this
> > > > case should be 1?
> > >
> > > I checked and pg_wal_record_info() seemed to be used for the purpose. I tried to
> > > move the functionality to core.
> >
> > IIUC the above query checks if the WAL record written at the slot's
> > confirmed_flush_lsn is a CHECKPOINT_SHUTDOWN, but there is no check if
> > this WAL record is the latest record.
> >
>
> Yeah, I also think there should be some way to ensure this. How about
> passing the number of records to read to this API? Actually, that will
> address my other concern as well where the current API can lead to
> reading an unbounded number of records if the confirmed_flush_lsn
> location is far behind the CHECKPOINT_SHUTDOWN. Do you have any better
> ideas to address it?

It makes sense to me to limit the number of WAL records to read. But
as I mentioned below, if there is a chance of any WAL activity during
the upgrade, I'm not sure what limit to set.

>
> > Therefore, I think it's quite
> > possible that slot's confirmed_flush_lsn points to previous
> > CHECKPOINT_SHUTDOWN, for example, in cases where the subscription was
> > disabled after the publisher shut down and then some changes are made
> > on the publisher. We might want to add that check too but it would not
> > work. Because some WAL records could be written (e.g., by autovacuums)
> > during pg_upgrade before checking the slot's confirmed_flush_lsn.
> >
>
> I think autovacuum is not enabled during the upgrade. See comment "Use
> -b to disable autovacuum." in start_postmaster().

Right, thanks.

> However, I am not
> sure if there can't be any additional WAL from checkpointer or
> bgwriter. Checkpointer has a code that ensures that if there is no
> important WAL activity then it would be skipped. Similarly, bgwriter
> also doesn't LOG xl_running_xacts unless there is an important
> activity.

WAL records for hint bit updates could be generated even in upgrading mode?

> I feel if there is a chance of any WAL activity during the
> upgrade, we need to either change the check to ensure such WAL records
> are expected or document the same in some way.

Yes, but how does it work with the above idea of limiting the number
of WAL records to read? If XLOG_FPI_FOR_HINT can still be generated in
the upgrade mode, we cannot predict how many such records are
generated after the latest CHECKPOINT_SHUTDOWN.

I'm not really sure we should always perform the slot's
confirmed_flush_lsn check by default in the first place. With this
check, the upgrade won't be able to proceed if there is any logical
slot that is not used by logical replication (or something streaming
the changes using walsender), right? For example, if a user uses a
program that periodically consumes the changes from the logical slot,
the slot would not be able to pass the check even if the user executed
pg_logical_slot_get_changes() just before shutdown. The backend
process who consumes the changes is always terminated before the
shutdown checkpoint. On the other hand, I think there are cases where
the user can ensure that no meaningful WAL records are generated after
the last pg_logical_slot_get_changes(). I'm concerned that this check
might make upgrading such cases cumbersome unnecessarily.

Regards,

--
Masahiko Sawada
Amazon Web Services: https://aws.amazon.com

In response to

Responses

Browse pgsql-hackers by date

  From Date Subject
Next Message Drouvot, Bertrand 2023-08-07 08:34:43 Fix badly generated entries in wait_event_names.txt
Previous Message Drouvot, Bertrand 2023-08-07 08:23:27 Re: WIP: new system catalog pg_wait_event