Re: [PoC] pg_upgrade: allow to upgrade publisher node

From: Amit Kapila <amit(dot)kapila16(at)gmail(dot)com>
To: "Hayato Kuroda (Fujitsu)" <kuroda(dot)hayato(at)fujitsu(dot)com>
Cc: PostgreSQL Hackers <pgsql-hackers(at)lists(dot)postgresql(dot)org>, Julien Rouhaud <rjuju123(at)gmail(dot)com>, vignesh C <vignesh21(at)gmail(dot)com>
Subject: Re: [PoC] pg_upgrade: allow to upgrade publisher node
Date: 2023-08-04 10:59:45
Message-ID: CAA4eK1+8btsYhNQvw6QJ4iTw1wFhkFXXABT=ED1eHFvtekRanQ@mail.gmail.com
Views: Raw Message | Whole Thread | Download mbox | Resend email
Thread:
Lists: pgsql-hackers

On Wed, Aug 2, 2023 at 1:43 PM Hayato Kuroda (Fujitsu)
<kuroda(dot)hayato(at)fujitsu(dot)com> wrote:
>
> > 3.
> > + /*
> > + * Get replication slots.
> > + *
> > + * XXX: Which information must be extracted from old node? Currently three
> > + * attributes are extracted because they are used by
> > + * pg_create_logical_replication_slot().
> > + */
> > + appendPQExpBufferStr(query,
> > + "SELECT slot_name, plugin, two_phase "
> > + "FROM pg_catalog.pg_replication_slots "
> > + "WHERE database = current_database() AND temporary = false "
> > + "AND wal_status IN ('reserved', 'extended');");
> >
> > Why are we ignoring the slots that have wal status as WALAVAIL_REMOVED
> > or WALAVAIL_UNRESERVED? I think the slots where wal status is
> > WALAVAIL_REMOVED, the corresponding slots are invalidated at some
> > point. I think such slots can't be used for decoding but these will be
> > dropped along with the subscription or when a user does it manually.
> > So, if we don't copy such slots after the upgrade then there could be
> > a problem in dropping the corresponding subscription. If we don't want
> > to copy over such slots then we need to provide instructions on what
> > users should do in such cases. OTOH, if we want to copy over such
> > slots then we need to find a way to invalidate such slots after copy.
> > Either way, this needs more analysis.
>
> I considered again here. At least WALAVAIL_UNRESERVED should be supported because
> the slot is still usable. It can return reserved or extended.
>
> As for WALAVAIL_REMOVED, I don't think it should be so that I added a description
> to the document.
>
> This feature re-create slots which have same name/plugins as old ones, not replicate
> its state. So if we copy them as-is slots become usable again. If subscribers refer
> the slot and then connect again at that time, changes between 'WALAVAIL_REMOVED'
> may be lost.
>
> Based on above slots must be copied as WALAVAIL_REMOVED, but as you said, we do
> not have a way to control that. the status is calculated by using restart_lsn,
> but there are no function to modify directly.
>
> One approach is adding an SQL funciton which set restart_lsn to aritrary value
> (or 0/0, invalid), but it seems dangerous.
>

So, we have three options here (a) As you have done in the patch,
document this limitation and request user to perform some manual steps
to drop the subscription; (b) don't allow upgrade to proceed if there
are invalid slots in the old cluster; (c) provide a new function like
pg_copy_logical_replication_slot_contents() where we copy the required
contents like invalid status(ReplicationSlotInvalidationCause), etc.

Personally, I would prefer (b) because it will minimize the steps
required to perform by the user after the upgrade and looks cleaner
solution.

Thoughts?

--
With Regards,
Amit Kapila.

In response to

Responses

Browse pgsql-hackers by date

  From Date Subject
Next Message shveta malik 2023-08-04 11:32:41 Re: Synchronizing slots from primary to standby
Previous Message Dmitry Dolgov 2023-08-04 10:47:38 Re: [RFC] Clang plugin for catching suspicious typedef casting