RE: [PoC] pg_upgrade: allow to upgrade publisher node

From: "Hayato Kuroda (Fujitsu)" <kuroda(dot)hayato(at)fujitsu(dot)com>
To: 'Julien Rouhaud' <rjuju123(at)gmail(dot)com>
Cc: PostgreSQL Hackers <pgsql-hackers(at)lists(dot)postgresql(dot)org>, Amit Kapila <amit(dot)kapila16(at)gmail(dot)com>
Subject: RE: [PoC] pg_upgrade: allow to upgrade publisher node
Date: 2023-04-07 09:40:14
Message-ID: TYAPR01MB5866A4CE796F7528DCD549EBF5969@TYAPR01MB5866.jpnprd01.prod.outlook.com
Views: Raw Message | Whole Thread | Download mbox | Resend email
Thread:
Lists: pgsql-hackers

Dear Julien,

Thank you for giving comments!

> As I mentioned in my original thread, I'm not very familiar with that code, but
> I'm a bit worried about "all the changes generated on publisher must be send
> and applied". Is that a hard requirement for the feature to work reliably?

I think the requirement is needed because the existing WALs on old node cannot be
transported on new instance. The WAL hole from confirmed_flush to current position
could not be filled by newer instance.

> If
> yes, how does this work if some subscriber node isn't connected when the
> publisher node is stopped? I guess you could add a check in pg_upgrade to make
> sure that all logical slot are indeed caught up and fail if that's not the case
> rather than assuming that a clean shutdown implies it. It would be good to
> cover that in the TAP test, and also cover some corner cases, like any new row
> added on the publisher node after the pg_upgrade but before the subscriber is
> reconnected is also replicated as expected.

Hmm, good point. Current patch could not be handled the case because walsenders
for the such slots do not exist. I have tested your approach, however, I found that
CHECKPOINT_SHUTDOWN record were generated twice when publisher was
shutted down and started. It led that the confirmed_lsn of slots always was behind
from WAL insert location and failed to upgrade every time.
Now I do not have good idea to solve it... Do anyone have for this?

> Agreed, but then shouldn't the option be named "--logical-slots-only" or
> something like that, same for all internal function names?

Seems right. Will be fixed in next version. Maybe "--logical-replication-slots-only"
will be used, per Peter's suggestion [1].

[1]: https://www.postgresql.org/message-id/CAHut%2BPvpBsyxj9SrB1ZZ9gP7r1AA5QoTYjpzMcVSjQO2xQy7aw%40mail.gmail.com

Best Regards,
Hayato Kuroda
FUJITSU LIMITED

In response to

Responses

Browse pgsql-hackers by date

  From Date Subject
Next Message John Naylor 2023-04-07 09:55:41 Re: [PoC] Improve dead tuple storage for lazy vacuum
Previous Message Amit Kapila 2023-04-07 09:37:38 Re: Initial Schema Sync for Logical Replication