From: | Amit Kapila <amit(dot)kapila16(at)gmail(dot)com> |
---|---|
To: | "Hayato Kuroda (Fujitsu)" <kuroda(dot)hayato(at)fujitsu(dot)com> |
Cc: | Julien Rouhaud <rjuju123(at)gmail(dot)com>, PostgreSQL Hackers <pgsql-hackers(at)lists(dot)postgresql(dot)org> |
Subject: | Re: [PoC] pg_upgrade: allow to upgrade publisher node |
Date: | 2023-07-17 12:49:44 |
Message-ID: | CAA4eK1KRDcsyFBkwwv4obMup8Q0HzTU6+YfP8Kk2izoNvSvmkA@mail.gmail.com |
Views: | Raw Message | Whole Thread | Download mbox | Resend email |
Thread: | |
Lists: | pgsql-hackers |
On Fri, Jun 30, 2023 at 7:29 PM Hayato Kuroda (Fujitsu)
<kuroda(dot)hayato(at)fujitsu(dot)com> wrote:
>
> I have analyzed more, and concluded that there are no difference between manual
> and shutdown checkpoint.
>
> The difference was whether the CHECKPOINT record has been decoded or not.
> The overall workflow of this test was:
>
> 1. do INSERT
> (2. do CHECKPOINT)
> (3. decode CHECKPOINT record)
> 4. receive feedback message from standby
> 5. do shutdown CHECKPOINT
>
> At step 3, the walsender decoded that WAL and set candidate_xmin_lsn. The stucktrace was:
> standby_decode()->SnapBuildProcessRunningXacts()->LogicalIncreaseXminForSlot().
>
> At step 4, the confirmed_flush of the slot was updated, but ReplicationSlotSave()
> was executed only when the slot->candidate_xmin_lsn had valid lsn. If step 2 and
> 3 are misssed, the dirty flag is not set and the change is still on the memory.
>
> FInally, the CHECKPOINT was executed at step 5. If step 2 and 3 are misssed and
> the patch from Julien is not applied, the updated value will be discarded. This
> is what I observed. The patch forces to save the logical slot at the shutdown
> checkpoint, so the confirmed_lsn is save to disk at step 5.
>
I see your point but there are comments in walsender.c which indicates
that we also wait for step-5 to get replicated. See [1] and comments
atop walsender.c. If this is true then we don't need a special check
as you have in patch 0003 or at least it doesn't seem to be required
in all cases.
[1] -
/*
* When SIGUSR2 arrives, we send any outstanding logs up to the
* shutdown checkpoint record (i.e., the latest record), wait for
* them to be replicated to the standby, and exit. ...
*/
--
With Regards,
Amit Kapila.
From | Date | Subject | |
---|---|---|---|
Next Message | Jelte Fennema | 2023-07-17 13:00:50 | Re: [EXTERNAL] Re: Add non-blocking version of PQcancel |
Previous Message | Aleksander Alekseev | 2023-07-17 12:48:58 | Re: Protect extension' internal tables - how? |