From: | "Zhijie Hou (Fujitsu)" <houzj(dot)fnst(at)fujitsu(dot)com> |
---|---|
To: | Masahiko Sawada <sawada(dot)mshk(at)gmail(dot)com> |
Cc: | Amit Kapila <amit(dot)kapila16(at)gmail(dot)com>, "Hayato Kuroda (Fujitsu)" <kuroda(dot)hayato(at)fujitsu(dot)com>, shveta malik <shveta(dot)malik(at)gmail(dot)com>, pgsql-hackers <pgsql-hackers(at)postgresql(dot)org> |
Subject: | RE: Conflict detection for update_deleted in logical replication |
Date: | 2024-11-14 02:54:29 |
Message-ID: | OS3PR01MB5718888648059FEDB3BF18F3945B2@OS3PR01MB5718.jpnprd01.prod.outlook.com |
Views: | Raw Message | Whole Thread | Download mbox | Resend email |
Thread: | |
Lists: | pgsql-hackers |
On Wednesday, November 13, 2024 8:35 AM Masahiko Sawada <sawada(dot)mshk(at)gmail(dot)com> wrote:
>
>
> I've started to review these patch series. I've reviewed only 0001 patch for now
> but let me share some comments:
Thanks for the comment !
>
> ---
> + if (*phase == DTR_WAIT_FOR_WALSENDER_WAL_POS)
> + {
> + Assert(xid_advance_attempt_time);
>
> What is this assertion for? If we want to check here that we have sent a request
> message for the publisher, I think it's clearer if we have
> "Assert(xid_advance_attempt_time > 0)". I'm not sure we really need this
> assertion though since it's never false once we set xid_advance_attempt_time.
I agree that the assert is not useful, so removed in this version.
>
> ---
> + /*
> + * Do not return here because the apply worker might
> have already
> + * applied all changes up to remote_lsn. Instead,
> proceed to the
> + * next phase to check if we can immediately advance
> the transaction
> + * ID.
> + */
> + *phase = DTR_WAIT_FOR_LOCAL_FLUSH;
> + }
>
> If we always proceed to the next phase, is this phase really necessary? IIUC
> even if we jump to DTR_WAIT_FOR_LOCAL_FLUSH phase after
> DTR_REQUEST_WALSENDER_WAL_POS and have a check if we received the
> remote WAL position in DTR_WAIT_FOR_LOCAL_FLUSH phase, it would work
> fine.
Agreed. I removed this separate phase.
Although the DeadTupleRetainPhase enum only has 2 items now, I didn't remove
the enum in this version, as it could make the code easier to read and we may extent
the phases later if we accept the approach in 0001_2.
>
> ---
> + /*
> + * Reaching here means the remote WAL position has
> been received, and
> + * all transactions up to that position on the
> publisher have been
> + * applied and flushed locally. So, now we can advance the
> + * non-removable transaction ID.
> + */
> + SpinLockAcquire(&MyLogicalRepWorker->relmutex);
> + MyLogicalRepWorker->oldest_nonremovable_xid =
> candidate_xid;
> + SpinLockRelease(&MyLogicalRepWorker->relmutex);
>
> How about adding a debug log message showing new
> oldest_nonremovable_xid and related LSN for making the debug/investigation
> easier? For example,
>
> elog(LOG, "confirmed remote flush up to %X/%X: new
> oldest_nonremovable_xid %u",
> LSN_FORMAT_ARGS(*remote_lsn),
> XidFromFullTransactionId(candidate_xid));
Added.
>
> ---
> + /*
> + * Exit early if the user has disabled sending messages to the
> + * publisher.
> + */
> + if (wal_receiver_status_interval <= 0)
> + return;
>
> In send_feedback(), we send a feedback message if the publisher requests,
> even if wal_receiver_status_interval is 0. On the other hand, the above codes
> mean that we don't send a WAL position request and therefore never update
> oldest_nonremovable_xid if wal_receiver_status_interval is 0. I'm concerned it
> could be a pitfall for users.
It was intended to mirror the style of hot standby feedback (as implemented in
XLogWalRcvSendHSFeedback()) where the message is sent at most once per
wal_receiver_status_interval. I think the usage of the new message in the patch
is more similar to the hot standby feedback which are used to advance xmin of a
replication slot.
I also documented this risk in the detect_update_deleted option patch(0004).
Would it be sufficient to you ?
>
> ---
> % git show | grep update_delete
> This set of patches aims to support the detection of update_deleted
> conflicts,
> transactions with earlier timestamps than the DELETE, detecting
> update_delete
> We assume that the appropriate resolution for update_deleted conflicts, to
> that when detecting the update_deleted conflict, and the remote update
> has a
> + * to detect update_deleted conflicts.
> + * update_deleted is necessary, as the UPDATEs in remote transactions
> should be
> + * to allow for the detection of update_delete conflicts when
> + applying
>
> There are mixed 'update_delete' and 'update_deleted' in the commit message
> and the codes. I think it's better to use 'update_deleted'.
Thanks for searching. I have replaced them to update_deleted.
Attach the V9 patch set which addressed above comments.
Best Regards,
Hou zj
Attachment | Content-Type | Size |
---|---|---|
v9-0004-Add-a-detect_update_deleted-option-to-subscriptio.patch | application/octet-stream | 65.2 KB |
v9-0001-Maintain-the-oldest-non-removeable-tranasction-ID.patch | application/octet-stream | 19.9 KB |
v9-0001_2-Maintain-the-oldest-non-removeable-tranasction-I.patch.txt | text/plain | 24.7 KB |
v9-0005-Add-a-tap-test-to-verify-the-new-slot-xmin-mechan.patch | application/octet-stream | 6.9 KB |
v9-0002-Maintain-the-replication-slot-in-logical-launcher.patch | application/octet-stream | 11.4 KB |
v9-0003-Support-the-conflict-detection-for-update_deleted.patch | application/octet-stream | 21.3 KB |
From | Date | Subject | |
---|---|---|---|
Next Message | Hayato Kuroda (Fujitsu) | 2024-11-14 02:56:45 | RE: Fix for pageinspect bug in PG 17 |
Previous Message | jian he | 2024-11-14 02:21:16 | Re: altering a column's collation leaves an invalid foreign key |