From: | Amit Kapila <amit(dot)kapila16(at)gmail(dot)com> |
---|---|
To: | "Zhijie Hou (Fujitsu)" <houzj(dot)fnst(at)fujitsu(dot)com> |
Cc: | "Hayato Kuroda (Fujitsu)" <kuroda(dot)hayato(at)fujitsu(dot)com>, Nisha Moond <nisha(dot)moond412(at)gmail(dot)com>, Masahiko Sawada <sawada(dot)mshk(at)gmail(dot)com>, shveta malik <shveta(dot)malik(at)gmail(dot)com>, pgsql-hackers <pgsql-hackers(at)postgresql(dot)org> |
Subject: | Re: Conflict detection for update_deleted in logical replication |
Date: | 2024-12-15 01:38:44 |
Message-ID: | CAA4eK1LPCd32iwmYqS4AU1bsEDFs9+HhzXvvYRFRCzKyJ1Cs0Q@mail.gmail.com |
Views: | Raw Message | Whole Thread | Download mbox | Resend email |
Thread: | |
Lists: | pgsql-hackers |
On Wed, Dec 11, 2024 at 2:32 PM Zhijie Hou (Fujitsu)
<houzj(dot)fnst(at)fujitsu(dot)com> wrote:
>
> Attach the V16 patch set which addressed above comments.
>
Review Comments
================
1.
+/*
+ * Workhorse for the RCI_WAIT_FOR_LOCAL_FLUSH phase.
+ */
+static void
+wait_for_local_flush(RetainConflictInfoData *data)
+{
...
+ /*
+ * Do not attempt to advance the non-removable transaction ID when table
+ * sync is in progress. During this time, changes from a single
+ * transaction may be applied by multiple table sync workers corresponding
+ * to the target tables. In this case, confirming the apply and flush
+ * progress across all table sync workers is complex and not worth the
+ * effort.
+ */
+ if (!AllTablesyncsReady())
+ return;
How is it ensured that new tables are not added to subscription via
refresh immediately after this check that are not yet in ready state?
I mean if it happens immediately after this check then the problem
described by comment can happen and we may end up advancing the
non-removable-xid incorrectly. If this is safe, then please update
comments to reflect the same.
2.
+ /*
+ * Reset all data fields except those used to determine the timing for the
+ * next round of transaction ID advancement.
+ */
+ memset(data, 0, offsetof(RetainConflictInfoData, candidate_xid_time));
Wouldn't it be better to initialize each field separately? With the
current code, adding new fields at the end of the structure
RetainConflictInfoData will be difficult.
3.
+ * TODO: The remote flush location (last_flushpos) is currently not updated
+ * during change application, making it impossible to satisfy the condition of
+ * the final phase (RCI_WAIT_FOR_LOCAL_FLUSH) for advancing the transaction ID.
+ * Consider updating the remote flush position in the final phase to enable
+ * advancement during change application.
+ */
+static inline bool
+can_advance_nonremovable_xid(RetainConflictInfoData *data)
I think we don't need this TODO here as there is XXX comment in
wait_for_local_flush() which has the same information.
4.
@@ -2314,6 +2316,10 @@ ProcessStandbyMessage(void)
ProcessStandbyHSFeedbackMessage();
break;
+ case 'S':
+ ProcessStandbyPSRequestMessage();
+ break;
Why do we use the capital message name 'S' when other messages in
ProcessStandbyMessage() are all small cases? I see that walsender
already handles 'S' message in HandleUploadManifestPacket() though it
won't conflict with this case. But still, shouldn't we use a different
message here?
5. The apply worker needs to at least twice get the publisher status
message to advance oldest_nonremovable_xid once. It then uses the
remote_lsn of the last such message to ensure that it has been applied
locally. Such a remote_lsn could be a much later value than required
leading to delay in advancing oldest_nonremovable_xid. How about if
while first time processing the publisher_status message on walsender,
we get the latest_transaction_in_commit by having a function
GetLatestTransactionIdInCommit() instead of
GetOldestTransactionIdInCommit() and then simply wait till that proc
has written commit WAL (aka wait till it clears
DELAY_CHKPT_IN_COMMIT)? Then get the latest LSN wrote and send that to
apply worker waiting for the publisher_status message. If this is
feasible then we should be able to advance oldest_nonremovable_xid
with just one publisher_status message. Won't that be an improvement
over current? If so, we can even further try to improve it by just
using commit_LSN of the transaction returned by
GetLatestTransactionIdInCommit(). One idea is that we can try to use
MyProc->waitLSN which we are using in synchronous replication for our
purpose. See SyncRepWaitForLSN.
6. Attached, a few minor comment updates.
--
With Regards,
Amit Kapila.
Attachment | Content-Type | Size |
---|---|---|
v16-0001-amit.1.patch.txt | text/plain | 1.3 KB |
From | Date | Subject | |
---|---|---|---|
Next Message | Noah Misch | 2024-12-15 02:27:01 | Reject Windows command lines not convertible to system locale |
Previous Message | Michael Paquier | 2024-12-15 01:34:07 | Re: pg_combinebackup PITR comparison test fix |