From: | "Zhijie Hou (Fujitsu)" <houzj(dot)fnst(at)fujitsu(dot)com> |
---|---|
To: | "Hayato Kuroda (Fujitsu)" <kuroda(dot)hayato(at)fujitsu(dot)com>, Amit Kapila <amit(dot)kapila16(at)gmail(dot)com> |
Cc: | Nisha Moond <nisha(dot)moond412(at)gmail(dot)com>, Masahiko Sawada <sawada(dot)mshk(at)gmail(dot)com>, shveta malik <shveta(dot)malik(at)gmail(dot)com>, pgsql-hackers <pgsql-hackers(at)postgresql(dot)org> |
Subject: | RE: Conflict detection for update_deleted in logical replication |
Date: | 2024-12-25 02:43:02 |
Message-ID: | OS0PR01MB5716E0EC8DC4F06C390E7418940C2@OS0PR01MB5716.jpnprd01.prod.outlook.com |
Views: | Raw Message | Whole Thread | Download mbox | Resend email |
Thread: | |
Lists: | pgsql-hackers |
On Monday, December 23, 2024 2:15 PM Kuroda, Hayato/黒田 隼人 <kuroda(dot)hayato(at)fujitsu(dot)com> wrote:
>
> Dear Hou,
>
> Thanks for updating the patch. Few comments:
Thanks for the comments!
> 02. ErrorOnReservedSlotName()
>
> Currently the function is callsed from three points -
> create_physical_replication_slot(),
> create_logical_replication_slot() and CreateReplicationSlot().
> Can we move them to the ReplicationSlotCreate(), or combine into
> ReplicationSlotValidateName()?
I am not sure because moving the check into these functions because that would
prevent the launcher from creating the slot as well unless we add a new
parameter for these functions, but I am not sure if it's worth it at this
stage.
>
> 03. advance_conflict_slot_xmin()
>
> ```
> Assert(TransactionIdIsValid(MyReplicationSlot->data.xmin));
> ```
>
> Assuming the case that the launcher crashed just after
> ReplicationSlotCreate(CONFLICT_DETECTION_SLOT).
> After the restart, the slot can be acquired since
> SearchNamedReplicationSlot(CONFLICT_DETECTION_SLOT)
> is true, but the process would fail the assert because data.xmin is still invalid.
>
> I think we should re-create the slot when the xmin is invalid. Thought?
After thinking more, the standard approach to me would be to mark the slot as
EPHEMERAL during creation and persist it after initializing, so changed like
that.
> 05. check_remote_recovery()
>
> Can we add a test case related with this?
I think the code path is already tested, and I am a bit unsure if we want to setup
a standby to test the ERROR case, so didn't add this.
---
Attach the new version patch set which addressed all other comments.
Based on some off-list discussions with Sawada-san and Amit, it would be better
if the apply worker can avoid reporting an ERROR if the publisher's clock's
lags behind that of the subscriber, so I implemented a new 0007 patch to allow
the apply worker to wait for the clock skew to pass and then send a new request
to the publisher for the latest status. The implementation is as follows:
Since we have the time (reply_time) on the walsender when it confirms that all
the committing transactions have finished, it means any subsequent transactions
on the publisher should be assigned a commit timestamp later then reply_time.
And the (candidate_xid_time) when it determines the oldest active xid. Any old
transactions on the publisher that have finished should have a commit timestamp
earlier than the candidate_xid_time.
The apply worker can compare the candidate_xid_time with reply_time. If
candidate_xid_time is less than the reply_time, then it's OK to advance the xid
immdidately. If candidate_xid_time is greater than reply_time, it means the
clock of publisher is behind that of the subscriber, so the apply worker can
wait for the skew to pass before advancing the xid.
Since this is considered as an improvement, we can focus on this after
pushing the main patches.
Best Regards,
Hou zj
Attachment | Content-Type | Size |
---|---|---|
v18-0007-Wait-for-publisher-s-clock-to-catch-up-before-pr.patch | application/octet-stream | 8.5 KB |
v18-0001-Maintain-the-oldest-non-removeable-tranasction-I.patch | application/octet-stream | 35.0 KB |
v18-0002-Dynamically-adjust-xid-advancement-interval.patch | application/octet-stream | 4.8 KB |
v18-0003-Maintain-the-replication-slot-in-logical-launche.patch | application/octet-stream | 11.9 KB |
v18-0004-Support-the-conflict-detection-for-update_delete.patch | application/octet-stream | 21.6 KB |
v18-0005-Add-a-detect_update_deleted-option-to-subscripti.patch | application/octet-stream | 73.8 KB |
v18-0006-Add-a-tap-test-to-verify-the-new-slot-xmin-mecha.patch | application/octet-stream | 7.5 KB |
From | Date | Subject | |
---|---|---|---|
Next Message | Michael Paquier | 2024-12-25 03:00:59 | Re: Infinite loop in XLogPageRead() on standby |
Previous Message | Andrei Lepikhov | 2024-12-25 02:33:57 | Re: ERROR: corrupt MVNDistinct entry |