From: | "Zhijie Hou (Fujitsu)" <houzj(dot)fnst(at)fujitsu(dot)com> |
---|---|
To: | shveta malik <shveta(dot)malik(at)gmail(dot)com> |
Cc: | vignesh C <vignesh21(at)gmail(dot)com>, Amit Kapila <amit(dot)kapila16(at)gmail(dot)com>, Masahiko Sawada <sawada(dot)mshk(at)gmail(dot)com>, Nisha Moond <nisha(dot)moond412(at)gmail(dot)com>, "Hayato Kuroda (Fujitsu)" <kuroda(dot)hayato(at)fujitsu(dot)com>, pgsql-hackers <pgsql-hackers(at)postgresql(dot)org> |
Subject: | RE: Conflict detection for update_deleted in logical replication |
Date: | 2025-04-24 12:41:15 |
Message-ID: | TYAPR01MB572428A7A4A875C069567B8E94852@TYAPR01MB5724.jpnprd01.prod.outlook.com |
Views: | Raw Message | Whole Thread | Download mbox | Resend email |
Thread: | |
Lists: | pgsql-hackers |
On Thu, Apr 17, 2025 at 12:19 PM shveta malik wrote:
>
> On Wed, Apr 16, 2025 at 10:30 AM shveta malik <shveta(dot)malik(at)gmail(dot)com>
> wrote:
> >
> > On Wed, Mar 26, 2025 at 4:17 PM Zhijie Hou (Fujitsu)
> > <houzj(dot)fnst(at)fujitsu(dot)com> wrote:
> > >
> > > Here's a rebased version of the patch series.
> > >
> >
>
> Thanks Hou-San for the patches. I am going through this long thread and
> patches. One doubt I have is whenever there is a chance of conflict-slot update
> (either xmin or possibility of its invalidation), apply-worker gives a wake-up call
> to the launcher (ApplyLauncherWakeup). Shouldn't that suffice to wake-up
> launcher irrespective of its nap-time? Do we actually need to introduce
> MIN/MAX_NAPTIME_PER_SLOT_UPDATE in the launcher and the logic
> around it?
Thanks for reviewing. After rethinking, I agree that the wakeup is
sufficient, so I removed the nap-time logic in this version.
> Few comments for patch004:
> Config.sgml:
> 1)
> + <para>
> + Maximum duration (in milliseconds) for which conflict
> + information can be retained for conflict detection by the apply worker.
> + The default value is <literal>0</literal>, indicating that conflict
> + information is retained until it is no longer needed for detection
> + purposes.
> + </para>
>
> IIUC, the above is not entirely accurate. Suppose the subscriber manages to
> catch up and sets oldest_nonremovable_xid to 100, which is then updated in
> slot. After this, the apply worker takes a nap and begins a new xid update cycle.
> Now, let’s say the next candidate_xid is 200, but this time the subscriber fails
> to keep up and exceeds max_conflict_retention_duration. As a result, it sets
> oldest_nonremovable_xid to InvalidTransactionId, and the launcher skips
> updating the slot’s xmin.
If the time exceeds the max_conflict_retention_duration, the launcher would
Invalidate the slot, instead of skipping updating it. So the conflict info(e.g.,
dead tuples) would not be retained anymore.
> However, the previous xmin value (100) is still there
> in the slot, causing its data to be retained beyond the
> max_conflict_retention_duration. The xid 200 which actually honors
> max_conflict_retention_duration was never marked for retention. If my
> understanding is correct, then the documentation doesn’t fully capture this
> scenario.
As mentioned above, the strategy here is to invalidate the slot.
>
> 2)
> + The replication slot
> + <quote><literal>pg_conflict_detection</literal></quote> that
> used to
> + retain conflict information will be invalidated if all apply workers
> + associated with the subscription, where
>
> Subscription --> subscriptions
>
> 3)
> Name stop_conflict_retention in MyLogicalRepWorker is confusing. Shall it be
> stop_conflict_info_retention?
Changed.
Here is V30 patch set includes the following changes:
* Addressed above comments.
* Added the retention timeout check in wait_for_local_flush(), as suggested by Nisha[1].
* Improved upgrade codes and added a test for upgrade of retain_conflict_info option,
as suggested by Kuroda-san[2].
[1] https://www.postgresql.org/message-id/CABdArM4Ft8q3dZv4Bqw%3DrbS5_LFMXDJMRr3vC8a_KMCX1qatpg%40mail.gmail.com
[2] https://www.postgresql.org/message-id/OSCPR01MB14966269726272F2F2B2BD3B0F5B22%40OSCPR01MB14966.jpnprd01.prod.outlook.com
Best Regards,
Hou zj
Attachment | Content-Type | Size |
---|---|---|
v30-0001-Maintain-the-oldest-non-removeable-tranasction-I.patch | application/octet-stream | 40.9 KB |
v30-0002-Maintain-the-replication-slot-in-logical-launche.patch | application/octet-stream | 20.2 KB |
v30-0003-Add-a-retain_conflict_info-option-to-subscriptio.patch | application/octet-stream | 83.4 KB |
v30-0004-Introduce-a-new-GUC-max_conflict_retention_durat.patch | application/octet-stream | 29.0 KB |
v30-0005-Re-create-the-replication-slot-if-the-conflict-r.patch | application/octet-stream | 10.7 KB |
v30-0006-Add-a-tap-test-to-verify-the-management-of-the-n.patch | application/octet-stream | 6.7 KB |
v30-0007-Support-the-conflict-detection-for-update_delete.patch | application/octet-stream | 23.7 KB |
From | Date | Subject | |
---|---|---|---|
Next Message | torikoshia | 2025-04-24 12:48:57 | Re: RFC: Logging plan of the running query |
Previous Message | Bruce Momjian | 2025-04-24 12:37:56 | Re: pg_upgrade-breaking release |