RE: Conflict detection for update_deleted in logical replication

From: "Zhijie Hou (Fujitsu)" <houzj(dot)fnst(at)fujitsu(dot)com>
To: Amit Kapila <amit(dot)kapila16(at)gmail(dot)com>, "Hayato Kuroda (Fujitsu)" <kuroda(dot)hayato(at)fujitsu(dot)com>
Cc: Nisha Moond <nisha(dot)moond412(at)gmail(dot)com>, Masahiko Sawada <sawada(dot)mshk(at)gmail(dot)com>, shveta malik <shveta(dot)malik(at)gmail(dot)com>, pgsql-hackers <pgsql-hackers(at)postgresql(dot)org>
Subject: RE: Conflict detection for update_deleted in logical replication
Date: 2024-12-02 11:33:38
Message-ID: OS0PR01MB5716775DB315563D8749581D94352@OS0PR01MB5716.jpnprd01.prod.outlook.com
Views: Raw Message | Whole Thread | Download mbox | Resend email
Thread:
Lists: pgsql-hackers

On Monday, December 2, 2024 12:14 PM Amit Kapila <amit(dot)kapila16(at)gmail(dot)com> wrote:
> On Fri, Nov 29, 2024 at 4:05 PM Hayato Kuroda (Fujitsu)
> <kuroda(dot)hayato(at)fujitsu(dot)com> wrote:
> >
> > 02. maybe_advance_nonremovable_xid
> >
> > ```
> > + case RCI_REQUEST_PUBLISHER_STATUS:
> > + request_publisher_status(data);
> > + break;
> > ```
> >
> > I think the part is not reachable because the transit
> >
> RCI_REQUEST_PUBLISHER_STATUS->RCI_WAIT_FOR_PUBLISHER_STATU
> S is done in
> > get_candidate_xid()->request_publisher_status().
> > Can we remove this?
> >
>
> After changing phase to RCI_REQUEST_PUBLISHER_STATUS, we directly
> invoke request_publisher_status, and similarly, after changing phase to
> RCI_WAIT_FOR_LOCAL_FLUSH, we call wait_for_local_flush. Won't it be
> better that in both cases and other similar cases, we instead invoke
> maybe_advance_nonremovable_xid()? This will make
> maybe_advance_nonremovable_xid() the only function with the knowledge to
> take action based on phase rather than spreading the knowledge of
> phase-related actions to various functions. Then we should also add a
> comment at the end in request_publisher_status() where we change the
> phase but don't do anything. The comment should explain the reason for the
> same.

Agreed.

>
> One more point, it seems on a busy server, the patch won't be able to advance
> nonremovable_xid. We should call
> maybe_advance_nonremovable_xid() at all the places where we call
> send_feedback() and additionally, we should also call it after applying some
> threshold number (say 100) of messages. The latter is to avoid the cases where
> we won't invoke the required functionality on a busy server with a large value of
> sender/receiver timeouts.

Right. I think instead of adding a threshold, we could directly check if it's
time to start next round of xid advancement attempt. Since we already get the
current timestamp in the loop of receiving msg, it would not be expensive to
check the time difference. I have used this approach in V13 and will test the
performance for it.

Here is the V13 patch set which addressed this and pending comments
in [1][2].

[1] https://www.postgresql.org/message-id/TYAPR01MB5692A6AC6EF2E22A2C9D099AF52A2%40TYAPR01MB5692.jpnprd01.prod.outlook.com
[2] https://www.postgresql.org/message-id/CAA4eK1L-J-iJJ6gpzUc9wJY6eGfBbCDmh%2B5fHUZoixnoFbJSNg%40mail.gmail.com

Best Regards,
Hou zj

Attachment Content-Type Size
v13-0003-Support-the-conflict-detection-for-update_delete.patch application/octet-stream 21.3 KB
v13-0001-Maintain-the-oldest-non-removeable-tranasction-I.patch application/octet-stream 32.8 KB
v13-0004-Add-a-detect_update_deleted-option-to-subscripti.patch application/octet-stream 73.5 KB
v13-0005-Add-a-tap-test-to-verify-the-new-slot-xmin-mecha.patch application/octet-stream 7.5 KB
v13-0002-Maintain-the-replication-slot-in-logical-launche.patch application/octet-stream 11.9 KB

In response to

Responses

Browse pgsql-hackers by date

  From Date Subject
Next Message Michael Christofides 2024-12-02 11:41:48 Re: Proposals for EXPLAIN: rename ANALYZE to EXECUTE and extend VERBOSE
Previous Message Peter Eisentraut 2024-12-02 11:26:47 Re: More CppAsString2() in psql's describe.c