RE: Conflict detection for update_deleted in logical replication

From: "Zhijie Hou (Fujitsu)" <houzj(dot)fnst(at)fujitsu(dot)com>
To: Amit Kapila <amit(dot)kapila16(at)gmail(dot)com>
Cc: "Hayato Kuroda (Fujitsu)" <kuroda(dot)hayato(at)fujitsu(dot)com>, Nisha Moond <nisha(dot)moond412(at)gmail(dot)com>, Masahiko Sawada <sawada(dot)mshk(at)gmail(dot)com>, shveta malik <shveta(dot)malik(at)gmail(dot)com>, pgsql-hackers <pgsql-hackers(at)postgresql(dot)org>
Subject: RE: Conflict detection for update_deleted in logical replication
Date: 2024-12-06 07:58:23
Message-ID: OS0PR01MB5716318675006A02A0A34AFD94312@OS0PR01MB5716.jpnprd01.prod.outlook.com
Views: Raw Message | Whole Thread | Download mbox | Resend email
Thread:
Lists: pgsql-hackers

On Thursday, December 5, 2024 6:00 PM Amit Kapila <amit(dot)kapila16(at)gmail(dot)com> wrote:
>
> On Wed, Dec 4, 2024 at 4:29 PM Amit Kapila <amit(dot)kapila16(at)gmail(dot)com>
> wrote:
> >
> > 1.
> > + if (can_advance_nonremovable_xid(&data, last_recv_timestamp))
> > + maybe_advance_nonremovable_xid(&data);
> >
> > In can_advance_nonremovable_xid(), we determine whether to advance the
> > oldest xid based on 'last_recv_timestamp' and then again in
> > maybe_advance_nonremovable_xid()->get_candidate_xid(), we compare it
> > with the current time. How does that make sense? Shall we use
> > 'last_recv_timestamp' directly in get_candidate_xid() as that will
> > avoid the additional time check in can_advance_nonremovable_xid()?

Agreed.

I added a field to RetainConflictInfoData to store the last_recv_timestamp and
am using it directly in get_candidate_xid(). If no message has been received,
we can still obtain the current timestamp.

> >
> > 2.
> > + TimestampTz next_attempt_time; /* when to attemp to advance the xid
> > +during
> > + * change application */
> > +} RetainConflictInfoData;
> >
> > This new variable introduced in this version is not used in the patch.
> > Any reason or just a leftover?

It was a leftover, removed.

> >
> > Apart from the above, I have made a few updates in the comments in the
> > attached. Please include those after review.
> >
>
> A few more comments:
> 1.
> +static void
> +wait_for_local_flush(RetainConflictInfoData *data)
> {
> ...
> +
> + data->phase = RCI_GET_CANDIDATE_XID;
> +
> + maybe_advance_nonremovable_xid(data);
> +}
>
> Isn't it better to reset all the fields of data before the next round of
> GET_CANDIDATE_XID phase? If we do that then we don't need to reset
> data->remote_lsn = InvalidXLogRecPtr; and data->last_phase_at =
> InvalidFullTransactionId; individually in request_publisher_status() and
> get_candidate_xid() respectively. Also, it looks clean and logical to me unless I
> am missing something.

The remote_lsn was used to determine whether a status is received, so was reset
each time in request_publisher_status. To make it more straightforward, I added
a new function parameter 'status_received', which would be set to true when
calling maybe_advance_nonremovable_xid() on receving the status. After this
change, there is no need to reset the remote_lsn.

>
> 2.
> + /*
> + * Issue a warning if there is a detected clock skew between the
> + publisher
> + * and subscriber.
> + *
> + * XXX Consider waiting for the publisher's clock to catch up with the
> + * subscriber's before proceeding to the next phase.
> + */
> + if (TimestampDifferenceExceeds(data->reply_time,
> + data->candidate_xid_time, 0))
> + ereport(WARNING,
> + errmsg("non-removable transaction ID may be advanced prematurely"),
> + errdetail("The clock on the publisher is behind that of the
> + subscriber."));
>
> Shouldn't this be an ERROR as this will lead to the removal of rows required to
> detect update_delete conflict?

Agreed. Changed to ERROR.

>
> Apart from the above, I have made a few more updates in the comments in the
> attached. Please include those after review.

Thanks, I have checked and merged.

Here is V15 patch set which addressed above comments. I also fixed a bug that
the table sync worker mis-sent the request to publisher.

Best Regards,
Hou zj

Attachment Content-Type Size
v15-0005-Add-a-tap-test-to-verify-the-new-slot-xmin-mecha.patch application/octet-stream 7.5 KB
v15-0001-Maintain-the-oldest-non-removeable-tranasction-I.patch application/octet-stream 34.4 KB
v15-0002-Maintain-the-replication-slot-in-logical-launche.patch application/octet-stream 11.9 KB
v15-0003-Support-the-conflict-detection-for-update_delete.patch application/octet-stream 21.3 KB
v15-0004-Add-a-detect_update_deleted-option-to-subscripti.patch application/octet-stream 73.3 KB

In response to

Responses

Browse pgsql-hackers by date

  From Date Subject
Next Message Andrei Lepikhov 2024-12-06 08:13:00 Re: Considering fractional paths in Append node
Previous Message Kirill Reshke 2024-12-06 07:30:56 Re: Use streaming read API in pgstattuple.