RE: Conflict Detection and Resolution

From: "Zhijie Hou (Fujitsu)" <houzj(dot)fnst(at)fujitsu(dot)com>
To: Nisha Moond <nisha(dot)moond412(at)gmail(dot)com>, Ajin Cherian <itsajin(at)gmail(dot)com>
Cc: Amit Kapila <amit(dot)kapila16(at)gmail(dot)com>, shveta malik <shveta(dot)malik(at)gmail(dot)com>, Ashutosh Bapat <ashutosh(dot)bapat(dot)oss(at)gmail(dot)com>, Dilip Kumar <dilipbalaut(at)gmail(dot)com>, Jan Wieck <jan(at)wi3ck(dot)info>, Tomas Vondra <tomas(dot)vondra(at)enterprisedb(dot)com>, pgsql-hackers <pgsql-hackers(at)postgresql(dot)org>
Subject: RE: Conflict Detection and Resolution
Date: 2024-07-08 04:32:03
Message-ID: OS0PR01MB571610127E08CAAF0650B6FA94DA2@OS0PR01MB5716.jpnprd01.prod.outlook.com
Views: Raw Message | Whole Thread | Download mbox | Resend email
Thread:
Lists: pgsql-hackers

Hi,

I researched about how to detect the resolve update_deleted and thought
about one idea: which is to maintain the xmin in logical slot to preserve
the dead row and support latest_timestamp_xmin resolution for
update_deleted to maintain data consistency.

Here are details of the xmin idea and resolution of update_deleted:

1. how to preserve the dead row so that we can detect update_delete
conflict correctly. (In the following explanation, let's assume there is a
a multimeter setup with node A, B).

To preserve the dead row on node A, I think we could maintain the "xmin"
in the logical replication slot on Node A to prevent the VACCUM from
removing the dead row in user table. The walsender that acquires the slot
is responsible to advance the xmin. (Node that I am trying to explore
xmin idea as it could be more efficient than using commit_timestamp, and the
logic could be simpler as we are already maintaining catalog_xmin in
logical slot and xmin in physical slot)

- Strategy for advancing xmin:

The xmin can be advanced if a) a transaction (xid:1000) has been flushed
to the remote node (Node B in this case). *AND* b) On Node B, the local
transactions that happened before applying the remote
transaction(xid:1000) were also sent and flushed to the Node A.

- The implementation:

condition a) can be achieved with existing codes, the walsender can
advance the xmin similar to the catalog_xmin.

For condition b), we can add a subscription option (say 'feedback_slot').
The feedback_slot indicates the replication slot that will send changes to
the origin (On Node B, the slot should be subBA). The apply worker will
check the status(confirmed flush lsn) of the 'feedback slot' and send
feedback to the walsender about the WAL position that has been sent and
flushed via the feedback_slot.

For example, on Node B, we specify the replication slot (subBA) that is
sending changes to Node A. The apply worker on Node B will send
feedback(WAL position that has been sent to the Node A) to Node A
regularly. Then the Node A can use the position to advance the xmin.
(Similar to the hot_standby_feedback).

2. The resolution for update_delete

The current design doesn't support 'last_timestamp_win'. But this could be
a problem if update_deleted is detected due to some very old dead row.
Assume the update has the latest timestamp, and if we skip the update due
to these very old dead rows, the data would be inconsistent because the
latest update data is missing.

The ideal resolution should compare the timestamp of the UPDATE and the
timestamp of the transaction that produced these dead rows. If the UPDATE
is newer, the convert the UDPATE to INSERT, otherwise, skip the UPDATE.

Best Regards,
Hou zj

In response to

Responses

Browse pgsql-hackers by date

  From Date Subject
Next Message feichanghong 2024-07-08 04:41:43 Re: Optimize commit performance with a large number of 'on commit delete rows' temp tables
Previous Message wenhui qiu 2024-07-08 04:18:17 Re: Optimize commit performance with a large number of 'on commit delete rows' temp tables