Re: Conflict Detection and Resolution

From: Masahiko Sawada <sawada(dot)mshk(at)gmail(dot)com>
To: shveta malik <shveta(dot)malik(at)gmail(dot)com>
Cc: pgsql-hackers <pgsql-hackers(at)postgresql(dot)org>, "Zhijie Hou (Fujitsu)" <houzj(dot)fnst(at)fujitsu(dot)com>, Nisha Moond <nisha(dot)moond412(at)gmail(dot)com>, Amit Kapila <amit(dot)kapila16(at)gmail(dot)com>
Subject: Re: Conflict Detection and Resolution
Date: 2024-07-01 06:16:52
Message-ID: CAD21AoDzo8ck57nvRVFWOCsjWBCjQMzqTFLY4cCeFeQZ3V_oQg@mail.gmail.com
Views: Raw Message | Whole Thread | Download mbox | Resend email
Thread:
Lists: pgsql-hackers

Hi,

On Thu, May 23, 2024 at 3:37 PM shveta malik <shveta(dot)malik(at)gmail(dot)com> wrote:
>
> DELETE
> ================
> Conflict Type:
> ----------------
> delete_missing: An incoming delete is trying to delete a row on a
> target node which does not exist.

IIUC the 'delete_missing' conflict doesn't cover the case where an
incoming delete message is trying to delete a row that has already
been updated locally or by another node. I think in update/delete
conflict situations, we need to resolve the conflicts based on commit
timestamps like we do for update/update and insert/update conflicts.

For example, suppose there are two node-A and node-B and setup
bi-directional replication, and suppose further that both have the row
with id = 1, consider the following sequences:

09:00:00 DELETE ... WHERE id = 1 on node-A.
09:00:05 UPDATE ... WHERE id = 1 on node-B.
09:00:10 node-A received the update message from node-B.
09:00:15 node-B received the delete message from node-A.

At 09:00:10 on node-A, an update_deleted conflict is generated since
the row on node-A is already deleted locally. Suppose that we use
'apply_or_skip' resolution for this conflict, we convert the update
message into an insertion, so node-A now has the row with id = 1. At
09:00:15 on node-B, the incoming delete message is applied and deletes
the row with id = 1, even though the row has already been modified
locally. The node-A and node-B are now inconsistent. This
inconsistency can be avoided by using 'skip' resolution for the
'update_deleted' conflict on node-A, and 'skip' resolution is the
default method for that actually. However, if we handle it as
'update_missing', the 'apply_or_skip' resolution is used by default.

IIUC with the proposed architecture, DELETE always takes precedence
over UPDATE since both 'update_deleted' and 'update_missing' don't use
commit timestamps to resolve the conflicts. As long as that is true, I
think there is no use case for 'apply_or_skip' and 'apply_or_error'
resolutions in update/delete conflict cases. In short, I think we need
something like 'delete_differ' conflict type as well. FYI PGD and
Oracle GoldenGate seem to have this conflict type[1][2].

The 'delete'_differ' conflict type would have at least
'latest_timestamp_wins' resolution. With the timestamp based
resolution method, we would deal with update/delete conflicts as
follows:

09:00:00: DELETE ... WHERE id = 1 on node-A.
09:00:05: UPDATE ... WHERE id = 1 on node-B.
- the updated row doesn't have the origin since it's a local change.
09:00:10: node-A received the update message from node-B.
- the incoming update message has the origin of node-B whereas the
local row is already removed locally.
- 'update_deleted' conflict is generated.
- do the insert of the new row instead, because the commit
timestamp of UPDATE is newer than DELETE's one.
09:00:15: node-B received the delete message from node-A.
- the incoming delete message has the origin of node-B whereas the
(updated) row doesn't have the origin.
- 'update_differ' conflict is generated.
- discard DELETE, because the commit timestamp of UPDATE is newer
than DELETE' one.ard DELETE, because the commit timestamp of UPDATE is
newer than DELETE' one.

As a result, both nodes have the new version row.

Regards,

[1] https://www.enterprisedb.com/docs/pgd/latest/consistency/conflicts/#updatedelete-conflicts
[2] https://docs.oracle.com/goldengate/c1230/gg-winux/GWUAD/configuring-conflict-detection-and-resolution.htm
(see DELETEROWEXISTS conflict type)

--
Masahiko Sawada
Amazon Web Services: https://aws.amazon.com

In response to

Responses

Browse pgsql-hackers by date

  From Date Subject
Next Message Bertrand Drouvot 2024-07-01 06:42:46 Surround CheckRelation[Oid]LockedByMe() with USE_ASSERT_CHECKING
Previous Message Hayato Kuroda (Fujitsu) 2024-07-01 06:14:30 RE: pg_createsubscriber: drop pre-existing subscriptions from the converted node