Re: Conflict Detection and Resolution

From: shveta malik <shveta(dot)malik(at)gmail(dot)com>
To: Tomas Vondra <tomas(dot)vondra(at)enterprisedb(dot)com>
Cc: pgsql-hackers <pgsql-hackers(at)postgresql(dot)org>, "Zhijie Hou (Fujitsu)" <houzj(dot)fnst(at)fujitsu(dot)com>, Nisha Moond <nisha(dot)moond412(at)gmail(dot)com>, Amit Kapila <amit(dot)kapila16(at)gmail(dot)com>, shveta malik <shveta(dot)malik(at)gmail(dot)com>
Subject: Re: Conflict Detection and Resolution
Date: 2024-05-27 05:48:53
Message-ID: CAJpy0uBWBEveM8LO2b7wNZ47raZ9tVJw3D2_WXd8-b6LSqP6HA@mail.gmail.com
Views: Raw Message | Whole Thread | Download mbox | Resend email
Thread:
Lists: pgsql-hackers

On Sat, May 25, 2024 at 2:39 AM Tomas Vondra
<tomas(dot)vondra(at)enterprisedb(dot)com> wrote:
>
> On 5/23/24 08:36, shveta malik wrote:
> > Hello hackers,
> >
> > Please find the proposal for Conflict Detection and Resolution (CDR)
> > for Logical replication.
> > <Thanks to Nisha, Hou-San, and Amit who helped in figuring out the
> > below details.>
> >
> > Introduction
> > ================
> > In case the node is subscribed to multiple providers, or when local
> > writes happen on a subscriber, conflicts can arise for the incoming
> > changes. CDR is the mechanism to automatically detect and resolve
> > these conflicts depending on the application and configurations.
> > CDR is not applicable for the initial table sync. If locally, there
> > exists conflicting data on the table, the table sync worker will fail.
> > Please find the details on CDR in apply worker for INSERT, UPDATE and
> > DELETE operations:
> >
>
> Which architecture are you aiming for? Here you talk about multiple
> providers, but the wiki page mentions active-active. I'm not sure how
> much this matters, but it might.

Currently, we are working for multi providers case but ideally it
should work for active-active also. During further discussion and
implementation phase, if we find that, there are cases which will not
work in straight-forward way for active-active, then our primary focus
will remain to first implement it for multiple providers architecture.

>
> Also, what kind of consistency you expect from this? Because none of
> these simple conflict resolution methods can give you the regular
> consistency models we're used to, AFAICS.

Can you please explain a little bit more on this.

>
> > INSERT
> > ================
> > To resolve INSERT conflict on subscriber, it is important to find out
> > the conflicting row (if any) before we attempt an insertion. The
> > indexes or search preference for the same will be:
> > First check for replica identity (RI) index.
> > - if not found, check for the primary key (PK) index.
> > - if not found, then check for unique indexes (individual ones or
> > added by unique constraints)
> > - if unique index also not found, skip CDR
> >
> > Note: if no RI index, PK, or unique index is found but
> > REPLICA_IDENTITY_FULL is defined, CDR will still be skipped.
> > The reason being that even though a row can be identified with
> > REPLICAT_IDENTITY_FULL, such tables are allowed to have duplicate
> > rows. Hence, we should not go for conflict detection in such a case.
> >
>
> It's not clear to me why would REPLICA_IDENTITY_FULL mean the table is
> allowed to have duplicate values? It just means the upstream is sending
> the whole original row, there can still be a PK/UNIQUE index on both the
> publisher and subscriber.

Yes, right. Sorry for confusion. I meant the same i.e. in absence of
'RI index, PK, or unique index', tables can have duplicates. So even
in presence of Replica-identity (FULL in this case) but in absence of
unique/primary index, CDR will be skipped for INSERT.

>
> > In case of replica identity ‘nothing’ and in absence of any suitable
> > index (as defined above), CDR will be skipped for INSERT.
> >
> > Conflict Type:
> > ----------------
> > insert_exists: A conflict is detected when the table has the same
> > value for a key column as the new value in the incoming row.
> >
> > Conflict Resolution
> > ----------------
> > a) latest_timestamp_wins: The change with later commit timestamp wins.
> > b) earliest_timestamp_wins: The change with earlier commit timestamp wins.
> > c) apply: Always apply the remote change.
> > d) skip: Remote change is skipped.
> > e) error: Error out on conflict. Replication is stopped, manual
> > action is needed.
> >
>
> Why not to have some support for user-defined conflict resolution
> methods, allowing to do more complex stuff (e.g. merging the rows in
> some way, perhaps even with datatype-specific behavior)?

Initially, for the sake of simplicity, we are targeting to support
built-in resolvers. But we have a plan to work on user-defined
resolvers as well. We shall propose that separately.

>
> > The change will be converted to 'UPDATE' and applied if the decision
> > is in favor of applying remote change.
> >
> > It is important to have commit timestamp info available on subscriber
> > when latest_timestamp_wins or earliest_timestamp_wins method is chosen
> > as resolution method. Thus ‘track_commit_timestamp’ must be enabled
> > on subscriber, in absence of which, configuring the said
> > timestamp-based resolution methods will result in error.
> >
> > Note: If the user has chosen the latest or earliest_timestamp_wins,
> > and the remote and local timestamps are the same, then it will go by
> > system identifier. The change with a higher system identifier will
> > win. This will ensure that the same change is picked on all the nodes.
>
> How is this going to deal with the fact that commit LSN and timestamps
> may not correlate perfectly? That is, commits may happen with LSN1 <
> LSN2 but with T1 > T2.

Are you pointing to the issue where a session/txn has taken
'xactStopTimestamp' timestamp earlier but is delayed to insert record
in XLOG, while another session/txn which has taken timestamp slightly
later succeeded to insert the record IN XLOG sooner than the session1,
making LSN and Timestamps out of sync? Going by this scenario, the
commit-timestamp may not be reflective of actual commits and thus
timestamp-based resolvers may take wrong decisions. Or do you mean
something else?

If this is the problem you are referring to, then I think this needs a
fix at the publisher side. Let me think more about it . Kindly let me
know if you have ideas on how to tackle it.

> >
> > UPDATE
> > ================
> >
> > Conflict Detection Method:
> > --------------------------------
> > Origin conflict detection: The ‘origin’ info is used to detect
> > conflict which can be obtained from commit-timestamp generated for
> > incoming txn at the source node. To compare remote’s origin with the
> > local’s origin, we must have origin information for local txns as well
> > which can be obtained from commit-timestamp after enabling
> > ‘track_commit_timestamp’ locally.
> > The one drawback here is the ‘origin’ information cannot be obtained
> > once the row is frozen and the commit-timestamp info is removed by
> > vacuum. For a frozen row, conflicts cannot be raised, and thus the
> > incoming changes will be applied in all the cases.
> >
> > Conflict Types:
> > ----------------
> > a) update_differ: The origin of an incoming update's key row differs
> > from the local row i.e.; the row has already been updated locally or
> > by different nodes.
> > b) update_missing: The row with the same value as that incoming
> > update's key does not exist. Remote is trying to update a row which
> > does not exist locally.
> > c) update_deleted: The row with the same value as that incoming
> > update's key does not exist. The row is already deleted. This conflict
> > type is generated only if the deleted row is still detectable i.e., it
> > is not removed by VACUUM yet. If the row is removed by VACUUM already,
> > it cannot detect this conflict. It will detect it as update_missing
> > and will follow the default or configured resolver of update_missing
> > itself.
> >
>
> I don't understand the why should update_missing or update_deleted be
> different, especially considering it's not detected reliably. And also
> that even if we happen to find the row the associated TOAST data may
> have already been removed. So why would this matter?

Here, we are trying to tackle the case where the row is 'recently'
deleted i.e. concurrent UPDATE and DELETE on pub and sub. User may
want to opt for a different resolution in such a case as against the
one where the corresponding row was not even present in the first
place. The case where the row was deleted long back may not fall into
this category as there are higher chances that they have been removed
by vacuum and can be considered equivalent to the update_ missing
case.

Regarding "TOAST column" for deleted row cases, we may need to dig
more. Thanks for bringing this case. Let me analyze more here.

>
> > Conflict Resolutions:
> > ----------------
> > a) latest_timestamp_wins: The change with later commit timestamp
> > wins. Can be used for ‘update_differ’.
> > b) earliest_timestamp_wins: The change with earlier commit
> > timestamp wins. Can be used for ‘update_differ’.
> > c) apply: The remote change is always applied. Can be used for
> > ‘update_differ’.
> > d) apply_or_skip: Remote change is converted to INSERT and is
> > applied. If the complete row cannot be constructed from the info
> > provided by the publisher, then the change is skipped. Can be used for
> > ‘update_missing’ or ‘update_deleted’.
> > e) apply_or_error: Remote change is converted to INSERT and is
> > applied. If the complete row cannot be constructed from the info
> > provided by the publisher, then error is raised. Can be used for
> > ‘update_missing’ or ‘update_deleted’.
> > f) skip: Remote change is skipped and local one is retained. Can be
> > used for any conflict type.
> > g) error: Error out of conflict. Replication is stopped, manual
> > action is needed. Can be used for any conflict type.
> >
> > To support UPDATE CDR, the presence of either replica identity Index
> > or primary key is required on target node. Update CDR will not be
> > supported in absence of replica identity index or primary key even
> > though REPLICA IDENTITY FULL is set. Please refer to "UPDATE" in
> > "Noteworthey Scenarios" section in [1] for further details.
> >
> > DELETE
> > ================
> > Conflict Type:
> > ----------------
> > delete_missing: An incoming delete is trying to delete a row on a
> > target node which does not exist.
> >
> > Conflict Resolutions:
> > ----------------
> > a) error : Error out on conflict. Replication is stopped, manual
> > action is needed.
> > b) skip : The remote change is skipped.
> >
> > Configuring Conflict Resolution:
> > ------------------------------------------------
> > There are two parts when it comes to configuring CDR:
> >
> > a) Enabling/Disabling conflict detection.
> > b) Configuring conflict resolvers for different conflict types.
> >
> > Users can sometimes create multiple subscriptions on the same node,
> > subscribing to different tables to improve replication performance by
> > starting multiple apply workers. If the tables in one subscription are
> > less likely to cause conflict, then it is possible that user may want
> > conflict detection disabled for that subscription to avoid detection
> > latency while enabling it for other subscriptions. This generates a
> > requirement to make ‘conflict detection’ configurable per
> > subscription. While the conflict resolver configuration can remain
> > global. All the subscriptions which opt for ‘conflict detection’ will
> > follow global conflict resolver configuration.
> >
> > To implement the above, subscription commands will be changed to have
> > one more parameter 'conflict_resolution=on/off', default will be OFF.
> >
> > To configure global resolvers, new DDL command will be introduced:
> >
> > CONFLICT RESOLVER ON <conflict_type> IS <conflict_resolver>
> >
>
> I very much doubt we want a single global conflict resolver, or even one
> resolver per subscription. It seems like a very table-specific thing.

Even we thought about this. We feel that even if we go for table based
or subscription based resolvers configuration, there may be use case
scenarios where the user is not interested in configuring resolvers
for each table and thus may want to give global ones. Thus, we should
provide a way for users to do global configuration. Thus we started
with global one. I have noted your point here and would also like to
know the opinion of others. We are open to discussion. We can either
opt for any of these 2 options (global or table) or we can opt for
both global and table/sub based one.

>
> Also, doesn't all this whole design ignore the concurrency between
> publishers? Isn't this problematic considering the commit timestamps may
> go backwards (for a given publisher), which means the conflict
> resolution is not deterministic (as it depends on how exactly it
> interleaves)?
>
>
> > -------------------------
> >
> > Apart from the above three main operations and resolver configuration,
> > there are more conflict types like primary-key updates, multiple
> > unique constraints etc and some special scenarios to be considered.
> > Complete design details can be found in [1].
> >
> > [1]: https://wiki.postgresql.org/wiki/Conflict_Detection_and_Resolution
> >
>
> Hmmm, not sure it's good to have a "complete" design on wiki, and only
> some subset posted to the mailing list. I haven't compared what the
> differences are, though.

It would have been difficult to mention all the details in email
(including examples and corner scenarios) and thus we thought that it
will be better to document everything in wiki page for the time being.
We can keep on discussing the design and all the scenarios on need
basis (before implementation phase of that part) and thus eventually
everything will come in email on hackers. With out first patch, we
plan to provide everything in a README as well.

thanks
Shveta

In response to

Responses

Browse pgsql-hackers by date

  From Date Subject
Next Message Jelte Fennema-Nio 2024-05-27 06:47:51 Re: DISCARD ALL does not force re-planning of plpgsql functions/procedures
Previous Message Michael Paquier 2024-05-26 23:40:16 Re: Avoid possible dereference null pointer (src/backend/catalog/pg_depend.c)