Re: Conflict detection for update_deleted in logical replication

From: vignesh C <vignesh21(at)gmail(dot)com>
To: "Zhijie Hou (Fujitsu)" <houzj(dot)fnst(at)fujitsu(dot)com>
Cc: Amit Kapila <amit(dot)kapila16(at)gmail(dot)com>, "Hayato Kuroda (Fujitsu)" <kuroda(dot)hayato(at)fujitsu(dot)com>, Nisha Moond <nisha(dot)moond412(at)gmail(dot)com>, Masahiko Sawada <sawada(dot)mshk(at)gmail(dot)com>, shveta malik <shveta(dot)malik(at)gmail(dot)com>, pgsql-hackers <pgsql-hackers(at)postgresql(dot)org>
Subject: Re: Conflict detection for update_deleted in logical replication
Date: 2025-01-08 11:03:07
Message-ID: CALDaNm3+x+ZRE9wynMofvc_ZxsLXKHpyS7LYxnP-a2=RQE-4uA@mail.gmail.com
Views: Raw Message | Whole Thread | Download mbox | Resend email
Thread:
Lists: pgsql-hackers

On Tue, 7 Jan 2025 at 18:04, Zhijie Hou (Fujitsu)
<houzj(dot)fnst(at)fujitsu(dot)com> wrote:
>
> On Friday, January 3, 2025 1:53 PM Amit Kapila <amit(dot)kapila16(at)gmail(dot)com> wrote:
> >
> > On Wed, Dec 25, 2024 at 8:13 AM Zhijie Hou (Fujitsu)
> > <houzj(dot)fnst(at)fujitsu(dot)com> wrote:
> > >
> > > Attach the new version patch set which addressed all other comments.
> > >
> >
> > Some more miscellaneous comments:
>
> Thanks for the comments!
>
> > =============================
> > 1.
> > @@ -1431,9 +1431,9 @@ RecordTransactionCommit(void)
> > * modifying it. This makes checkpoint's determination of which xacts
> > * are delaying the checkpoint a bit fuzzy, but it doesn't matter.
> > */
> > - Assert((MyProc->delayChkptFlags & DELAY_CHKPT_START) == 0);
> > + Assert((MyProc->delayChkptFlags & DELAY_CHKPT_IN_COMMIT) == 0);
> > START_CRIT_SECTION();
> > - MyProc->delayChkptFlags |= DELAY_CHKPT_START;
> > + MyProc->delayChkptFlags |= DELAY_CHKPT_IN_COMMIT;
> >
> > /*
> > * Insert the commit XLOG record.
> > @@ -1536,7 +1536,7 @@ RecordTransactionCommit(void)
> > */
> > if (markXidCommitted)
> > {
> > - MyProc->delayChkptFlags &= ~DELAY_CHKPT_START;
> > + MyProc->delayChkptFlags &= ~DELAY_CHKPT_IN_COMMIT;
> > END_CRIT_SECTION();
> >
> > The comments related to this change should be updated in EndPrepare()
> > and RecordTransactionCommitPrepared(). They still refer to the
> > DELAY_CHKPT_START flag. We should update the comments explaining why
> > a
> > similar change is not required for prepare or commit_prepare, if there
> > is one.
>
> After considering more, I think we need to use the new flag in
> RecordTransactionCommitPrepared() as well, because it is assigned a commit
> timestamp and would be replicated as normal transaction if sub's two_phase is
> not enabled.
>
> > 3.
> > +FindMostRecentlyDeletedTupleInfo(Relation rel, TupleTableSlot *searchslot,
> > + TransactionId *delete_xid,
> > + RepOriginId *delete_origin,
> > + TimestampTz *delete_time)
> > ...
> > ...
> > + /* Try to find the tuple */
> > + while (table_scan_getnextslot(scan, ForwardScanDirection, scanslot))
> > + {
> > + bool dead = false;
> > + TransactionId xmax;
> > + TimestampTz localts;
> > + RepOriginId localorigin;
> > +
> > + if (!tuples_equal(scanslot, searchslot, eq, indexbitmap))
> > + continue;
> > +
> > + tuple = ExecFetchSlotHeapTuple(scanslot, false, NULL);
> > + buf = hslot->buffer;
> > +
> > + LockBuffer(buf, BUFFER_LOCK_SHARE);
> > +
> > + if (HeapTupleSatisfiesVacuum(tuple, oldestXmin, buf) ==
> > HEAPTUPLE_RECENTLY_DEAD)
> > + dead = true;
> > +
> > + LockBuffer(buf, BUFFER_LOCK_UNLOCK);
> > +
> > + if (!dead)
> > + continue;
> >
> > Why do we need to check only for HEAPTUPLE_RECENTLY_DEAD and not
> > HEAPTUPLE_DEAD? IIUC, we came here because we couldn't find the live
> > tuple, now whether the tuple is DEAD or RECENTLY_DEAD, why should it
> > matter to detect update_delete conflict?
>
> The HEAPTUPLE_DEAD could indicate tuples whose inserting transaction was
> aborted, in which case we could not get the commit timestamp or origin for the
> transaction. Or it could indicate tuples deleted by a transaction older than
> oldestXmin(we would take the new replication slot's xmin into account when
> computing this value), which means any subsequent transaction would have commit
> timestamp later than that old delete transaction, so I think it's OK to ignore
> this dead tuple and even detect update_missing because the resolution is to
> apply the subsequent UPDATEs anyway (assuming we are using last update win
> strategy). I added some comments along these lines in the patch.
>
> >
> > 5.
> > +
> > + <varlistentry
> > id="sql-createsubscription-params-with-detect-update-deleted">
> > + <term><literal>detect_update_deleted</literal>
> > (<type>boolean</type>)</term>
> > + <listitem>
> > + <para>
> > + Specifies whether the detection of <xref
> > linkend="conflict-update-deleted"/>
> > + is enabled. The default is <literal>false</literal>. If set to
> > + true, the dead tuples on the subscriber that are still useful for
> > + detecting <xref linkend="conflict-update-deleted"/>
> > + are retained,
> >
> > One of the purposes of retaining dead tuples is to detect
> > update_delete conflict. But, I also see the following in 0001's commit
> > message: "Since the mechanism relies on a single replication slot, it
> > not only assists in retaining dead tuples but also preserves commit
> > timestamps and origin data. These information will be displayed in the
> > additional logs generated for logical replication conflicts.
> > Furthermore, the preserved commit timestamps and origin data are
> > essential for consistently detecting update_origin_differs conflicts."
> > which indicates there are other cases where retaining dead tuples can
> > help. So, I was thinking about whether to name this new option as
> > retain_dead_tuples or something along those lines?
>
> I used the retain_conflict_info in this version as it looks more general and we
> are already using similar name in patch(RetainConflictInfoData), but we can
> change it later if people have better ideas.
>
> Attached the V19 patch which addressed comments in [1][2][3][4][5][6][7].

Consider a LR setup with retain_conflict_info=true for a table t1:
Publisher:
insert into t1 values(1);
-- Have a open transaction before delete operation in subscriber
begin;

Subscriber:
-- delete the record that was replicated
delete from t1;

-- Now commit the transaction in publisher
Publisher:
update t1 set c1 = 2;
commit;

In normal case update_deleted conflict is detected
2025-01-08 15:41:38.529 IST [112744] LOG: conflict detected on
relation "public.t1": conflict=update_deleted
2025-01-08 15:41:38.529 IST [112744] DETAIL: The row to be updated
was deleted locally in transaction 751 at 2025-01-08
15:41:29.811566+05:30.
Remote tuple (2); replica identity full (1).
2025-01-08 15:41:38.529 IST [112744] CONTEXT: processing remote data
for replication origin "pg_16387" during message type "UPDATE" for
replication target relation "public.t1" in transaction 747, finished
at 0/16FBCA0

Now execute the same above case by having a presetup to consume all
the replication slots in the system by executing
pg_create_logical_replication_slot before the subscription is created,
in this case the conflict is not detected correctly.
2025-01-08 15:39:17.931 IST [112551] LOG: conflict detected on
relation "public.t1": conflict=update_missing
2025-01-08 15:39:17.931 IST [112551] DETAIL: Could not find the row
to be updated.
Remote tuple (2); replica identity full (1).
2025-01-08 15:39:17.931 IST [112551] CONTEXT: processing remote data
for replication origin "pg_16387" during message type "UPDATE" for
replication target relation "public.t1" in transaction 747, finished
at 0/16FBC68
2025-01-08 15:39:18.266 IST [112582] ERROR: all replication slots are in use
2025-01-08 15:39:18.266 IST [112582] HINT: Free one or increase
"max_replication_slots".

This is because even though we say create subscription is successful,
the launcher has not yet created the replication slot.

There are few observations from this test:
1) Create subscription does not wait for the slot to be created by the
launcher and starts applying the changes. Should create a subscription
wait till the slot is created by the launcher process.
2) Currently launcher is exiting continuously and trying to create
replication slots. Should the launcher wait for
wal_retrieve_retry_interval configuration before trying to create the
slot instead of filling the logs continuously.
3) If we try to create a similar subscription with
retain_conflict_info and disable_on_error option and there is an error
in replication slot creation, currently the subscription does not get
disabled. Should we consider disable_on_error for these cases and
disable the subscription if we are not able to create the slots.

Regards,
Vignesh

In response to

Responses

Browse pgsql-hackers by date

  From Date Subject
Next Message Bertrand Drouvot 2025-01-08 11:11:59 Re: per backend WAL statistics
Previous Message Zhijie Hou (Fujitsu) 2025-01-08 11:00:24 RE: Conflict detection for update_deleted in logical replication