Re: Conflict detection for update_deleted in logical replication

From: Dilip Kumar <dilipbalaut(at)gmail(dot)com>
To: Amit Kapila <amit(dot)kapila16(at)gmail(dot)com>
Cc: Masahiko Sawada <sawada(dot)mshk(at)gmail(dot)com>, "Zhijie Hou (Fujitsu)" <houzj(dot)fnst(at)fujitsu(dot)com>, Nisha Moond <nisha(dot)moond412(at)gmail(dot)com>, "Hayato Kuroda (Fujitsu)" <kuroda(dot)hayato(at)fujitsu(dot)com>, shveta malik <shveta(dot)malik(at)gmail(dot)com>, pgsql-hackers <pgsql-hackers(at)postgresql(dot)org>
Subject: Re: Conflict detection for update_deleted in logical replication
Date: 2025-02-10 04:56:07
Message-ID: CAFiTN-t47aOsghd669jVKVvpSA9nKKE0JpEjGD79KfxoqfnmJw@mail.gmail.com
Views: Raw Message | Whole Thread | Download mbox | Resend email
Thread:
Lists: pgsql-hackers

On Fri, Feb 7, 2025 at 11:17 AM Amit Kapila <amit(dot)kapila16(at)gmail(dot)com> wrote:
>
> On Fri, Feb 7, 2025 at 2:18 AM Masahiko Sawada <sawada(dot)mshk(at)gmail(dot)com> wrote:
> >
> > I'd like to confirm what users would expect of this
> > max_conflict_retention_duration option and it works as expected. IIUC
> > users would want to use this option when they want to balance between
> > the reliable update_deleted conflict detection and the performance. I
> > think they want to detect updated_deleted reliably as much as possible
> > but, at the same time, would like to avoid a huge performance dip
> > caused by it. IOW, once the apply lag becomes larger than the limit,
> > they would expect to prioritize the performance (recovery) over the
> > reliable update_deleted conflict detection.
> >
>
> Yes, this understanding is correct.
>
> > With the subscription-level max_conflict_retention_duration, users can
> > set it to '5min' to a subscription, SUB1, while not setting it to
> > another subscription, SUB2, (assuming here that both subscriptions set
> > retain_conflict_info = true). This setting works fine if SUB2 could
> > easily catch up while SUB1 is delaying, because in this case, SUB1
> > would stop updating its xmin when delaying for 5 min or longer so the
> > slot's xmin can advance based only on SUB2's xmin. Which is good
> > because it ultimately allow vacuum to remove dead tuples and
> > contributes to better performance. On the other hand, in cases where
> > SUB2 is as delayed as or more than SUB1, even if SUB1 stopped updating
> > its xmin, the slot's xmin would not be able to advance. IIUC
> > pg_conflict_detection slot won't be invalidated as long as there is at
> > least one subscription that sets retain_conflict_info = true and
> > doesn't set max_conflict_retention_duration, even if other
> > subscriptions set max_conflict_retention_duration.
> >

That seems like a valid point.

>
> > I'm not really sure that these behaviors are the expected behavior of
> > users who set max_conflict_retention_duration to some subscriptions.
> > Or I might have set the wrong expectation or assumption on this
> > parameter. I'm fine with a subscription-level
> > max_conflict_retention_duration if it's clear this option works as
> > expected by users who want to use it.
> >
>
> It seems you are not convinced to provide this parameter at the
> subscription level and anyway providing it as GUC will simplify the
> implementation and it would probably be easier for users to configure
> rather than giving it at the subscription level for all subscriptions
> that have set retain_conflict_info set to true. I guess in the future
> we can provide it at the subscription level as well if there is a
> clear use case for it. Does that make sense to you?

Would it make sense to introduce a GUC parameter for this value, where
subscribers can overwrite it for their specific subscriptions, but
only up to the limit set by the GUC? This would allow flexibility in
certain cases --subscribers could opt to wait for a shorter duration
than the GUC value if needed.

Although a concrete use case isn't immediately clear, consider a
hypothetical scenario: Suppose a subscriber connected to Node1 must
wait for long period to avoid an incorrect conflict decision. In such
cases, it would rely on the default high value set by the GUC.
However, since Node1 is generally responsive and rarely has
long-running transactions, this long wait would only be necessary in
rare cases. On the other hand, a subscriber pulling from Node2 may not
require as stringent conflict detection. If Node2 frequently has
long-running transactions, waiting too long could lead to excessive
delays.

The idea here is that the Node1 subscriber can wait for the full
max_conflict_retention_duration set by the GUC when necessary, while
the Node2 subscriber can choose a shorter wait time to avoid
unnecessary delays caused by frequent long transactions.

--
Regards,
Dilip Kumar
EnterpriseDB: http://www.enterprisedb.com

In response to

Responses

Browse pgsql-hackers by date

  From Date Subject
Next Message Suraj Kharage 2025-02-10 05:13:18 Re: Support for NO INHERIT to INHERIT state change with named NOT NULL constraints
Previous Message jian he 2025-02-10 04:52:45 Re: Virtual generated columns