Re: Conflict detection for update_deleted in logical replication

From: Masahiko Sawada <sawada(dot)mshk(at)gmail(dot)com>
To: Amit Kapila <amit(dot)kapila16(at)gmail(dot)com>
Cc: "Zhijie Hou (Fujitsu)" <houzj(dot)fnst(at)fujitsu(dot)com>, Nisha Moond <nisha(dot)moond412(at)gmail(dot)com>, "Hayato Kuroda (Fujitsu)" <kuroda(dot)hayato(at)fujitsu(dot)com>, shveta malik <shveta(dot)malik(at)gmail(dot)com>, pgsql-hackers <pgsql-hackers(at)postgresql(dot)org>
Subject: Re: Conflict detection for update_deleted in logical replication
Date: 2025-02-06 20:47:52
Message-ID: CAD21AoCbjVTjejQxBkyo9kop2HMw85wSJqpB=JapsSE+Kw_iRg@mail.gmail.com
Views: Raw Message | Whole Thread | Download mbox | Resend email
Thread:
Lists: pgsql-hackers

On Tue, Feb 4, 2025 at 10:30 PM Amit Kapila <amit(dot)kapila16(at)gmail(dot)com> wrote:
>
> On Wed, Feb 5, 2025 at 6:00 AM Masahiko Sawada <sawada(dot)mshk(at)gmail(dot)com> wrote:
> >
> > On Fri, Jan 31, 2025 at 9:07 PM Amit Kapila <amit(dot)kapila16(at)gmail(dot)com> wrote:
> > >
> > > >
> > > > I was not sure of the point of
> > > > making the max_conflict_retention_duration a per-subscription
> > > > parameter.
> > > >
> > >
> > > The idea is to keep it at the same level as the other related
> > > parameter 'retain_conflict_info'. It could be useful for cases where
> > > publishers are from two different nodes (NP1 and NP2) and we have
> > > separate subscriptions for both nodes. Now, it is possible that users
> > > won't expect conflicts on the tables from one of the nodes NP1 then
> > > she could choose to enable 'retain_conflict_info' and
> > > 'max_conflict_retention_duration' only for the subscription pointing
> > > to publisher NP2.
> > >
> > > Now, say the publisher node that can generate conflicts (NP2) has
> > > fewer writes and the corresponding apply worker could easily catch up
> > > and almost always be in sync with the publisher. In contrast, the
> > > other node that has no conflicts has a large number of writes. In such
> > > cases, giving new options at the subscription level will be helpful.
> > >
> > > If we want to provide it at the global level, then the performance or
> > > dead tuple control may not be any better than the current patch but
> > > won't allow the provision for the above kinds of cases. Second, adding
> > > two new GUCs is another thing I want to prevent. But OTOH, the
> > > implementation could be slightly simpler if we provide these options
> > > as GUC though I am not completely sure of that point. Having said
> > > that, I am open to changing it to a non-subscription level. Do you
> > > think it would be better to provide one or both of these parameters as
> > > GUCs or do you have something else in mind?
> >
> > It makes sense to me to have the retain_conflict_info as a
> > subscription-level parameter. I was thinking of making only
> > max_conflict_retention_duration a global parameter, but I might be
> > missing something. With a subscription-level
> > max_conflict_retention_duration, how can users choose the setting
> > values for each subscription, and is there a case that can be covered
> > only by a subscription-level max_conflict_retention_duration?
> >
>
> Users can configure depending on the workload of the publisher
> considering the publishers are different nodes as explained in my
> previous response. Also, I think it will help in resolutions where the
> worker for which the duration for updating the worker_level xmin has
> not exceeded the max_conflict_retention_duration can reliably detect
> update_delete. Then this parameter will only be required for
> subscriptions that have enabled retain_conflict_info. I am not
> completely sure if these are reasons enough to keep at the
> subscription level but OTOH Dilip also seems to favor keeping
> max_conflict_retention_duration at susbcription-level.

I'd like to confirm what users would expect of this
max_conflict_retention_duration option and it works as expected. IIUC
users would want to use this option when they want to balance between
the reliable update_deleted conflict detection and the performance. I
think they want to detect updated_deleted reliably as much as possible
but, at the same time, would like to avoid a huge performance dip
caused by it. IOW, once the apply lag becomes larger than the limit,
they would expect to prioritize the performance (recovery) over the
reliable update_deleted conflict detection.

With the subscription-level max_conflict_retention_duration, users can
set it to '5min' to a subscription, SUB1, while not setting it to
another subscription, SUB2, (assuming here that both subscriptions set
retain_conflict_info = true). This setting works fine if SUB2 could
easily catch up while SUB1 is delaying, because in this case, SUB1
would stop updating its xmin when delaying for 5 min or longer so the
slot's xmin can advance based only on SUB2's xmin. Which is good
because it ultimately allow vacuum to remove dead tuples and
contributes to better performance. On the other hand, in cases where
SUB2 is as delayed as or more than SUB1, even if SUB1 stopped updating
its xmin, the slot's xmin would not be able to advance. IIUC
pg_conflict_detection slot won't be invalidated as long as there is at
least one subscription that sets retain_conflict_info = true and
doesn't set max_conflict_retention_duration, even if other
subscriptions set max_conflict_retention_duration.

I'm not really sure that these behaviors are the expected behavior of
users who set max_conflict_retention_duration to some subscriptions.
Or I might have set the wrong expectation or assumption on this
parameter. I'm fine with a subscription-level
max_conflict_retention_duration if it's clear this option works as
expected by users who want to use it.

Regards,

--
Masahiko Sawada
Amazon Web Services: https://aws.amazon.com

In response to

Responses

Browse pgsql-hackers by date

  From Date Subject
Next Message Alexander Lakhin 2025-02-06 21:00:30 Re: Improving tracking/processing of buildfarm test failures
Previous Message Alexander Borisov 2025-02-06 20:16:01 Re: Optimization for lower(), upper(), casefold() functions.