From: | Amit Kapila <amit(dot)kapila16(at)gmail(dot)com> |
---|---|
To: | Masahiko Sawada <sawada(dot)mshk(at)gmail(dot)com> |
Cc: | "Zhijie Hou (Fujitsu)" <houzj(dot)fnst(at)fujitsu(dot)com>, Nisha Moond <nisha(dot)moond412(at)gmail(dot)com>, "Hayato Kuroda (Fujitsu)" <kuroda(dot)hayato(at)fujitsu(dot)com>, shveta malik <shveta(dot)malik(at)gmail(dot)com>, pgsql-hackers <pgsql-hackers(at)postgresql(dot)org> |
Subject: | Re: Conflict detection for update_deleted in logical replication |
Date: | 2025-02-05 06:30:32 |
Message-ID: | CAA4eK1Jix5UEb6P4s85NBgG7q1_n4o1_Xn3R7bym87QS+i=0+w@mail.gmail.com |
Views: | Raw Message | Whole Thread | Download mbox | Resend email |
Thread: | |
Lists: | pgsql-hackers |
On Wed, Feb 5, 2025 at 6:00 AM Masahiko Sawada <sawada(dot)mshk(at)gmail(dot)com> wrote:
>
> On Fri, Jan 31, 2025 at 9:07 PM Amit Kapila <amit(dot)kapila16(at)gmail(dot)com> wrote:
> >
> > >
> > > I was not sure of the point of
> > > making the max_conflict_retention_duration a per-subscription
> > > parameter.
> > >
> >
> > The idea is to keep it at the same level as the other related
> > parameter 'retain_conflict_info'. It could be useful for cases where
> > publishers are from two different nodes (NP1 and NP2) and we have
> > separate subscriptions for both nodes. Now, it is possible that users
> > won't expect conflicts on the tables from one of the nodes NP1 then
> > she could choose to enable 'retain_conflict_info' and
> > 'max_conflict_retention_duration' only for the subscription pointing
> > to publisher NP2.
> >
> > Now, say the publisher node that can generate conflicts (NP2) has
> > fewer writes and the corresponding apply worker could easily catch up
> > and almost always be in sync with the publisher. In contrast, the
> > other node that has no conflicts has a large number of writes. In such
> > cases, giving new options at the subscription level will be helpful.
> >
> > If we want to provide it at the global level, then the performance or
> > dead tuple control may not be any better than the current patch but
> > won't allow the provision for the above kinds of cases. Second, adding
> > two new GUCs is another thing I want to prevent. But OTOH, the
> > implementation could be slightly simpler if we provide these options
> > as GUC though I am not completely sure of that point. Having said
> > that, I am open to changing it to a non-subscription level. Do you
> > think it would be better to provide one or both of these parameters as
> > GUCs or do you have something else in mind?
>
> It makes sense to me to have the retain_conflict_info as a
> subscription-level parameter. I was thinking of making only
> max_conflict_retention_duration a global parameter, but I might be
> missing something. With a subscription-level
> max_conflict_retention_duration, how can users choose the setting
> values for each subscription, and is there a case that can be covered
> only by a subscription-level max_conflict_retention_duration?
>
Users can configure depending on the workload of the publisher
considering the publishers are different nodes as explained in my
previous response. Also, I think it will help in resolutions where the
worker for which the duration for updating the worker_level xmin has
not exceeded the max_conflict_retention_duration can reliably detect
update_delete. Then this parameter will only be required for
subscriptions that have enabled retain_conflict_info. I am not
completely sure if these are reasons enough to keep at the
subscription level but OTOH Dilip also seems to favor keeping
max_conflict_retention_duration at susbcription-level.
--
With Regards,
Amit Kapila.
From | Date | Subject | |
---|---|---|---|
Next Message | Masahiro Ikeda | 2025-02-05 06:31:04 | Recheck if ANALYZE is needed after VACUUM finishes by autovacuum |
Previous Message | Sandeep Thakkar | 2025-02-05 06:30:11 | Re: EDB Installer initcluster script changes - review requested |