From: | Amit Kapila <amit(dot)kapila16(at)gmail(dot)com> |
---|---|
To: | Masahiko Sawada <sawada(dot)mshk(at)gmail(dot)com> |
Cc: | "Zhijie Hou (Fujitsu)" <houzj(dot)fnst(at)fujitsu(dot)com>, Nisha Moond <nisha(dot)moond412(at)gmail(dot)com>, "Hayato Kuroda (Fujitsu)" <kuroda(dot)hayato(at)fujitsu(dot)com>, shveta malik <shveta(dot)malik(at)gmail(dot)com>, pgsql-hackers <pgsql-hackers(at)postgresql(dot)org> |
Subject: | Re: Conflict detection for update_deleted in logical replication |
Date: | 2025-01-15 04:08:12 |
Message-ID: | CAA4eK1L4b8z+gGHYJCMKxvwOvDh5YCYpYY18Xq3_AmN5YTGznQ@mail.gmail.com |
Views: | Raw Message | Whole Thread | Download mbox | Resend email |
Thread: | |
Lists: | pgsql-hackers |
On Wed, Jan 15, 2025 at 5:57 AM Masahiko Sawada <sawada(dot)mshk(at)gmail(dot)com> wrote:
>
> On Mon, Jan 13, 2025 at 8:39 PM Amit Kapila <amit(dot)kapila16(at)gmail(dot)com> wrote:
> >
> > As of now, I can't think of a way to throttle the publisher when the
> > apply_worker lags. Basically, we need some way to throttle (reduce the
> > speed of backends) when the apply worker is lagging behind a threshold
> > margin. Can you think of some way? I thought if one notices frequent
> > invalidation of the launcher's slot due to max_lag, then they can
> > rebalance their workload on the publisher.
>
> I don't have any ideas other than invalidating the launcher's slot
> when the apply lag is huge. We can think of invalidating the
> launcher's slot for some reasons such as the replay lag, the age of
> slot's xmin, and the duration.
>
Right, this is exactly where we are heading. I think we can add
reasons step-wise. For example, as a first step, we can invalidate the
slot due to replay LAG. Then, slowly, we can add other reasons as
well.
One thing that needs more discussion is the exact way to invalidate a
slot. I have mentioned a couple of ideas in my previous email which I
am writing again: "If we just invalidate the slot, users can check the
status of the slot and need to disable/enable retain_conflict_info
again to start retaining the required information. This would be
required because we can't allow system slots (slots created
internally) to be created by users. The other way could be that
instead of invalidating the slot, we directly drop/re-create the slot
or increase its xmin. If we choose to advance the slot automatically
without user intervention, we need to let users know via LOG and or
via information in the view."
> >
> > >
> > The max_lag idea sounds interesting for the case
> > > where the subscriber is much behind. Probably we can visit this idea
> > > as a new feature after completing this feature.
> > >
> >
> > Sure, but what will be our answer to users for cases where the
> > performance tanks due to bloat accumulation? The tests show that once
> > the apply_lag becomes large, it becomes almost impossible for the
> > apply worker to catch up (or take a very long time) and advance the
> > slot's xmin. The users can disable retain_conflict_info to bring back
> > the performance and get rid of bloat but I thought it would be easier
> > for users to do that if we have some knob where they don't need to
> > wait till actually the problem of bloat/performance dip happens.
>
> Probably retaining dead tuples based on the time duration or its age
> might be other solutions, it would increase a risk of not being able
> to detect update_deleted conflict though. I think in any way as long
> as we accumulate dead tulpes to detect update_deleted conflicts, it
> would be a tradeoff between reliably detecting update_deleted
> conflicts and the performance.
>
Right, and users have an option for it. Say, if they set max_lag as -1
(or some special value), we won't invalidate the slot, so the
update_delete conflict can be detected with complete reliability. At
this stage, it is okay if this information is LOGGED and displayed via
a system view. We need more thoughts while working on the CONFLICT
RESOLUTION patch such as we may need to additionally display a WARNING
or ERROR if the remote_tuples commit_time is earlier than the last
time slot is invalidated. I don't want to go in a detailed discussion
at this point but just wanted you to know that we will need additional
work for the resolution of update_delete conflicts to avoid
inconsistency.
> As for detecting update_deleted conflicts, we probably don't need the
> whole tuple data of deleted tuples. It would be sufficient if we can
> check XIDs of deleted tuple to get their origins and commit
> timestamps. Probably the same is true for the old version of updated
> tuple in terms of detecting update_origin_differ conflicts. If my
> understanding is right, probably we can remove only the tuple data of
> dead tuples that are older than a xmin horizon (excluding the
> launcher's xmin), while leaving the heap tuple header, which can
> minimize the table bloat.
>
I am afraid that is not possible because even to detect the conflict,
we first need to find the matching tuple on the subscriber node. If
the replica_indentity or primary_key is present in the table, we try
to save that and transaction info but that won't be simple either.
Also, if RI or primary_key is not there, we need an entire tuple to
match. We need a concept of tombstone tables (or we can call it a
dead-rows-store) where old data is stored reliably till we don't need
it. We have discussed briefly that idea previously [1][2] and decided
to move forward with an idea to retain the dead tuples idea based on
the theory that we already use similar ideas at other places.
BTW, a related point to note is that we need to retain the
conflict_info even to detect origin_differ conflict with complete
reliability. We need only commit_ts information for that case. See
analysis [3].
[1] - https://www.postgresql.org/message-id/CAJpy0uCov4JfZJeOvY0O21_gk9bcgNUDp4jf8%2BBbMp%2BEAv8cVQ%40mail.gmail.com
[2] - https://www.postgresql.org/message-id/e4cdb849-d647-4acf-aabe-7049ae170fbf%40enterprisedb.com
[3] - https://www.postgresql.org/message-id/OSCPR01MB14966F6B816880165E387758AF5112%40OSCPR01MB14966.jpnprd01.prod.outlook.com
--
With Regards,
Amit Kapila.
From | Date | Subject | |
---|---|---|---|
Next Message | vignesh C | 2025-01-15 04:34:50 | Re: POC: enable logical decoding when wal_level = 'replica' without a server restart |
Previous Message | Tom Lane | 2025-01-15 04:08:05 | Re: convert libpgport's pqsignal() to a void function |