Re: Skipping logical replication transactions on subscriber side

From: Amit Kapila <amit(dot)kapila16(at)gmail(dot)com>
To: Masahiko Sawada <sawada(dot)mshk(at)gmail(dot)com>
Cc: Peter Eisentraut <peter(dot)eisentraut(at)enterprisedb(dot)com>, PostgreSQL-development <pgsql-hackers(at)postgresql(dot)org>
Subject: Re: Skipping logical replication transactions on subscriber side
Date: 2021-06-01 05:28:10
Message-ID: CAA4eK1LgtDyayec1FBJ9MUfPmUxVFR8_Umj+xvUnnUrkt_hs7Q@mail.gmail.com
Views: Raw Message | Whole Thread | Download mbox | Resend email
Thread:
Lists: pgsql-hackers

On Tue, Jun 1, 2021 at 10:07 AM Masahiko Sawada <sawada(dot)mshk(at)gmail(dot)com> wrote:
>
> On Tue, Jun 1, 2021 at 1:01 PM Amit Kapila <amit(dot)kapila16(at)gmail(dot)com> wrote:
> >
> > On Tue, Jun 1, 2021 at 12:55 AM Peter Eisentraut
> > <peter(dot)eisentraut(at)enterprisedb(dot)com> wrote:
> > >
> > > On 27.05.21 12:04, Amit Kapila wrote:
> > > >>> Also, I am thinking that instead of a stat view, do we need
> > > >>> to consider having a system table (pg_replication_conflicts or
> > > >>> something like that) for this because what if stats information is
> > > >>> lost (say either due to crash or due to udp packet loss), can we rely
> > > >>> on stats view for this?
> > > >> Yeah, it seems better to use a catalog.
> > > >>
> > > > Okay.
> > >
> > > Could you store it shared memory? You don't need it to be crash safe,
> > > since the subscription will just run into the same error again after
> > > restart. You just don't want it to be lost, like with the statistics
> > > collector.
> > >
> >
> > But, won't that be costly in cases where we have errors in the
> > processing of very large transactions? Subscription has to process all
> > the data before it gets an error.
>
> I had the same concern. Particularly, the approach we currently
> discussed is to skip the transaction based on the information written
> by the worker rather than require the user to specify the XID.
>

Yeah, but I was imagining that the user still needs to specify
something to indicate that we need to skip it, otherwise, we might try
to skip a transaction that the user wants to resolve by itself rather
than expecting us to skip it. Another point is if we don't store this
information in a persistent way then how will we restrict a user to
specify some random XID which is not even errored after restart.

> Therefore, we will always require the worker to process the same large
> transaction after the restart in order to skip the transaction.
>
> > I think we can even imagine this
> > feature to be extended to use commitLSN as a skip candidate in which
> > case we can even avoid getting the data of that transaction from the
> > publisher. So if this information is persistent, the user can even set
> > the skip identifier after the restart before the publisher can send
> > all the data.
>
> Another possible benefit of writing it to a catalog is that we can
> replicate it to the physical standbys. If we have failover slots in
> the future, the physical standby server also can resolve the conflict
> without processing a possibly large transaction.
>

makes sense.

> > I think the XID (or say another identifier like commitLSN) which we
> > want to use for skipping the transaction as specified by the user has
> > to be stored in the catalog because otherwise, after the restart we
> > won't remember it and the user won't know that he needs to set it
> > again. Now, say we have multiple skip identifiers (XIDs, commitLSN,
> > ..), isn't it better to store all conflict-related information in a
> > separate catalog like pg_subscription_conflict or something like that.
> > I think it might be also better to later extend it for auto conflict
> > resolution where the user can specify auto conflict resolution info
> > for a subscription. Is it better to store all such information in
> > pg_subscription or have a separate catalog? It is possible that even
> > if we have a separate catalog for conflict info, we might not want to
> > store error info there.
>
> Just to be clear, we need to store only the conflict-related
> information that cannot be resolved without manual intervention,
> right? That is, conflicts cause an error, exiting the workers. In
> general, replication conflicts include also conflicts that don’t cause
> an error. I think that those conflicts don’t necessarily need to be
> stored in the catalog and don’t require manual intervention.
>

Yeah, I think we want to record the error cases but which other
conflicts you are talking about here which doesn't lead to any sort of
error?

--
With Regards,
Amit Kapila.

In response to

Responses

Browse pgsql-hackers by date

  From Date Subject
Next Message Dilip Kumar 2021-06-01 05:30:49 Re: Decoding speculative insert with toast leaks memory
Previous Message Amit Kapila 2021-06-01 04:51:35 Re: Decoding speculative insert with toast leaks memory