Quick Links

Re: Skipping logical replication transactions on subscriber side

From:	Masahiko Sawada <sawada(dot)mshk(at)gmail(dot)com>
To:	Amit Kapila <amit(dot)kapila16(at)gmail(dot)com>
Cc:	Peter Eisentraut <peter(dot)eisentraut(at)enterprisedb(dot)com>, PostgreSQL-development <pgsql-hackers(at)postgresql(dot)org>
Subject:	Re: Skipping logical replication transactions on subscriber side
Date:	2021-06-17 06:24:03
Message-ID:	CAD21AoDYLyhkGOzRh8JYoMjoe39y9teAidS+bRHHsfoDZ1RicA@mail.gmail.com
Views:	Whole Thread \| Raw Message \| Download mbox \| Resend email
Thread:
Lists:	pgsql-hackers

On Wed, Jun 16, 2021 at 6:05 PM Amit Kapila <amit(dot)kapila16(at)gmail(dot)com> wrote:
>
> On Tue, Jun 15, 2021 at 6:13 AM Masahiko Sawada <sawada(dot)mshk(at)gmail(dot)com> wrote:
> >
> > On Wed, Jun 2, 2021 at 3:07 PM Amit Kapila <amit(dot)kapila16(at)gmail(dot)com> wrote:
> > >
> > > On Tue, Jun 1, 2021 at 9:05 PM Peter Eisentraut
> > > <peter(dot)eisentraut(at)enterprisedb(dot)com> wrote:
> > > >
> > > > On 01.06.21 06:01, Amit Kapila wrote:
> > > > > But, won't that be costly in cases where we have errors in the
> > > > > processing of very large transactions? Subscription has to process all
> > > > > the data before it gets an error. I think we can even imagine this
> > > > > feature to be extended to use commitLSN as a skip candidate in which
> > > > > case we can even avoid getting the data of that transaction from the
> > > > > publisher. So if this information is persistent, the user can even set
> > > > > the skip identifier after the restart before the publisher can send
> > > > > all the data.
> > > >
> > > > At least in current practice, skipping parts of the logical replication
> > > > stream on the subscriber is a rare, emergency-level operation when
> > > > something that shouldn't have happened happened. So it doesn't really
> > > > matter how costly it is. It's not going to be more costly than the
> > > > error happening in the first place. All you'd need is one shared memory
> > > > slot per subscription to store a xid to skip.
> > > >
> > >
> > > Leaving aside the performance point, how can we do by just storing
> > > skip identifier (XID/commitLSN) in shared_memory? How will the apply
> > > worker know about that information after restart? Do you expect the
> > > user to set it again, if so, I think users might not like that? Also,
> > > how will we prohibit users to give some identifier other than for
> > > failed transactions, and if users provide that what should be our
> > > action? Without that, if users provide XID of some in-progress
> > > transaction, we might need to do more work (rollback) than just
> > > skipping it.
> >
> > I think the simplest solution would be to have a fixed-size array on
> > the shared memory to store information of skipping transactions on the
> > particular subscription. Given that this feature is meant to be a
> > repair tool in emergency cases, 32 or 64 entries seem enough.
> >
>
> IIUC, here you are talking about xids specified by the user to skip?

Yes. I think we need to store pairs of subid and xid.

> If so, then how will you get that information after the restart, and
> why you need 32 or 64 entries for it?

That information doesn't last after the restart. I think that the
situation that DBA uses this tool would be that they fix the
subscription on the spot. Once the subscription skipped the
transaction, the entry of that information is cleared. So I’m thinking
that we don’t need to hold many entries and it does not necessarily to
be durable. I think your below idea of storing that information in
ReplicationState seems better to me.

>
> >
> > Anyway, it seems to me that we need to consider the user interface
> > first, especially how and what the user specifies the transaction to
> > skip. My current feeling is that specifying XID is intuitive and
> > flexible but the user needs to have 2 steps: checks XID and then
> > specifies it, and there is a risk that the user mistakenly specifies a
> > wrong XID. On the other hand, the idea of specifying to skip the first
> > transaction doesn’t require the user to check and specify XID but is
> > less flexible, and “the first” transaction might be ambiguous for the
> > user.
> >
>
> I see your point in allowing to specify First N transactions but OTOH,
> I am slightly afraid that it might lead to skipping some useful
> transactions which will make replica out-of-sync.

Agreed.

It might be better to skip only the first transaction.

> BTW, is there any
> data point for the user to check how many transactions it can skip?
> Normally, we won't be able to proceed till we resolve/skip the
> transaction that is generating an error. One possibility could be that
> we provide some *superuser* functions like
> pg_logical_replication_skip_xact()/pg_logical_replication_reset_skip_xact()
> which takes subscription name/id and xid as input parameters. Then, I
> think we can store this information in ReplicationState and probably
> try to map to originid from subscription name/id to retrieve that
> info. We can probably document that the effects of these functions
> won't last after the restart.

ReplicationState seems a reasonable place to store that information.

> Now, if this function is used by super
> users then we can probably trust that they provide the XIDs that we
> can trust to be skipped but OTOH making a restriction to allow these
> functions to be used by superusers might restrict the usage of this
> repair tool.

If we specify the subscription id or name, maybe we can allow also the
owner of subscription to do that operation?

Regards,

--
Masahiko Sawada
EDB: https://www.enterprisedb.com/

In response to

Re: Skipping logical replication transactions on subscriber side at 2021-06-16 09:05:08 from Amit Kapila

Responses

Re: Skipping logical replication transactions on subscriber side at 2021-06-17 09:20:21 from Masahiko Sawada

Browse pgsql-hackers by date

	From	Date	Subject
Next Message	Amit Kapila	2021-06-17 06:41:53	Re: Decoding speculative insert with toast leaks memory
Previous Message	Yugo NAGATA	2021-06-17 06:17:40	Re: pgbench logging broken by time logic changes