Re: Skipping logical replication transactions on subscriber side

From: Masahiko Sawada <sawada(dot)mshk(at)gmail(dot)com>
To: Amit Kapila <amit(dot)kapila16(at)gmail(dot)com>
Cc: Dilip Kumar <dilipbalaut(at)gmail(dot)com>, Peter Eisentraut <peter(dot)eisentraut(at)enterprisedb(dot)com>, "houzj(dot)fnst(at)fujitsu(dot)com" <houzj(dot)fnst(at)fujitsu(dot)com>, vignesh C <vignesh21(at)gmail(dot)com>, Greg Nancarrow <gregn4422(at)gmail(dot)com>, "tanghy(dot)fnst(at)fujitsu(dot)com" <tanghy(dot)fnst(at)fujitsu(dot)com>, "osumi(dot)takamichi(at)fujitsu(dot)com" <osumi(dot)takamichi(at)fujitsu(dot)com>, Alexey Lesovsky <lesovsky(at)gmail(dot)com>, PostgreSQL-development <pgsql-hackers(at)postgresql(dot)org>
Subject: Re: Skipping logical replication transactions on subscriber side
Date: 2021-12-15 00:38:34
Message-ID: CAD21AoDVjxg-JKNOQPRoKSsmpJVOXJemadPHedXGWbecJwxY3g@mail.gmail.com
Views: Raw Message | Whole Thread | Download mbox | Resend email
Thread:
Lists: pgsql-hackers

On Tue, Dec 14, 2021 at 8:24 PM Amit Kapila <amit(dot)kapila16(at)gmail(dot)com> wrote:
>
> On Tue, Dec 14, 2021 at 3:41 PM Dilip Kumar <dilipbalaut(at)gmail(dot)com> wrote:
> >
> > On Tue, Dec 14, 2021 at 2:36 PM Amit Kapila <amit(dot)kapila16(at)gmail(dot)com> wrote:
> > >
> > > > >
> > > > > I agree with this theory. Can we reflect this in comments so that in
> > > > > the future we know why we didn't pursue this direction?
> > > >
> > > > I might be missing something here, but for streaming, transaction
> > > > users can decide whether they wants to skip or not only once we start
> > > > applying no? I mean only once we start applying the changes we can
> > > > get some errors and by that time we must be having all the changes for
> > > > the transaction.
> > > >
> > >
> > > That is right and as per my understanding, the patch is trying to
> > > accomplish the same.
> > >
> > > > So I do not understand the point we are trying to
> > > > discuss here?
> > > >
> > >
> > > The point is that whether we can skip the changes while streaming
> > > itself like when we get the changes and write to a stream file. Now,
> > > it is possible that streams from multiple transactions can be
> > > interleaved and users can change the skip_xid in between. It is not
> > > that we can't handle this but that would require a more complex design
> > > and it doesn't seem worth it because we can anyway skip the changes
> > > while applying as you mentioned in the previous paragraph.
> >
> > Actually, I was trying to understand the use case for skipping while
> > streaming. Actually, during streaming we are not doing any database
> > operation that means this will not generate any error.
> >
>
> Say, there is an error the first time when we start to apply changes
> for such a transaction. So, such a transaction will be streamed again.
> Say, the user has set the skip_xid before we stream a second time, so
> this time, we can skip it either during the stream phase or apply
> phase. I think the patch is skipping it during apply phase.
> Sawada-San, please confirm if my understanding is correct?

My understanding is the same. The patch doesn't skip the streaming
phase but starts skipping when starting to apply changes. That is, we
receive streamed changes and write them to the stream file anyway
regardless of skip_xid. When receiving the stream-commit message, we
check whether or not we skip this transaction, and if so we apply all
messages in the stream file other than all data modification messages.

Regards,

--
Masahiko Sawada
EDB: https://www.enterprisedb.com/

In response to

Browse pgsql-hackers by date

  From Date Subject
Next Message Peter Geoghegan 2021-12-15 01:02:24 Re: generalized conveyor belt storage
Previous Message David Zhang 2021-12-15 00:21:17 Re: Question about 001_stream_rep.pl recovery test