From: | Masahiko Sawada <sawada(dot)mshk(at)gmail(dot)com> |
---|---|
To: | Amit Kapila <amit(dot)kapila16(at)gmail(dot)com> |
Cc: | Greg Nancarrow <gregn4422(at)gmail(dot)com>, vignesh C <vignesh21(at)gmail(dot)com>, Peter Eisentraut <peter(dot)eisentraut(at)enterprisedb(dot)com>, "houzj(dot)fnst(at)fujitsu(dot)com" <houzj(dot)fnst(at)fujitsu(dot)com>, "tanghy(dot)fnst(at)fujitsu(dot)com" <tanghy(dot)fnst(at)fujitsu(dot)com>, "osumi(dot)takamichi(at)fujitsu(dot)com" <osumi(dot)takamichi(at)fujitsu(dot)com>, Alexey Lesovsky <lesovsky(at)gmail(dot)com>, PostgreSQL-development <pgsql-hackers(at)postgresql(dot)org> |
Subject: | Re: Skipping logical replication transactions on subscriber side |
Date: | 2022-01-11 08:20:39 |
Message-ID: | CAD21AoBL02vLhdfQk8qcM9sgY8aRujopw16qvkZ5e_+oQp3THQ@mail.gmail.com |
Views: | Raw Message | Whole Thread | Download mbox | Resend email |
Thread: | |
Lists: | pgsql-hackers |
On Tue, Jan 11, 2022 at 3:12 PM Amit Kapila <amit(dot)kapila16(at)gmail(dot)com> wrote:
>
> On Tue, Jan 11, 2022 at 8:52 AM Masahiko Sawada <sawada(dot)mshk(at)gmail(dot)com> wrote:
> >
> > On Mon, Jan 10, 2022 at 8:50 PM Amit Kapila <amit(dot)kapila16(at)gmail(dot)com> wrote:
> > >
> > > I was thinking what if we don't advance origin explicitly in this
> > > case? Actually, that will be no different than the transactions where
> > > the apply worker doesn't apply any change because the initial sync is
> > > in progress (see should_apply_changes_for_rel()) or we have received
> > > an empty transaction. In those cases also, the origin lsn won't be
> > > advanced even though we acknowledge the advanced last_received
> > > location because of keep_alive messages. Now, it is possible after the
> > > restart we send the old start_lsn location because the replication
> > > origin was not updated before restart but we handle that case in the
> > > server by starting from the last confirmed location. See below code:
> > >
> > > CreateDecodingContext()
> > > {
> > > ..
> > > else if (start_lsn < slot->data.confirmed_flush)
> > > ..
> >
> > Good point. Probably one minor thing that is different from the
> > transaction where the apply worker applied an empty transaction is a
> > case where the server restarts/crashes before sending an
> > acknowledgment of the flush location. That is, in the case of the
> > empty transaction, the publisher sends an empty transaction again. On
> > the other hand in the case of skipping the transaction, a non-empty
> > transaction will be sent again but skip_xid is already changed or
> > cleared, therefore the user will have to specify skip_xid again. If we
> > write replication origin WAL record to advance the origin lsn, it
> > reduces the possibility of that. But I think it’s a very minor case so
> > we won’t need to deal with that.
> >
>
> Yeah, in the worst case, it will lead to conflict again and the user
> needs to set the xid again.
On second thought, the same is true for other cases, for example,
preparing the transaction and clearing skip_xid while handling a
prepare message. That is, currently we don't clear skip_xid while
handling a prepare message but do that while handling commit/rollback
prepared message, in order to avoid the worst case. If we do both
while handling a prepare message and the server crashes between them,
it ends up that skip_xid is cleared and the transaction will be
resent, which is identical to the worst-case above. Therefore, if we
accept this situation because of its low probability, probably we can
do the same things for other cases too, which makes the patch simple
especially for prepare and commit/rollback-prepared cases. What do you
think?
Regards,
--
Masahiko Sawada
EDB: https://www.enterprisedb.com/
From | Date | Subject | |
---|---|---|---|
Next Message | Kyotaro Horiguchi | 2022-01-11 08:30:27 | Re: Disallow quorum uncommitted (with synchronous standbys) txns in logical replication subscribers |
Previous Message | Sergey Shinderuk | 2022-01-11 08:08:59 | Re: Improve error handling of HMAC computations and SCRAM |