From: | Amit Kapila <amit(dot)kapila16(at)gmail(dot)com> |
---|---|
To: | "Hayato Kuroda (Fujitsu)" <kuroda(dot)hayato(at)fujitsu(dot)com> |
Cc: | Dilip Kumar <dilipbalaut(at)gmail(dot)com>, "pgsql-hackers(at)postgresql(dot)org" <pgsql-hackers(at)postgresql(dot)org> |
Subject: | Re: Exit walsender before confirming remote flush in logical replication |
Date: | 2022-12-27 09:25:07 |
Message-ID: | CAA4eK1L+JVwRiy9Z=7DK5B_FGb8hJcsH=nz8BvPFVg82s7zH5g@mail.gmail.com |
Views: | Raw Message | Whole Thread | Download mbox | Resend email |
Thread: | |
Lists: | pgsql-hackers |
On Tue, Dec 27, 2022 at 2:50 PM Amit Kapila <amit(dot)kapila16(at)gmail(dot)com> wrote:
>
> On Tue, Dec 27, 2022 at 1:44 PM Hayato Kuroda (Fujitsu)
> <kuroda(dot)hayato(at)fujitsu(dot)com> wrote:
> >
> > Thanks for checking my proposal!
> >
> > > - * Note that if we determine that there's still more data to send, this
> > > - * function will return control to the caller.
> > > + * Note that if we determine that there's still more data to send or we are in
> > > + * the physical replication more, this function will return control to the
> > > + * caller.
> > >
> > > I think in this comment you meant to say
> > >
> > > 1. "or we are in physical replication mode and all WALs are not yet replicated"
> > > 2. Typo /replication more/replication mode
> >
> > Firstly I considered 2, but I thought 1 seemed to be better.
> > PSA the updated patch.
> >
>
> I think even for logical replication we should check whether there is
> any pending WAL (via pq_is_send_pending()) to be sent. Otherwise, what
> is the point to send the done message? Also, the caller of
> WalSndDone() already has that check which is another reason why I
> can't see why you didn't have the same check in function WalSndDone().
>
> BTW, even after fixing this, I think logical replication will behave
> differently when due to some reason (like time-delayed replication)
> send buffer gets full and walsender is not able to send data. I think
> this will be less of an issue with physical replication because there
> is a separate walreceiver process to flush the WAL which doesn't wait
> but the same is not true for logical replication. Do you have any
> thoughts on this matter?
>
In logical replication, it can happen today as well without
time-delayed replication. Basically, say apply worker is waiting to
acquire some lock that is already acquired by some backend then it
will have the same behavior. I have not verified this, so you may want
to check it once.
--
With Regards,
Amit Kapila.
From | Date | Subject | |
---|---|---|---|
Next Message | Hayato Kuroda (Fujitsu) | 2022-12-27 09:29:02 | RE: Time delayed LR (WAS Re: logical replication restrictions) |
Previous Message | Amit Kapila | 2022-12-27 09:20:48 | Re: Exit walsender before confirming remote flush in logical replication |