Re: Exit walsender before confirming remote flush in logical replication

From: Kyotaro Horiguchi <horikyota(dot)ntt(at)gmail(dot)com>
To: ashutosh(dot)bapat(dot)oss(at)gmail(dot)com
Cc: kuroda(dot)hayato(at)fujitsu(dot)com, pgsql-hackers(at)postgresql(dot)org, amit(dot)kapila16(at)gmail(dot)com, ashutosh(dot)bapat(at)enterprisedb(dot)com
Subject: Re: Exit walsender before confirming remote flush in logical replication
Date: 2022-12-23 02:21:54
Message-ID: 20221223.112154.220544589754432382.horikyota.ntt@gmail.com
Views: Raw Message | Whole Thread | Download mbox | Resend email
Thread:
Lists: pgsql-hackers

At Thu, 22 Dec 2022 17:29:34 +0530, Ashutosh Bapat <ashutosh(dot)bapat(dot)oss(at)gmail(dot)com> wrote in
> On Thu, Dec 22, 2022 at 11:16 AM Hayato Kuroda (Fujitsu)
> <kuroda(dot)hayato(at)fujitsu(dot)com> wrote:
> > In case of logical replication, however, we cannot support the use-case that
> > switches the role publisher <-> subscriber. Suppose same case as above, additional
..
> > Therefore, I think that we can ignore the condition for shutting down the
> > walsender in logical replication.
...
> > This change may be useful for time-delayed logical replication. The walsender
> > waits the shutdown until all changes are applied on subscriber, even if it is
> > delayed. This causes that publisher cannot be stopped if large delay-time is
> > specified.
>
> I think the current behaviour is an artifact of using the same WAL
> sender code for both logical and physical replication.

Yeah, I don't think we do that for the reason of switchover. On the
other hand I think the behavior was intentionally taken over since it
is thought as sensible alone. And I'm afraind that many people already
relies on that behavior.

> I agree with you that the logical WAL sender need not wait for all the
> WAL to be replayed downstream.

Thus I feel that it might be a bit outrageous to get rid of that
bahavior altogether because of a new feature stumbling on it. I'm
fine doing that only in the apply_delay case, though. However, I have
another concern that we are introducing the second exception for
XLogSendLogical in the common path.

I doubt that anyone wants to use synchronous logical replication with
apply_delay since the sender transaction is inevitablly affected back
by that delay.

Thus how about before entering an apply_delay, logrep worker sending a
kind of crafted feedback, which reports commit_data.end_lsn as
flushpos? A little tweak is needed in send_feedback() but seems to
work..

regards.

--
Kyotaro Horiguchi
NTT Open Source Software Center

In response to

Responses

Browse pgsql-hackers by date

  From Date Subject
Next Message Kyotaro Horiguchi 2022-12-23 02:25:24 Re: Force streaming every change in logical decoding
Previous Message Richard Guo 2022-12-23 02:21:14 Re: Avoid lost result of recursion (src/backend/optimizer/util/inherit.c)