From: | Amit Kapila <amit(dot)kapila16(at)gmail(dot)com> |
---|---|
To: | Kyotaro Horiguchi <horikyota(dot)ntt(at)gmail(dot)com> |
Cc: | kuroda(dot)hayato(at)fujitsu(dot)com, osumi(dot)takamichi(at)fujitsu(dot)com, vignesh21(at)gmail(dot)com, euler(at)eulerto(dot)com, m(dot)melihmutlu(at)gmail(dot)com, andres(at)anarazel(dot)de, marcos(at)f10(dot)com(dot)br, pgsql-hackers(at)postgresql(dot)org, smithpb2250(at)gmail(dot)com |
Subject: | Re: Time delayed LR (WAS Re: logical replication restrictions) |
Date: | 2022-12-15 03:53:12 |
Message-ID: | CAA4eK1Lq+h8qo+rqGU-E+hwJKAHYocV54y4pvou4rLysCgYD-g@mail.gmail.com |
Views: | Raw Message | Whole Thread | Download mbox | Resend email |
Thread: | |
Lists: | pgsql-hackers |
On Thu, Dec 15, 2022 at 7:16 AM Kyotaro Horiguchi
<horikyota(dot)ntt(at)gmail(dot)com> wrote:
>
> At Wed, 14 Dec 2022 10:46:17 +0000, "Hayato Kuroda (Fujitsu)" <kuroda(dot)hayato(at)fujitsu(dot)com> wrote in
> > I have implemented and tested that workers wake up per wal_receiver_timeout/2
> > and send keepalive. Basically it works well, but I found two problems.
> > Do you have any good suggestions about them?
> >
> > 1)
> >
> > With this PoC at present, workers calculate sending intervals based on its
> > wal_receiver_timeout, and it is suppressed when the parameter is set to zero.
> >
> > This means that there is a possibility that walsender is timeout when wal_sender_timeout
> > in publisher and wal_receiver_timeout in subscriber is different.
> > Supposing that wal_sender_timeout is 2min, wal_receiver_tiemout is 5min,
>
> It seems to me wal_receiver_status_interval is better for this use.
> It's enough for us to docuemnt that "wal_r_s_interval should be
> shorter than wal_sener_timeout/2 especially when logical replication
> connection is using min_apply_delay. Otherwise you will suffer
> repeated termination of walsender".
>
This sounds reasonable to me.
> > and min_apply_delay is 10min. The worker on subscriber will wake up per 2.5min and
> > send keepalives, but walsender exits before the message arrives to publisher.
> >
> > One idea to avoid that is to send the min_apply_delay subscriber option to publisher
> > and compare them, but it may be not sufficient. Because XXX_timout GUC parameters
> > could be modified later.
>
> # Anyway, I don't think such asymmetric setup is preferable.
>
> > 2)
> >
> > The issue reported by Vignesh-san[1] has still remained. I have already analyzed that [2],
> > the root cause is that flushed WAL is not updated and sent to the publisher. Even
> > if workers send keepalive messages to pub during the delay, the flushed position
> > cannot be modified.
>
> I didn't look closer but the cause I guess is walsender doesn't die
> until all WAL has been sent, while logical delay chokes replication
> stream.
>
Right, I also think so.
> Allowing walsender to finish ignoring replication status
> wouldn't be great.
>
Yes, that would be ideal. But do you know why that is a must?
> One idea is to let logical workers send delaying
> status.
>
How can that help?
--
With Regards,
Amit Kapila.
From | Date | Subject | |
---|---|---|---|
Next Message | Kyotaro Horiguchi | 2022-12-15 04:14:56 | Re: Time delayed LR (WAS Re: logical replication restrictions) |
Previous Message | Amit Kapila | 2022-12-15 03:48:55 | Re: Time delayed LR (WAS Re: logical replication restrictions) |