From: | Amit Kapila <amit(dot)kapila16(at)gmail(dot)com> |
---|---|
To: | Peter Smith <smithpb2250(at)gmail(dot)com> |
Cc: | Alvaro Herrera <alvherre(at)alvh(dot)no-ip(dot)org>, Petr Jelinek <petr(at)2ndquadrant(dot)com>, Tom Lane <tgl(at)sss(dot)pgh(dot)pa(dot)us>, Henry Hinze <henry(dot)hinze(at)gmail(dot)com>, PostgreSQL mailing lists <pgsql-bugs(at)lists(dot)postgresql(dot)org>, Peter Eisentraut <peter(dot)eisentraut(at)2ndquadrant(dot)com> |
Subject: | Re: BUG #16643: PG13 - Logical replication - initial startup never finishes and gets stuck in startup loop |
Date: | 2020-11-18 08:43:13 |
Message-ID: | CAA4eK1KnrJp6MgU8VupdaOx=qmi_T+A4oP9dwn-=0g=EO5Fy-g@mail.gmail.com |
Views: | Raw Message | Whole Thread | Download mbox | Resend email |
Thread: | |
Lists: | pgsql-bugs |
On Wed, Nov 18, 2020 at 11:19 AM Peter Smith <smithpb2250(at)gmail(dot)com> wrote:
>
> On Wed, Nov 18, 2020 at 3:17 PM Amit Kapila <amit(dot)kapila16(at)gmail(dot)com> wrote:
> > > To cut a long story short, a tablesync worker CAN in fact end up
> > > processing (e.g. apply_dispatch) streaming messages.
> > > So the tablesync worker CAN get into the apply_handle_stream_commit.
> > > And this scenario, albeit rare, will crash.
> > >
> >
> > Thank you for reproducing this issue. Dilip, Peter, is anyone of you
> > interested in writing a fix for this?
>
> Hi Amit.
>
> FYI - Sorry, I am away/offline for the next 5 days.
>
> However, if this bug still remains unfixed after next Tuesday then I
> can look at it then.
>
Fair enough. Let's see if Dilip or I can get a chance to look into
this before that.
> ---
>
> IIUC there are 2 options:
> 1) Disallow streaming for the tablesync worker.
> 2) Make streaming work for the tablesync worker.
>
> I prefer option (a) not only because of the KISS principle, but also
> because this is how the tablesync worker was previously thought to
> behave anyway. I expect this fix may be like the code that Dilip
> already posted [1]
> [1] https://www.postgresql.org/message-id/CAFiTN-uUgKpfdbwSGnn3db3mMQAeviOhQvGWE_pC9icZF7VDKg%40mail.gmail.com
>
> OTOH, option (b) fix may or may not be possible (I don't know), but I
> have doubts that it is worthwhile to consider making a special fix for
> a scenario which so far has never been reproduced outside of the
> debugger.
>
I would prefer option (b) unless the fix is not possible due to design
constraints. I don't think it is a good idea to make tablesync workers
behave differently unless we have a reason for doing so.
--
With Regards,
Amit Kapila.
From | Date | Subject | |
---|---|---|---|
Next Message | Magnus Hagander | 2020-11-18 10:57:05 | Re: BUG #16722: PG hanging on COPY when table has close to 2^32 toasts in the table. |
Previous Message | Peter Smith | 2020-11-18 05:49:12 | Re: BUG #16643: PG13 - Logical replication - initial startup never finishes and gets stuck in startup loop |