From: | Alvaro Herrera <alvherre(at)alvh(dot)no-ip(dot)org> |
---|---|
To: | Tom Lane <tgl(at)sss(dot)pgh(dot)pa(dot)us> |
Cc: | Henry Hinze <henry(dot)hinze(at)gmail(dot)com>, pgsql-bugs(at)lists(dot)postgresql(dot)org |
Subject: | Re: BUG #16643: PG13 - Logical replication - initial startup never finishes and gets stuck in startup loop |
Date: | 2020-09-30 21:42:16 |
Message-ID: | 20200930214216.GA5296@alvherre.pgsql |
Views: | Raw Message | Whole Thread | Download mbox | Resend email |
Thread: | |
Lists: | pgsql-bugs |
On 2020-Sep-30, Tom Lane wrote:
> Henry Hinze <henry(dot)hinze(at)gmail(dot)com> writes:
> > I've made an important observation!
> > Since I had the impression this setup was already working with RC1 of PG
> > 13, I re-installed RC1 and did the same test. And it's working fine!
>
> Ugh. So that points the finger at commits 07082b08c/bfb12cd2b,
> which are the only nearby change between rc1 and 13.0. A quick
> comparison of before-and-after checkouts confirms it.
Oh dear.
> After some digging around, I realize that that commit actually
> resulted in a protocol break. libpqwalreceiver is expecting to
> get an additional CommandComplete message after COPY OUT finishes,
> per libpqrcv_endstreaming(), and it's no longer getting one.
>
> (I have not read the protocol document to see if this is per spec;
> but spec or no, that's what libpqwalreceiver is expecting.)
Yeah, definitely.
The minimal fix seems to be to add an EndReplicationCommand() call in
the T_StartReplicationCmd case. Testing this now ...
> The question that this raises is how the heck did that get past
> our test suites? It seems like the error should have been obvious
> to even the most minimal testing.
... yeah, that's indeed an important question. I'm going to guess that
the TAP suites are too forgiving :-(
From | Date | Subject | |
---|---|---|---|
Next Message | Tom Lane | 2020-09-30 21:52:38 | Re: BUG #16643: PG13 - Logical replication - initial startup never finishes and gets stuck in startup loop |
Previous Message | Tom Lane | 2020-09-30 21:35:43 | Re: BUG #16419: wrong parsing BC year in to_date() function |