From: | Robert Haas <robertmhaas(at)gmail(dot)com> |
---|---|
To: | Andres Freund <andres(at)anarazel(dot)de> |
Cc: | Peter Eisentraut <peter_e(at)gmx(dot)net>, Petr Jelinek <petr(at)2ndquadrant(dot)com>, Tom Lane <tgl(at)sss(dot)pgh(dot)pa(dot)us>, Fabien COELHO <coelho(at)cri(dot)ensmp(dot)fr>, Alvaro Herrera <alvherre(at)2ndquadrant(dot)com>, David Steele <david(at)pgmasters(dot)net>, Noah Misch <noah(at)leadboat(dot)com>, Stephen Frost <sfrost(at)snowman(dot)net>, Amit Kapila <amit(dot)kapila16(at)gmail(dot)com>, Suraj Kharage <suraj(dot)kharage(at)enterprisedb(dot)com>, tushar <tushar(dot)ahuja(at)enterprisedb(dot)com>, Rajkumar Raghuwanshi <rajkumar(dot)raghuwanshi(at)enterprisedb(dot)com>, Rushabh Lathia <rushabh(dot)lathia(at)gmail(dot)com>, Tels <nospam-pg-abuse(at)bloodgate(dot)com>, Andrew Dunstan <andrew(dot)dunstan(at)2ndquadrant(dot)com>, PostgreSQL Hackers <pgsql-hackers(at)postgresql(dot)org>, Jeevan Chalke <jeevan(dot)chalke(at)enterprisedb(dot)com>, vignesh C <vignesh21(at)gmail(dot)com>, Thomas Munro <thomas(dot)munro(at)gmail(dot)com> |
Subject: | Re: backup manifests and contemporaneous buildfarm failures |
Date: | 2020-04-04 13:20:51 |
Message-ID: | CA+TgmoZORBcBvvGrQnyA4dfM-Pcy0nPmTzKO-hEGFCKjpcEuWA@mail.gmail.com |
Views: | Raw Message | Whole Thread | Download mbox | Resend email |
Thread: | |
Lists: | pgsql-hackers |
On Fri, Apr 3, 2020 at 11:06 PM Andres Freund <andres(at)anarazel(dot)de> wrote:
> On 2020-04-03 20:48:09 -0400, Robert Haas wrote:
> > 'serinus' is also failing. This is less obviously related:
>
> Hm. Tests passed once since then.
Yeah, but conchuela also failed once in what I think was a similar
way. I suspect the fix I pushed last night
(3e0d80fd8d3dd4f999e0d3aa3e591f480d8ad1fd) may have been enough to
clear this up.
> That already seems suspicious. I checked the following (successful) run
> and I did not see that in the stage's logs.
Yeah, the behavior of the test case doesn't seem to be entirely deterministic.
> I, again, have to say that the amount of stuff that was done as part of
>
> commit 7c4f52409a8c7d85ed169bbbc1f6092274d03920
> Author: Peter Eisentraut <peter_e(at)gmx(dot)net>
> Date: 2017-03-23 08:36:36 -0400
>
> Logical replication support for initial data copy
>
> is insane. Adding support for running sql over replication connections
> and extending CREATE_REPLICATION_SLOT with new options (without even
> mentioning that in the commit message!) as part of a commit described as
> "Logical replication support for initial data copy" shouldn't happen.
I agreed then and still do.
> So I'm a bit confused here. The best approach is probably to try to
> reproduce this by adding an artifical delay into backend shutdown.
I was able to reproduce an assertion failure by starting a
transaction, running a replication command that failed, and then
exiting the backend. 3e0d80fd8d3dd4f999e0d3aa3e591f480d8ad1fd made
that go away. I had wrongly assumed that there was no other way for a
walsender to have a ResourceOwner, and in the face of SQL commands
also being executed by walsenders, that's clearly not true. I'm not
sure *precisely* how that lead to the BF failures, but it was really
clear that it was wrong.
> > (I still really dislike the fact that we have this evil hack allowing
> > one connection to mix and match those sets of commands...)
>
> FWIW, I think the opposite. We should get rid of the difference as much
> as possible.
Well, that's another approach. It's OK to have one system and it's OK
to have two systems, but one and a half is not ideal.
--
Robert Haas
EnterpriseDB: http://www.enterprisedb.com
The Enterprise PostgreSQL Company
From | Date | Subject | |
---|---|---|---|
Next Message | Robert Haas | 2020-04-04 13:34:52 | Re: backup manifests |
Previous Message | Jürgen Purtz | 2020-04-04 12:30:14 | Re: Add A Glossary |