| From: | Tom Lane <tgl(at)sss(dot)pgh(dot)pa(dot)us> |
|---|---|
| To: | Andres Freund <andres(at)anarazel(dot)de> |
| Cc: | Justin Pryzby <pryzby(at)telsasoft(dot)com>, Robert Haas <robertmhaas(at)gmail(dot)com>, Andrew Dunstan <andrew(at)dunslane(dot)net>, pgsql-hackers(at)postgresql(dot)org, Thomas Munro <thomas(dot)munro(at)gmail(dot)com>, Melanie Plageman <melanieplageman(at)gmail(dot)com>, Peter Eisentraut <peter(dot)eisentraut(at)enterprisedb(dot)com>, Daniel Gustafsson <daniel(at)yesql(dot)se> |
| Subject: | Re: Adding CI to our tree |
| Date: | 2022-01-19 04:54:12 |
| Message-ID: | 450972.1642568052@sss.pgh.pa.us |
| Views: | Whole Thread | Raw Message | Download mbox | Resend email |
| Thread: | |
| Lists: | pgsql-hackers |
Andres Freund <andres(at)anarazel(dot)de> writes:
> On 2022-01-18 21:50:07 -0500, Tom Lane wrote:
>> This actually causes parallel check-world to fail altogether on florican's
>> host, because the initial fsync of the recovered primary takes more than 3
>> minutes when there's conflicting I/O traffic, causing pg_ctl to time out.
> Ugh.
I misspoke there: it's the standby that is performing an fsync'd
checkpoint and timing out, during the test's promote-the-standby
step.
This test attempt revealed another problem too: the standby never
shut down, and thus the calling "make" never quit, until I intervened
manually. I'm not sure why. I see that Cluster::promote uses
system_or_bail() to run "pg_ctl promote" ... could it be that
BAIL_OUT causes the normal script END hooks to not get run?
But it seems like we'd have noticed that long ago.
regards, tom lane
| From | Date | Subject | |
|---|---|---|---|
| Next Message | Dilip Kumar | 2022-01-19 05:07:21 | Re: Make relfile tombstone files conditional on WAL level |
| Previous Message | Takashi Menjo | 2022-01-19 04:41:11 | Re: Map WAL segment files on PMEM as WAL buffers |