From: | Nathan Bossart <nathandbossart(at)gmail(dot)com> |
---|---|
To: | Michael Paquier <michael(at)paquier(dot)xyz> |
Cc: | Andres Freund <andres(at)anarazel(dot)de>, Robert Haas <robertmhaas(at)gmail(dot)com>, pgsql-hackers(at)postgresql(dot)org |
Subject: | Re: recovery modules |
Date: | 2023-03-15 04:13:09 |
Message-ID: | 20230315041309.GA596995@nathanxps13 |
Views: | Raw Message | Whole Thread | Download mbox | Resend email |
Thread: | |
Lists: | pgsql-hackers |
I noticed that the new TAP test for basic_archive was failing
intermittently for cfbot. It looks like the query for checking that the
post-backup WAL is restored sometimes executes before archive recovery is
complete (because hot_standby is on). To fix this, I adjusted the test to
use poll_query_until instead. There are no other changes in v14.
I first tried to set hot_standby to off on the restored node so that the
query wouldn't run until archive recovery completed. This seemed like it
would work because start() useѕ "pg_ctl --wait", which has the following
note in the docs:
Startup is considered complete when the PID file indicates that the
server is ready to accept connections.
However, that's not what happens when hot_standby is off. In that case,
the postmaster.pid file is updated with PM_STATUS_STANDBY once recovery
starts, which wait_for_postmaster_start() interprets as "ready." I see
this was reported before [0], but that discussion fizzled out. IIUC it was
done this way to avoid infinite waits when hot_standby is off and standby
mode is enabled. I could be missing something obvious, but that doesn't
seem necessary when hot_standby is off and recovery mode is enabled because
recovery should end at some point (never mind the halting problem). I'm
still digging into this and may spin off a new thread if I can conjure up a
proposal.
[0] https://postgr.es/m/CAMkU%3D1wrMqPggnEfszE-c3PPLmKgRK17_qr7tmxBECYEbyV-4Q%40mail.gmail.com
--
Nathan Bossart
Amazon Web Services: https://aws.amazon.com
Attachment | Content-Type | Size |
---|---|---|
v14-0001-Move-extra-code-out-of-the-Pre-PostRestoreComman.patch | text/x-diff | 2.1 KB |
v14-0002-Don-t-proc_exit-in-startup-s-SIGTERM-handler-if-.patch | text/x-diff | 4.5 KB |
v14-0003-introduce-routine-for-checking-mutually-exclusiv.patch | text/x-diff | 2.9 KB |
v14-0004-refactor-code-for-restoring-via-shell.patch | text/x-diff | 28.2 KB |
v14-0005-rename-archive-modules.sgml-to-archive-and-resto.patch | text/x-diff | 1.8 KB |
v14-0006-restructure-archive-modules-docs-in-preparation-.patch | text/x-diff | 11.5 KB |
v14-0007-introduce-restore_library.patch | text/x-diff | 70.3 KB |
From | Date | Subject | |
---|---|---|---|
Next Message | Nathan Bossart | 2023-03-15 04:23:48 | Re: psql \watch 2nd argument: iteration count |
Previous Message | Michael Paquier | 2023-03-15 04:09:34 | Re: psql \watch 2nd argument: iteration count |