From: | Thomas Munro <thomas(dot)munro(at)gmail(dot)com> |
---|---|
To: | Michael Banck <michael(dot)banck(at)credativ(dot)de> |
Cc: | PostgreSQL Development <pgsql-hackers(at)postgresql(dot)org> |
Subject: | Re: Add a log message on recovery startup before syncing datadir |
Date: | 2020-10-07 08:11:46 |
Message-ID: | CA+hUKGLRY+C5vxPQJ45Vxww-68jF0w9hXm8xhbVdiZPjZC1KqA@mail.gmail.com |
Views: | Raw Message | Whole Thread | Download mbox | Resend email |
Thread: | |
Lists: | pgsql-hackers |
On Wed, Oct 7, 2020 at 8:58 PM Michael Banck <michael(dot)banck(at)credativ(dot)de> wrote:
> we had a customer incident recently where they needed to do a PITR.
> Their data directory is on a NetApp NFS and they have several hundred
> databases in their instance. The startup sync (i.e. before the message
> "starting archive recovery" appears) took 20 minutes and during the
Nice data point.
> first try[1] they were wondering what's going on because there is just
> one log message ("database system was interrupted; last known up at
> ...") and the postmaster process is in state 'D'. Attaching strace
> revealed that it was syncing files and due to the NFS performance that
> took a long time.
No objection to adding a message, but see also this other thread,
about potential ways to get rid of that sync completely, or at least
the phase where you have to open all the files one by one:
Also, maybe of interest for PITR use cases, see this other thread
about relaxing the end-of-recovery checkpoint (well the patch doesn't
do that yet but it'd be a small step to not wait for it, based on a
GUC, once the checkpointer is running):
From | Date | Subject | |
---|---|---|---|
Next Message | Michael Paquier | 2020-10-07 08:18:52 | Re: [patch] Fix checksum verification in base backups for zero page headers |
Previous Message | Michael Banck | 2020-10-07 07:59:39 | Add a log message on recovery startup before syncing datadir |