From: | Thomas Munro <thomas(dot)munro(at)gmail(dot)com> |
---|---|
To: | Robert Haas <robertmhaas(at)gmail(dot)com> |
Cc: | pgsql-hackers <pgsql-hackers(at)postgresql(dot)org>, Simon Riggs <simon(at)2ndquadrant(dot)com>, Heikki Linnakangas <hlinnaka(at)iki(dot)fi>, Jakub Wartak <Jakub(dot)Wartak(at)tomtom(dot)com> |
Subject: | Re: Background writer and checkpointer in crash recovery |
Date: | 2021-03-12 22:16:00 |
Message-ID: | CA+hUKGKBngNkTLm1gSmosY6o-azfUvQxALaBSEOjs6KpKYQ9pg@mail.gmail.com |
Views: | Raw Message | Whole Thread | Download mbox | Resend email |
Thread: | |
Lists: | pgsql-hackers |
On Wed, Feb 3, 2021 at 11:11 AM Robert Haas <robertmhaas(at)gmail(dot)com> wrote:
> I think the way it works right now is stupid and the proposed change
> is going in the right direction. We have ample evidence already that
> handing off fsyncs to a background process is a good idea, and there's
> no reason why that shouldn't be beneficial during crash recovery just
> as it is at other times. But even if it somehow failed to improve
> performance during recovery, there's another good reason to do this,
> which is that it would make the code simpler. Having the pendingOps
> stuff in the startup process in some recovery situations and in the
> checkpointer in other recovery situations makes this harder to reason
> about. As Tom said, the system state where bgwriter and checkpointer
> are not running is an uncommon one, and is probably more likely to
> have (or grow) bugs than the state where they are running.
Yeah, it's a good argument.
> The rat's-nest of logic introduced by the comment "Perform a
> checkpoint to update all our recovery activity to disk." inside
> StartupXLOG() could really do with some simplification. Right now we
> have three cases: CreateEndOfRecoveryRecord(), RequestCheckpoint(),
> and CreateCheckpoint(). Maybe with this change we could get it down to
> just two, since RequestCheckpoint() already knows what to do about
> !IsUnderPostmaster.
True. Done in this version.
Here's a rebase of this patch + Simon's patch to report on stats.
I also have a sketch patch to provide a GUC that turns off the
end-of-recovery checkpoint as mentioned earlier, attached (sharing
mainly because this is one of the stack of patches that Jakub was
testing for his baseback/PITR workloads and he might want to test some
more), but I'm not proposing that for PG14. That idea is tangled up
with the "relfile tombstone" stuff I wrote about elsewhere[1], but I
haven't finished studying the full arsenal of footguns in that area
(it's something like: if we don't wait for end-of-recovery checkpoint
before allowing connections, then we'd better start creating
tombstones in recovery unless the WAL level is high enough to avoid
data eating hazards with unlogged changes and a double crash).
Attachment | Content-Type | Size |
---|---|---|
v2-0001-Run-checkpointer-and-bgworker-in-crash-recovery.patch | text/x-patch | 8.7 KB |
v2-0002-Log-buffer-stats-after-crash-recovery.patch | text/x-patch | 1.8 KB |
v2-0003-Optionally-don-t-wait-for-end-of-recovery-checkpo.patch | text/x-patch | 3.3 KB |
From | Date | Subject | |
---|---|---|---|
Next Message | Tomas Vondra | 2021-03-12 22:16:11 | Re: VACUUM (DISABLE_PAGE_SKIPPING on) |
Previous Message | Peter Geoghegan | 2021-03-12 22:08:49 | Re: pg_amcheck contrib application |