From: | Michael Paquier <michael(at)paquier(dot)xyz> |
---|---|
To: | Bertrand Drouvot <bertranddrouvot(dot)pg(at)gmail(dot)com> |
Cc: | Tomas Vondra <tomas(dot)vondra(at)enterprisedb(dot)com>, Konstantin Knizhnik <knizhnik(at)garret(dot)ru>, Postgres hackers <pgsql-hackers(at)lists(dot)postgresql(dot)org> |
Subject: | Re: Flush pgstats file during checkpoints |
Date: | 2024-07-23 03:52:11 |
Message-ID: | Zp8o6_cl0KSgsnvS@paquier.xyz |
Views: | Raw Message | Whole Thread | Download mbox | Resend email |
Thread: | |
Lists: | pgsql-hackers |
On Mon, Jul 22, 2024 at 07:01:41AM +0000, Bertrand Drouvot wrote:
> 1 ===
> Not related with your patch but this comment in the GetRedoRecPtr() function:
>
> * grabbed a WAL insertion lock to read the authoritative value in
> * Insert->RedoRecPtr
>
> sounds weird. Should'nt that be s/Insert/XLogCtl/?
No, the comment is right. We are retrieving a copy of
Insert->RedoRecPtr here.
> 2 ===
>
> + /* Write the redo LSN, used to cross check the file loaded */
>
> Nit: s/loaded/read/?
WFM.
> 3 ===
>
> + /*
> + * Read the redo LSN stored in the file.
> + */
> + if (!read_chunk_s(fpin, &file_redo) ||
> + file_redo != redo)
> + goto error;
>
> I wonder if it would make sense to have dedicated error messages for
> "file_redo != redo" and for "format_id != PGSTAT_FILE_FORMAT_ID". That would
> ease to diagnose as to why the stat file is discarded.
Yep. This has been itching me quite a bit, and that's a bit more than
just the format ID or the redo LSN: it relates to all the read_chunk()
callers. I've taken a shot at this with patch 0001, implemented on
top of the rest. Adjusted as well the redo LSN read to have more
error context, now in 0002.
> Looking at 0003:
>
> 4 ===
>
> @@ -5638,10 +5634,7 @@ StartupXLOG(void)
> * TODO: With a bit of extra work we could just start with a pgstat file
> * associated with the checkpoint redo location we're starting from.
> */
> - if (didCrash)
> - pgstat_discard_stats();
> - else
> - pgstat_restore_stats(checkPoint.redo);
> + pgstat_restore_stats(checkPoint.redo)
>
> remove the TODO comment?
Pretty sure I've removed that more than one time already, and that
this is a rebase accident. Thanks for noticing.
> 5 ===
>
> + * process) if the stats file has a redo LSN that matches with the .
>
> unfinished sentence?
This is missing a reference to the control file.
> 6 ===
>
> - * Should only be called by the startup process or in single user mode.
> + * This is called by the checkpointer or in single-user mode.
> */
> void
> -pgstat_discard_stats(void)
> +pgstat_flush_stats(XLogRecPtr redo)
> {
>
> Would that make sense to add an Assert in pgstat_flush_stats()? (checking what
> the above comment states).
There is one in pgstat_write_statsfile(), not sure there is a point in
duplicating the assertion in both.
Attaching a new v4 series, with all these comments addressed.
--
Michael
Attachment | Content-Type | Size |
---|---|---|
v4-0001-Revert-Test-that-vacuum-removes-tuples-older-than.patch | text/x-diff | 12.3 KB |
v4-0002-Add-more-debugging-information-when-reading-stats.patch | text/x-diff | 4.7 KB |
v4-0003-Add-redo-LSN-to-pgstats-file.patch | text/x-diff | 4.7 KB |
v4-0004-Flush-pgstats-file-during-checkpoints.patch | text/x-diff | 10.8 KB |
From | Date | Subject | |
---|---|---|---|
Next Message | jian he | 2024-07-23 04:03:57 | Re: Virtual generated columns |
Previous Message | Amit Kapila | 2024-07-23 03:35:05 | Re: pg_upgrade and logical replication |