From: | Andres Freund <andres(at)anarazel(dot)de> |
---|---|
To: | Thomas Munro <thomas(dot)munro(at)gmail(dot)com> |
Cc: | Justin Pryzby <pryzby(at)telsasoft(dot)com>, pgsql-hackers <pgsql-hackers(at)postgresql(dot)org>, Shawn Debnath <sdn(at)amazon(dot)com> |
Subject: | Re: checkpointer: PANIC: could not fsync file: No such file or directory |
Date: | 2024-12-13 20:55:42 |
Message-ID: | nyj4k7yur5t27rtygvx2i2lrlp6rqfvvhoiiwx4fznynksf2et@4hj2sp42alpe |
Views: | Raw Message | Whole Thread | Download mbox | Resend email |
Thread: | |
Lists: | pgsql-hackers |
Hi,
On 2019-12-13 17:41:56 +1300, Thomas Munro wrote:
> From 9609c9a153232fb2de169bf76158781d354c633b Mon Sep 17 00:00:00 2001
> From: Thomas Munro <tmunro(at)postgresql(dot)org>
> Date: Fri, 13 Dec 2019 17:12:42 +1300
> Subject: [PATCH] Don't use _mdfd_getseg() in mdsyncfiletag().
>
> _mdfd_getseg() opens all segments up to the requested one. That
> causes problems for mdsyncfiletag(), if mdunlinkfork() has
> already unlinked other segment files. Open the file we want
> directly by name instead, if we don't have it already.
>
> The consequence of this bug was a rare panic in the checkpointer,
> made more likely if you saturated the sync request queue so that
> the SYNC_FORGET_REQUEST messages for a given relation were more
> likely to be absorbed in separate cycles by the checkpointer.
>
> Back-patch to 12. Defect in commit 3eb77eba.
>
> Author: Thomas Munro
> Reported-by: Justin Pryzby
> Discussion: https://postgr.es/m/20191119115759.GI30362%40telsasoft.com
> ---
> src/backend/storage/smgr/md.c | 56 ++++++++++++++++++++++++-----------
> 1 file changed, 39 insertions(+), 17 deletions(-)
>
> diff --git a/src/backend/storage/smgr/md.c b/src/backend/storage/smgr/md.c
> index 8a9eaf6430..1a7b91b523 100644
> --- a/src/backend/storage/smgr/md.c
> +++ b/src/backend/storage/smgr/md.c
> @@ -1280,25 +1280,47 @@ int
> mdsyncfiletag(const FileTag *ftag, char *path)
...
> - /* Try to open the requested segment. */
> - v = _mdfd_getseg(reln,
> - ftag->forknum,
> - ftag->segno * (BlockNumber) RELSEG_SIZE,
> - false,
> - EXTENSION_RETURN_NULL | EXTENSION_DONT_CHECK_SIZE);
> - if (v == NULL)
> - return -1;
> -
> - /* Try to fsync the file. */
> - return FileSync(v->mdfd_vfd, WAIT_EVENT_DATA_FILE_SYNC);
> + return result;
> }
This was the only user of EXTENSION_DONT_CHECK_SIZE. It's not generally safe,
as its comment says:
/*
* Allow opening segments which are preceded by segments smaller than
* RELSEG_SIZE, e.g. inactive segments (see above). Note that this breaks
* mdnblocks() and related functionality henceforth - which currently is ok,
* because this is only required in the checkpointer which never uses
* mdnblocks().
*/
But since the above change checkpointer doesn't use EXTENSION_DONT_CHECK_SIZE
anymore.
Seems we should remove this code?
Greetings,
Andres
From | Date | Subject | |
---|---|---|---|
Next Message | Sergei Kornilov | 2024-12-13 21:06:37 | Re: Track the amount of time waiting due to cost_delay |
Previous Message | Vladlen Popolitov | 2024-12-13 20:36:23 | Re: COPY performance on Windows |