Re: Avoid erroring out when unable to remove or parse logical rewrite files to save checkpoint work

From: Nathan Bossart <nathandbossart(at)gmail(dot)com>
To: Thomas Munro <thomas(dot)munro(at)gmail(dot)com>
Cc: Andres Freund <andres(at)anarazel(dot)de>, Bharath Rupireddy <bharath(dot)rupireddyforpostgres(at)gmail(dot)com>, "Bossart, Nathan" <bossartn(at)amazon(dot)com>, Tom Lane <tgl(at)sss(dot)pgh(dot)pa(dot)us>, Julien Rouhaud <rjuju123(at)gmail(dot)com>, PostgreSQL Hackers <pgsql-hackers(at)lists(dot)postgresql(dot)org>
Subject: Re: Avoid erroring out when unable to remove or parse logical rewrite files to save checkpoint work
Date: 2022-03-29 22:48:32
Message-ID: 20220329224832.GA560657@nathanxps13
Views: Raw Message | Whole Thread | Download mbox | Resend email
Thread:
Lists: pgsql-hackers

Thanks for taking a look!

On Thu, Mar 24, 2022 at 01:17:01PM +1300, Thomas Munro wrote:
> /* we're only handling directories here, skip if it's not ours */
> - if (lstat(path, &statbuf) == 0 && !S_ISDIR(statbuf.st_mode))
> + if (lstat(path, &statbuf) != 0)
> + ereport(ERROR,
> + (errcode_for_file_access(),
> + errmsg("could not stat file \"%s\": %m", path)));
> + else if (!S_ISDIR(statbuf.st_mode))
> return;
>
> Why is this a good place to silently ignore non-directories?
> StartupReorderBuffer() is already in charge of skipping random
> detritus found in the directory, so would it be better to do "if
> (get_dirent_type(...) != PGFILETYPE_DIR) continue" there, and then
> drop the lstat() stanza from ReorderBufferCleanupSeralizedTXNs()
> completely? Then perhaps its ReadDirExtended() shoud be using ERROR
> instead of INFO, so that missing/non-dir/b0rked directories raise an
> error.

My guess is that this was done because ReorderBufferCleanupSerializedTXNs()
is also called from ReorderBufferAllocate() and ReorderBufferFree().
However, it is odd that we just silently return if the slot path isn't a
directory in those cases. I think we could use get_dirent_type() in
StartupReorderBuffer() as you suggested, and then we could let ReadDir()
ERROR for non-directories for the other callers of
ReorderBufferCleanupSerializedTXNs(). WDYT?

> I don't understand why it's reporting readdir() errors at INFO
> but unlink() errors at ERROR, and as far as I can see the other paths
> that reach this code shouldn't be sending in paths to non-directories
> here unless something is seriously busted and that's ERROR-worthy.

I agree. I'll switch it to ReadDir() in the next revision so that we ERROR
instead of INFO.

--
Nathan Bossart
Amazon Web Services: https://aws.amazon.com

In response to

Responses

Browse pgsql-hackers by date

  From Date Subject
Next Message Andres Freund 2022-03-29 23:16:41 Re: Add parameter jit_warn_above_fraction
Previous Message Tom Lane 2022-03-29 22:38:30 Re: Frontend error logging style