From: | Bharath Rupireddy <bharath(dot)rupireddyforpostgres(at)gmail(dot)com> |
---|---|
To: | Nathan Bossart <nathandbossart(at)gmail(dot)com> |
Cc: | Andres Freund <andres(at)anarazel(dot)de>, "Bossart, Nathan" <bossartn(at)amazon(dot)com>, Tom Lane <tgl(at)sss(dot)pgh(dot)pa(dot)us>, Julien Rouhaud <rjuju123(at)gmail(dot)com>, PostgreSQL Hackers <pgsql-hackers(at)lists(dot)postgresql(dot)org> |
Subject: | Re: Avoid erroring out when unable to remove or parse logical rewrite files to save checkpoint work |
Date: | 2022-02-02 11:49:26 |
Message-ID: | CALj2ACXzMf+8-KoREutAA40MrC4sMEY2P1aLFD_RYysC1M2CZQ@mail.gmail.com |
Views: | Raw Message | Whole Thread | Download mbox | Resend email |
Thread: | |
Lists: | pgsql-hackers |
On Wed, Feb 2, 2022 at 5:25 AM Nathan Bossart <nathandbossart(at)gmail(dot)com> wrote:
>
> On Mon, Jan 31, 2022 at 10:42:54AM +0530, Bharath Rupireddy wrote:
> > After an off-list discussion with Andreas, proposing here a patch that
> > basically replaces ReadDir call with ReadDirExtended and gets rid of
> > lstat entirely. With this chance, the checkpoint will only care about
> > the snapshot and mapping files and not fail if it finds other files in
> > the directories. Removing lstat enables us to make things faster as we
> > avoid a bunch of extra system calls - one lstat call per each mapping
> > or snapshot file.
>
> I think removing the lstat() is probably reasonable. We currently aren't
> doing proper error checking, and the chances of a non-regular file matching
> the prefix are likely pretty low. In the worst case, we'll LOG or ERROR
> when unlinking or fsyncing fails.
>
> However, I'm not sure about the change to ReadDirExtended(). That might be
> okay for CheckPointSnapBuild(), which is just trying to remove old files,
> but CheckPointLogicalRewriteHeap() is responsible for ensuring that files
> are flushed to disk for the checkpoint. If we stop reading the directory
> after an error and let the checkpoint continue, isn't it possible that some
> mappings files won't be persisted to disk?
Unless I mis-read your above statement, with LOG level in
ReadDirExtended, I don't think we stop reading the files in
CheckPointLogicalRewriteHeap. Am I missing something here?
Since, we also continue in CheckPointLogicalRewriteHeap if we can't
parse/delete some files with the change of "could not parse
filename"/"could not remove file" messages to LOG level
I'm attaching v6, just changed elog(LOG, to ereport(LOG in
CheckPointLogicalRewriteHeap, other things remain the same.
Regards,
Bharath Rupireddy.
Attachment | Content-Type | Size |
---|---|---|
v6-0001-Replace-ReadDir-with-ReadDirExtended.patch | application/octet-stream | 4.0 KB |
From | Date | Subject | |
---|---|---|---|
Next Message | Amit Kapila | 2022-02-02 12:08:37 | Re: Design of pg_stat_subscription_workers vs pgstats |
Previous Message | Teodor Sigaev | 2022-02-02 10:13:29 | Re: CREATE TABLE ( .. STORAGE ..) |