Re: Checking for missing heap/index files

From: Robert Haas <robertmhaas(at)gmail(dot)com>
To: Stephen Frost <sfrost(at)snowman(dot)net>
Cc: Tom Lane <tgl(at)sss(dot)pgh(dot)pa(dot)us>, Alvaro Herrera <alvherre(at)alvh(dot)no-ip(dot)org>, Bruce Momjian <bruce(at)momjian(dot)us>, PostgreSQL-development <pgsql-hackers(at)postgresql(dot)org>
Subject: Re: Checking for missing heap/index files
Date: 2022-10-18 19:14:41
Message-ID: CA+TgmoZHBOZOtQa6T+yGWwKW6R=Ey2pWZLnMQvhjp0x7dC5=JQ@mail.gmail.com
Views: Raw Message | Whole Thread | Download mbox | Resend email
Thread:
Lists: pgsql-hackers

On Tue, Oct 18, 2022 at 2:37 PM Stephen Frost <sfrost(at)snowman(dot)net> wrote:
> While I don't think it's really something that should be happening, it's
> definitely something that's been seen with some networked filesystems,
> as reported.

Do you have clear and convincing evidence of this happening on
anything other than CIFS?

> I don't see it as likely to be acceptable, but arranging to not add or
> remove files while the scan is happening would presumably eliminate the
> risk entirely. We've not seen this issue recur in the expire command
> since the change to first completely scan the directory and then go and
> remove the files from it. Perhaps just not removing files during the
> scan would be sufficient which might be more reasonable to do.

I don't think that's a complete non-starter, but I do think it would
be somewhat expensive in some workloads. I hate to make everyone pay
that much for insurance against a shouldn't-happen case. We could make
it optional, but then we're asking users to decide whether or not they
need insurance. Since we don't even know which filesystems are
potentially affected, how is anyone else supposed to know? Worse
still, if you have a corruption event, you're still not going to know
for sure whether this would have fixed it, so you still don't know
whether you should turn on the feature for next time. And if you do
turn it on and don't get corruption again, you don't know whether you
would have had a problem if you hadn't used the feature. It all just
seems like a lot of guesswork that will end up being frustrating to
both users and developers.

Just deciding to cache to the results of readdir() in memory is much
cheaper insurance. I think I'd probably be willing to foist that
overhead onto everyone, all the time. As I mentioned before, it could
still hose someone who is right on the brink of a memory disaster, but
that's a much narrower blast radius than putting locking around all
operations that create or remove a file in the same directory as a
relation file. But it's also not a complete fix, which sucks.

--
Robert Haas
EDB: http://www.enterprisedb.com

In response to

Responses

Browse pgsql-hackers by date

  From Date Subject
Next Message Robert Haas 2022-10-18 19:18:28 Re: Exponentiation confusion
Previous Message Stephen Frost 2022-10-18 18:37:36 Re: Checking for missing heap/index files