Re: Checking for missing heap/index files

From: Stephen Frost <sfrost(at)snowman(dot)net>
To: Robert Haas <robertmhaas(at)gmail(dot)com>
Cc: Tom Lane <tgl(at)sss(dot)pgh(dot)pa(dot)us>, Alvaro Herrera <alvherre(at)alvh(dot)no-ip(dot)org>, Bruce Momjian <bruce(at)momjian(dot)us>, PostgreSQL-development <pgsql-hackers(at)postgresql(dot)org>
Subject: Re: Checking for missing heap/index files
Date: 2022-10-18 18:37:36
Message-ID: Y07ycEcWop2AMMok@tamriel.snowman.net
Views: Raw Message | Whole Thread | Download mbox | Resend email
Thread:
Lists: pgsql-hackers

Greetings,

* Robert Haas (robertmhaas(at)gmail(dot)com) wrote:
> On Tue, Oct 18, 2022 at 12:59 PM Tom Lane <tgl(at)sss(dot)pgh(dot)pa(dot)us> wrote:
> > There is no text suggesting that it's okay to miss, or to double-return,
> > an entry that is present throughout the scan. So I'd interpret the case
> > you're worried about as "forbidden by POSIX". Of course, it's known that
> > NFS fails to provide POSIX semantics in all cases --- but I don't know
> > if this is one of them.
>
> Yeah, me neither. One problem I see is that, even if the behavior is
> forbidden by POSIX, if it happens in practice on systems people
> actually use, then it's an issue. We even have documentation saying
> that it's OK to use NFS, and a lot of people do -- which IMHO is
> unfortunate, but it's also not clear what the realistic alternatives
> are. It's pretty hard to tell people in 2022 that they are only
> allowed to use PostgreSQL with local storage.
>
> But to put my cards on the table, it's not so much that I am worried
> about this problem myself as that I want to know whether we're going
> to do anything about it as a project, and if so, what, because it
> intersects a patch that I'm working on. So if we want to readdir() in
> one fell swoop and cache the results, I'm going to go write a patch
> for that. If we don't, then I'd like to know whether (a) we think that
> would be theoretically acceptable but not justified by the evidence
> presently available or (b) would be unacceptable due to (b1) the
> potential for increased memory usage or (b2) some other reason.

While I don't think it's really something that should be happening, it's
definitely something that's been seen with some networked filesystems,
as reported. I also strongly suspect that on local filesystems there's
something that prevents this from happening but as mentioned that
doesn't cover all PG use cases.

In pgbackrest, we moved to doing a scan and cache'ing all of the results
in memory to reduce the risk when reading from the PG data dir. We also
reworked our expire code (which removes an older backup from the backup
repository) to also do a complete scan before removing files.

I don't see it as likely to be acceptable, but arranging to not add or
remove files while the scan is happening would presumably eliminate the
risk entirely. We've not seen this issue recur in the expire command
since the change to first completely scan the directory and then go and
remove the files from it. Perhaps just not removing files during the
scan would be sufficient which might be more reasonable to do.

Thanks,

Stephen

In response to

Responses

Browse pgsql-hackers by date

  From Date Subject
Next Message Robert Haas 2022-10-18 19:14:41 Re: Checking for missing heap/index files
Previous Message Peter Geoghegan 2022-10-18 17:56:03 Re: effective_multixact_freeze_max_age issue