Re: FileFallocate misbehaving on XFS

From: Andres Freund <andres(at)anarazel(dot)de>
To: Robert Haas <robertmhaas(at)gmail(dot)com>
Cc: Michael Harris <harmic(at)gmail(dot)com>, Tomas Vondra <tomas(at)vondra(dot)me>, PostgreSQL-development <pgsql-hackers(at)postgresql(dot)org>
Subject: Re: FileFallocate misbehaving on XFS
Date: 2024-12-11 16:34:07
Message-ID: 6kdvplm4lermcfcgle3meg2bj7wgzwsym37deiwvpsrvv7pv3l@wsjrnj2wxbew
Views: Raw Message | Whole Thread | Download mbox | Resend email
Thread:
Lists: pgsql-hackers

Hi,

On 2024-12-10 16:33:06 -0500, Andres Freund wrote:
> Maybe. I think we would have gotten a lot more reports if it were common. I
> know of quite a few very busy installs using xfs.
>
> I think there must be some as-of-yet-unknown condition gating it. E.g. that
> the filesystem has been created a while ago and has some now-on-by-default
> options disabled.
>
>
> > > I think the source of this needs to be debugged further before we try to apply
> > > workarounds in postgres.
> >
> > Why? It seems to me that this has to be a filesystem bug,
>
> Adding workarounds for half-understood problems tends to lead to code that we
> can't evolve in the future, as we a) don't understand b) can't reproduce the
> problem.
>
> Workarounds could also mask some bigger / worse issues. We e.g. have blamed
> ext4 for a bunch of bugs that then turned out to be ours in the past. But we
> didn't look for a long time, because it was convenient to just blame ext4.

>
> > and we should almost certainly adopt one of these ideas from Michael Harris:
> >
> > - Providing a way to configure PG not to use posix_fallocate at runtime
>
> I'm not strongly opposed to that. That's testable without access to an
> affected system. I wouldn't want to automatically do that when detecting an
> affected system though, that'll make behaviour way less predictable.
>
>
> > - In the case of posix_fallocate failing with ENOSPC, fall back to
> > FileZero (worst case that will fail as well, in which case we will
> > know that we really are out of space)
>
> I doubt that that's a good idea. What if fallocate failing is an indicator of
> a problem? What if you turn on AIO + DIO and suddenly get a much more
> fragmented file?

One thing that I think we should definitely do is to include more detail in
the error message. mdzeroextend()'s error messages don't include how many
blocks the relation was to be extended by. Neither mdextend() nor
mdzeroextend() include the offset at which the extension failed.

I'm not entirely sure about the phrasing though, we have a somewhat confusing
mix of blocks and bytes in messages.

Perhaps some of information should be in an errdetail, but I admit I'm a bit
hesitant about doing so for crucial details. I find that often only the
primary error message is available when debugging problems encountered by
others.

Maybe something like
/* translator: second %s is a function name like FileAllocate() */
could not extend file \"%s\" by %u blocks, from %llu to %llu bytes, using %s: %m
or
could not extend file \"%s\" using %s by %u blocks, from its current size of %u blocks: %m
or
could not extend file \"%s\" using %s by %u blocks/%llu bytes from its current size of %llu bytes: %m

If we want to use errdetail() judiciously, we could go for something like
errmsg("could not extend file \"%s\" by %u blocks, using %s: %m", ...
errdetail("Failed to extend file from %u blocks/%llu bytes to %u blocks / %llu bytes.", ...)

I think it might also be good - this is a slightly more complicated project -
to report the amount of free space the filesystem reports when we hit
ENOSPC. I have dealt with cases of the FS transiently filling up way too many
times, and it's always a pain to figure that out.

Greetings,

Andres Freund

In response to

Responses

Browse pgsql-hackers by date

  From Date Subject
Next Message Robert Haas 2024-12-11 16:34:31 Re: Proposal to add a new URL data type.
Previous Message Alexander Borisov 2024-12-11 16:04:40 Re: Proposal to add a new URL data type.