From: | Andres Freund <andres(at)anarazel(dot)de> |
---|---|
To: | Christoph Berg <myon(at)debian(dot)org>, Thomas Munro <thomas(dot)munro(at)gmail(dot)com> |
Cc: | pgsql-hackers(at)lists(dot)postgresql(dot)org |
Subject: | Re: could not extend file "base/5/3501" with FileFallocate(): Interrupted system call |
Date: | 2023-04-24 22:32:25 |
Message-ID: | 20230424223225.gle3x3sw5m4div2l@awork3.anarazel.de |
Views: | Raw Message | Whole Thread | Download mbox | Resend email |
Thread: | |
Lists: | pgsql-committers pgsql-hackers |
Hi,
On 2023-04-24 10:53:35 +0200, Christoph Berg wrote:
> I'm often seeing PG16 builds erroring out in the pgbench tests:
Interesting!
> I don't think the disk is full since it's always hitting that same
> spot, on some of the builds:
Yea, the EINTR pretty clearly indicates that it's not really out-of-space.
> https://pgdgbuild.dus.dg-i.net/job/postgresql-16-binaries-snapshot/833/
>
> This is overlayfs with tmpfs (upper)/ext4 (lower). Manually running
> that test works though, and the FS seems to support posix_fallocate:
I guess it requires a bunch of memory (?) pressure for this to happen
(triggering blocking during fallocate, opening the window for a signal to
arrive), which likely only happens when running things concurrently.
We obviously can add a retry loop to FileFallocate(), similar to what's
already present e.g. in FileRead(). But I wonder if we shouldn't go a bit
further, and do it for all the fd.c routines where it's remotely plausible
EINTR could be returned? It's a bit silly to add EINTR retries one-by-one to
the functions.
The following are documented to potentially return EINTR, without fd.c having
code to retry:
- FileWriteback() / pg_flush_data()
- FileSync() / pg_fsync()
- FileFallocate()
- FileTruncate()
With the first two there's the added complication that it's not entirely
obvious whether it'd be better to handle this in File* or pg_*. I'd argue the
latter is a bit more sensible?
Greetings,
Andres Freund
From | Date | Subject | |
---|---|---|---|
Next Message | Andres Freund | 2023-04-25 00:16:23 | Re: could not extend file "base/5/3501" with FileFallocate(): Interrupted system call |
Previous Message | Andres Freund | 2023-04-24 20:03:08 | pgsql: Remove vacuum_defer_cleanup_age |
From | Date | Subject | |
---|---|---|---|
Next Message | Melanie Plageman | 2023-04-24 22:36:24 | Re: pg_stat_io not tracking smgrwriteback() is confusing |
Previous Message | Andres Freund | 2023-04-24 22:15:59 | Re: Remove io prefix from pg_stat_io columns |