From: | Tomas Vondra <tomas(at)vondra(dot)me> |
---|---|
To: | Michael Harris <harmic(at)gmail(dot)com>, PostgreSQL-development <pgsql-hackers(at)postgresql(dot)org> |
Subject: | Re: FileFallocate misbehaving on XFS |
Date: | 2024-12-09 10:06:13 |
Message-ID: | 8aa1d1d7-645f-404b-a8f8-7c49be9acd27@vondra.me |
Views: | Raw Message | Whole Thread | Download mbox | Resend email |
Thread: | |
Lists: | pgsql-hackers |
On 12/9/24 08:34, Michael Harris wrote:
> Hello PG Hackers
>
> Our application has recently migrated to PG16, and we have experienced
> some failed upgrades. The upgrades are performed using pg_upgrade and
> have failed during the phase where the schema is restored into the new
> cluster, with the following error:
>
> pg_restore: error: could not execute query: ERROR: could not extend
> file "pg_tblspc/16401/PG_16_202307071/17643/1249.1" with
> FileFallocate(): No space left on device
> HINT: Check free disk space.
>
> This has happened multiple times on different servers, and in each
> case there was plenty of free space available.
>
> We found this thread describing similar issues:
>
> https://www.postgresql.org/message-id/flat/AS1PR05MB91059AC8B525910A5FCD6E699F9A2%40AS1PR05MB9105.eurprd05.prod.outlook.com
>
> As is the case in that thread, all of the affected databases are using XFS.
>
> One of my colleagues built postgres from source with
> HAVE_POSIX_FALLOCATE not defined, and using that build he was able to
> complete the pg_upgrade, and then switched to a stock postgres build
> after the upgrade. However, as you might expect, after the upgrade we
> have experienced similar errors during regular operation. We make
> heavy use of COPY, which is mentioned in the above discussion as
> pre-allocating files.
>
> We have seen this on both Rocky Linux 8 (kernel 4.18.0) and Rocky
> Linux 9 (Kernel 5.14.0).
>
> I am wondering if this bug might be related:
> https://bugs.launchpad.net/ubuntu/+source/linux/+bug/1791323
>
>> When given an offset of 0 and a length, fallocate (man 2 fallocate) reports ENOSPC if the size of the file + the length to be allocated is greater than the available space.
>
> There is a reproduction procedure at the bottom of the above ubuntu
> thread, and using that procedure I get the same results on both kernel
> 4.18.0 and 5.14.0.
> When calling fallocate with offset zero on an existing file, I get
> enospc even if I am only requesting the same amount of space as the
> file already has.
> If I repeat the experiment with ext4 I don't get that behaviour.
>
> On a surface examination of the code paths leading to the
> FileFallocate call, it does not look like it should be trying to
> allocate already allocated space, but I might have missed something
> there.
>
> Is this already being looked into?
>
Sounds more like an XFS bug/behavior, so it's not clear to me what we
could do about it. I mean, if the filesystem reports bogus out-of-space,
is there even something we can do?
What is not clear to me is why would this affect pg_upgrade at all. We
have the data files split into 1GB segments, and the copy/clone/... goes
one by one. So there shouldn't be more than 1GB "extra" space needed.
Surely you have more free space on the system?
regards
--
Tomas Vondra
From | Date | Subject | |
---|---|---|---|
Next Message | Amit Kapila | 2024-12-09 10:06:15 | Re: Memory leak in WAL sender with pgoutput (v10~) |
Previous Message | Nisha Moond | 2024-12-09 09:50:29 | Re: Conflict detection for update_deleted in logical replication |