Re: FileFallocate misbehaving on XFS

From: Andres Freund <andres(at)anarazel(dot)de>
To: Michael Harris <harmic(at)gmail(dot)com>
Cc: Robert Haas <robertmhaas(at)gmail(dot)com>, Jakub Wartak <jakub(dot)wartak(at)enterprisedb(dot)com>, Alvaro Herrera <alvherre(at)alvh(dot)no-ip(dot)org>, Tomas Vondra <tomas(at)vondra(dot)me>, PostgreSQL-development <pgsql-hackers(at)postgresql(dot)org>
Subject: Re: FileFallocate misbehaving on XFS
Date: 2024-12-20 16:39:42
Message-ID: jq6lozj36wseov4tbg5ziduvy7bfj7r3oxmbyifi6yn24dmsyp@4cj5oivz22mj
Views: Raw Message | Whole Thread | Download mbox | Resend email
Thread:
Lists: pgsql-hackers

Hi,

On 2024-12-19 17:47:13 +1100, Michael Harris wrote:
> I finally managed to get the patched version installed in a production
> database where the error is occurring very regularly.

Thanks!

> Here is a sample of the output:
>
> 2024-12-19 01:08:50 CET [2533222]: LOG: mdzeroextend FileFallocate
> failing with ENOSPC: free space for filesystem containing
> "pg_tblspc/107724/PG_16_202307071/465960/2591590762.15" f_blocks:
> 2683831808, f_bfree: 205006167, f_bavail: 205006167 f_files:
> 1073741376, f_ffree: 1069933796

That's ~700 GB of free space...

It'd be interesting to see filefrag -v for that segment.

> This is a different system to those I previously provided logs from.
> It is also running RHEL8 with a similar configuration to the other
> system.

Given it's a RHEL system, have you raised this as an issue with RH? They
probably have somebody with actual XFS hacking experience on staff.

RH's kernels are *heavily* patched, so it's possible the issue is actually RH
specific.

> I have so far not installed the bpftrace that Jakub suggested before -
> as I say this is a production machine and I am wary of triggering a
> kernel panic or worse (even though it seems like the risk for that
> would be low?). While a kernel stack trace would no doubt be helpful
> to the XFS developers, from a postgres point of view, would that be
> likely to help us decide what to do about this?

Well, I'm personally wary of installing workarounds for a problem I don't
understand and can't reproduce, which might be specific to old filesystems
and/or heavily patched kernels. This clearly is an FS bug.

That said, if we learn that somehow this is a fundamental XFS issue that can
be triggered on every XFS filesystem, with current kernels, it becomes more
reasonable to implement a workaround in PG.

Another thing I've been wondering about is if we could reduce the frequency of
hitting problems by rounding up the number of blocks we extend by to powers of
two. That would probably reduce fragmentation, and the extra space would be
quickly used in workloads where we extend by a bunch of blocks at once,
anyway.

Greetings,

Andres Freund

In response to

Browse pgsql-hackers by date

  From Date Subject
Next Message Andrey M. Borodin 2024-12-20 17:16:29 Re: Sort functions with specialized comparators
Previous Message David G. Johnston 2024-12-20 16:02:55 Document How Commit Handles Aborted Transactions