Re: FileFallocate misbehaving on XFS

From: Michael Harris <harmic(at)gmail(dot)com>
To: Andres Freund <andres(at)anarazel(dot)de>
Cc: Jakub Wartak <jakub(dot)wartak(at)enterprisedb(dot)com>, Tomas Vondra <tomas(at)vondra(dot)me>, PostgreSQL-development <pgsql-hackers(at)postgresql(dot)org>
Subject: Re: FileFallocate misbehaving on XFS
Date: 2024-12-12 03:14:20
Message-ID: CADofcAU14uhy7OzRPHq+4o+23zQAWu+AszojEEK1ENjyfcDg5A@mail.gmail.com
Views: Raw Message | Whole Thread | Download mbox | Resend email
Thread:
Lists: pgsql-hackers

Hi Andres

On Thu, 12 Dec 2024 at 10:50, Andres Freund <andres(at)anarazel(dot)de> wrote:
> Just to make sure - you're absolutely certain that you actually have space at
> the time of the errors?

As sure as I can be. The RHEL8 system that I took prints from
yesterday has > 1.5TB free. I can't see it varying by that much.

It does look as though the system needs to be quite full to provoke
this problem. The systems I have looked at so far have >90% full
filesystems.

Another interesting snippet: the application has a number of ETL
workers going at once. The actual number varies depending on a number
of factors but might be somewhere from 10 - 150. Each worker will have
a single postgres backend that they are feeding data to.

At the time of the error, it is not the case that all ETL workers
strike it at once - it looks like a lot of the time only a single
worker is affected, or at most a handful of workers. I can't see for
sure what the other workers were doing at the time, but I would expect
they were all importing data as well.

> If I were to provide you with a patch that showed the amount of free disk
> space at the time of an error, the size of the relation etc, could you
> reproduce the issue with it applied? Or is that unrealistic?

I have not been able to reproduce it on demand, and so far it has only
happened in production systems.

As long as the patch doesn't degrade normal performance it should be
possible to deploy it to one of the systems that is regularly
reporting the error, although it might take a while to get approval to
do that.

Cheers
Mike

In response to

Responses

Browse pgsql-hackers by date

  From Date Subject
Next Message Michael Paquier 2024-12-12 03:24:39 Re: Add Postgres module info
Previous Message Peter Smith 2024-12-12 03:13:40 Re: pg_createsubscriber TAP test wrapping makes command options hard to read.