Quick Links

Re: BUG #18146: Rows reappearing in Tables after Auto-Vacuum Failure in PostgreSQL on Windows

From:	Michael Paquier <michael(at)paquier(dot)xyz>
To:	Thomas Munro <thomas(dot)munro(at)gmail(dot)com>
Cc:	Tom Lane <tgl(at)sss(dot)pgh(dot)pa(dot)us>, Laurenz Albe <laurenz(dot)albe(at)cybertec(dot)at>, rootcause000(at)gmail(dot)com, pgsql-bugs(at)lists(dot)postgresql(dot)org
Subject:	Re: BUG #18146: Rows reappearing in Tables after Auto-Vacuum Failure in PostgreSQL on Windows
Date:	2023-10-04 22:44:14
Message-ID:	ZR3qvrYULJWaUnBK@paquier.xyz
Views:	Whole Thread \| Raw Message \| Download mbox \| Resend email
Thread:
Lists:	pgsql-bugs

On Thu, Oct 05, 2023 at 10:12:27AM +1300, Thomas Munro wrote:
> On Thu, Oct 5, 2023 at 3:26 AM Tom Lane <tgl(at)sss(dot)pgh(dot)pa(dot)us> wrote:
>> I'm too lazy to check the commit log right now, but I think
>> we did implement a fix for that (ie, flush dirty pages even
>> if we anticipate them going away due to truncation). But as
>> Laurenz says, v10 is out of support and possibly didn't get
>> that fix. Even if it did, you'd need to be running one of
>> the last minor releases, because this wasn't very long ago.
>
> This thread seems to be saying otherwise:
>
> https://www.postgresql.org/message-id/flat/2348.1544474335%40sss.pgh.pa.us

Yeah, that's not been solved. I've put my mind on this problem for a
few hours last May, just before PGCon, and there was an argument that
marking all the pages as dirty is kind of a waste of resources as it
would create WAL for data that's going to be gone a little bit later
as an effect of the truncate, leading to an extra burst of I/O
especially for large truncatoins. FWIW, I think that I'd be
personally OK with using this method. At least that's safe, simple,
backpatchable and it does not require any new magic. I know that
there are voices that argued against this method, but here we are N
years later, so perhaps we should just do that on HEAD at least.

>> In the end though, the *real* problem here is running on a
>> platform that randomly disallows writes to disk. There's only
>> so much that Postgres can possibly do about unreliability of the
>> underlying platform. I would never run a production database on
>> Windows, because it's just too prone to that sort of BS.
>
> It's surprising that ftruncate() AKA chsize() is able to fail like
> this (I am not a Windows user but AFAIR that sharing stuff obstructs
> stuff like open, unlink, rename, so it surprises me to see it come up
> with ftruncate, since we must already have made it past the open
> stage). Hmm, the documentation is scant, but I know from my attempts
> to use large files that chsize() is probably some kind of wrapper
> around SetEndOfFile() or similar, and that is documented as failing if
> someone has the file mapped. I don't know why someone would have the
> file mapped, though.

(shrug)

> But as for what we should do about it, PANIC (as suggested by several
> people) seems better than corruption, if we're not going to write some
> kind of resilience? How else are we supposed to deal with "this
> shouldn't happen, and if it does we're hosed?"

A PANIC may be OK for this specific syscall and would be better, but
the problematic area is larger than that as we'd still finish with a
corruption as long as there's an ERROR or a FATAL between the moment
the buffers (potentially dirty, with live-still-dead-in-memory tuples
on disk) are discarded and the moment the truncation fails. Another
method discussed is the use of a critical section (I recall that there
were some pallocs in this area, actually, but got nothing on my notes
about that...).
--
Michael

In response to

Re: BUG #18146: Rows reappearing in Tables after Auto-Vacuum Failure in PostgreSQL on Windows at 2023-10-04 21:12:27 from Thomas Munro

Responses

Re: BUG #18146: Rows reappearing in Tables after Auto-Vacuum Failure in PostgreSQL on Windows at 2023-10-05 19:55:58 from Thomas Munro

Browse pgsql-bugs by date

	From	Date	Subject
Next Message	Tom Lane	2023-10-04 23:02:09	Re: BUG #18146: Rows reappearing in Tables after Auto-Vacuum Failure in PostgreSQL on Windows
Previous Message	Thomas Munro	2023-10-04 22:38:24	Re: BUG #18146: Rows reappearing in Tables after Auto-Vacuum Failure in PostgreSQL on Windows