Re: BUG #18146: Rows reappearing in Tables after Auto-Vacuum Failure in PostgreSQL on Windows

From: Thomas Munro <thomas(dot)munro(at)gmail(dot)com>
To: Robert Haas <robertmhaas(at)gmail(dot)com>
Cc: Michael Paquier <michael(at)paquier(dot)xyz>, Tom Lane <tgl(at)sss(dot)pgh(dot)pa(dot)us>, Laurenz Albe <laurenz(dot)albe(at)cybertec(dot)at>, rootcause000(at)gmail(dot)com, pgsql-bugs(at)lists(dot)postgresql(dot)org
Subject: Re: BUG #18146: Rows reappearing in Tables after Auto-Vacuum Failure in PostgreSQL on Windows
Date: 2023-10-23 21:54:46
Message-ID: CA+hUKGK4omtUqvyP9rE2u=LtDUAH8WvzAp+zq9d2JG99ZajxwA@mail.gmail.com
Views: Raw Message | Whole Thread | Download mbox | Resend email
Thread:
Lists: pgsql-bugs

On Tue, Oct 24, 2023 at 4:06 AM Robert Haas <robertmhaas(at)gmail(dot)com> wrote:
> But within mdtruncate(), we've got more than one problem, I think.
> mdnblocks() is a problem because of the reason that you mention, but
> register_dirty_segment() doesn't look totally safe either, because it
> can call RegisterSyncRequest() which, in a standalone backend, can
> call RememberSyncRequest().

But that is already enkludgified thusly:

/*
* XXX: The checkpointer needs to add entries to the pending ops table
* when absorbing fsync requests. That is done within a critical
* section, which isn't usually allowed, but we make an exception. It
* means that there's a theoretical possibility that you run out of
* memory while absorbing fsync requests, which leads to a PANIC.
* Fortunately the hash table is small so that's unlikely to happen in
* practice.
*/
pendingOpsCxt = AllocSetContextCreate(TopMemoryContext,
"Pending ops context",
ALLOCSET_DEFAULT_SIZES);
MemoryContextAllowInCriticalSection(pendingOpsCxt, true);

> In general, it seems like it would be a lot nicer if we were doing a
> lot less stuff inside the critical section here. So I think you're
> right that we need some refactoring. Maybe smgr_prepare_truncate() and
> smgr_execute_truncate() or something like that. I wonder if we could
> actually register the dirty segment in the "prepare" phase - is it bad
> if we register a dirty segment before actually dirtying it? And maybe
> even CacheInvalidateSmgr() could be done at that stage? It seems
> pretty clear that dropping the dirty buffers and actually truncating
> the relation on disk need to happen after we've entered the critical
> section, because if we fail after doing the former, we've thrown away
> dirty data in anticipation of performing an operation that didn't
> happen, and if we fail when attempting the latter, primaries and
> standbys diverge and the originally-reported bug on this thread
> happens. But we'd like to move as much other stuff as we can out of
> that critical section.

Hmm, yeah it seems like that direction would be a nice improvement, as
long as we are sure that the fsync request can't be processed too
soon.

In response to

Responses

Browse pgsql-bugs by date

  From Date Subject
Next Message Peter Geoghegan 2023-10-23 22:15:03 Re: AW: AW: BUG #18147: ERROR: invalid perminfoindex 0 in RTE with relid xxxxx
Previous Message Tom Lane 2023-10-23 20:59:25 Re: AW: AW: BUG #18147: ERROR: invalid perminfoindex 0 in RTE with relid xxxxx