From: | Robert Haas <robertmhaas(at)gmail(dot)com> |
---|---|
To: | Thomas Munro <thomas(dot)munro(at)gmail(dot)com> |
Cc: | Michael Paquier <michael(at)paquier(dot)xyz>, Heikki Linnakangas <hlinnaka(at)iki(dot)fi>, Alexander Lakhin <exclusion(at)gmail(dot)com>, Tom Lane <tgl(at)sss(dot)pgh(dot)pa(dot)us>, Laurenz Albe <laurenz(dot)albe(at)cybertec(dot)at>, rootcause000(at)gmail(dot)com, pgsql-bugs(at)lists(dot)postgresql(dot)org |
Subject: | Re: BUG #18146: Rows reappearing in Tables after Auto-Vacuum Failure in PostgreSQL on Windows |
Date: | 2024-12-19 15:44:05 |
Message-ID: | CA+TgmoZj9dJhHQokHioXB8dyDZz5gjzyg7SbUrG=At-uShq7yA@mail.gmail.com |
Views: | Raw Message | Whole Thread | Download mbox | Resend email |
Thread: | |
Lists: | pgsql-bugs |
On Wed, Dec 11, 2024 at 8:26 AM Thomas Munro <thomas(dot)munro(at)gmail(dot)com> wrote:
> > I do wonder whether the new smgrtruncatefrom() should be used
> > everywhere instead of just from one of the call sites, but even if
> > that's desirable long-term, doing this much is still a lot better than
> > doing nothing. This data corrupting bug has been known and understood
> > for more than 4 years at this point.
>
> There are only three callers of smgrtruncate() in our tree:
>
> 1. This one.
> 2. Its redo-time counterpart.
> 3. pg_truncate_visibility_map(), in contrib/pg_visibility.
>
> The recovery environment promotes errors of both types to FATAL (the
> "we have no handler" case in errstart()), and the postmaster handles
> startup failure a bit like a PANIC. Or do you think PANIC and more
> similar code would be better?
I think mostly my thought was that if we could remove smgrtruncate()
entirely in favor of smgrtruncatefrom(), or just keep it called
smgrtruncate() but add a mandatory additional argument, that would be
less error-prone than having two versions between which future hackers
must pick.
> I am curious about the non-flushing though. RelationTruncate() also
> doesn't flush if there is no fsm or vm. Perhaps it always flushes in
> practice since VACUUM itself creates those, I'm not entirely sure yet.
I agree that we should make that unconditional. I think your analysis
shows that abandoning the WAL-before-data isn't safe, but even if
there were some doubt about whether your analysis is correct, we
should have a very strong bias against thinking that a rule as
fundamental as WAL-before-data is negotiable.
> PFA the first sketch of a patch to make pg_truncate_visibility_map()
> into... a cut-down version of RelationTruncate(). Still looking into
> the flushing, without which the comments in the patch are entirely
> bogus.
The patch looks to good to me. It seems to me that you could just add
an XLogFlush() call.
--
Robert Haas
EDB: http://www.enterprisedb.com
From | Date | Subject | |
---|---|---|---|
Next Message | Michael Paquier | 2024-12-20 02:53:43 | Re: BUG #18146: Rows reappearing in Tables after Auto-Vacuum Failure in PostgreSQL on Windows |
Previous Message | Thomas Munro | 2024-12-19 02:45:47 | Re: Build failure with GCC 15 (defaults to -std=gnu23) |