Re: Allow ERROR from heap_prepare_freeze_tuple to be downgraded to WARNING

From: Robert Haas <robertmhaas(at)gmail(dot)com>
To: Andres Freund <andres(at)anarazel(dot)de>
Cc: Dilip Kumar <dilipbalaut(at)gmail(dot)com>, pgsql-hackers <pgsql-hackers(at)postgresql(dot)org>
Subject: Re: Allow ERROR from heap_prepare_freeze_tuple to be downgraded to WARNING
Date: 2020-08-28 16:37:17
Message-ID: CA+TgmoZm5x9N1jnr8-U_5LwGtTSpJasewVcY5N1Lh0mhWemSBw@mail.gmail.com
Views: Raw Message | Whole Thread | Download mbox | Resend email
Thread:
Lists: pgsql-hackers

On Mon, Jul 20, 2020 at 4:30 PM Andres Freund <andres(at)anarazel(dot)de> wrote:
> If we really were to do something like this the option would need to be
> called vacuum_allow_making_corruption_worse or such. Its need to be
> *exceedingly* clear that it will likely lead to making everything much
> worse.

I don't really understand this objection. How does letting VACUUM
continue after problems have been detected make anything worse? I
agree that if it does, it shouldn't touch relfrozenxid or relminmxid,
but the patch has been adjusted to work that way. Assuming you don't
touch relfrozenxid or relminmxid, what harm befalls if you continue
freezing undamaged tuples and continue removing dead tuples after
finding a bad tuple? You may have already done an arbitrary amount of
that before encountering the damage, and doing it afterward is no
different. Doing the index vacuuming step is different, but I don't
see how that would exacerbate corruption either.

The point is that when you make VACUUM fail, you not only don't
advance relfrozenxid/relminmxid, but also don't remove dead tuples. In
the long run, either thing will kill you, but it is not difficult to
have a situation where failing to remove dead tuples kills you a lot
faster. The table can just bloat until performance tanks, and then the
application goes down, even if you still had 100+ million XIDs before
you needed a wraparound vacuum.

Honestly, I wonder why continuing (but without advancing relfrozenxid
or relminmxid) shouldn't be the default behavior. I mean, if it
actually corrupts your data, then it clearly shouldn't be, and
probably shouldn't even be an optional behavior, but I still don't see
why it would do that.

--
Robert Haas
EnterpriseDB: http://www.enterprisedb.com
The Enterprise PostgreSQL Company

In response to

Responses

Browse pgsql-hackers by date

  From Date Subject
Next Message Robert Haas 2020-08-28 16:43:16 Re: Deprecating postfix and factorial operators in PostgreSQL 13
Previous Message Robert Haas 2020-08-28 16:19:24 Re: Allow ERROR from heap_prepare_freeze_tuple to be downgraded to WARNING