From: | Andres Freund <andres(at)anarazel(dot)de> |
---|---|
To: | Robert Haas <robertmhaas(at)gmail(dot)com> |
Cc: | "pgsql-hackers(at)postgresql(dot)org" <pgsql-hackers(at)postgresql(dot)org> |
Subject: | Re: HOT vs freezing issue causing "cannot freeze committed xmax" |
Date: | 2020-07-24 16:55:14 |
Message-ID: | 20200724165514.dnu5hr4vvgkssf5p@alap3.anarazel.de |
Views: | Raw Message | Whole Thread | Download mbox | Resend email |
Thread: | |
Lists: | pgsql-hackers |
Hi,
On 2020-07-24 11:06:58 -0400, Robert Haas wrote:
> On Thu, Jul 23, 2020 at 2:10 PM Andres Freund <andres(at)anarazel(dot)de> wrote:
> > In the case the HOT logic triggers, we'll call
> > heap_prepare_freeze_tuple() even when the tuple is dead.
>
> I think this is very bad. I've always been confused about these
> comments, but I couldn't quite put my finger on the problem. Now I
> think I can: the comments here imagine that we have the option either
> to set tupgone, causing the line pointer to be marked unused by an
> eventual call to lazy_vacuum_page(), or that we can decline to set
> tupgone, which will leave the tuple around to be handled by the next
> vacuum.
Yea. I think the only saving grace is that it's not obvious when the
situation can arise without prior corruption. But even if that's actuall
impossible, it seems extremely fragile. I stared at heap_prune_chain()
for quite a while and couldn't convince myself either way.
> However, we don't really have a choice at all. A choice implies that
> either option is correct, and therefore we can elect the one we
> prefer. But here, it's not just that one option is incorrect, but that
> both options are incorrect. Setting tupgone controls whether or not
> the tuple is considered for freezing. If we DON'T consider freezing
> it, then we might manage to advance relfrozenxid while an older XID
> still exists in the table. If we DO consider freezing it, we will
> correctly conclude that it needs to be frozen, but then the freezing
> code is in an impossible situation, because it has no provision for
> getting rid of tuples, only for keeping them. Its logic assumes that
> whenever we are freezing xmin or xmax we do that in a way that causes
> the tuple to be visible to everyone, but this tuple should be visible
> to no one. So with your changes it now errors out instead of
> corrupting data, but that's just replacing one bad thing (data
> corruption) with another (VACUUM failures).
I suspect that the legitimate cases hitting this branch won't error out,
because then xmin/xmax aren't old enough to need to be frozen.
> I think the actual correct behavior here is to mark the line pointer
> as dead.
That's not trivial, because just doing that naively will break HOT.
> The easiest way to accomplish that is probably to have
> lazy_scan_heap() just emit an extra XLOG_HEAP2_CLEAN record beyond
> whatever HOT-pruning already did, if it's necessary. A better solution
> would probably be to merge HOT-pruning with setting things all-visible
> and have a single function that does both, but that seems a lot more
> invasive, and definitely unsuitable for back-patching.
I suspect that merging pruning and this logic in lazy_scan_heap() really
is the only proper way to solve this kind of issue.
I wonder if, given we don't know if this issue can be hit in a real
database, and given that it already triggers an error, the right way to
deal with this in the back-branches is to emit a more precise error
message. I.e. if we hit this branch, and either xmin/xmax are older than
the cutoff, then we issue a more specific ERROR.
Greetings,
Andres Freund
From | Date | Subject | |
---|---|---|---|
Next Message | Pavel Stehule | 2020-07-24 17:03:30 | Re: Making CASE error handling less surprising |
Previous Message | Andres Freund | 2020-07-24 16:49:13 | Re: Making CASE error handling less surprising |