Re: New vacuum option to do only freezing

From: Tom Lane <tgl(at)sss(dot)pgh(dot)pa(dot)us>
To: Andres Freund <andres(at)anarazel(dot)de>
Cc: Robert Haas <robertmhaas(at)gmail(dot)com>, Masahiko Sawada <sawada(dot)mshk(at)gmail(dot)com>, Kyotaro HORIGUCHI <horiguchi(dot)kyotaro(at)lab(dot)ntt(dot)co(dot)jp>, "Bossart, Nathan" <bossartn(at)amazon(dot)com>, Alvaro Herrera <alvherre(at)2ndquadrant(dot)com>, PostgreSQL-development <pgsql-hackers(at)postgresql(dot)org>
Subject: Re: New vacuum option to do only freezing
Date: 2019-04-16 16:44:34
Message-ID: 31782.1555433074@sss.pgh.pa.us
Views: Raw Message | Whole Thread | Download mbox | Resend email
Thread:
Lists: pgsql-hackers

So after thinking about this a bit more ...

ISTM that what we have here is a race condition (ie, tuple changed state
since heap_page_prune), and that ideally we want the code to resolve it
as if no race had happened. That is, either of these behaviors would
be acceptable:

1. Delete the tuple, just as heap_page_prune would've done if it had seen
it DEAD. (Possibly we could implement that by jumping back and doing
heap_page_prune again, but that seems pretty messy and bug-prone.
In any case, if we're not doing index cleanup then this must reduce to
"replace tuple by a dead line pointer", not remove it altogether.)

2. Act as if the tuple were still live, just as would've happened if the
state didn't change till just after we looked instead of just before.

Option #2 is a lot simpler and safer, and can be implemented as I
suggested earlier, assuming we're all good with the assumption that
heap_prepare_freeze_tuple isn't going to do anything bad.

However ... it strikes me that there's yet another assumption in here
that this patch has broken. Namely, notice that the reason we normally
don't get here is that what heap_page_prune does with an already-DEAD
tuple is reduce it to a dead line pointer and then count it in its
return value, which gets added to tups_vacuumed. But then what
lazy_scan_heap's per-tuple loop does is

/*
* DEAD item pointers are to be vacuumed normally; but we don't
* count them in tups_vacuumed, else we'd be double-counting (at
* least in the common case where heap_page_prune() just freed up
* a non-HOT tuple).
*/
if (ItemIdIsDead(itemid))
{
lazy_record_dead_tuple(vacrelstats, &(tuple.t_self));
all_visible = false;
continue;
}

When this patch is active, it will *greatly* increase the odds that
we report a misleading tups_vacuumed total, for two different reasons:

* DEAD tuples reduced to dead line pointers during heap_page_prune will be
counted as tups_vacuumed, even though we don't take the further step of
removing the dead line pointer, as always happened before.

* When, after some vacuum cycles with index_cleanup disabled, we finally
do one with index_cleanup enabled, there are going to be a heck of a lot
of old dead line pointers to clean out, which the existing logic won't
count at all. That was only barely tolerable before, and it seems like
this has pushed it over the bounds of silliness. People are going to
be wondering why vacuum reports that it removed zillions of index
entries and no tuples.

I'm thinking that we really need to upgrade vacuum's reporting totals
so that it accounts in some more-honest way for pre-existing dead
line pointers. The patch as it stands has made the reporting even more
confusing, rather than less so.

BTW, the fact that dead line pointers will accumulate without limit
makes me even more dubious of the proposition that this "feature"
will be safe to enable as a reloption in production. I really think
that we ought to restrict it to be a manual VACUUM option, to be
used only when you're desperate to freeze old tuples.

regards, tom lane

In response to

Responses

Browse pgsql-hackers by date

  From Date Subject
Next Message Dmitry Belyavsky 2019-04-16 17:28:35 Re: Ltree syntax improvement
Previous Message Tomas Vondra 2019-04-16 16:15:24 Re: Zedstore - compressed in-core columnar storage