From: | Masahiko Sawada <sawada(dot)mshk(at)gmail(dot)com> |
---|---|
To: | Robert Haas <robertmhaas(at)gmail(dot)com> |
Cc: | Tom Lane <tgl(at)sss(dot)pgh(dot)pa(dot)us>, Andres Freund <andres(at)anarazel(dot)de>, Kyotaro HORIGUCHI <horiguchi(dot)kyotaro(at)lab(dot)ntt(dot)co(dot)jp>, "Bossart, Nathan" <bossartn(at)amazon(dot)com>, Alvaro Herrera <alvherre(at)2ndquadrant(dot)com>, PostgreSQL-development <pgsql-hackers(at)postgresql(dot)org> |
Subject: | Re: New vacuum option to do only freezing |
Date: | 2019-05-07 07:58:45 |
Message-ID: | CAD21AoCfO6fp0N4smf-gDToX4AE8sUrGE-rhVRfJ_zbE2m_Nqg@mail.gmail.com |
Views: | Raw Message | Whole Thread | Download mbox | Resend email |
Thread: | |
Lists: | pgsql-hackers |
On Fri, May 3, 2019 at 12:09 AM Robert Haas <robertmhaas(at)gmail(dot)com> wrote:
>
> On Tue, Apr 16, 2019 at 4:00 PM Tom Lane <tgl(at)sss(dot)pgh(dot)pa(dot)us> wrote:
> > I wrote:
> > > I'm thinking that we really need to upgrade vacuum's reporting totals
> > > so that it accounts in some more-honest way for pre-existing dead
> > > line pointers. The patch as it stands has made the reporting even more
> > > confusing, rather than less so.
> >
> > Here's a couple of ideas about that:
> >
> > 1. Ignore heap_page_prune's activity altogether, on the grounds that
> > it's just random chance that any cleanup done there was done during
> > VACUUM and not some preceding DML operation. Make tups_vacuumed
> > be the count of dead line pointers removed. The advantage of this
> > way is that tups_vacuumed would become independent of previous
> > non-VACUUM pruning activity, as well as preceding index-cleanup-disabled
> > VACUUMs. But maybe it's hiding too much information.
> >
> > 2. Have heap_page_prune count, and add to tups_vacuumed, only HOT
> > tuples that it deleted entirely. The action of replacing a DEAD
> > root tuple with a dead line pointer doesn't count for anything.
> > Then also add the count of dead line pointers removed to tups_vacuumed.
> >
> > Approach #2 is closer to the way we've defined tups_vacuumed up to
> > now, but I think it'd be more realistic in cases where previous
> > pruning or index-cleanup-disabled vacuums have left lots of dead
> > line pointers.
> >
> > I'm not especially wedded to either of these --- any other ideas?
>
> I think it's almost impossible to have clear reporting here with only
> a single counter. There are two clearly-distinct cleanup operations
> going on here: (1) removing tuples from pages, and (2) making dead
> line pointers unused so that they can be reused for new tuples. They
> happen in equal quantity when there are no HOT updates: pruning makes
> dead tuples into dead line pointers, and then index vacuuming allows
> the dead line pointers to be set unused. But if there are HOT
> updates, intermediate tuples in each HOT chain are removed from the
> page but the line pointers are directly set to unused, so VACUUM could
> remove a lot of tuples but not need to make very many dead line
> pointers unused. On the other hand, the opposite could also happen:
> maybe lots of single-page HOT-pruning has happened prior to VACUUM, so
> VACUUM has lots of dead line pointers to make unused but removes very
> few tuples because that's already been done.
>
> For the most part, tups_vacuumed seems to be intending to count #1
> rather than #2. While the comments for heap_page_prune and
> heap_prune_chain are not as clear as might be desirable, it appears to
> me that those functions are counting tuples removed from a page,
> ignoring everything that might happen to line pointers -- so using the
> return value of this function is consistent with wanting to count #1.
> However, there's one place that seems slightly unclear about this,
> namely this comment:
>
> /*
> * DEAD item pointers are to be vacuumed normally; but we don't
> * count them in tups_vacuumed, else we'd be double-counting (at
> * least in the common case where heap_page_prune() just freed up
> * a non-HOT tuple).
> */
>
> If we're counting tuples removed from pages, then it's not merely that
> we would be double-counting, but that we would be counting completely
> the wrong thing. However, as far as I can see, that's just an issue
> with the comments; the code is in all cases counting tuple removals,
> not dead line pointers marked unused.
>
> If I understand correctly, your first proposal amounts to redefining
> tups_vacuumed to count #2 rather than #1, and your second proposal
> amounts to making tups_vacuumed count #1 + #2 rather than #1. I
> suggest a third option: have two counters. tups_vacuum can continue
> to count #1, with just a comment adjustment. And then we can add a
> second counter which is incremented every time lazy_vacuum_page does
> ItemIdSetUnused, which will count #2.
>
> Thoughts?
I agree to have an another counter. Please note that non-HOT-updated
tuples that became dead after hot pruning (that is 'tupgone' tuples)
are changed to unused directly in lazy_page_vacuum. Therefore we would
need not to increment tups_vacuumed in tupgone case if we increment
the new counter in lazy_page_vacuum.
For the contents of vacuum verbose report, it could be worth to
discuss whether reporting the number of deleted line pointers would be
helpful for users. The reporting the number of line pointers we left
might be more helpful for users.
Regards,
--
Masahiko Sawada
NIPPON TELEGRAPH AND TELEPHONE CORPORATION
NTT Open Source Software Center
From | Date | Subject | |
---|---|---|---|
Next Message | Michael Paquier | 2019-05-07 08:00:24 | Re: copy-past-o comment in lock.h |
Previous Message | David Fetter | 2019-05-07 07:50:47 | Re: New EXPLAIN option: ALL |