Re: decoupling table and index vacuum

From: Peter Geoghegan <pg(at)bowt(dot)ie>
To: Robert Haas <robertmhaas(at)gmail(dot)com>
Cc: Andres Freund <andres(at)anarazel(dot)de>, "pgsql-hackers(at)postgresql(dot)org" <pgsql-hackers(at)postgresql(dot)org>
Subject: Re: decoupling table and index vacuum
Date: 2021-04-22 18:30:21
Message-ID: CAH2-Wznb2aOE9SYP8pF6DouTHCO-xbBWHqoA9nkWmHhEjD9kDA@mail.gmail.com
Views: Raw Message | Whole Thread | Download mbox | Resend email
Thread:
Lists: pgsql-hackers

On Thu, Apr 22, 2021 at 9:15 AM Robert Haas <robertmhaas(at)gmail(dot)com> wrote:
> > Have you thought about how we would do the scheduling of vacuums for the
> > different indexes? We don't really have useful stats for the number of
> > dead index entries to be expected in an index. It'd not be hard to track
> > how many entries are removed in an index via killtuples, but
> > e.g. estimating how many dead entries there are in a partial index seems
> > quite hard (at least without introducing significant overhead).
>
> No, I don't have any good ideas about that, really. Partial indexes
> seem like a hard problem, and so do GIN indexes or other kinds of
> things where you may have multiple index entries per heap tuple. We
> might have to accept some known-to-be-wrong approximations in such
> cases.

I think that you're both missing very important subtleties here.
Apparently the "quantitative vs qualitative" distinction I like to
make hasn't cleared it up.

You both seem to be assuming that everything would be fine if you
could somehow inexpensively know the total number of undeleted dead
tuples in each index at all times. But I don't think that that's true
at all. I don't mean that it might not be true. What I mean is that
it's usually a meaningless number *on its own*, at least if you assume
that every index is either an nbtree index (or an index that uses some
other index AM that has the same index deletion capabilities).

My mental models for index bloat usually involve imagining an
idealized version of a real world bloated index -- I compare the
empirical reality against an imagined idealized version. I then try to
find optimizations that make the reality approximate the idealized
version. Say a version of the same index in a traditional 2PL database
without MVCC, or in real world Postgres with VACUUM that magically
runs infinitely fast.

Bottom-up index deletion usually leaves a huge number of
undeleted-though-dead index tuples untouched for hours, even when it
works perfectly. 10% - 30% of the index tuples might be
undeleted-though-dead at any given point in time (traditional B-Tree
space utilization math generally ensures that there is about that much
free space on each leaf page if we imagine no version churn/bloat --
we *naturally* have a lot of free space to work with). These are
"Schrodinger's dead index tuples". You could count them
mechanistically, but then you'd be counting index tuples that are
"already dead and deleted" in an important theoretical sense, despite
the fact that they are not yet literally deleted. That's why bottom-up
index deletion frequently avoids 100% of all unnecessary page splits.
The asymmetry that was there all along was just crazy. I merely had
the realization that it was there and could be exploited -- I didn't
create or invent the natural asymmetry.

A similar issue exists within vacuumlazy.c (though that might matter a
lot less). Notice that it counts a recently dead heap tuple in its
new_dead_tuples accounting, even though the workload can probably
delete that tuple in a just-in-time fashion opportunistically. Might
we be better off recognizing that such a heap tuple is already morally
dead and gone, even if that isn't literally true? (That's a harder
argument to make, and I'm not making it right now -- it's just an
example.)

--
Peter Geoghegan

In response to

Responses

Browse pgsql-hackers by date

  From Date Subject
Next Message Andres Freund 2021-04-22 18:31:48 Re: posgres 12 bug (partitioned table)
Previous Message Tom Lane 2021-04-22 18:21:05 Re: posgres 12 bug (partitioned table)