Re: Decoupling antiwraparound autovacuum from special rules around auto cancellation

From: Andres Freund <andres(at)anarazel(dot)de>
To: Peter Geoghegan <pg(at)bowt(dot)ie>
Cc: Tom Lane <tgl(at)sss(dot)pgh(dot)pa(dot)us>, Tomas Vondra <tv(at)fuzzy(dot)cz>, Robert Haas <robertmhaas(at)gmail(dot)com>, Jeff Davis <pgsql(at)j-davis(dot)com>, PostgreSQL Hackers <pgsql-hackers(at)lists(dot)postgresql(dot)org>
Subject: Re: Decoupling antiwraparound autovacuum from special rules around auto cancellation
Date: 2023-01-19 01:49:05
Message-ID: 20230119014905.m36fzqjmozfvyqxh@awork3.anarazel.de
Views: Raw Message | Whole Thread | Download mbox | Resend email
Thread:
Lists: pgsql-hackers

Hi,

On 2023-01-18 16:19:02 -0800, Peter Geoghegan wrote:
> On Wed, Jan 18, 2023 at 4:02 PM Andres Freund <andres(at)anarazel(dot)de> wrote:
> > vacuum-no reltuples/n_live_tup n_dead_tup
> > 1 4999976 5000000
> > 2 2500077 5000000
> > 3 1250184 5000000
> > 4 625266 5000000
> > 5 312821 5000000
> > 10 10165 5000000
> >
> > Each vacuum halves reltuples. That's going to screw badly with all kinds of
> > things. Planner costs completely out of whack etc.
>
> I get that that could be a big problem, even relative to the more
> immediate problem of VACUUM just spinning like it does in your test
> case. What do you think we should do about it?

The change made in 7c91a0364fc imo isn't right. We need to fix it. I think
it's correct that now pgstat_report_vacuum() doesn't include recently dead
tuples in livetuples - they're still tracked via deadtuples. But it's wrong
for vacuum's call to vac_update_relstats() to not include recently dead
tuples, at least when we only scanned part of the relation.

I think the right thing would be to not undo the semantic change of
7c91a0364fc as a whole, but instead take recently-dead tuples into account
only in the "Okay, we've covered the corner cases." part, to avoid the
spiraling seen above.

Not super clean, but it seems somewhat fundamental that we'll re-scan pages
full of recently-dead tuples in the near future. If we, in a way, subtract
the recently dead tuples from reltuples in this cycle, we shouldn't do so
again in the next - but not taking recently dead into account, does so.

It's a bit complicated because of the APIs involved. vac_estimate_reltuples()
computes vacrel->new_live_tuples and contains the logic for how to compute the
new reltuples. But we use the ->new_live_tuples both vac_update_relstats(),
where we, imo, should take recently-dead into account for partial scans and
pgstat_report_vacuum where we shouldn't. I guess we would need to add an
output paramter both for "reltuples" and "new_live_tuples".

> What do you think about my idea of focussing on the subset of pages newly
> set all-visible in the VM?

I don't understand it yet.

On 2023-01-18 15:28:19 -0800, Peter Geoghegan wrote:
> Perhaps we should make vac_estimate_reltuples focus on the pages that
> VACUUM newly set all-visible each time (not including all-visible
> pages that got scanned despite being all-visible) -- only that subset
> of scanned_pages seems to be truly relevant. That way you wouldn't be
> able to do stuff like this. We'd have to start explicitly tracking the
> number of pages that were newly set in the VM in vacuumlazy.c to be
> able to do that, but that seems like a good idea anyway.

Can you explain a bit more what you mean with "focus on the pages" means?

> > I wonder if this is part of the reason for the distortion you addressed with
> > 74388a1a / 3097bde7dd1d. I am somewhat doubtful they're right as is. For a
> > large relation 2% of blocks is a significant number of rows, and simply never
> > adjusting reltuples seems quite problematic. At the very least we ought to
> > account for dead tids we removed or such, instead of just freezing reltuples.
>
> As I mentioned, it only kicks in when relpages is *precisely* the same
> as last time (not one block more or one block less), *and* we only
> scanned less than 2% of rel_pages. It's quite possible that that's
> insufficient, but I can't imagine it causing any new problems.

In OLTP workloads relpages will often not change, even if there's lots of
write activity, because there's plenty free space in the relation, and there's
something not-removable on the last page. relpages also won't change if data
is deleted anywhere but the end.

I don't think it's hard to see this causing problems. Set
autovacuum_vacuum_scale_factor to something smaller than 2% or somewhat
frequently vacuum manually. Incrementally delete old data. Unless analyze
saves you - which might not be run or might have a different scale factor or
not be run manually - reltuples will stay exactly the same, despite data
changing substantially.

Greetings,

Andres Freund

In response to

Responses

Browse pgsql-hackers by date

  From Date Subject
Next Message Peter Smith 2023-01-19 01:49:14 Re: Time delayed LR (WAS Re: logical replication restrictions)
Previous Message Peter Smith 2023-01-19 01:41:58 Re: Time delayed LR (WAS Re: logical replication restrictions)