Quick Links

Re: Decoupling antiwraparound autovacuum from special rules around auto cancellation

From:	Andres Freund <andres(at)anarazel(dot)de>
To:	Robert Haas <robertmhaas(at)gmail(dot)com>
Cc:	Peter Geoghegan <pg(at)bowt(dot)ie>, Jeff Davis <pgsql(at)j-davis(dot)com>, PostgreSQL Hackers <pgsql-hackers(at)lists(dot)postgresql(dot)org>
Subject:	Re: Decoupling antiwraparound autovacuum from special rules around auto cancellation
Date:	2023-01-17 20:14:00
Message-ID:	20230117201400.rt24bc4y6f25az4j@awork3.anarazel.de
Views:	Whole Thread \| Raw Message \| Download mbox \| Resend email
Thread:
Lists:	pgsql-hackers

Hi,

On 2023-01-17 14:57:27 -0500, Robert Haas wrote:
> > In pg_class:
> >
> > - The number of all frozen pages, like we do for all-visible
> >
> > That'd give us a decent upper bound for the amount of work we need to do to
> > increase relfrozenxid. It's important that this is crash safe (thus no
> > pg_stats), and it only needs to be written when we'd likely make other
> > changes to the pg_class row anyway.
>
> I'm not sure how useful this is because a lot of the work is from
> scanning the indexes.

We can increase relfrozenxid before the index scans, unless we ran out of dead
tuple space. We already have code for that in failsafe mode, in some way. But
we really also should increase

> > In pgstats:
> >
> > - The number of dead items, incremented both by the heap scan and
> > opportunistic pruning
> >
> > This would let us estimate how urgently we need to clean up indexes.
>
> I don't think this is true because btree indexes are self-cleaning in
> some scenarios and not in others.

I mainly meant it from the angle of whether need to clean up dead items in the
heap to avoid the table from bloating because we stop using those pages -
which requires index scans. But even for the index scan portion, it'd give us
a better bound than we have today.

We probably should track the number of killed tuples in indexes.

> > - The xid/mxid horizons during the last vacuum
> >
> > - The number of pages with tuples that couldn't removed due to the horizon
> > during the last vacuum
> >
> > - The number of pages with tuples that couldn't be frozen
>
> Not bad to know, but if the horizon we could use advances by 1, we
> can't tell whether that allows pruning nothing additional or another
> billion tuples.

Sure. But it'd be a lot better than scanning it again and again when nothing
has changed because thing still holds back the horizon. We could improve upon
it later by tracking the average or even bins of ages.

> However, where XID age really falls down as a metric is that it
> doesn't tell you what it's going to take to solve the problem. The
> answer, at ten thousand feet, is always vacuum. But how long will that
> vacuum run? We don't know. Do we need the XID horizon to advance
> first, and if so, how far? We don't know. Do we need auto-cancellation
> to be disabled? We don't know. That's where we get into a lot of
> trouble here.

Agreed. I think the metrics I proposed would help some, by at least providing
sensible upper boundaries (for work) and minimal requirements (horizon during
last vacuum).

Greetings,

Andres Freund

In response to

Re: Decoupling antiwraparound autovacuum from special rules around auto cancellation at 2023-01-17 19:57:27 from Robert Haas

Browse pgsql-hackers by date

	From	Date	Subject
Next Message	Bruce Momjian	2023-01-17 20:53:24	Re: document the need to analyze partitioned tables
Previous Message	Bruce Momjian	2023-01-17 20:13:16	Re: cutting down the TODO list thread