Quick Links

Re: More aggressive vacuuming of temporary tables

From:	Andres Freund <andres(at)anarazel(dot)de>
To:	Tom Lane <tgl(at)sss(dot)pgh(dot)pa(dot)us>
Cc:	pgsql-hackers(at)lists(dot)postgresql(dot)org
Subject:	Re: More aggressive vacuuming of temporary tables
Date:	2020-09-08 19:01:29
Message-ID:	20200908190129.bxujcfkxpldljus4@alap3.anarazel.de
Views:	Whole Thread \| Raw Message \| Download mbox \| Resend email
Thread:
Lists:	pgsql-hackers

Hi,

On 2020-08-28 11:46:49 -0400, Tom Lane wrote:
> It strikes me that when we are vacuuming a temporary table (which
> necessarily will be one of our own session), we don't really need
> to care what the global xmin horizon is. If we're not inside a
> user transaction block, then there are no tuples in the table that
> could be in-doubt anymore. Neither are there any snapshots in our
> session that could see any dead tuples. Nor do we give a fig what
> other sessions might think of those tuples. So we could just set
> the xmin cutoff as aggressively as possible, which is to say
> equal to the nextXid counter. While vacuuming a temp table is
> perhaps not something people do very often, I think when they do
> do it they would like us to clean out all the dead tuples not just
> some.

That seems like a good idea.

I've been toying with a patch that introduces more smarts about when a
row is removable, by looking more closely whether a specific row
versions are visible (e.g. in the common case of one old snapshot and
lots of newer rows). But that's orders of magnitude more complicated. So
going for something as simple as this seems like a good idea.

> Hence I propose 0001 attached. 80% of it is just API additions to allow
> passing down the isTopLevel flag so that we can do the right thing in
> the CLUSTER case; the important change is in vacuum_set_xid_limits.
> (For ease of review, I didn't reindent the existing code in
> vacuum_set_xid_limits, but that would be something to do before commit.)

I did wonder for a moment if it could be worthwhile to just implement
this for VACUUM and leave CLUSTER alone, to avoid having to deal with
the toplevel stuff. But I think it's better to be consistent.

But now I do wonder why we need to know whether the command is top level
or not? Why isn't the correct thing to instead look at what the current
backend's xmin is? Seems like you could just replace
*oldestXmin = XidFromFullTransactionId(ReadNextFullTransactionId());
with
*oldestXmin = MyProc->xmin;
Assert(TransactionIdIsValid(*oldestXmin));

and you'd not need to care about whether the CLUSTER is top-level or
not? Without, I think, loosing any aggressiveness in the top-level case?
Perhaps even worth moving this logic into
GetOldestNonRemovableTransactionId().

I know you already pushed this, but I vote for revising it this way if
you don't see an issue?

Greetings,

Andres Freund

In response to

More aggressive vacuuming of temporary tables at 2020-08-28 15:46:49 from Tom Lane

Responses

Re: More aggressive vacuuming of temporary tables at 2020-09-08 19:24:54 from Tom Lane
Re: More aggressive vacuuming of temporary tables at 2020-09-09 14:14:04 from Stephen Frost

Browse pgsql-hackers by date

	From	Date	Subject
Next Message	Heikki Linnakangas	2020-09-08 19:05:37	Re: Yet another fast GiST build
Previous Message	Andres Freund	2020-09-08 18:49:36	VACUUM (INTERRUPTIBLE)?