Re: why there is not VACUUM FULL CONCURRENTLY?

From: Alvaro Herrera <alvherre(at)alvh(dot)no-ip(dot)org>
To: Antonin Houska <ah(at)cybertec(dot)at>
Cc: Junwang Zhao <zhjwpku(at)gmail(dot)com>, Kirill Reshke <reshkekirill(at)gmail(dot)com>, Pavel Stehule <pavel(dot)stehule(at)gmail(dot)com>, Michael Paquier <michael(at)paquier(dot)xyz>, PostgreSQL Hackers <pgsql-hackers(at)postgresql(dot)org>
Subject: Re: why there is not VACUUM FULL CONCURRENTLY?
Date: 2024-12-06 13:46:01
Message-ID: 202412061346.vxlwleqysl4i@alvherre.pgsql
Views: Raw Message | Whole Thread | Download mbox | Resend email
Thread:
Lists: pgsql-hackers

On 2024-Oct-09, Antonin Houska wrote:

> diff --git a/doc/src/sgml/ref/vacuum.sgml b/doc/src/sgml/ref/vacuum.sgml
> index 9110938fab..f1008f5013 100644
> --- a/doc/src/sgml/ref/vacuum.sgml
> +++ b/doc/src/sgml/ref/vacuum.sgml

> @@ -61,8 +62,12 @@ VACUUM [ ( <replaceable class="parameter">option</replaceable> [, ...] ) ] [ <re
> <para>
> Without a <replaceable class="parameter">table_and_columns</replaceable>
> list, <command>VACUUM</command> processes every table and materialized view
> - in the current database that the current user has permission to vacuum.
> - With a list, <command>VACUUM</command> processes only those table(s).
> + in the current database that the current user has permission to vacuum. If
> + the <literal>CONCURRENTLY</literal> is specified (see below), tables which
> + have not been clustered yet are silently skipped. With a
> + list, <command>VACUUM</command> processes only those table(s). If
> + the <literal>CONCURRENTLY</literal> is specified, the list may only contain
> + tables which have already been clustered.
> </para>

The idea that VACUUM CONCURRENTLY can only process tables that have been
clustered sounds very strange to me. I don't think such a restriction
would really fly. However, I think this may just be a documentation
mistake; can you please clarify? I am tempted to suggest that VACUUM
CONCURRENTLY should receive a table list; without a list, it should
raise an error. This is not supposed to be a routine maintenance
command that you can run on all your tables, after all. Heck, maybe
don't even accept a partitioned table -- the user can process one
partition at a time, if they need that.

I don't believe in the need for the LOCK_CLUSTER_CONCURRENT define; IMO
the code should just use ShareUpdateExclusiveLock where needed.

In 0001, the new API of make_new_heap() is somewhat bizarre regarding
the output lockmode_new_p parameter. I didn't find any place in the
patch series where we use that to return a different lock level that the
caller gave; the only case were we do something that looks funny is when
a toast table is involved. But I don't think I fully understand what is
going on in that case. I'm likely missing something here, but isn't it
simpler to just state that make_new_heap will obtain a lock on the new
heap, and that the immediately following table_open needn't acquire a
lock (or, in the case of RefreshMatViewByOid, no LockRelationOid is
necessary)?

Anyway, I propose some cosmetic cleanups for 0001 in attachment,
including changing make_new_heap to assume a non-null value of
lockmode_new_p. I didn't go as far as making it no longer a pointer,
but if it can be done, then I suggest we should do that. I didn't try
to apply the next patches in the series after this one.

--
Álvaro Herrera Breisgau, Deutschland — https://www.EnterpriseDB.com/

Attachment Content-Type Size
0001-Minor-code-review.patch.nocfbot text/plain 10.8 KB

In response to

Responses

Browse pgsql-hackers by date

  From Date Subject
Next Message Daniel Gustafsson 2024-12-06 13:46:23 Re: Proposal to add a new URL data type.
Previous Message Alexander Borisov 2024-12-06 12:59:48 Re: Proposal to add a new URL data type.