Re: why there is not VACUUM FULL CONCURRENTLY?

From: Alvaro Herrera <alvherre(at)alvh(dot)no-ip(dot)org>
To: Antonin Houska <ah(at)cybertec(dot)at>
Cc: Junwang Zhao <zhjwpku(at)gmail(dot)com>, Kirill Reshke <reshkekirill(at)gmail(dot)com>, Pavel Stehule <pavel(dot)stehule(at)gmail(dot)com>, Michael Paquier <michael(at)paquier(dot)xyz>, PostgreSQL Hackers <pgsql-hackers(at)postgresql(dot)org>
Subject: Re: why there is not VACUUM FULL CONCURRENTLY?
Date: 2025-01-09 19:52:43
Message-ID: 202501091952.df7vrzrundlo@alvherre.pgsql
Views: Raw Message | Whole Thread | Download mbox | Resend email
Thread:
Lists: pgsql-hackers

On 2025-Jan-09, Antonin Houska wrote:

> It seems you accidentally fixed another problem :-) I was referring to the
> 'lockmode' argument of make_new_heap(). I can try to write a patch for that
> but ...
>
> > Meanwhile the patch 0004 has some seemingly trivial conflicts. If you
> > want to rebase, I'd appreciate that. In the meantime I'll give a look
> > at the next two other API changes.
>
> ... I can apply v06 even though I do have the commit ebd8fc7e47 in my working
> tree. (And the CF bot does not complain (yet?).) Have you removed the
> 'lockmode' argument also from make_new_heap() and forgot to push it? This
> change would probably cause a conflict with v06.

Hmm, I'm not sure what you mean. My changes are just comment updates,
plus I added asserts that relations are locked to match what the
comments say (this made me touch matview.c, including removal of a
pointless lock acquisition there); also in cluster_rel I moved one
initialization to a different place. A diff between your patch and what
I committed is below. But neither of those changes cause the conflict
to apply 0002,3,4 from v06 to master; rather the conflict comes from
older commits to cluster.c. I'm really surprised that you see no
conflicts. Maybe I'm doing something wrong. (I tried "git am", "git
apply -3" and "patch -p1" and they all result in a few rejects.)

> > Maybe we should have a new toplevel command. Some ideas that have been
> > thrown around:
> >
> > - RETABLE (it's like REINDEX, but for tables)
> > - ALTER TABLE <tab> SQUEEZE
> > - SQUEEZE <table>
> > - VACUUM (SQUEEZE)
> > - VACUUM (COMPACT)
> > - MAINTAIN <tab> COMPACT
> > - MAINTAIN <tab> SQUEEZE
>
> I recall that DB2 has REORG command, which also can do clustering [1]

That's an idea, but ... man, is that ugly!

> Regardless the name of the new command, should that also handle the
> non-concurrent cases? In that case we'd probably need to mark CLUSTER and
> VACUUM (FULL) as deprecated.

Hmm, I don't feel a need to change CLUSTER (since it does one pretty
well defined thing), but I would be okay adding a command that does both
things (concurrent and non concurrent, chosen as per options specified),
and redefine VACUUM FULL as "an obsolete alias for New Shiny Command".
I'm not strongly opposed to also aliasing CLUSTER, note, I just think
it's not really necessary.

diff --git a/src/backend/commands/cluster.c b/src/backend/commands/cluster.c
index 4c0d6b80d2a..99193f5c886 100644
--- a/src/backend/commands/cluster.c
+++ b/src/backend/commands/cluster.c
@@ -316,7 +316,7 @@ cluster_rel(Relation OldHeap, Oid indexOid, ClusterParams *params)
int save_nestlevel;
bool verbose = ((params->options & CLUOPT_VERBOSE) != 0);
bool recheck = ((params->options & CLUOPT_RECHECK) != 0);
- Relation index = NULL;
+ Relation index;

Assert(CheckRelationLockedByMe(OldHeap, AccessExclusiveLock, false));

@@ -434,10 +434,13 @@ cluster_rel(Relation OldHeap, Oid indexOid, ClusterParams *params)
/* Check heap and index are valid to cluster on */
if (OidIsValid(indexOid))
{
+ /* verify the index is good and lock it */
check_index_is_clusterable(OldHeap, indexOid, AccessExclusiveLock);
- /* Open the index (It should already be locked.) */
+ /* also open it */
index = index_open(indexOid, NoLock);
}
+ else
+ index = NULL;

/*
* Quietly ignore the request if this is a materialized view which has not
@@ -615,12 +618,12 @@ mark_index_clustered(Relation rel, Oid indexOid, bool is_internal)
/*
* rebuild_relation: rebuild an existing relation in index or physical order
*
- * OldHeap: table to rebuild --- must be opened and exclusive-locked!
- * index: index to cluster by, or NULL to rewrite in physical order. Must be
- * opened and locked.
+ * OldHeap: table to rebuild.
+ * index: index to cluster by, or NULL to rewrite in physical order.
*
- * On exit, the heap (and also the index, if one was passed) are closed, but
- * still locked with AccessExclusiveLock.
+ * On entry, heap and index (if one is given) must be open, and
+ * AccessExclusiveLock held on them.
+ * On exit, they are closed, but locks on them are not released.
*/
static void
rebuild_relation(Relation OldHeap, Relation index, bool verbose)
@@ -636,6 +639,9 @@ rebuild_relation(Relation OldHeap, Relation index, bool verbose)
TransactionId frozenXid;
MultiXactId cutoffMulti;

+ Assert(CheckRelationLockedByMe(OldHeap, AccessExclusiveLock, false) &&
+ (index == NULL || CheckRelationLockedByMe(index, AccessExclusiveLock, false)));
+
if (index)
/* Mark the correct index as clustered */
mark_index_clustered(OldHeap, RelationGetRelid(index), true);
@@ -647,18 +653,15 @@ rebuild_relation(Relation OldHeap, Relation index, bool verbose)
/*
* Create the transient table that will receive the re-ordered data.
*
- * NoLock for the old heap because we already have it locked and want to
- * keep unlocking straightforward. The new heap (and its TOAST if one
- * exists) will be locked in AccessExclusiveMode on return. Since others
- * can't see it yet, we do not care.
+ * OldHeap is already locked, so no need to lock it again. make_new_heap
+ * obtains AccessExclusiveLock on the new heap and its toast table.
*/
OIDNewHeap = make_new_heap(tableOid, tableSpace,
accessMethod,
relpersistence,
NoLock);
+ Assert(CheckRelationOidLockedByMe(OIDNewHeap, AccessExclusiveLock, false));
NewHeap = table_open(OIDNewHeap, NoLock);
- /* NewHeap already locked by make_new_heap */
- Assert(CheckRelationLockedByMe(NewHeap, AccessExclusiveLock, false));

/* Copy the heap data into the new table in the desired order */
copy_table_data(NewHeap, OldHeap, index, verbose,
diff --git a/src/backend/commands/matview.c b/src/backend/commands/matview.c
index 4b3d4822872..c12817091ed 100644
--- a/src/backend/commands/matview.c
+++ b/src/backend/commands/matview.c
@@ -319,7 +319,7 @@ RefreshMatViewByOid(Oid matviewOid, bool is_create, bool skipData,
OIDNewHeap = make_new_heap(matviewOid, tableSpace,
matviewRel->rd_rel->relam,
relpersistence, ExclusiveLock);
- LockRelationOid(OIDNewHeap, AccessExclusiveLock);
+ Assert(CheckRelationOidLockedByMe(OIDNewHeap, AccessExclusiveLock, false));

/* Generate the data, if wanted. */
if (!skipData)

--
Álvaro Herrera Breisgau, Deutschland — https://www.EnterpriseDB.com/
"Puedes vivir sólo una vez, pero si lo haces bien, una vez es suficiente"

In response to

Browse pgsql-hackers by date

  From Date Subject
Next Message Nikolay Samokhvalov 2025-01-09 20:32:53 pg_dump, pg_dumpall, pg_restore: Add --no-policies option
Previous Message David G. Johnston 2025-01-09 19:35:00 Re: Psql meta-command conninfo+