Re: why there is not VACUUM FULL CONCURRENTLY?

From: Antonin Houska <ah(at)cybertec(dot)at>
To: Kirill Reshke <reshkekirill(at)gmail(dot)com>
Cc: Alvaro Herrera <alvherre(at)alvh(dot)no-ip(dot)org>, Pavel Stehule <pavel(dot)stehule(at)gmail(dot)com>, Michael Paquier <michael(at)paquier(dot)xyz>, PostgreSQL Hackers <pgsql-hackers(at)postgresql(dot)org>
Subject: Re: why there is not VACUUM FULL CONCURRENTLY?
Date: 2024-07-31 15:25:45
Message-ID: 16859.1722439545@antos
Views: Raw Message | Whole Thread | Download mbox | Resend email
Thread:
Lists: pgsql-hackers

Kirill Reshke <reshkekirill(at)gmail(dot)com> wrote:

> Also, I was thinking about pg_repack vs pg_squeeze being used for the
> VACUUM FULL CONCURRENTLY feature, and I'm a bit suspicious about the
> latter.

> If I understand correctly, we essentially parse the whole WAL to
> obtain info about one particular relation changes. That may be a big
> overhead,

pg_squeeze is an extension but the logical decoding is performed by the core,
so there is no way to ensure that data changes of the "other tables" are not
decoded. However, it might be possible if we integrate the functionality into
the core. I'll consider doing so in the next version of [1].

> whereas the trigger approach does not suffer from this. So, there is the
> chance that VACUUM FULL CONCURRENTLY will never keep up with vacuumed
> relation changes. Am I right?

Perhaps it can happen, but note that trigger processing is also not free and
that in this case the cost is paid by the applications. So while VACUUM FULL
CONCURRENTLY (based on logical decoding) might fail to catch-up, the trigger
based solution may slow down the applications that execute DML commands while
the table is being rewritten.

[1] https://commitfest.postgresql.org/49/5117/

--
Antonin Houska
Web: https://www.cybertec-postgresql.com

In response to

Browse pgsql-hackers by date

  From Date Subject
Previous Message Nathan Bossart 2024-07-31 15:19:01 Re: improve performance of pg_dump with many sequences