From: | Antonin Houska <ah(at)cybertec(dot)at> |
---|---|
To: | Kirill Reshke <reshkekirill(at)gmail(dot)com> |
Cc: | Alvaro Herrera <alvherre(at)alvh(dot)no-ip(dot)org>, Pavel Stehule <pavel(dot)stehule(at)gmail(dot)com>, Michael Paquier <michael(at)paquier(dot)xyz>, PostgreSQL Hackers <pgsql-hackers(at)postgresql(dot)org> |
Subject: | Re: why there is not VACUUM FULL CONCURRENTLY? |
Date: | 2024-08-02 06:09:26 |
Message-ID: | 1788.1722578966@antos |
Views: | Raw Message | Whole Thread | Download mbox | Resend email |
Thread: | |
Lists: | pgsql-hackers |
Kirill Reshke <reshkekirill(at)gmail(dot)com> wrote:
> What is the size of the biggest relation successfully vacuumed
> via pg_squeeze?
> Looks like in case of big relartion or high insertion load,
> replication may lag and never catch up...
Users reports problems rather than successes, so I don't know. 400 GB was
reported in [1] but it's possible that the table size for this test was
determined based on available disk space.
I think that the amount of data changes performed during the "squeezing"
matters more than the table size. In [2] one user reported "thounsands of
UPSERTs per second", but the amount of data also depends on row size, which he
didn't mention.
pg_squeeze gives up if it fails to catch up a few times. The first version of
my patch does not check this, I'll add the corresponding code in the next
version.
> However, in general, the 3rd patch is really big, very hard to
> comprehend. Please consider splitting this into smaller (and
> reviewable) pieces.
I'll try to move some preparation steps into separate diffs, but not sure if
that will make the main diff much smaller. I prefer self-contained patches, as
also explained in [3].
> Also, we obviously need more tests on this. Both tap-test and
> regression tests I suppose.
Sure. The next version will use the injection points to test if "concurrent
data changes" are processed correctly.
> One more thing is about pg_squeeze background workers. They act in an
> autovacuum-like fashion, aren't they? Maybe we can support this kind
> of relation processing in core too?
Maybe later. Even just adding the CONCURRENTLY option to CLUSTER and VACUUM
FULL requires quite some effort.
[1] https://github.com/cybertec-postgresql/pg_squeeze/issues/51
[2]
https://github.com/cybertec-postgresql/pg_squeeze/issues/21#issuecomment-514495369
[3] http://peter.eisentraut.org/blog/2024/05/14/when-to-split-patches-for-postgresql
--
Antonin Houska
Web: https://www.cybertec-postgresql.com
From | Date | Subject | |
---|---|---|---|
Next Message | Jeff Davis | 2024-08-02 06:44:54 | Re: Statistics Import and Export |
Previous Message | Hayato Kuroda (Fujitsu) | 2024-08-02 05:56:01 | RE: [Proposal] Add foreign-server health checks infrastructure |