From: | Antonin Houska <ah(at)cybertec(dot)at> |
---|---|
To: | Alvaro Herrera <alvherre(at)alvh(dot)no-ip(dot)org> |
Cc: | Marcos Pegoraro <marcos(at)f10(dot)com(dot)br>, Michael Banck <mbanck(at)gmx(dot)net>, Junwang Zhao <zhjwpku(at)gmail(dot)com>, Kirill Reshke <reshkekirill(at)gmail(dot)com>, Pavel Stehule <pavel(dot)stehule(at)gmail(dot)com>, Michael Paquier <michael(at)paquier(dot)xyz>, PostgreSQL Hackers <pgsql-hackers(at)postgresql(dot)org> |
Subject: | Re: why there is not VACUUM FULL CONCURRENTLY? |
Date: | 2025-04-01 13:31:31 |
Message-ID: | 178741.1743514291@localhost |
Views: | Raw Message | Whole Thread | Download mbox | Resend email |
Thread: | |
Lists: | pgsql-hackers |
Antonin Houska <ah(at)cybertec(dot)at> wrote:
> Antonin Houska <ah(at)cybertec(dot)at> wrote:
>
> > Alvaro Herrera <alvherre(at)alvh(dot)no-ip(dot)org> wrote:
> >
> > > On 2025-Mar-22, Antonin Houska wrote:
> > >
> > > > Alvaro Herrera <alvherre(at)alvh(dot)no-ip(dot)org> wrote:
> > > >
> > > > > I rebased this patch series; here's v09. No substantive changes from v08.
> > > > > I made sure the tree still compiles after each commit.
> > >
> > > I rebased again, fixing a compiler warning reported by CI and applying
> > > pgindent to each individual patch. I'm slowly starting to become more
> > > familiar with the whole of this new code.
> >
> > I'm trying to reflect Robert's suggestions about locking [1]. The next version
> > should be a bit simpler, so maybe wait for it before you continue studying the
> > code.
>
> This is it.
One more version, hopefully to make cfbot happy (I missed the bug because I
did not set the RELCACHE_FORCE_RELEASE macro in my environment.)
Besides that, it occurred to me that 0005 ("Preserve visibility information of
the concurrent data changes.") will probably introduce significant
overhead. The problem is that the table we're repacking is treated like a
catalog, for reorderbuffer.c to generate snapshots that we need to replay
UPDATE / DELETE commands on the new table.
contrib/test_decoding can be used to demonstrate the difference between
ordinary and catalog tables:
CREATE TABLE t(i int);
SELECT pg_create_logical_replication_slot('test_slot', 'test_decoding');
INSERT INTO t SELECT n FROM generate_series(1, 1000000) g(n);
DELETE FROM t;
EXPLAIN ANALYZE SELECT * FROM pg_logical_slot_get_binary_changes('test_slot', null, null);
...
Execution Time: 3521.190 ms
ALTER TABLE t SET (user_catalog_table = true);
INSERT INTO t SELECT n FROM generate_series(1, 1000000) g(n);
DELETE FROM t;
EXPLAIN ANALYZE SELECT * FROM pg_logical_slot_get_binary_changes('test_slot', null, null);
...
Execution Time: 6561.634 ms
I wanted to avoid the "MVCC unsafety" [1], so that both REPACK and REPACK
CONCURRENTLY both work "cleanly". We can try to optimize the logical decoding
for REPACK CONCURRENTLY, or implement 0005 in a different way, but not sure
how much effort that would require. Or implement REPACK CONCURRENTLY as
MVCC-unsafe for now? (The pg_squeeze extension also is not MVCC-safe.)
[1] https://www.postgresql.org/docs/17/mvcc-caveats.html
--
Antonin Houska
Web: https://www.cybertec-postgresql.com
Attachment | Content-Type | Size |
---|---|---|
v12-0001-Add-REPACK-command.patch | text/x-diff | 89.8 KB |
v12-0002-Move-conversion-of-a-historic-to-MVCC-snapshot-to-a-.patch | text/x-diff | 5.4 KB |
v12-0003-Move-the-recheck-branch-to-a-separate-function.patch | text/x-diff | 4.8 KB |
v12-0004-Add-CONCURRENTLY-option-to-REPACK-command.patch | text/plain | 137.1 KB |
v12-0005-Preserve-visibility-information-of-the-concurrent-da.patch | text/x-diff | 57.3 KB |
v12-0006-Add-regression-tests.patch | text/x-diff | 10.6 KB |
v12-0007-Introduce-repack_max_xlock_time-configuration-variab.patch | text/x-diff | 20.4 KB |
v12-0008-Enable-logical-decoding-transiently-only-for-REPACK-.patch | text/x-diff | 22.6 KB |
v12-0009-Call-logical_rewrite_heap_tuple-when-applying-concur.patch | text/x-diff | 26.0 KB |
From | Date | Subject | |
---|---|---|---|
Next Message | Robert Haas | 2025-04-01 13:37:30 | Re: Statistics Import and Export |
Previous Message | Aleksander Alekseev | 2025-04-01 13:26:13 | Re: Adding skip scan (including MDAM style range skip scan) to nbtree |