Re: Test slots invalidations in 035_standby_logical_decoding.pl only if dead rows are removed

From: Alexander Lakhin <exclusion(at)gmail(dot)com>
To: "Drouvot, Bertrand" <bertranddrouvot(dot)pg(at)gmail(dot)com>, Michael Paquier <michael(at)paquier(dot)xyz>
Cc: "Yu Shi (Fujitsu)" <shiy(dot)fnst(at)fujitsu(dot)com>, Tom Lane <tgl(at)sss(dot)pgh(dot)pa(dot)us>, Robert Haas <robertmhaas(at)gmail(dot)com>, Andres Freund <andres(at)anarazel(dot)de>, Thomas Munro <thomas(dot)munro(at)gmail(dot)com>, PostgreSQL Hackers <pgsql-hackers(at)lists(dot)postgresql(dot)org>
Subject: Re: Test slots invalidations in 035_standby_logical_decoding.pl only if dead rows are removed
Date: 2024-01-09 17:00:00
Message-ID: 28fffded-7dc6-3518-7e0e-fc76b9221586@gmail.com
Views: Raw Message | Whole Thread | Download mbox | Resend email
Thread:
Lists: pgsql-hackers

Hello Michael and Bertrand,

I'd also like to note that even with FREEZE added [1], I happened to see
the test failure:
5       #   Failed test 'inactiveslot slot invalidation is logged with vacuum on pg_class'
5       #   at t/035_standby_logical_decoding.pl line 222.
5
5       #   Failed test 'activeslot slot invalidation is logged with vacuum on pg_class'
5       #   at t/035_standby_logical_decoding.pl line 227.

where 035_standby_logical_decoding_primary.log contains:
...
2024-01-09 07:44:26.480 UTC [820142] 035_standby_logical_decoding.pl LOG:  statement: DROP TABLE conflict_test;
2024-01-09 07:44:26.687 UTC [820142] 035_standby_logical_decoding.pl LOG:  statement: VACUUM (VERBOSE, FREEZE) pg_class;
2024-01-09 07:44:26.687 UTC [820142] 035_standby_logical_decoding.pl INFO:  aggressively vacuuming
"testdb.pg_catalog.pg_class"
2024-01-09 07:44:27.099 UTC [820143] DEBUG:  autovacuum: processing database "testdb"
2024-01-09 07:44:27.102 UTC [820142] 035_standby_logical_decoding.pl INFO:  finished vacuuming
"testdb.pg_catalog.pg_class": index scans: 1
        pages: 0 removed, 11 remain, 11 scanned (100.00% of total)
        tuples: 0 removed, 423 remain, 4 are dead but not yet removable
        removable cutoff: 762, which was 2 XIDs old when operation ended
        new relfrozenxid: 762, which is 2 XIDs ahead of previous value
        frozen: 1 pages from table (9.09% of total) had 1 tuples frozen
....

Thus just adding FREEZE is not enough, seemingly. It makes me wonder if
0174c2d21 should be superseded by a patch like discussed (or just have
autovacuum = off added)...

09.01.2024 07:59, Michael Paquier wrote:
> Alexander, does the test gain in stability once you begin using the
> patch posted on [2], mentioned by Bertrand?
>
> (Also, perhaps we'd better move the discussion to the other thread
> where the patch has been sent.)
>
> [2]: https://www.postgresql.org/message-id/d40d015f-03a4-1cf2-6c1f-2b9aca860762@gmail.com

09.01.2024 08:29, Bertrand Drouvot wrote:
> Alexander, pleae find attached v3 which is more or less a rebased version of it.

Bertrand, thank you for updating the patch!

Michael, it definitely increases stability of the test (tens of iterations
with 20 tests in parallel performed successfully), although I've managed to
see another interesting failure (twice):
13      #   Failed test 'activeslot slot invalidation is logged with vacuum on pg_class'
13      #   at t/035_standby_logical_decoding.pl line 227.

psql:<stdin>:1: INFO:  vacuuming "testdb.pg_catalog.pg_class"
psql:<stdin>:1: INFO:  finished vacuuming "testdb.pg_catalog.pg_class": index scans: 1
pages: 0 removed, 11 remain, 11 scanned (100.00% of total)
tuples: 4 removed, 419 remain, 0 are dead but not yet removable
removable cutoff: 754, which was 0 XIDs old when operation ended
...
Waiting for replication conn standby's replay_lsn to pass 0/403E6F8 on primary

But I see no VACUUM records in WAL:
rmgr: Transaction len (rec/tot):    222/   222, tx:          0, lsn: 0/0403E468, prev 0/0403E370, desc: INVALIDATION ;
inval msgs: catcache 55 catcache 54 catcache 55 catcache 54 catcache 55 catcache 54 catcache 55 catcache 54 relcache
2662 relcache 2663 relcache 3455 relcache 1259
rmgr: Standby     len (rec/tot):    234/   234, tx:          0, lsn: 0/0403E548, prev 0/0403E468, desc: INVALIDATIONS ;
relcache init file inval dbid 16384 tsid 1663; inval msgs: catcache 55 catcache 54 catcache 55 catcache 54 catcache 55
catcache 54 catcache 55 catcache 54 relcache 2662 relcache 2663 relcache 3455 relcache 1259
rmgr: Heap        len (rec/tot):     60/   140, tx:        754, lsn: 0/0403E638, prev 0/0403E548, desc: INSERT off: 2,
flags: 0x08, blkref #0: rel 1663/16384/16385 blk 0 FPW
rmgr: Transaction len (rec/tot):     46/    46, tx:        754, lsn: 0/0403E6C8, prev 0/0403E638, desc: COMMIT
2024-01-09 13:40:59.873385 UTC
rmgr: Standby     len (rec/tot):     50/    50, tx:          0, lsn: 0/0403E6F8, prev 0/0403E6C8, desc: RUNNING_XACTS
nextXid 755 latestCompletedXid 754 oldestRunningXid 755
rmgr: XLOG        len (rec/tot):     30/    30, tx:          0, lsn: 0/0403E730, prev 0/0403E6F8, desc: CHECKPOINT_REDO
rmgr: Standby     len (rec/tot):     50/    50, tx:          0, lsn: 0/0403E750, prev 0/0403E730, desc: RUNNING_XACTS
nextXid 755 latestCompletedXid 754 oldestRunningXid 755
rmgr: XLOG        len (rec/tot):    114/   114, tx:          0, lsn: 0/0403E788, prev 0/0403E750, desc:
CHECKPOINT_ONLINE redo 0/403E730; tli 1; prev tli 1; fpw true; xid 0:755; oid 24576; multi 1; offset 0; oldest xid 728
in DB 1; oldest multi 1 in DB 1; oldest/newest commit timestamp xid: 0/0; oldest running xid 755; online
rmgr: Standby     len (rec/tot):     50/    50, tx:          0, lsn: 0/0403E800, prev 0/0403E788, desc: RUNNING_XACTS
nextXid 755 latestCompletedXid 754 oldestRunningXid 755

(Full logs are attached.)

[1] https://www.postgresql.org/message-id/4fd52508-54d7-0202-5bd3-546c2295967f%40gmail.com

Best regards,
Alexander

Attachment Content-Type Size
035-failures-after-vacuum-on-pg_class.tar.gz application/gzip 34.8 KB

In response to

Responses

Browse pgsql-hackers by date

  From Date Subject
Next Message Michail Nikolaev 2024-01-09 17:00:28 Re: Revisiting {CREATE INDEX, REINDEX} CONCURRENTLY improvements
Previous Message Tom Lane 2024-01-09 16:58:21 Re: Fix bogus Asserts in calc_non_nestloop_required_outer