Use heap scan routines directly in vac_update_datfrozenxid()

From: Soumyadeep Chakraborty <soumyadeep2007(at)gmail(dot)com>
To: pgsql-hackers <pgsql-hackers(at)postgresql(dot)org>
Subject: Use heap scan routines directly in vac_update_datfrozenxid()
Date: 2024-10-06 20:39:42
Message-ID: CAE-ML+8kNjRWO9YwWie2DdSbmRAgMPUO5mqtq0A15Uj3-MZXfA@mail.gmail.com
Views: Raw Message | Whole Thread | Download mbox | Resend email
Thread:
Lists: pgsql-hackers

Hi hackers,

Attached is a simple patch to directly use heap scan routines in
vac_update_datfrozenxid(), avoiding the multilayer overhead from the
sysscan infrastructure. The speedup can be noticeable in databases
containing a large number of relations (perhaps due to heavy partition
table usage). This was proposed in [1].

Experiment setup:

* Use -O3 optimized build without asserts, with fsync and autovacuum off,
on my laptop. Other gucs are all at defaults.

* Create tables using pgbench to inflate pg_class's to a decent size.

$ cat << EOF > bench.sql
> select txid_current() AS txid \gset
> CREATE TABLE t:txid(a int);
> EOF

$ pgbench -f ./bench.sql -t 200000 -c 100 -n bench

select pg_size_pretty(pg_relation_size('pg_class'));
pg_size_pretty
----------------
3508 MB
(1 row)

* Use instr_time to record the scan time. See attached instr_vac.diff.

* Run vacuum on any of the created empty tables in the database bench:

Results:

* main as of 68dfecbef2:

bench=# vacuum t1624;
NOTICE: scan took 796.862142 ms
bench=# vacuum t1624;
NOTICE: scan took 793.730688 ms
bench=# vacuum t1624;
NOTICE: scan took 793.963655 ms

* patch:

bench=# vacuum t1624;
NOTICE: scan took 682.283366 ms
bench=# vacuum t1624;
NOTICE: scan took 670.816975 ms
bench=# vacuum t1624;
NOTICE: scan took 683.821717 ms

Regards,
Soumyadeep (Broadcom)

[1] https://www.postgresql.org/message-id/20221229030329.fbpiitatmowzza6c%40awork3.anarazel.de

Attachment Content-Type Size
v1-0001-Use-heap_getnext-in-vac_update_datfrozenxid.patch text/x-patch 1.6 KB
instr_vac.diff text/x-patch 1.1 KB

Responses

Browse pgsql-hackers by date

  From Date Subject
Next Message Tom Lane 2024-10-06 21:40:54 Re: Use heap scan routines directly in vac_update_datfrozenxid()
Previous Message Guillaume Lelarge 2024-10-06 19:53:32 Re: Add parallel columns for seq scan and index scan on pg_stat_all_tables and _indexes