n_ins_since_vacuum stats for aborted transactions

From: Sami Imseih <samimseih(at)gmail(dot)com>
To: pgsql-hackers <pgsql-hackers(at)postgresql(dot)org>
Subject: n_ins_since_vacuum stats for aborted transactions
Date: 2025-04-09 16:57:10
Message-ID: CAA5RZ0sFabU2sAJft5JAqBwrQAbM3X81X-CGUY8CdOdVM+58Og@mail.gmail.com
Views: Raw Message | Whole Thread | Download mbox | Resend email
Thread:
Lists: pgsql-hackers

Hi,

I came across what appears to be incorrect behavior in the
pg_stat_all_tables.n_ins_since_vacuum
counter after a rollback.

As shown below, the first two inserts of 1,000 rows were rolled back,
yet they are still counted toward
n_ins_since_vacuum.Consequently, they influence the vacuum insert
threshold calculation—even though
such rolled-back rows are dead tuples and should only affect the
vacuum threshold calculation.

Notice that the n_mod_since_analyze actually does the correct thing
here and does
not take into account the rolledback inserts.

Rollbacks are not common, so this may go unnoticed, but I think it should be
corrected, which means that only committed inserts should count
towards n_ins_since_vacuum.

the n_tup_ins|del|upd should continue to track both committed and rolledback
rows, but I also think the documentation [0] for these fields could be improved
to clarify this point, i.e. n_tup_ins should be documented as
"Total number of rows inserted, including those from aborted transactions"
instead of just "Total number of rows inserted"

```
DROP TABLE t;
CREATE TABLE t (id INT);
ALTER TABLE t SET (autovacuum_enabled = OFF);
BEGIN;
INSERT INTO t SELECT n FROM generate_series(1, 1000) AS n;
INSERT INTO t SELECT n FROM generate_series(1, 1000) AS n;
ROLLBACK;
INSERT INTO t SELECT n FROM generate_series(1, 1000) AS n;
INSERT INTO t SELECT n FROM generate_series(1, 1000) AS n;
INSERT INTO t SELECT n FROM generate_series(1, 1000) AS n;
SELECT
n_tup_ins,
n_tup_upd,
n_tup_del,
n_live_tup,
n_dead_tup,
n_mod_since_analyze,
n_ins_since_vacuum
FROM
pg_stat_all_tables
WHERE
relname = 't';

n_tup_ins | n_tup_upd | n_tup_del | n_live_tup | n_dead_tup |
n_mod_since_analyze | n_ins_since_vacuum
-----------+-----------+-----------+------------+------------+---------------------+--------------------
5000 | 0 | 0 | 3000 | 2000 |
3000 | 5000
(1 row)

```

Thoughts? before I prepare patches for this.

[0] https://www.postgresql.org/docs/current/monitoring-stats.html

--
Sami Imseih
Amazon Web Services (AWS)

Responses

Browse pgsql-hackers by date

  From Date Subject
Next Message Robert Haas 2025-04-09 16:58:23 Re: Add missing PGDLLIMPORT markings
Previous Message Andres Freund 2025-04-09 15:51:12 Re: Draft for basic NUMA observability