From: | Renan Alves Fonseca <renanfonseca(at)gmail(dot)com> |
---|---|
To: | pgsql-hackers <pgsql-hackers(at)postgresql(dot)org> |
Subject: | Prototype: Implement dead tuples xid histograms |
Date: | 2025-04-16 18:38:54 |
Message-ID: | 87sem86jep.fsf@gmail.com |
Views: | Raw Message | Whole Thread | Download mbox | Resend email |
Thread: | |
Lists: | pgsql-hackers |
Hi hackers,
in a recent hacking workshop organized by Robert Haas, we discussed
[1]. Among the autovacuum issues exposed, the useless vacuum case caught
my attention. I've started to study the respective code and I came up
with a prototype to improve the statistics system regarding dead tuples.
The attached patch implements only partially the dead tuples histogram
mentioned in [1]. But, since I'm a beginner, I thought it would be nice
to have an early feedback just to make sure I don't do anything very
wrong.
My initial idea was to implement a growing histogram with a linked list
of bins, exploiting the fact that most of dead tuples are added in the
last bin. Then, I realized that there are no other cases of dynamical
data structures in pg_stats and it would be harder to serialize
it. That's why I choose to implement the histogram in a static data
structure inside one of the pg_stats data structures. It does require a
little bit more logic to maintain the histogram but it is well
integrated in the whole pg_stats architecture.
As discussed in the hacking workshop, one of the problems is to capture
the exact xmin of the dead tuple. In my tests, I've observed that,
outside of a transaction, xmin corresponds to
GetCurrentTransactionId(). But inside a transaction, xmin receives
incremental xids on successive DM statements. Capturing xids for every
statement inside a transaction seems overkill. So, I decided to
attribute the highest xmin/xid of a transaction to all dead tuples
of that transaction.
In order to see the statistics in a table t1, we do:
select pg_stat_get_dead_tuples_xid_freqs ('t1'::regclass),
pg_stat_get_dead_tuples_xid_bounds('t1'::regclass);
Then, to verify that the bounds make sense, I've used:
select xmin from t1;
In this version, the removal of dead tuples is not yet implemented, so
these histograms only grow.
I would really appreciate any kind of feedback.
Best regards,
Renan Fonseca
[1] How Autovacuum Goes Wrong: And Can We Please Make It Stop Doing
That? (PGConf.dev 2024)
Attachment | Content-Type | Size |
---|---|---|
0001-Implement-dead-tuples-xid-histograms.patch | text/x-patch | 16.5 KB |
From | Date | Subject | |
---|---|---|---|
Next Message | Bruce Momjian | 2025-04-16 18:53:38 | Re: Performance issues with v18 SQL-language-function changes |
Previous Message | Andrei Lepikhov | 2025-04-16 18:07:45 | Re: A modest proposal: make parser/rewriter/planner inputs read-only |