Re: Vacuum statistics

From: Jim Nasby <jnasby(at)upgrade(dot)com>
To: Alena Rybakina <a(dot)rybakina(at)postgrespro(dot)ru>
Cc: Andrei Zubkov <zubkov(at)moonset(dot)ru>, Alexander Korotkov <aekorotkov(at)gmail(dot)com>, Masahiko Sawada <sawada(dot)mshk(at)gmail(dot)com>, Melanie Plageman <melanieplageman(at)gmail(dot)com>, jian he <jian(dot)universality(at)gmail(dot)com>, pgsql-hackers <pgsql-hackers(at)postgresql(dot)org>, a(dot)lepikhov(at)postgrespro(dot)ru
Subject: Re: Vacuum statistics
Date: 2024-11-13 00:24:03
Message-ID: 9C7A167C-DCDE-4A17-9ABE-6276723FEC50@upgrade.com
Views: Raw Message | Whole Thread | Download mbox | Resend email
Thread:
Lists: pgsql-hackers

On Nov 10, 2024, at 2:09 PM, Alena Rybakina <a(dot)rybakina(at)postgrespro(dot)ru> wrote:
>
> On 08.11.2024 22:34, Jim Nasby wrote:
>>
>>> On Nov 2, 2024, at 7:22 AM, Alena Rybakina <a(dot)rybakina(at)postgrespro(dot)ru> <mailto:a(dot)rybakina(at)postgrespro(dot)ru> wrote:
>>>
>>>>> The second is the interrupts field. It is needed for monitoring to know
>>>>> do we have them or not, so tracking them on the database level will do
>>>>> the trick. Interrupt is quite rare event, so once the monitoring system
>>>>> will catch one the DBA can go to the server log for the details.
>>>> Just to confirm… by “interrupt” you mean vacuum encountered an error?
>>> Yes it is.
>> In that case I feel rather strongly that we should label that as “errors”. “Interrupt” could mean a few different things, but “error” is very clear.
>>> I updated patches. I excluded system and user time statistics and save number of interrupts only for database.
>>> I removed the ability to get statistics for all tables, now they can only be obtained for an oid table [0], as suggested here. I also renamed the statistics from pg_stat_vacuum_tables to pg_stat_get_vacuum_tables and similarly for indexes and databases. I noticed that that’s what they’re mostly called. Ready for discussion.
>>>
>> I think it’s better that the views follow the existing naming conventions (which don’t include “_get_”; only the functions have that in their names). Assuming that, the only question becomes pg_stat_vacuum_* vs pg_stat_*_vacuum. Given the existing precedent of pg_statio_*, I’m inclined to go with pg_stat_vacuum_*.
> I have fixed it.

I’ve reviewed and made some cosmetic changes to patch 1, though of note it looks like an effort has been made to keep stat_reset_timestamp at the end of PgStat_StatDBEntry, so I re-arranged that. I also removed some obviously dead code. It appears that pgstat_update_snapshot(), InitSnapshotIterator() and ScanStatSnapshot() are also dead, but I’ve left it in incase I’m missing something. The tests are also failing for me because a number of psql variables aren’t set.

I do think we should separate out the counts for deleted but still visible tuples vs tuples where we couldn’t get a cleanup lock (in other words, recently_dead_tuples and missed_dead_tuples from LVRelState). I realize that’s a departure from how some of the existing reporting works, but IMO combining them together isn’t a pattern we should be repeating since they mean completely different things. Towards that end I did remove missed_dead_tuples from the reporting, and renamed ExtVacReport.dead_tuples to recently_dead_tuples, but I stopped short of creating a separate entry for missed_dead_tuples. Note that while recently_dead_tuples is really a global thing (so only needs to be reported at a global (or at most per-database) level, but missed_dead_tuples should really be at a per-table level.

Updated 0001-v13 attached, as well as the diff between v12 and v13.



In response to

Responses

Browse pgsql-hackers by date

  From Date Subject
Next Message Michael Paquier 2024-11-13 00:25:37 Re: define pg_structiszero(addr, s, r)
Previous Message Jim Jones 2024-11-13 00:16:42 Re: [PoC] XMLCast (SQL/XML X025)