Re: Vacuum statistics

From: Alexander Korotkov <aekorotkov(at)gmail(dot)com>
To: Alena Rybakina <a(dot)rybakina(at)postgrespro(dot)ru>
Cc: Ilia Evdokimov <ilya(dot)evdokimov(at)tantorlabs(dot)com>, Andrei Zubkov <zubkov(at)moonset(dot)ru>, Alena Rybakina <lena(dot)ribackina(at)yandex(dot)ru>, pgsql-hackers <pgsql-hackers(at)postgresql(dot)org>, a(dot)lepikhov(at)postgrespro(dot)ru, jian he <jian(dot)universality(at)gmail(dot)com>
Subject: Re: Vacuum statistics
Date: 2024-08-23 01:07:37
Message-ID: CAPpHfdug0s2MD7bBf-5nDQGn1WBxCKiTmZyGfxHz_7P0CDOjbg@mail.gmail.com
Views: Raw Message | Whole Thread | Download mbox | Resend email
Thread:
Lists: pgsql-hackers

On Wed, Aug 21, 2024 at 1:39 AM Alena Rybakina <a(dot)rybakina(at)postgrespro(dot)ru>
wrote:
>
> I think you've counted the above system tables from the database, but
> I'll double-check it. Thank you for your review!
>
> On 19.08.2024 19:28, Ilia Evdokimov wrote:
> > Are you certain that all tables are included in
> > `pg_stat_vacuum_tables`? I'm asking because of the following:
> >
> >
> > SELECT count(*) FROM pg_stat_all_tables ;
> > count
> > -------
> > 108
> > (1 row)
> >
> > SELECT count(*) FROM pg_stat_vacuum_tables ;
> > count
> > -------
> > 20
> > (1 row)
> >

I'd like to do some review a well.

+ MyDatabaseId = dbid;
+
+ PG_TRY();
+ {
+ tabentry = pgstat_fetch_stat_tabentry(relid);
+ MyDatabaseId = storedMyDatabaseId;
+ }
+ PG_CATCH();
+ {
+ MyDatabaseId = storedMyDatabaseId;
+ }
+ PG_END_TRY();

I think this is generally wrong to change MyDatabaseId, especially if you
have to wrap it with PG_TRY()/PG_CATCH(). I think, instead we need proper
API changes, i.e. make pgstat_fetch_stat_tabentry() and others take dboid
as an argument.

+/*
+ * Get the vacuum statistics for the heap tables.
+ */
+Datum
+pg_stat_vacuum_tables(PG_FUNCTION_ARGS)
+{
+ return pg_stats_vacuum(fcinfo, PGSTAT_EXTVAC_HEAP,
EXTVACHEAPSTAT_COLUMNS);
+
+ PG_RETURN_NULL();
+}

The PG_RETURN_NULL() is unneeded after another return statement. However,
does pg_stats_vacuum() need to return anything? What about making its
return type void?

@@ -874,4 +874,38 @@ pgstat_get_custom_snapshot_data(PgStat_Kind kind)
return pgStatLocal.snapshot.custom_data[idx];
}

+/* hash table for statistics snapshots entry */
+typedef struct PgStat_SnapshotEntry
+{
+ PgStat_HashKey key;
+ char status; /* for simplehash use */
+ void *data; /* the stats data itself */
+} PgStat_SnapshotEntry;

It would be nice to preserve encapsulation and don't expose pgstat_snapshot
hash in the headers. I see there is only one usage of it outside of
pgstat.c: pg_stats_vacuum().

+ Oid storedMyDatabaseId = MyDatabaseId;
+
+ pgstat_update_snapshot(PGSTAT_KIND_RELATION);
+ MyDatabaseId = storedMyDatabaseId;

This manipulation with storedMyDatabaseId looks pretty useless. It seems
to be intended to change MyDatabaseId, while I'm not fan of this as I
mentioned above.

+static PgStat_StatTabEntry *
+fetch_dbstat_tabentry(Oid dbid, Oid relid)
+{
+ Oid storedMyDatabaseId = MyDatabaseId;
+ PgStat_StatTabEntry *tabentry = NULL;
+
+ if (OidIsValid(CurrentDatabaseId) && CurrentDatabaseId == dbid)
+ /* Quick path when we read data from the same database */
+ return pgstat_fetch_stat_tabentry(relid);
+
+ pgstat_clear_snapshot();

It looks scary to reset the whole snapshot each time we access another
database. Need to also mention that the CurrentDatabaseId machinery isn't
implemented.

New functions
pg_stat_vacuum_tables(), pg_stat_vacuum_indexes(), pg_stat_vacuum_database()
are SRFs. When zero Oid is passed they report all the objects. However,
it seems they aren't intended to be used directly. Instead, there are
views with the same names. These views always call them with particular
Oids, therefore SRFs always return one row. Then why bother with SRF?
They could return plain records instead.

Also, as I mentioned above patchset makes a lot of trouble accessing
statistics of relations of another database. But that seems to be useless
given corresponding views allow to see only relations of the current
database. Even if you call functions directly, what is the value of this
information given that you don't know the relation oids in another
database? So, I think if we will give up and limit access to the relations
of the current database patch will become simpler and clearer.

------
Regards,
Alexander Korotkov
Supabase

In response to

Browse pgsql-hackers by date

  From Date Subject
Next Message Tom Lane 2024-08-23 02:03:37 Re: Improving the notation for ecpg.addons rules
Previous Message Alexander Korotkov 2024-08-23 00:56:23 Re: Add SPLIT PARTITION/MERGE PARTITIONS commands