Re: relfilenode statistics

From: Bertrand Drouvot <bertranddrouvot(dot)pg(at)gmail(dot)com>
To: Michael Paquier <michael(at)paquier(dot)xyz>
Cc: pgsql-hackers(at)lists(dot)postgresql(dot)org
Subject: Re: relfilenode statistics
Date: 2024-07-10 13:38:06
Message-ID: Zo6OvsBLUbYm6nQM@ip-10-97-1-34.eu-west-3.compute.internal
Views: Raw Message | Whole Thread | Download mbox | Resend email
Thread:
Lists: pgsql-hackers

Hi,

On Wed, Jul 10, 2024 at 03:02:34PM +0900, Michael Paquier wrote:
> On Sat, May 25, 2024 at 07:52:02AM +0000, Bertrand Drouvot wrote:
> > But I think that it is in a state that can be used to discuss the approach it
> > is implementing (so that we can agree or not on it) before moving
> > forward.
>
> I have read through the patch to get an idea of how things are done,

Thanks!

> and I am troubled by the approach taken (mentioned down by you), but
> that's invasive compared to how pgstats wants to be transparent with
> its stats kinds.
>
> + Oid objoid; /* object ID, either table or function
> or tablespace. */
> + RelFileNumber relfile; /* relfilenumber for RelFileLocator. */
> } PgStat_HashKey;
>
> This adds a relfilenode component to the central hash key used for the
> dshash of pgstats, which is something most stats types don't care
> about.

That's right but that's an existing behavior without the patch as:

PGSTAT_KIND_DATABASE does not care care about the objoid
PGSTAT_KIND_REPLSLOT does not care care about the dboid
PGSTAT_KIND_SUBSCRIPTION does not care care about the dboid

That's 3 kinds out of the 5 non fixed stats kind.

Not saying it's good, just saying that's an existing behavior.

> That looks like the incorrect thing to do to me, particularly
> seeing a couple of lines down that a stats kind is assigned so the
> HashKey uniqueness is ensured by the KindInfo:
> + [PGSTAT_KIND_RELFILENODE] = {
> + .name = "relfilenode",

You mean, just rely on kind, dboid and relfile to ensure uniqueness?

I'm not sure that would work as there is this comment in relfilelocator.h:

"
* Notice that relNumber is only unique within a database in a particular
* tablespace.
"

So, I think it makes sense to link the hashkey to all the RelFileLocator
fields, means:

dboid (linked to RelFileLocator's dbOid)
objoid (linked to RelFileLocator's spcOid)
relfile (linked to RelFileLocator's relNumber)

> FWIW, I have on my stack of patches something to switch the objoid to
> 8 bytes, actually, which is something that would be required for
> pg_stat_statements as query IDs are wider than that and affect all
> databases, FWIW. Relfilenodes are 4 bytes, okay still Robert has
> proposed a couple of years ago a patch set to bump that to 56 bits,
> change reverted in a448e49bcbe4.

Right, but it really looks like this extra field is needed to ensure
uniqueness (see above).

> What you would be looking instead is to use the relfilenode as an
> objoid

Not sure that works, as it looks like uniqueness won't be ensured (see above).

> and keep track of the OID of the original relation in each
> PgStat_StatRelFileNodeEntry so as it is possible to know where a past
> relfilenode was used? That makes looking back at the past relation's
> elfilenodes stats more complicated as it would be necessary to keep a
> list of the past relfilenodes for a relation, as well. Perhaps with
> some kind of cache that maintains a mapping between the relation and
> its relfilenode history?

Yeah, I also thought about keeping a list of "previous" relfilenodes stats for a
relation but that would lead to:

1. Keep previous relfilnode stats
2. A more complicated way to look at relation stats (as you said)
3. Extra memory usage

I think the only reason "previous" relfilenode stats are needed is to provide
accurate stats for the relation. Outside of this need, I don't think we would
want to retrieve "individual" previous relfilenode stats in the past.

That's why the POC patch "simply" copies the stats to the relation during a
rewrite (before getting rid of the "previous" relfilenode stats).

What do you think?

Regards,

--
Bertrand Drouvot
PostgreSQL Contributors Team
RDS Open Source Databases
Amazon Web Services: https://aws.amazon.com

In response to

Responses

Browse pgsql-hackers by date

  From Date Subject
Next Message Andrew Dunstan 2024-07-10 14:03:50 Re: tests fail on windows with default git settings
Previous Message Masahiko Sawada 2024-07-10 13:35:37 Re: MERGE/SPLIT partition commands should create new partitions in the parent's tablespace?