Re: relfilenode statistics

From: Bertrand Drouvot <bertranddrouvot(dot)pg(at)gmail(dot)com>
To: Robert Haas <robertmhaas(at)gmail(dot)com>
Cc: pgsql-hackers(at)lists(dot)postgresql(dot)org
Subject: Re: relfilenode statistics
Date: 2024-06-03 11:11:46
Message-ID: Zl2k8u4HDTUW6QlC@ip-10-97-1-34.eu-west-3.compute.internal
Views: Raw Message | Whole Thread | Download mbox | Resend email
Thread:
Lists: pgsql-hackers

Hi Robert,

On Mon, May 27, 2024 at 09:10:13AM -0400, Robert Haas wrote:
> Hi Bertrand,
>
> It would be helpful to me if the reasons why we're splitting out
> relfilenodestats could be more clearly spelled out. I see Andres's
> comment in the thread to which you linked, but it's pretty vague about
> why we should do this ("it's not nice") and whether we should do this
> ("I wonder if this is an argument for") and maybe that's all fine if
> Andres is going to be the one to review and commit this, but even if
> then it would be nice if the rest of us could follow along from home,
> and right now I can't.

Thanks for the feedback!

You’re completely right, my previous message is missing clear explanation as to
why I think that relfilenode stats could be useful. Let me try to fix this.

The main argument is that we currently don’t have writes counters for relations.
The reason is that we don’t have the relation OID when writing buffers out.
Tracking writes per relfilenode would allow us to track/consolidate writes per
relation (example in the v1 patch and in the message up-thread).

I think that adding instrumentation in this area (writes counters) could be
beneficial (like it is for the ones we currently have for reads).

Second argument is that this is also beneficial for the "Split index and
table statistics into different types of stats" thread (mentioned in the previous
message). It would allow us to avoid additional branches in some situations (like
the one mentioned by Andres in the link I provided up-thread).

If we agree that the main argument makes sense to think about having relfilenode
stats then I think using them as proposed in the second argument makes sense too:

We’d move the current buffer read and buffer hit counters from the relation stats
to the relfilenode stats (while still being able to retrieve them from the
pg_statio_all_tables/indexes views: see the example for the new heap_blks_written
stat added in the patch). Generally speaking, I think that tracking counters at
a common level (i.e relfilenode level instead of table or index level) is
beneficial (avoid storing/allocating space for the same counters in multiple
structs) and sounds more intuitive to me.

Also I think this is open door for new ideas: for example, with relfilenode
statistics in place, we could probably also start thinking about tracking
checksum errors per relfllenode.

> The commit message is often a good place to spell this kind of thing
> out, because then it's included with every version of the patch you
> post, and may be of some use to the eventual committer in writing
> their commit message. The body of the email where you post the patch
> set can be fine, too.
>

Yeah, I’ll update the commit message in V2 with better explanations once I get
feedback on V1 (should we decide to move on with the relfilenode stats idea).

Regards,

--
Bertrand Drouvot
PostgreSQL Contributors Team
RDS Open Source Databases
Amazon Web Services: https://aws.amazon.com

In response to

Responses

Browse pgsql-hackers by date

  From Date Subject
Next Message Shlok Kyal 2024-06-03 11:52:35 Re: Pgoutput not capturing the generated columns
Previous Message vignesh C 2024-06-03 10:43:29 Re: Pgoutput not capturing the generated columns