Re: per backend I/O statistics

From: Bertrand Drouvot <bertranddrouvot(dot)pg(at)gmail(dot)com>
To: Michael Paquier <michael(at)paquier(dot)xyz>
Cc: Nazir Bilal Yavuz <byavuz81(at)gmail(dot)com>, Alvaro Herrera <alvherre(at)alvh(dot)no-ip(dot)org>, Kyotaro Horiguchi <horikyota(dot)ntt(at)gmail(dot)com>, pgsql-hackers(at)lists(dot)postgresql(dot)org
Subject: Re: per backend I/O statistics
Date: 2024-11-25 07:12:56
Message-ID: Z0QjeIkwC0HNI16K@ip-10-97-1-34.eu-west-3.compute.internal
Views: Raw Message | Whole Thread | Download mbox | Resend email
Thread:
Lists: pgsql-hackers

Hi,

On Mon, Nov 25, 2024 at 10:06:44AM +0900, Michael Paquier wrote:
> On Fri, Nov 22, 2024 at 07:49:58AM +0000, Bertrand Drouvot wrote:
> > On Fri, Nov 22, 2024 at 10:36:29AM +0900, Michael Paquier wrote:
> >> Hmm. created_entry only matters for pgstat_init_function_usage().
> >> All the other callers of pgstat_prep_pending_entry() pass a NULL
> >> value.
> >
> > I meant to say all the calls that passe "create" as true in pgstat_get_entry_ref().
>
> Ah, OK, I think that I see your point here.
>
> I am wondering how much this would matter as well for custom stats,
> but we're not there yet without at least one release out and folks try
> new things with these APIs and variable-numbered kinds.

Not sure here, could custom stats start incrementing before the database system
is ready to accept connections?

> pgstat_prep_pending_entry() to return NULL even if "create" is true
> may be a good thing, at the end, because that's the only way I can see
> based on the current APIs where we could say "Sorry, but the stats
> have not been loaded yet, so you cannot try to do anything related to
> the dshash".

Yeah, same here.

> From my view having a kind of barrier would be cleaner in the long
> run, but it's true that it may not be mandatory, as well. pg_stat_io
> is currently OK to be called because the stats are loaded for
> auxiliary processes because it uses fixed-numbered stats in shmem.
> And it means we already have early calls that add stats getting
> overwritten once the stats are loaded from the on-disk file (Am I
> getting this part right?).

Yeah, we can already see that, for example, the background writer could enter
pgstat_io_flush_cb() before the stats are reset or restored.

> Anyway, do we really require that for the sake of this thread? We
> know that there's only one of each auxiliary process at a time, and
> they keep a footprint in pg_stat_io already. So we could just limit
> outselves to live database backends, WAL senders and autovacuum
> workers, everything that's not auxiliary and spawned on request?

I think that's a fair starting point and that we will not lose any informations
doing so (as you said there is only one of each auxiliary process at a time,
so that one could already see their stats from pg_stat_io).

The only cons that I can see is that we will not be able to merge the flush cb
but I don't think that's a blocker (the flush are done in shared memory so the
impact on performance should not be that much of an issue).

I'll come back with a new version implementing the above.

[1]: https://www.postgresql.org/message-id/Zz9sno%2BJJbWqdXhQ%40ip-10-97-1-34.eu-west-3.compute.internal

Regards,

--
Bertrand Drouvot
PostgreSQL Contributors Team
RDS Open Source Databases
Amazon Web Services: https://aws.amazon.com

In response to

Responses

Browse pgsql-hackers by date

  From Date Subject
Next Message Michael Paquier 2024-11-25 07:18:54 Re: per backend I/O statistics[
Previous Message Alexander Korotkov 2024-11-25 07:08:41 Re: POC, WIP: OR-clause support for indexes