Re: per backend I/O statistics

From: Bertrand Drouvot <bertranddrouvot(dot)pg(at)gmail(dot)com>
To: Nazir Bilal Yavuz <byavuz81(at)gmail(dot)com>
Cc: Alvaro Herrera <alvherre(at)alvh(dot)no-ip(dot)org>, Kyotaro Horiguchi <horikyota(dot)ntt(at)gmail(dot)com>, pgsql-hackers(at)lists(dot)postgresql(dot)org
Subject: Re: per backend I/O statistics
Date: 2024-09-17 13:07:07
Message-ID: Zul++79EOPsFB+Gd@ip-10-97-1-34.eu-west-3.compute.internal
Views: Raw Message | Whole Thread | Download mbox | Resend email
Thread:
Lists: pgsql-hackers

Hi,

On Tue, Sep 17, 2024 at 02:52:01PM +0300, Nazir Bilal Yavuz wrote:
> Hi,
>
> On Fri, 13 Sept 2024 at 19:09, Bertrand Drouvot
> <bertranddrouvot(dot)pg(at)gmail(dot)com> wrote:
> > On Fri, Sep 13, 2024 at 04:45:08PM +0300, Nazir Bilal Yavuz wrote:
> > > - The pgstat_reset_io_counter_internal() is called in the
> > > pgstat_shutdown_hook(). This causes the stats_reset column showing the
> > > termination time of the old backend when its proc num is reassigned to
> > > a new backend.
> >
> > doh! Nice catch, thanks!
> >
> > And also new backends that are not re-using a previous "existing" process slot
> > are getting the last reset time (if any). So the right place to fix this is in
> > pgstat_io_init_backend_cb(): done in v4 attached. v4 also sets the reset time to
> > 0 in pgstat_shutdown_hook() (but that's not necessary though, that's just to be
> > right "conceptually" speaking).
>
> Thanks, I confirm that it is fixed.

Thanks!

> You mentioned some refactoring upthread:
>
> On Fri, 6 Sept 2024 at 18:03, Bertrand Drouvot
> <bertranddrouvot(dot)pg(at)gmail(dot)com> wrote:
> >
> > - If we agree on the current proposal then I'll do some refactoring in
> > pg_stat_get_backend_io(), pg_stat_get_my_io() and pg_stat_get_io() to avoid
> > duplicated code (it's not done yet to ease the review).
>
> Could we remove pg_stat_get_my_io() completely and use
> pg_stat_get_backend_io() with the current backend's pid to get the
> current backend's stats?

The reason why I keep pg_stat_get_my_io() is because (as mentioned in [1]), the
statistics snapshot is build for "my backend stats" (means it depends of the
stats_fetch_consistency value). I can see use cases for that.

OTOH, pg_stat_get_backend_io() behaves as if stats_fetch_consistency is set to
none (each execution re-fetches counters from shared memory) (see [2]). Indeed,
the snapshot is not build in each backend to copy all the others backends stats,
as 1/ I think that there is no use case (there is no need to get others backends
I/O statistics while taking care of the stats_fetch_consistency) and 2/ that
could be memory expensive depending of the number of max connections.

So I think it's better to keep both functions as they behave differently.

Thoughts?

> If you meant the same thing, please don't
> mind it.

What I meant to say is to try to remove the duplicate code that we can find in
those 3 functions (say creating a new function that would contain the duplicate
code and make use of this new function in the 3 others). Did not look at it in
details yet.

[1]: https://www.postgresql.org/message-id/ZtXR%2BCtkEVVE/LHF%40ip-10-97-1-34.eu-west-3.compute.internal
[2]: https://www.postgresql.org/message-id/ZtsZtaRza9bFFeF8%40ip-10-97-1-34.eu-west-3.compute.internal

Regards,

--
Bertrand Drouvot
PostgreSQL Contributors Team
RDS Open Source Databases
Amazon Web Services: https://aws.amazon.com

In response to

Responses

Browse pgsql-hackers by date

  From Date Subject
Next Message Nazir Bilal Yavuz 2024-09-17 13:47:51 Re: per backend I/O statistics
Previous Message Amit Langote 2024-09-17 12:57:03 Re: generic plans and "initial" pruning