Re: per backend I/O statistics

From: Bertrand Drouvot <bertranddrouvot(dot)pg(at)gmail(dot)com>
To: Kyotaro Horiguchi <horikyota(dot)ntt(at)gmail(dot)com>
Cc: pgsql-hackers(at)lists(dot)postgresql(dot)org
Subject: Re: per backend I/O statistics
Date: 2024-09-03 07:21:23
Message-ID: Zta482TuN7ri1Bis@ip-10-97-1-34.eu-west-3.compute.internal
Views: Raw Message | Whole Thread | Download mbox | Resend email
Thread:
Lists: pgsql-hackers

Hi,

On Tue, Sep 03, 2024 at 03:37:49PM +0900, Kyotaro Horiguchi wrote:
> At Mon, 2 Sep 2024 14:55:52 +0000, Bertrand Drouvot <bertranddrouvot(dot)pg(at)gmail(dot)com> wrote in
> > Hi hackers,
> >
> > Please find attached a patch to implement $SUBJECT.
> >
> > While pg_stat_io provides cluster-wide I/O statistics, this patch adds a new
> > pg_my_stat_io view to display "my" backend I/O statistics and a new
> > pg_stat_get_backend_io() function to retrieve the I/O statistics for a given
> > backend pid.
> >
>
> I'm not sure about the usefulness of having the stats only available
> from the current session. Since they are stored in shared memory,
> shouldn't we make them accessible to all backends?

Thanks for the feedback!

The stats are accessible to all backends thanks to 0002 and the introduction
of the pg_stat_get_backend_io() function.

> However, this would
> introduce permission considerations and could become complex.

Not sure that the data exposed here is sensible enough to consider permission
restriction.

> When I first looked at this patch, my initial thought was whether we
> should let these stats stay "fixed." The reason why the current
> PGSTAT_KIND_IO is fixed is that there is only one global statistics
> storage for the entire database. If we have stats for a flexible
> number of backends, it would need to be non-fixed, perhaps with the
> entry for INVALID_PROC_NUMBER storing the global I/O stats, I
> suppose. However, one concern with that approach would be the impact
> on performance due to the frequent creation and deletion of stats
> entries caused by high turnover of backends.
>

The pros of using the fixed amount are:

- less code change (I think as I did not write the non fixed counterpart)
- probably better performance and less scalabilty impact (in case of high rate
of backends creation/ deletion)

Cons is probably allocating shared memory space that might not be used (
sizeof(PgStat_IO) is 16392 so that could be a concern for a high number of
unused connection). OTOH, if a high number of connections is not used that might
be worth to reduce the max_connections setting.

"Conceptually" speaking, we know what the maximum number of backend is, so I
think that using the fixed amount approach makes sense (somehow I think it can
be compared to PGSTAT_KIND_SLRU which relies on SLRU_NUM_ELEMENTS).

> Just to be clear, the above comments are not meant to oppose the
> current implementation approach. They are purely for the sake of
> discussing comparisons with other possible approaches.

No problem at all, thanks for your feedback and sharing your thoughts!

Regards,

--
Bertrand Drouvot
PostgreSQL Contributors Team
RDS Open Source Databases
Amazon Web Services: https://aws.amazon.com

In response to

Responses

Browse pgsql-hackers by date

  From Date Subject
Next Message Alexander Pyhalov 2024-09-03 07:33:23 Re: SQLFunctionCache and generic plans
Previous Message Tender Wang 2024-09-03 07:17:33 Re: not null constraints, again