per backend I/O statistics

From: Bertrand Drouvot <bertranddrouvot(dot)pg(at)gmail(dot)com>
To: pgsql-hackers(at)lists(dot)postgresql(dot)org
Subject: per backend I/O statistics
Date: 2024-09-02 14:55:52
Message-ID: ZtXR+CtkEVVE/LHF@ip-10-97-1-34.eu-west-3.compute.internal
Views: Raw Message | Whole Thread | Download mbox | Resend email
Thread:
Lists: pgsql-hackers

Hi hackers,

Please find attached a patch to implement $SUBJECT.

While pg_stat_io provides cluster-wide I/O statistics, this patch adds a new
pg_my_stat_io view to display "my" backend I/O statistics and a new
pg_stat_get_backend_io() function to retrieve the I/O statistics for a given
backend pid.

By having the per backend level of granularity, one could for example identify
which running backend is responsible for most of the reads, most of the extends
and so on... The pg_my_stat_io view could also be useful to check the
impact on the I/O made by some operations, queries,... in the current session.

Some remarks:

- it is split in 2 sub patches: 0001 introducing the necessary changes to provide
the pg_my_stat_io view and 0002 to add the pg_stat_get_backend_io() function.
- the idea of having per backend I/O statistics has already been mentioned in
[1] by Andres.

Some implementation choices:

- The KIND_IO stats are still "fixed amount" ones as the maximum number of
backend is fixed.
- The statistics snapshot is made for the global stats (the aggregated ones) and
for my backend stats. The snapshot is not build for all the backend stats (that
could be memory expensive depending on the number of max connections and given
the fact that PgStat_IO is 16KB long).
- The above point means that pg_stat_get_backend_io() behaves as if
stats_fetch_consistency is set to none (each execution re-fetches counters
from shared memory).
- The above 2 points are also the reasons why the pg_my_stat_io view has been
added (as its results takes care of the stats_fetch_consistency setting). I think
that makes sense to rely on it in that case, while I'm not sure that would make
a lot of sense to retrieve other's backend I/O stats and taking care of
stats_fetch_consistency.

[1]: https://www.postgresql.org/message-id/20230309003438.rectf7xo7pw5t5cj%40awork3.anarazel.de

Looking forward to your feedback,

Regards,

--
Bertrand Drouvot
PostgreSQL Contributors Team
RDS Open Source Databases
Amazon Web Services: https://aws.amazon.com

Attachment Content-Type Size
v1-0001-per-backend-I-O-statistics.patch text/x-diff 27.2 KB
v1-0002-Add-pg_stat_get_backend_io.patch text/x-diff 14.4 KB

Responses

Browse pgsql-hackers by date

  From Date Subject
Next Message Bertrand Drouvot 2024-09-02 15:00:32 Re: Track IO times in pg_stat_io
Previous Message Heikki Linnakangas 2024-09-02 14:08:03 Re: Flush pgstats file during checkpoints