Re: per backend I/O statistics

From: Bertrand Drouvot <bertranddrouvot(dot)pg(at)gmail(dot)com>
To: Alvaro Herrera <alvherre(at)alvh(dot)no-ip(dot)org>
Cc: Kyotaro Horiguchi <horikyota(dot)ntt(at)gmail(dot)com>, pgsql-hackers(at)lists(dot)postgresql(dot)org
Subject: Re: per backend I/O statistics
Date: 2024-09-06 15:03:17
Message-ID: ZtsZtaRza9bFFeF8@ip-10-97-1-34.eu-west-3.compute.internal
Views: Raw Message | Whole Thread | Download mbox | Resend email
Thread:
Lists: pgsql-hackers

Hi,

On Thu, Sep 05, 2024 at 02:14:47PM +0000, Bertrand Drouvot wrote:
> Hi,
>
> On Thu, Sep 05, 2024 at 03:03:32PM +0200, Alvaro Herrera wrote:
> > So it seems to
> > me that storing one of these struct for "my backend stats" is wrong: I
> > think you should be using PgStat_BktypeIO instead (or maybe another
> > struct which has a reset timestamp plus BktypeIO, if you care about the
> > reset time). That struct is only 1024 bytes long, so it seems much more
> > reasonable to have one of those per backend.
>
> Yeah, that could be an area of improvement. Thanks, I'll look at it.
> Currently the filtering is done when retrieving the per backend stats but better
> to do it when storing the stats.

I ended up creating (in v2 attached):

"
typedef struct PgStat_Backend_IO
{
TimestampTz stat_reset_timestamp;
BackendType bktype;
PgStat_BktypeIO stats;
} PgStat_Backend_IO;
"

The stat_reset_timestamp is there so that one knows when a particular backend
had its I/O stats reset. Also the backend type is part of the struct to
be able to filter the stats correctly when we display them.

The struct size is 1040 Bytes and that's much more reasonable than the size
needed for per backend I/O stats in v1 (about 16KB).

> One backend can already see the stats of the other backends thanks to the
> pg_stat_get_backend_io() function (that takes a backend pid as parameter)
> that is introduced in the 0002 sub-patch.

0002 still provides the pg_stat_get_backend_io() function so that one could
get the stats of other backends.

Example:

postgres=# select backend_type,object,context,reads,extends,hits from pg_stat_get_backend_io(3779502);
backend_type | object | context | reads | extends | hits
----------------+---------------+-----------+-------+---------+--------
client backend | relation | bulkread | 0 | | 0
client backend | relation | bulkwrite | 0 | 0 | 0
client backend | relation | normal | 73 | 2216 | 504674
client backend | relation | vacuum | 0 | 0 | 0
client backend | temp relation | normal | 0 | 0 | 0
(5 rows)

could be an individual walsender:

postgres=# select pid, backend_type from pg_stat_activity where backend_type = 'walsender';
pid | backend_type
---------+--------------
3779565 | walsender
(1 row)

postgres=# select backend_type,object,context,reads,hits from pg_stat_get_backend_io(3779565);
backend_type | object | context | reads | hits
--------------+---------------+-----------+-------+------
walsender | relation | bulkread | 0 | 0
walsender | relation | bulkwrite | 0 | 0
walsender | relation | normal | 6 | 48
walsender | relation | vacuum | 0 | 0
walsender | temp relation | normal | 0 | 0
(5 rows)

and so on...

Remarks:

- As stated up-thread, the pg_stat_get_backend_io() behaves as if
stats_fetch_consistency is set to none (each execution re-fetches counters
from shared memory). Indeed, the snapshot is not build in each backend to copy
all the others backends stats, as 1/ there is no use case (there is no need to
get others backends I/O statistics while taking care of the stats_fetch_consistency)
and 2/ that could be memory expensive depending of the number of max connections.

- If we agree on the current proposal then I'll do some refactoring in
pg_stat_get_backend_io(), pg_stat_get_my_io() and pg_stat_get_io() to avoid
duplicated code (it's not done yet to ease the review).

Regards,

--
Bertrand Drouvot
PostgreSQL Contributors Team
RDS Open Source Databases
Amazon Web Services: https://aws.amazon.com

Attachment Content-Type Size
v2-0001-per-backend-I-O-statistics.patch text/x-diff 30.6 KB
v2-0002-Add-pg_stat_get_backend_io.patch text/x-diff 14.4 KB

In response to

Responses

Browse pgsql-hackers by date

  From Date Subject
Next Message Tom Lane 2024-09-06 15:40:56 Re: Remove one TimestampTzGetDatum call in pg_stat_get_io()
Previous Message Tom Lane 2024-09-06 14:38:18 Re: Remove one TimestampTzGetDatum call in pg_stat_get_io()