Re: per backend I/O statistics

From: Bertrand Drouvot <bertranddrouvot(dot)pg(at)gmail(dot)com>
To: Michael Paquier <michael(at)paquier(dot)xyz>
Cc: Nazir Bilal Yavuz <byavuz81(at)gmail(dot)com>, Alvaro Herrera <alvherre(at)alvh(dot)no-ip(dot)org>, Kyotaro Horiguchi <horikyota(dot)ntt(at)gmail(dot)com>, pgsql-hackers(at)lists(dot)postgresql(dot)org
Subject: Re: per backend I/O statistics
Date: 2024-12-16 15:42:04
Message-ID: Z2BKTK9KX5Kc3HaU@ip-10-97-1-34.eu-west-3.compute.internal
Views: Raw Message | Whole Thread | Download mbox | Resend email
Thread:
Lists: pgsql-hackers

Hi,

On Mon, Dec 16, 2024 at 05:07:52PM +0900, Michael Paquier wrote:
> On Fri, Dec 13, 2024 at 09:20:13AM +0000, Bertrand Drouvot wrote:
>
> Not feeling so sure about the value brought by the backend_type
> returned in pg_stat_get_backend_io(), but well..

It's not necessary per say, but it ensures that pg_stat_get_backend_io() does not
return rows full of "invalid" combinations.

For example:

With the filtering in place, on a "client backend" we would get:

postgres=# select * from pg_stat_get_backend_io(pg_backend_pid());
backend_type | object | context | reads | read_time | writes | write_time | writebacks | writeback_time | extends | extend_time | op_bytes | hits | evictions | reuses | fsyncs | fsync_time | stats_reset
----------------+---------------+-----------+-------+-----------+--------+------------+------------+----------------+---------+-------------+----------+------+-----------+--------+--------+------------+-------------
client backend | relation | bulkread | 0 | 0 | 0 | 0 | 0 | 0 | | | 8192 | 0 | 0 | 0 | | |
client backend | relation | bulkwrite | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 8192 | 0 | 0 | 0 | | |
client backend | relation | normal | 86 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 8192 | 1604 | 0 | | 0 | 0 |
client backend | relation | vacuum | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 8192 | 0 | 0 | 0 | | |
client backend | temp relation | normal | 0 | 0 | 0 | 0 | | | 0 | 0 | 8192 | 0 | 0 | | | |
(5 rows)

and for example for the walsender:

postgres=# select * from pg_stat_get_backend_io(3982910);
backend_type | object | context | reads | read_time | writes | write_time | writebacks | writeback_time | extends | extend_time | op_bytes | hits | evictions | reuses | fsyncs | fsync_time | stats_reset
--------------+---------------+-----------+-------+-----------+--------+------------+------------+----------------+---------+-------------+----------+------+-----------+--------+--------+------------+-------------
walsender | relation | bulkread | 0 | 0 | 0 | 0 | 0 | 0 | | | 8192 | 0 | 0 | 0 | | |
walsender | relation | bulkwrite | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 8192 | 0 | 0 | 0 | | |
walsender | relation | normal | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 8192 | 54 | 0 | | 0 | 0 |
walsender | relation | vacuum | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 8192 | 0 | 0 | 0 | | |
walsender | temp relation | normal | 0 | 0 | 0 | 0 | | | 0 | 0 | 8192 | 0 | 0 | | | |
(5 rows)

While, without the filtering we would get:

postgres=# select * from pg_stat_get_backend_io(pg_backend_pid());
backend_type | object | context | reads | read_time | writes | write_time | writebacks | writeback_time | extends | extend_time | op_bytes | hits | evictions | reuses | fsyncs | fsync_time | stats_reset
----------------+---------------+-----------+-------+-----------+--------+------------+------------+----------------+---------+-------------+----------+------+-----------+--------+--------+------------+-------------
client backend | relation | bulkread | 0 | 0 | 0 | 0 | 0 | 0 | | | 8192 | 0 | 0 | 0 | | |
client backend | relation | bulkwrite | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 8192 | 0 | 0 | 0 | | |
client backend | relation | normal | 4 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 8192 | 50 | 0 | | 0 | 0 |
client backend | relation | vacuum | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 8192 | 0 | 0 | 0 | | |
client backend | temp relation | bulkread | | | | | | | | | 8192 | | | | | |
client backend | temp relation | bulkwrite | | | | | | | | | 8192 | | | | | |
client backend | temp relation | normal | 0 | 0 | 0 | 0 | | | 0 | 0 | 8192 | 0 | 0 | | | |
client backend | temp relation | vacuum | | | | | | | | | 8192 | | | | | |
(8 rows)

and for a walsender:

postgres=# select * from pg_stat_get_backend_io(3981588);
backend_type | object | context | reads | read_time | writes | write_time | writebacks | writeback_time | extends | extend_time | op_bytes | hits | evictions | reuses | fsyncs | fsync_time | stats_reset
--------------+---------------+-----------+-------+-----------+--------+------------+------------+----------------+---------+-------------+----------+------+-----------+--------+--------+------------+-------------
walsender | relation | bulkread | 0 | 0 | 0 | 0 | 0 | 0 | | | 8192 | 0 | 0 | 0 | | |
walsender | relation | bulkwrite | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 8192 | 0 | 0 | 0 | | |
walsender | relation | normal | 6 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 8192 | 48 | 0 | | 0 | 0 |
walsender | relation | vacuum | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 8192 | 0 | 0 | 0 | | |
walsender | temp relation | bulkread | | | | | | | | | 8192 | | | | | |
walsender | temp relation | bulkwrite | | | | | | | | | 8192 | | | | | |
walsender | temp relation | normal | 0 | 0 | 0 | 0 | | | 0 | 0 | 8192 | 0 | 0 | | | |
walsender | temp relation | vacuum | | | | | | | | | 8192 | | | | | |
(8 rows)

It would not be possible to remove those "extra 3 rows" with a join on the
backend type on pg_stat_activity because only the C code knows about what the
compatibility is (though pgstat_tracks_io_object).

> + /* drop the backend stats entry */
> + pgstat_drop_entry(PGSTAT_KIND_BACKEND, InvalidOid, MyProcNumber);
>
> Oh, I've missed something. Shouldn't pgstat_request_entry_refs_gc()
> be called when this returns false?

I'm not sure that would be possible for a PGSTAT_KIND_BACKEND kind to return
false here but I agree that's better to call pgstat_request_entry_refs_gc()
if that's the case. Done in v9 attached.

> The creation of the dshash entry is a bit too early, I think.. How
> about delaying it more so as we don't create entries that could be
> useless if we fail the last steps of authentication? One spot would
> be to delay the creation of the new entry at the end of
> pgstat_bestart(), where we know that we are done with authentication
> and that the backend is ready to report back to the client connected
> to it. It is true that some subsystems could produce stats as of the
> early transactions they generate, which is why pgstat_initialize() is
> done that early in BaseInit(), but that's not really relevant for this
> case?

I think that makes sense to move the stats entry creation in pgstat_bestart(),
done that way in v9.

> I'm still feeling a bit uneasy about the drop done in
> pgstat_shutdown_hook(); it would be nice to make sure that this
> happens in a path that would run just after we're done with the
> creation of the entry to limit windows where we have an entry but no
> way to drop it, or vice-versa,

I don't believe that's possible as MyProcNumber can't be "reused" (and is
guaranteed to remain valid) until ProcKill() is done (which happens after the
pgstat_shutdown_hook()).

> Perhaps there's an argument for an entirely separate
> callback that would run before pgstat is plugged off, like a new
> before_shmem_exit() callback registered after the entry is created?

As the ProcKill() is run in shmem_exit() (and so after before_shmem_exit()),
I think that the way we currently drop the backend stats entry is fine (unless
I miss something about your concern).

Regards,

--
Bertrand Drouvot
PostgreSQL Contributors Team
RDS Open Source Databases
Amazon Web Services: https://aws.amazon.com

Attachment Content-Type Size
v9-0001-per-backend-I-O-statistics.patch text/x-diff 36.9 KB

In response to

Responses

Browse pgsql-hackers by date

  From Date Subject
Next Message Pavel Luzanov 2024-12-16 15:42:07 Re: DOCS: pg_createsubscriber wrong link?
Previous Message Melanie Plageman 2024-12-16 15:37:35 Re: Maybe we should reduce SKIP_PAGES_THRESHOLD a bit?