From: | Bertrand Drouvot <bertranddrouvot(dot)pg(at)gmail(dot)com> |
---|---|
To: | Michael Paquier <michael(at)paquier(dot)xyz> |
Cc: | Nazir Bilal Yavuz <byavuz81(at)gmail(dot)com>, Alvaro Herrera <alvherre(at)alvh(dot)no-ip(dot)org>, Kyotaro Horiguchi <horikyota(dot)ntt(at)gmail(dot)com>, pgsql-hackers(at)lists(dot)postgresql(dot)org |
Subject: | Re: per backend I/O statistics |
Date: | 2024-12-16 15:42:04 |
Message-ID: | Z2BKTK9KX5Kc3HaU@ip-10-97-1-34.eu-west-3.compute.internal |
Views: | Raw Message | Whole Thread | Download mbox | Resend email |
Thread: | |
Lists: | pgsql-hackers |
Hi,
On Mon, Dec 16, 2024 at 05:07:52PM +0900, Michael Paquier wrote:
> On Fri, Dec 13, 2024 at 09:20:13AM +0000, Bertrand Drouvot wrote:
>
> Not feeling so sure about the value brought by the backend_type
> returned in pg_stat_get_backend_io(), but well..
It's not necessary per say, but it ensures that pg_stat_get_backend_io() does not
return rows full of "invalid" combinations.
For example:
With the filtering in place, on a "client backend" we would get:
postgres=# select * from pg_stat_get_backend_io(pg_backend_pid());
backend_type | object | context | reads | read_time | writes | write_time | writebacks | writeback_time | extends | extend_time | op_bytes | hits | evictions | reuses | fsyncs | fsync_time | stats_reset
----------------+---------------+-----------+-------+-----------+--------+------------+------------+----------------+---------+-------------+----------+------+-----------+--------+--------+------------+-------------
client backend | relation | bulkread | 0 | 0 | 0 | 0 | 0 | 0 | | | 8192 | 0 | 0 | 0 | | |
client backend | relation | bulkwrite | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 8192 | 0 | 0 | 0 | | |
client backend | relation | normal | 86 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 8192 | 1604 | 0 | | 0 | 0 |
client backend | relation | vacuum | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 8192 | 0 | 0 | 0 | | |
client backend | temp relation | normal | 0 | 0 | 0 | 0 | | | 0 | 0 | 8192 | 0 | 0 | | | |
(5 rows)
and for example for the walsender:
postgres=# select * from pg_stat_get_backend_io(3982910);
backend_type | object | context | reads | read_time | writes | write_time | writebacks | writeback_time | extends | extend_time | op_bytes | hits | evictions | reuses | fsyncs | fsync_time | stats_reset
--------------+---------------+-----------+-------+-----------+--------+------------+------------+----------------+---------+-------------+----------+------+-----------+--------+--------+------------+-------------
walsender | relation | bulkread | 0 | 0 | 0 | 0 | 0 | 0 | | | 8192 | 0 | 0 | 0 | | |
walsender | relation | bulkwrite | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 8192 | 0 | 0 | 0 | | |
walsender | relation | normal | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 8192 | 54 | 0 | | 0 | 0 |
walsender | relation | vacuum | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 8192 | 0 | 0 | 0 | | |
walsender | temp relation | normal | 0 | 0 | 0 | 0 | | | 0 | 0 | 8192 | 0 | 0 | | | |
(5 rows)
While, without the filtering we would get:
postgres=# select * from pg_stat_get_backend_io(pg_backend_pid());
backend_type | object | context | reads | read_time | writes | write_time | writebacks | writeback_time | extends | extend_time | op_bytes | hits | evictions | reuses | fsyncs | fsync_time | stats_reset
----------------+---------------+-----------+-------+-----------+--------+------------+------------+----------------+---------+-------------+----------+------+-----------+--------+--------+------------+-------------
client backend | relation | bulkread | 0 | 0 | 0 | 0 | 0 | 0 | | | 8192 | 0 | 0 | 0 | | |
client backend | relation | bulkwrite | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 8192 | 0 | 0 | 0 | | |
client backend | relation | normal | 4 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 8192 | 50 | 0 | | 0 | 0 |
client backend | relation | vacuum | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 8192 | 0 | 0 | 0 | | |
client backend | temp relation | bulkread | | | | | | | | | 8192 | | | | | |
client backend | temp relation | bulkwrite | | | | | | | | | 8192 | | | | | |
client backend | temp relation | normal | 0 | 0 | 0 | 0 | | | 0 | 0 | 8192 | 0 | 0 | | | |
client backend | temp relation | vacuum | | | | | | | | | 8192 | | | | | |
(8 rows)
and for a walsender:
postgres=# select * from pg_stat_get_backend_io(3981588);
backend_type | object | context | reads | read_time | writes | write_time | writebacks | writeback_time | extends | extend_time | op_bytes | hits | evictions | reuses | fsyncs | fsync_time | stats_reset
--------------+---------------+-----------+-------+-----------+--------+------------+------------+----------------+---------+-------------+----------+------+-----------+--------+--------+------------+-------------
walsender | relation | bulkread | 0 | 0 | 0 | 0 | 0 | 0 | | | 8192 | 0 | 0 | 0 | | |
walsender | relation | bulkwrite | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 8192 | 0 | 0 | 0 | | |
walsender | relation | normal | 6 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 8192 | 48 | 0 | | 0 | 0 |
walsender | relation | vacuum | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 8192 | 0 | 0 | 0 | | |
walsender | temp relation | bulkread | | | | | | | | | 8192 | | | | | |
walsender | temp relation | bulkwrite | | | | | | | | | 8192 | | | | | |
walsender | temp relation | normal | 0 | 0 | 0 | 0 | | | 0 | 0 | 8192 | 0 | 0 | | | |
walsender | temp relation | vacuum | | | | | | | | | 8192 | | | | | |
(8 rows)
It would not be possible to remove those "extra 3 rows" with a join on the
backend type on pg_stat_activity because only the C code knows about what the
compatibility is (though pgstat_tracks_io_object).
> + /* drop the backend stats entry */
> + pgstat_drop_entry(PGSTAT_KIND_BACKEND, InvalidOid, MyProcNumber);
>
> Oh, I've missed something. Shouldn't pgstat_request_entry_refs_gc()
> be called when this returns false?
I'm not sure that would be possible for a PGSTAT_KIND_BACKEND kind to return
false here but I agree that's better to call pgstat_request_entry_refs_gc()
if that's the case. Done in v9 attached.
> The creation of the dshash entry is a bit too early, I think.. How
> about delaying it more so as we don't create entries that could be
> useless if we fail the last steps of authentication? One spot would
> be to delay the creation of the new entry at the end of
> pgstat_bestart(), where we know that we are done with authentication
> and that the backend is ready to report back to the client connected
> to it. It is true that some subsystems could produce stats as of the
> early transactions they generate, which is why pgstat_initialize() is
> done that early in BaseInit(), but that's not really relevant for this
> case?
I think that makes sense to move the stats entry creation in pgstat_bestart(),
done that way in v9.
> I'm still feeling a bit uneasy about the drop done in
> pgstat_shutdown_hook(); it would be nice to make sure that this
> happens in a path that would run just after we're done with the
> creation of the entry to limit windows where we have an entry but no
> way to drop it, or vice-versa,
I don't believe that's possible as MyProcNumber can't be "reused" (and is
guaranteed to remain valid) until ProcKill() is done (which happens after the
pgstat_shutdown_hook()).
> Perhaps there's an argument for an entirely separate
> callback that would run before pgstat is plugged off, like a new
> before_shmem_exit() callback registered after the entry is created?
As the ProcKill() is run in shmem_exit() (and so after before_shmem_exit()),
I think that the way we currently drop the backend stats entry is fine (unless
I miss something about your concern).
Regards,
--
Bertrand Drouvot
PostgreSQL Contributors Team
RDS Open Source Databases
Amazon Web Services: https://aws.amazon.com
Attachment | Content-Type | Size |
---|---|---|
v9-0001-per-backend-I-O-statistics.patch | text/x-diff | 36.9 KB |
From | Date | Subject | |
---|---|---|---|
Next Message | Pavel Luzanov | 2024-12-16 15:42:07 | Re: DOCS: pg_createsubscriber wrong link? |
Previous Message | Melanie Plageman | 2024-12-16 15:37:35 | Re: Maybe we should reduce SKIP_PAGES_THRESHOLD a bit? |