Re: per backend I/O statistics

From: Bertrand Drouvot <bertranddrouvot(dot)pg(at)gmail(dot)com>
To: Michael Paquier <michael(at)paquier(dot)xyz>
Cc: Nazir Bilal Yavuz <byavuz81(at)gmail(dot)com>, Alvaro Herrera <alvherre(at)alvh(dot)no-ip(dot)org>, Kyotaro Horiguchi <horikyota(dot)ntt(at)gmail(dot)com>, pgsql-hackers(at)lists(dot)postgresql(dot)org
Subject: Re: per backend I/O statistics
Date: 2024-12-12 14:02:38
Message-ID: Z1rs/j5JMdTbUIJJ@ip-10-97-1-34.eu-west-3.compute.internal
Views: Raw Message | Whole Thread | Download mbox | Resend email
Thread:
Lists: pgsql-hackers

Hi,

On Thu, Dec 12, 2024 at 01:52:03PM +0900, Michael Paquier wrote:
> On Mon, Nov 25, 2024 at 03:47:59PM +0000, Bertrand Drouvot wrote:

> + view. The function does not return I/O statistics for the checkpointer,
> + the background writer, the startup process and the autovacuum launcher
> + as they are already visible in the <link linkend="monitoring-pg-stat-io-view"> <structname>pg_stat_io</structname></link>
> + view and there is only one of those.
>
> This last sentence seems unnecessary? The function is named
> "backend", and well, all these processes are not backends.

Yeah, but you can find their pid through pg_stat_activity for which the pid field
description is "Process ID of this backend". Also we can find "backend_type" in
both pg_stat_activity and pg_stat_io, so I think this last sentence could help
to avoid any confusion though.

> + /*
> + * Maybe an auxiliary process? That should not be possible, due to
> + * pgstat_tracks_per_backend_bktype() though.
> + */
> + if (proc == NULL)
> + proc = AuxiliaryPidGetProc(backend_pid);
> [...]
> + /*
> + * Maybe an auxiliary process? That should not be possible, due to
> + * pgstat_tracks_per_backend_bktype() though.
> + */
> + if (proc == NULL)
> + proc = AuxiliaryPidGetProc(pid);
>
> This does not seem right. Shouldn't we return immediately if
> BackendPidGetProc() finds nothing matching with the PID?

Yeah, that would work. I was keeping the AuxiliaryPidGetProc() calls just in case
we want to add the Aux processes back in the future. Replaced with a comment
in v7 attached instead.

>
> + /* Look for the backend type */
> + for (curr_backend = 1; curr_backend <= num_backends; curr_backend++)
> + {
> + LocalPgBackendStatus *local_beentry;
> + PgBackendStatus *beentry;
> +
> + /* Get the next one in the list */
> + local_beentry = pgstat_get_local_beentry_by_index(curr_backend);
> + beentry = &local_beentry->backendStatus;
> +
> + /* looking for specific PID, ignore all the others */
> + if (beentry->st_procpid != pid)
> + continue;
> +
> + bktype = beentry->st_backendType;
> + break;
> + }
>
> Sounds to me that the backend type is not strictly required in this
> function call if pg_stat_activity can tell already that?

Yeah, but it's needed in pg_stat_get_backend_io() for the stats filtering (to
display only those linked to this backend type), later in the function here:

+ /*
+ * Some combinations of BackendType, IOObject, and IOContext are
+ * not valid for any type of IOOp. In such cases, omit the entire
+ * row from the view.
+ */
+ if (!pgstat_tracks_io_object(bktype, io_obj, io_context))
+ continue;

OTOH, now that we get rid of the AuxiliaryPidGetProc() call in
pg_stat_reset_single_backend_io_counters() we can also remove the need to
look for the backend type in this function.

>
> + (void) pgstat_per_backend_flush_cb(entry_ref, nowait);
>
> I'd recommend to not directly call the callback, use a wrapper
> function instead if need be.

Makes sense, created its own callback in pgstat_backend.c. Bonus point, it avoids
unnecessary pgstat_tracks_per_backend_bktype() checks in :

pgstat_report_bgwriter()
pgstat_report_checkpointer()
pgstat_report_wal()

as there is no attempt to call the callback anymore in those places.

> /*
> - * Simpler wrapper of pgstat_io_flush_cb()
> + * Simpler wrapper of pgstat_io_flush_cb() and pgstat_per_backend_flush_cb().
> */
> void
> pgstat_flush_io(bool nowait)
>
> This is also called in the checkpointer and the bgwriter and the
> walwriter via pgstat_report_wal(), which is kind of useless. Perhaps
> just use a different, separate function instead and use that where it
> makes sense (per se also the argument of upthread that backend stats
> may not be only IO-related..).

Yeah, done that way.

BTW, not related to this particular patch but I realized that pgstat_flush_io()
is called for the walwriter. Indeed, it's coming from the pgstat_report_wal()
call in WalWriterMain(). That can not report any I/O stats activity (as the
walwriter is not part of the I/O stats tracking, see pgstat_tracks_io_bktype()).

So it looks like, we could move pgstat_flush_io() outside of pgstat_report_wal()
and add the pgstat_flush_io() calls only where they need to be made (and so, not
in WalWriterMain()).

Maybe a dedicated thread is worth it for that, thoughts?

> Sounds to me that PgStat_BackendPendingIO should be
> PgStat_BackendPendingStats?

Yeah, can do that as that's what is being used for the pending_size for
example.

OTOH, it's clear that this one in pgstat_io.c has to be "linked" to an "IO"
related one:

"
typedef PgStat_BackendPendingStats PgStat_PendingIO;
"

The right way would probably be to do something like:

"
typedef struct PgStat_BackendPendingStats {
PgStat_BackendPendingIO pendingio;
} PgStat_BackendPendingStats;
"

But I'm not sure that's worth it until we don't have a need to add more
pending stats per-backend, thoughts?

> +{ oid => '8806', descr => 'statistics: per backend IO statistics',
> + proname => 'pg_stat_get_backend_io', prorows => '5', proretset => 't',
>
> Similarly, s/pg_stat_get_backend_io/pg_stat_get_backend_stats/?

I think that's fine to keep pg_stat_get_backend_io(). If we add more per-backend
stats in the future then we could add "dedicated" get functions too and a generic
one retrieving all of them.

> + descr => 'statistics: reset collected IO statistics for a single backend',
> + proname => 'pg_stat_reset_single_backend_io_counters', provolatile => 'v',
>
> And here, pg_stat_reset_backend_stats?

Same as above, we could imagine that in the future the backend would get mutiple
stats and that one would want to reset only the I/O ones for example.

> #define PGSTAT_KIND_SUBSCRIPTION 5 /* per-subscription statistics */
> +#define PGSTAT_KIND_PER_BACKEND 6
>
> Missing one comment here.

Yeap.

> FWIW, I'm so-so about the addition of pg_my_stat_io, knowing that
> pg_stat_get_backend_io(NULL/pg_backend_pid()) does the same job.

Okay, I don't have a strong opinion about that. Removed in v7.

> I
> would just add a note in the docs with a query showing how to use it
> with pg_stat_activity. An example with LATERAL, doing the same work:
> select a.pid, s.* from pg_stat_activity as a,
> lateral pg_stat_get_backend_io(a.pid) as s
> where pid = pg_backend_pid();

I'm not sure it's worth it. I think that's clear that to get our own stats
then we need to provide our own backend pid. For example pg_stat_get_activity()
does not provide such an example using pg_stat_activity or using something like
pg_stat_get_activity(pg_backend_pid()).

Regards,

--
Bertrand Drouvot
PostgreSQL Contributors Team
RDS Open Source Databases
Amazon Web Services: https://aws.amazon.com

Attachment Content-Type Size
v7-0001-per-backend-I-O-statistics.patch text/x-diff 37.0 KB

In response to

Responses

Browse pgsql-hackers by date

  From Date Subject
Next Message Yurii Rashkovskii 2024-12-12 14:02:46 Re: Add Postgres module info
Previous Message RECHTÉ Marc 2024-12-12 13:50:49 Re: Logical replication timeout