Quick Links

Re: per backend I/O statistics

From:	Bertrand Drouvot <bertranddrouvot(dot)pg(at)gmail(dot)com>
To:	Michael Paquier <michael(at)paquier(dot)xyz>
Cc:	Nazir Bilal Yavuz <byavuz81(at)gmail(dot)com>, Alvaro Herrera <alvherre(at)alvh(dot)no-ip(dot)org>, Kyotaro Horiguchi <horikyota(dot)ntt(at)gmail(dot)com>, pgsql-hackers(at)lists(dot)postgresql(dot)org
Subject:	Re: per backend I/O statistics
Date:	2024-10-31 05:09:56
Message-ID:	ZyMRJIbUpNPoCXUe@ip-10-97-1-34.eu-west-3.compute.internal
Views:	Raw Message \| Whole Thread \| Download mbox \| Resend email
Thread:
Lists:	pgsql-hackers

Hi,

On Tue, Oct 08, 2024 at 04:28:39PM +0000, Bertrand Drouvot wrote:
> > > On Fri, Sep 20, 2024 at 01:26:49PM +0900, Michael Paquier wrote:
> >
> > Okay, per the above and the persistency of the stats.
>
> Great, I'll work on an updated patch version then.
>

I spend some time on this during the last 2 days and I think we have 3 design
options.

=== GOALS ===

But first let's sump up the goals that I think we agreed on:

- Keep pg_stat_io as it is today: give the whole server picture and serialize
the stats to disk.

- Introduce per-backend IO stats and 2 new APIs to:

1. Provide the IO stats for "my backend" (through say pg_my_stat_io), this
would take care of the stats_fetch_consistency.

2. Retrieve the IO stats for another backend (through say pg_stat_get_backend_io(pid))
that would _not_ take care of stats_fetch_consistency, as:

2.1/ I think that there is no use case (there is no need to get others
backends I/O statistics while taking care of the stats_fetch_consistency)

2.2/ That could be memory expensive to store a snapshot for all the backends
(depending of the number of backend created)

- There is no need to serialize the per-backend IO stats to disk (no point to
see stats for backends that do not exist anymore after a re-start).

- The per-backend IO stats should be variable-numbered (not fixed), as per
up-thread discussion.

=== OPTIONS ===

So, based on this, I think that we could:

Option 1: "move" the existing PGSTAT_KIND_IO to variable-numbered and let this
KIND take care of the aggregated view (pg_stat_io) and the per-backend stats.

Option 2: let PGSTAT_KIND_IO as it is and introduce a new PGSTAT_KIND_BACKEND_IO
that would be variable-numbered.

Option 3: Remove PGSTAT_KIND_IO, introduce a new PGSTAT_KIND_BACKEND_IO that
would be variable-numbered and store the "aggregated stats aka pg_stat_io" in
shared memory (not part of the variable-numbered hash). Per-backend stats
could be aggregated into "pg_stat_io" during the flush_pending_cb call for example.

=== BEST OPTION? ===

I would opt for Option 2 as:

- The stats system is currently not designed for Option 1 and our goals (for
example the shared_data_len is used to serialize but also to fetch the entries,
see pgstat_fetch_entry()) so that would need some hack to serialize only a part
of them and still be able to fetch them all).

- Mixing "fixed" and "variable" in the same KIND does not sound like a good idea
(though that might be possible with some hacks, I don't think that would be
easy to maintain).

- Having the per-backend as "variable" in its dedicated kind looks more reasonable
and less error-prone.

- I don't think there is a stats design similar to option 3 currently, so I'm
not sure there is a need to develop something new while Option 2 could be done.

- Option 3 would need some hack for (at least) the "pg_stat_io" [de]serialization
part.

- Option 2 seems to offer more flexibility (as compare to Option 1 and 3).

Thoughts?

Regards,

--
Bertrand Drouvot
PostgreSQL Contributors Team
RDS Open Source Databases
Amazon Web Services: https://aws.amazon.com

In response to

Re: per backend I/O statistics at 2024-10-08 16:28:39 from Bertrand Drouvot

Responses

Re: per backend I/O statistics at 2024-11-04 10:01:50 from Bertrand Drouvot

Browse pgsql-hackers by date

	From	Date	Subject
Next Message	Yushi Ogiwara	2024-10-31 05:18:09	Fix for Extra Parenthesis in pgbench progress message
Previous Message	Jingtang Zhang	2024-10-31 04:17:14	Re: Introduce new multi insert Table AM and improve performance of various SQL commands with it for Heap AM