Re: per backend I/O statistics

From: Bertrand Drouvot <bertranddrouvot(dot)pg(at)gmail(dot)com>
To: Alvaro Herrera <alvherre(at)alvh(dot)no-ip(dot)org>
Cc: Kyotaro Horiguchi <horikyota(dot)ntt(at)gmail(dot)com>, pgsql-hackers(at)lists(dot)postgresql(dot)org
Subject: Re: per backend I/O statistics
Date: 2024-09-05 14:14:47
Message-ID: Ztm811fM73GFt4z/@ip-10-97-1-34.eu-west-3.compute.internal
Views: Raw Message | Whole Thread | Download mbox | Resend email
Thread:
Lists: pgsql-hackers

Hi,

On Thu, Sep 05, 2024 at 03:03:32PM +0200, Alvaro Herrera wrote:
> On 2024-Sep-03, Bertrand Drouvot wrote:
>
> > Cons is probably allocating shared memory space that might not be used (
> > sizeof(PgStat_IO) is 16392 so that could be a concern for a high number of
> > unused connection). OTOH, if a high number of connections is not used that might
> > be worth to reduce the max_connections setting.
>
> I was surprised by the size you mention so I went to look for the
> definition of that struct:
>

Thanks for looking at it!


> typedef struct PgStat_IO
> {
> TimestampTz stat_reset_timestamp;
> PgStat_BktypeIO stats[BACKEND_NUM_TYPES];
> } PgStat_IO;
>
> (It would be good to have more than zero comments here BTW)
>
> So what's happening is that struct PgStat_IO stores the data for every
> single process type, be it regular backends, backend-like auxiliary
> processes, and all other potential postmaster children.

Yeap.

> So it seems to
> me that storing one of these struct for "my backend stats" is wrong: I
> think you should be using PgStat_BktypeIO instead (or maybe another
> struct which has a reset timestamp plus BktypeIO, if you care about the
> reset time). That struct is only 1024 bytes long, so it seems much more
> reasonable to have one of those per backend.

Yeah, that could be an area of improvement. Thanks, I'll look at it.
Currently the filtering is done when retrieving the per backend stats but better
to do it when storing the stats.

> Another way to think about this might be to say that the B_BACKEND
> element of the PgStat_Snapshot.io array should really be spread out to
> MaxBackends separate elements. This would make it more difficult to
> obtain a total value accumulating ops done by all backends (since it's
> require to sum the values of each backend), but it allows you to obtain
> backend-specific values, including those of remote backends rather than
> only your own, as Kyotaro suggests.
>

One backend can already see the stats of the other backends thanks to the
pg_stat_get_backend_io() function (that takes a backend pid as parameter)
that is introduced in the 0002 sub-patch.

I'll ensure that's still the case with the next version of the patch.

Regards,

--
Bertrand Drouvot
PostgreSQL Contributors Team
RDS Open Source Databases
Amazon Web Services: https://aws.amazon.com

In response to

Responses

Browse pgsql-hackers by date

  From Date Subject
Next Message Andrew Dunstan 2024-09-05 15:01:49 Re: json_query conditional wrapper bug
Previous Message Tom Lane 2024-09-05 13:58:36 Re: updatable view set default interact with base rel generated stored columns