Re: per backend I/O statistics

From: Michael Paquier <michael(at)paquier(dot)xyz>
To: Bertrand Drouvot <bertranddrouvot(dot)pg(at)gmail(dot)com>
Cc: Kyotaro Horiguchi <horikyota(dot)ntt(at)gmail(dot)com>, pgsql-hackers(at)lists(dot)postgresql(dot)org
Subject: Re: per backend I/O statistics
Date: 2024-09-20 03:53:43
Message-ID: Zuzxx8sd8yg6CpcV@paquier.xyz
Views: Raw Message | Whole Thread | Download mbox | Resend email
Thread:
Lists: pgsql-hackers

On Wed, Sep 04, 2024 at 04:45:24AM +0000, Bertrand Drouvot wrote:
> On Tue, Sep 03, 2024 at 04:07:58PM +0900, Kyotaro Horiguchi wrote:
>> As an additional benefit of this approach, the client can set a
>> connection variable, for example, no_backend_iostats to true, or set
>> its inverse variable to false, to restrict memory usage to only the
>> required backends.
>
> Thanks for the feedback!
>
> If we were to add an on/off switch button, I think I'd vote for a global one
> instead. Indeed, I see this feature more like an "Administrator" one, where
> the administrator wants to be able to find out which session is reponsible of
> what (from an I/O point of view): like being able to anwser "which session is
> generating this massive amount of reads"?
>
> If we allow each session to disable the feature then the administrator
> would lost this ability.

Hmm, I've been studying this patch, and I am not completely sure to
agree with this feeling of using fixed-numbered stats, actually, after
reading the whole and seeing the structure of the patch
(FLEXIBLE_ARRAY_MEMBER is a new way to handle the fact that we don't
know exactly the number of slots we need to know for the
fixed-numbered stats as MaxBackends may change). If we make these
kind of stats variable-numbered, does it have to actually involve many
creations or removals of the stats entries, though? One point is that
the number of entries to know about is capped by max_connections,
which is a PGC_POSTMASTER. That's the same kind of control as
replication slots. So one approach would be to reuse entries in the
dshash and use in the hashing key the number in the procarrays. If a
new connection spawns and reuses a slot that was used in the past,
then reset all the existing fields and assign its PID.

Another thing is the consistency of the data that we'd like to keep at
shutdown. If the connections have a balanced amount of stats shared
among them, doing decision-making based on them is kind of easy. But
that may cause confusion if the activity is unbalanced across the
sessions. We could also not flush them to disk as an option, but it
still seems more useful to me to save this data across restarts if one
takes frequent snapshots of the new system view reporting everything,
so as it is possible to get an idea of the deltas across the snapshots
for each connection slot.
--
Michael

In response to

Browse pgsql-hackers by date

  From Date Subject
Next Message Zhijie Hou (Fujitsu) 2024-09-20 03:59:07 RE: Conflict detection for update_deleted in logical replication
Previous Message Junwang Zhao 2024-09-20 03:51:49 Re: attndims, typndims still not enforced, but make the value within a sane threshold