Re: per backend I/O statistics

From: Bertrand Drouvot <bertranddrouvot(dot)pg(at)gmail(dot)com>
To: Michael Paquier <michael(at)paquier(dot)xyz>
Cc: Nazir Bilal Yavuz <byavuz81(at)gmail(dot)com>, Alvaro Herrera <alvherre(at)alvh(dot)no-ip(dot)org>, Kyotaro Horiguchi <horikyota(dot)ntt(at)gmail(dot)com>, pgsql-hackers(at)lists(dot)postgresql(dot)org
Subject: Re: per backend I/O statistics
Date: 2024-11-14 13:30:11
Message-ID: ZzX7Y5i56UrOtlRC@ip-10-97-1-34.eu-west-3.compute.internal
Views: Raw Message | Whole Thread | Download mbox | Resend email
Thread:
Lists: pgsql-hackers

Hi,

On Thu, Nov 14, 2024 at 03:31:51PM +0900, Michael Paquier wrote:
> On Fri, Nov 08, 2024 at 02:09:30PM +0000, Bertrand Drouvot wrote:
> > 1. one assertion in pgstat_drop_entry_internal() is not necessary true anymore
> > with this new stat kind. So, adding an extra bool as parameter to take care of it.
>
> Why? I have to admit that the addition of this argument at this level
> specific to a new stats kind feels really weird and it looks like a
> layer violation, while this happens only when resetting the whole
> pgstats state because we have loaded a portion of the stats but the
> file loaded was in such a state that we don't have a consistent
> picture.

Yes. I agree that looks weird and I don't like it that much. But the assertion
is not true anymore. If you:

- change the arguments to false in the pgstat_drop_entry_internal() call in
pgstat_drop_all_entries()
- start the engine
- kill -9 postgres
- restart the engine

You'll see the assert failing due to the startup process. I don't think it is
doing something wrong though, just populating its backend stats. Do you have
another view on it?

> > 2. a new struct "PgStat_Backend" is created and does contain the backend type.
> > The backend type is used for filtering purpose when the stats are displayed.
>
> Is that necessary? We can guess that from the PID with a join in
> pg_stat_activity. pg_stat_get_backend_io() from 0004 returns a set of
> tuples for a specific PID, as well..

How would that work for someone doing "select * from pg_stat_get_backend_io(<pid>)"?

Do you mean copy/paste part of the code from pg_stat_get_activity() into
pg_stat_get_backend_io() to get the backend type? That sounds like an idea,
I'll have a look at it.

> + /* save the bktype */
> + if (kind == PGSTAT_KIND_PER_BACKEND)
> + bktype = ((PgStatShared_Backend *) header)->stats.bktype;
> [...]
> + /* restore the bktype */
> + if (kind == PGSTAT_KIND_PER_BACKEND)
> + ((PgStatShared_Backend *) header)->stats.bktype = bktype;
>
> Including the backend type results in these blips in
> shared_stat_reset_contents() which should not have anything related
> to stats kinds and should remain neutral, as well.

Yeah, we can simply get rid of it if we remove the backend type in PgStat_Backend.

> pgstat_prep_per_backend_pending() is used only in pgstat_io.c, so I
> guess that it should be static in this file rather than declared in
> pgstat.h?

Good catch, thanks!

> +typedef struct PgStat_PendingIO
>
> Perhaps this part should use a separate structure named
> "BackendPendingIO"? The definition of the structure has to be in
> pgstat.h as this is the pending_size of the new stats kind. It looks
> like it would be cleaner to keep PgStat_PendingIO local to
> pgstat_io.c, and define PgStat_PendingIO based on
> PgStat_BackendPendingIO?

I see what you meean, what about simply "PgStat_BackendPending" in pgstat.h?

> + /*
> + * Do serialize or not this kind of stats.
> + */
> + bool to_serialize:1;
>
> Not sure that "serialize" is the best term that applies here. For
> pgstats entries, serialization refers to the matter of writing their
> entries with a "serialized" name because they have an undefined number
> when stored locally after a reload. I'd suggest to split this concept
> into its own patch, rename the flag as "write_to_file" (or what you
> think is suited), and also apply the flag in the fixed-numbered loop
> done in pgstat_write_statsfile() before going through the dshash.

Makes sense to create its own patch, will have a look.

> + <row>
> + <entry role="func_table_entry"><para role="func_signature">
> + <indexterm>
> + <primary>pg_stat_reset_single_backend_io_counters</primary>
> + </indexterm>
> + <function>pg_stat_reset_single_backend_io_counters</function> ( <type>int4</type> )
> + <returnvalue>void</returnvalue>
>
> This should document that the input argument is a PID.

Yeap, will add.

>
> Is pg_my_stat_io() the best name ever?

Not 100% sure, but there is already "pg_my_temp_schema" so I thought that
pg_my_stat_io would not be that bad. Open to suggestions though.

> I'd suggest to just merge 0004 with 0001.

Sure.

> Structurally, it may be cleaner to keep all the callbacks and the
> backend-I/O specific logic into a separate file, perhaps
> pgstat_io_backend.c or pgstat_backend_io?

Yeah, I was wondering the same. What about pgstat_backend.c (that would contain
only one "section" dedicated to IO currently)?

> > ==== 0002
> >
> > Merge both IO stats flush callback. There is no need to keep both callbacks.
> >
> > Merging both allows to save O(N^3) while looping on IOOBJECT_NUM_TYPES,
> > IOCONTEXT_NUM_TYPES and IOCONTEXT_NUM_TYPES.
>
> Not sure to be a fan of that, TBH, still I get the performance
> argument of the matter. Each stats kind has its own flush path, and
> this assumes that we'll never going to diverge in terms of the
> information maintained for each backend and the existing pg_stat_io.
> Perhaps there could be a divergence at some point?

Yeah perhaps, so in case of divergence let's split "again"? I mean for the moment
I don't see any reason to keep both and we have pros related to efficiency during
flush.

> > === Remarks
> >
> > R1. The key field is still "objid" even if it stores the proc number. I think
> > that's fine and there is no need to rename it as it's related comment states:
>
> Let's not change the object ID and the hash key. The approach you are
> using here with a variable-numbered stats kinds and a lookup with the
> procnumber is sensible here.

Agree, thanks for confirming.

> > R2. pg_stat_get_io() and pg_stat_get_backend_io() could probably be merged (
> > duplicated code). That's not mandatory to provide the new per backend I/O stats
> > feature. So let's keep it as future work to ease the review.
>
> Not sure about that.

I don't have a strong opinion about this one, so I'm ok to not merge those.

> > R3. There is no use case of retrieving other backend's IO stats with
> > stats_fetch_consistency set to 'snapshot'. Avoiding this behavior allows us to
> > save memory (that could be non negligeable as pgstat_build_snapshot() would add
> > _all_ the other backend's stats in the snapshot). I choose to document it and to
> > not return any rows: I think that's better than returning some data (that would
> > not "ensure" the snapshot consistency) or an error.
>
> I'm pretty sure that there is a sensible argument about being able to
> get a snapshot of the I/O stat of another backend and consider that a
> feature. For example, keeping a look at the activity of the
> checkpointer while doing a scan at a certain point in time?

hm, not sure how the stats_fetch_consistency set to 'snapshot' could help here:

- the checkpointer could write stuff for reason completly unrelated to "your"
backend

- if the idea is to 1. record checkpointer stats, 2. do stuff in the backend and
3. check again the checkpointer stats then I don't think setting stats_fetch_consistency
to snapshot is needed at all. In fact it's quite the opposite as 1. and 3. would
report the same values with stats_fetch_consistency set to 'snapshot' (if done
in the same transaction).

Regards,

--
Bertrand Drouvot
PostgreSQL Contributors Team
RDS Open Source Databases
Amazon Web Services: https://aws.amazon.com

In response to

Responses

Browse pgsql-hackers by date

  From Date Subject
Next Message Amul Sul 2024-11-14 13:40:03 Re: NOT ENFORCED constraint feature
Previous Message Jonathan S. Katz 2024-11-14 13:26:21 Re: 2024-11-14 release announcement draft